SOCR ≫ | DSPA ≫ | Topics ≫ |
In this chapter, we are going to cover two very powerful machine-learning algorithms. These techniques have complex mathematical formulations, however, efficient algorithms and reliable software packages have been developed to utilize them for various practical applications. We will (1) describe Neural Networks as analogues of biological neurons, (2) develop hands-on a neural net that can be trained to compute the square-root function, (3) describe support vector machine (SVM) classification, and (4) complete several case-studies, including optical character recognition (OCR), the Iris flowers, Google Trends and the Stock Market, and Quality of Life in chronic disease.
Later, in Chapter 22, we will provide more details and additional examples of deep neural network learning. For now, let’s start by exploring the magic inside the machine learning black box.
An Artificial Neural Network (ANN)
model mimics the biological brain response to multisource (sensory-motor) stimuli (inputs). ANN simulate the brain using a network of interconnected neuron cells to create a massive parallel processor. Of course, it uses a network of artificial nodes, not brain cells, to train data.
When we have three signals (or inputs) \(x_1\), \(x_2\) and \(x_3\), the first step is weighting the features (\(w\)’s) according to their importance. Then, the weighted signals are summed by the “neuron cell” and this sum is passed on according to an activation function denoted by f. The last step is generating an output y at the end of the process. A typical output will have the following mathematical relationship to the inputs. The weights \(\{w_i\}_{i\ge 1}\)’s control the weight-averaging of the inputs, \(\{x_i\}\)’s, used to assess the activation function. The constant factor weight \(w_o\) and the corresponding bias term \(b\) allow us to shift or offset the entire activation function (left or right). \[y(x)=f\left (w_o \underbrace{b}_{bias}+\sum_{i=1}^n \overbrace{w_i}^{weights} \underbrace{x_i}_{inputs}\right ).\]
There are three important components for building a neural network:
Let’s look at each of these components one by one.
One of the functions is known as threshold activation function that results in an output signal once a specified input threshold has been attained.
\[ f(x)= \left\{ \begin{array}{ll} 0 & x<0 \\ 1 & x\geq 0 \\ \end{array} \right. . \]
This is the simplest form for activation functions. It may be rarely used in real world situations. Most commonly used alternative is the sigmoid activation function where \(f(x)=\frac{1}{1+e^{-x}}\). The Euler number e
is defined as the limit of \(\displaystyle\lim_{n\longrightarrow\infty}{\left ( 1+\frac{1}{n}\right )^n}\). The output signal is no longer binary but can be any real number ranging from 0 to 1.
Other activation functions might also be useful: