SOCR ≫ | DSPA ≫ | DSPA2 Topics ≫ |
In this chapter, we are going to cover several powerful black-box machine learning and artificial intelligence techniques. These techniques have complex mathematical formulations, however, efficient algorithms and reliable software packages have been developed to utilize them for various practical applications. We will (1) describe Neural Networks as analogues of biological neurons, (2) develop hands-on a neural network that can be trained to compute the square-root function, (3) describe support vector machine (SVM) classification, (4) present the random forest as an ensemble ML technique, and (5) analyze several case-studies, including optical character recognition (OCR), the Iris flowers, Google Trends and the Stock Market, and Quality of Life in chronic disease.
Later, in Chapter 14, we will provide more details and additional examples of deep neural network learning. For now, let’s start by exploring the mechanics inside black box machine learning approaches.
An Artificial Neural Network (ANN)
model mimics the
biological brain response to multisource (sensory-motor) stimuli
(inputs). ANN simulates the brain using a network of interconnected
neuron cells to create a massive parallel processor. Indeed, ANNs rely
on graphs of artificial nodes, not brain cells, to model intrinsic
process characteristics using observational data.
The basic ANN component is a cell node. Suppose we have the input \(x=\{x_i\}\) to the node feeding information from upstream network nodes, and one output propagating the information downstream through the network. The first step in fitting an ANN involves estimation of the weight coefficients for each input feature. These weights (\(w\)’s) correspond to the relative importance of each input. Then, the weighted signals are summed by the “neuron cell” and this sum is passed on according to an activation function denoted by \(f(\cdot)\). The last step is generating an output \(y\) at the end of each node. A typical output will have the following mathematical relationship to the inputs. The weights \(\{w_i\}_{i\ge 1}\) control the weight-averaging of the inputs, \(\{x_i\}\), used to assess the activation function. The constant factor weight \(w_o\) and the corresponding bias term \(b\) allows us to shift or offset the entire activation function (left or right). \[\underbrace{y(x)}_{output}=f\left (w_o \underbrace{b}_{bias}+\sum_{i=1}^n \overbrace{w_i}^{weights} \underbrace{x_i}_{inputs}\right ).\]
There are three important components for building a neural network:
Let’s look at each of these components one by one.
There are many alternative activation functions. One example is a threshold activation function that results in an output signal only when a specified input threshold has been attained.
\[f(x)= \left\{ \begin{array}{ll} 0 & x<0 \\ 1 & x\geq 0 \\ \end{array} \right. .\]
This is the simplest form of an activation function. It may be rarely used in real world situations. Most commonly used alternative is the sigmoid activation function where \(f(x)=\frac{1}{1+e^{-x}}\). The Euler number \(e\) is defined by the limit of \(e=\displaystyle\lim_{n\longrightarrow\infty}{\left ( 1+\frac{1}{n}\right )^n}\). The output signal is no longer binary but can be any real number ranging from 0 to 1.
Other activation functions might also be useful: