导图社区 CFA二级:机器学习Machine Learning(2)
不同机器学习分类下一些常用的算法, 算法的基本原理、和应用。
编辑于2020-01-05 03:30:49ML Algorithms (机器学习算法)
Supervised ML Algorithms
Penalized regression
LASSO (least absolute shrinkage and selection operator)
λ > 0
classification
Support vector machine (SVM)
Support vector machine (SVM) is a linear classifier that determines the hyperplane that optimally separates the observations into two sets of data points
classification,
linear classifier(a binary classifier)
soft margin classification

应用:particularly suited for small- to medium-size but complex high-dimensional data sets, such as corporate financial statements or bankruptcy databases. Investors seek to predict company failures for identifying stocks to avoid or to short sell, and SVM can generate a binary classification (e.g., bankruptcy likely vs. bankruptcy unlikely) using many fundamental and technical feature variables.
K-nearest neighbor (KNN)
look at the diamond’s k nearest neighbors
A critical challenge of KNN is defining what it means to be “similar” (or near).
Besides the selection of features, an important decision relates to the distance metric used to model similarity because an inappropriate measure will generate poorly performing models.

应用:including bankruptcy prediction, stock price prediction, corporate bond credit rating assignment, and customized equity and bond index creation.
Classification and Regression Tree (CART)
CART is applied to binary classification or regression.

应用:enhancing detection of fraud in financial statements, generating consistent decision processes in equity and fixed-income selection, and simplifying communication of investment strategies to clients.
ensemble learning
分类
aggregation of heterogeneous learners
different types of algorithms combined together with a voting classifier
aggregation of homogenous learners
a combination of the same algorithm, using different training data that are based, for example, on a bootstrap aggregating
举例
Bootstrap Aggregating (Bagging)
original training data set is used to generate n new training data sets or bags of data
The algorithm can now be trained on n independent data sets that will generate n new models.
Bagging is a very useful technique because it helps to improve the stability of predictions and protects against overfitting the model.
random forest
a collection of a large number of decision trees trained via a bagging method.
For example, a CART algorithm would be trained using each of the n independent data sets (from the bagging process) to generate the multitude of different decision trees that make up the random forest classifier.
black box-type algorithm.

Unsupervised ML Algorithms
dimension reduction
Principal Components Analysis(PCA)
PCA is used to summarize or reduce highly correlated features of data into a few main, uncorrelated composite variables.
A composite variable is a variable that combines two or more variables that are statistically strongly related to each other
two key concepts: eigenvectors and eigenvalues
The eigenvectors define new, mutually uncorrelated composite variables that are linear combinations of the original features.
An eigenvalue gives the proportion of total variance in the initial data that is explained by each eigenvector.

black box
应用:. It is typically performed as part of exploratory data analysis, before training another supervised or unsupervised learning model.
clustering
A cluster contains a subset of observations from the data set such that all the observations within the same cluster are deemed “similar.”
k-means clustering
K-means is a relatively old algorithm that repeatedly partitions observations into a fixed number, k, of non-overlapping clusters

应用:in data exploration for discovering patterns in high dimensional data or as a method for deriving alternatives to existing static industry classifications.
hierarchical clustering
agglomerative clustering (or bottom-up)
begins with each observation being treated as its own cluster.
clustering (or top-down)
starts with all the observations belonging to a single cluster
Dendrograms
Deep Learning and Reinforcement Learning (深度学习和强化学习)
Neural Networks

(4-5-1) Neural Network
4:input layer
4 features
5:hidden layers
where learning occurs in training and inputs are processed on trained nets
1:output layer
here consisting of a single node for the target variable y
summation operator
A functional part of a neural network’s node that multiplies each input value received by a weight and sums the weighted values to form the total net input, which is then passed to the activation function.
activation function
A functional part of a neural network’s node that transforms the total net input received into the final output of the node. The activation function operates like a light dimmer switch that decreases or increases the strength of the input.
forward propagation
The process of adjusting weights in a neural network, to reduce total error of the network, by moving forward through the network’s layers.
backward propagation
The process of adjusting weights in a neural network, to reduce total error of the network, by moving backward through the network’s layers.
learning rate
A parameter that affects the magnitude of adjustments in the weights in a neural network.
应用:a variety of tasks characterized by non-linearities and complex interactions among features.
deep learning nets (DLNs)
Algorithms based on complex neural networks, ones with many hidden layers (more than 3), that address highly complex tasks, such as image classification, face recognition, speech recognition, and natural language processing.
Reinforcement learning (RL)
Machine learning in which a computer learns from interacting with itself (or data generated by the same algorithm).