12. 選擇合適的估算器#

通常,解決機器學習問題中最困難的部分可能是找到適合該工作的估算器。不同的估算器更適合不同類型的資料和不同的問題。

下面的流程圖旨在為使用者提供一些粗略的指導,說明如何針對資料嘗試哪些估算器來處理問題。按一下下面的圖表中的任何估算器以查看其文件。 嘗試下一個橘色箭頭應解讀為「如果這個估算器沒有達到預期的結果,則跟隨箭頭嘗試下一個」。使用滾輪放大和縮小,按一下並拖曳以四處平移。您也可以下載圖表: ml_map.svg

START
START
>50
samples
>50...
get
more
data
get...
NO
NO
predicting a
category
predicting...
YES
YES
do you have
labeled
data
do you hav...
YES
YES
predicting a
quantity
predicting...
NO
NO
just
looking
just...
NO
NO
predicting
structure
predicting...
NO
NO
tough
luck
tough...
<100K
samples
<100K...
YES
YES
SGD
Classifier
SGD...
NO
NO
Linear
SVC
Linear...
YES
YES
text
data
text...
Kernel
Approximation
Kernel...
KNeighbors
Classifier
KNeighbors...
NO
NO
SVC
SVC
Ensemble
Classifiers
Ensemble...
Naive
Bayes
Naive...
YES
YES
classification
classification
number of
categories
known
number of...
NO
NO
<10K
samples
<10K...
<10K
samples
<10K...
NO
NO
NO
NO
YES
YES
MeanShift
MeanShift
VBGMM
VBGMM
YES
YES
MiniBatch
KMeans
MiniBatch...
NO
NO
clustering
clustering
KMeans
KMeans
YES
YES
Spectral
Clustering
Spectral...
GMM
GMM
<100K
samples
<100K...
YES
YES
few features
should be
important
few features...
YES
YES
SGD
Regressor
SGD...
NO
NO
Lasso
Lasso
ElasticNet
ElasticNet
YES
YES
RidgeRegression
RidgeRegression
SVR(kernel="linear")
SVR(kernel="linea...
NO
NO
SVR(kernel="rbf")
SVR(kernel="rbf...
Ensemble
Regressors
Ensemble...
regression
regression
Ramdomized
PCA
Ramdomized...
YES
YES
<10K
samples
<10K...
Kernel
Approximation
Kernel...
NO
NO
IsoMap
IsoMap
Spectral
Embedding
Spectral...
YES
YES
LLE
LLE
dimensionality
reduction
dimensionality...
scikit-learn
algorithm cheat sheet
scikit-learn...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
Text is not SVG - cannot display