注意
跳到末尾以下載完整的範例程式碼。 或透過 JupyterLite 或 Binder 在您的瀏覽器中執行此範例
顯示管道#
在 Jupyter Notebook 中顯示管道的預設配置是 'diagram'
,其中 set_config(display='diagram')
。 若要停用 HTML 表示法,請使用 set_config(display='text')
。
若要在管道的可視化中看到更詳細的步驟,請按一下管道中的步驟。
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
顯示具有預處理步驟和分類器的管道#
本節建構一個 Pipeline
,其中包含預處理步驟 StandardScaler
和分類器 LogisticRegression
,並顯示其視覺表示法。
from sklearn import set_config
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
steps = [
("preprocessing", StandardScaler()),
("classifier", LogisticRegression()),
]
pipe = Pipeline(steps)
若要視覺化圖表,預設值是 display='diagram'
。
set_config(display="diagram")
pipe # click on the diagram below to see the details of each step
若要檢視文字管道,請變更為 display='text'
。
set_config(display="text")
pipe
Pipeline(steps=[('preprocessing', StandardScaler()),
('classifier', LogisticRegression())])
放回預設顯示
set_config(display="diagram")
顯示串聯多個預處理步驟和分類器的管道#
本節建構一個 Pipeline
,其中包含多個預處理步驟 PolynomialFeatures
和 StandardScaler
,以及分類器步驟 LogisticRegression
,並顯示其視覺表示法。
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
steps = [
("standard_scaler", StandardScaler()),
("polynomial", PolynomialFeatures(degree=3)),
("classifier", LogisticRegression(C=2.0)),
]
pipe = Pipeline(steps)
pipe # click on the diagram below to see the details of each step
顯示管道和降維及分類器#
本節建構一個 Pipeline
,其中包含降維步驟 PCA
和分類器 SVC
,並顯示其視覺表示法。
顯示串聯欄轉換器的複雜管道#
本節建構一個複雜的 Pipeline
,其中包含 ColumnTransformer
和分類器 LogisticRegression
,並顯示其視覺表示法。
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
numeric_preprocessor = Pipeline(
steps=[
("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
("scaler", StandardScaler()),
]
)
categorical_preprocessor = Pipeline(
steps=[
(
"imputation_constant",
SimpleImputer(fill_value="missing", strategy="constant"),
),
("onehot", OneHotEncoder(handle_unknown="ignore")),
]
)
preprocessor = ColumnTransformer(
[
("categorical", categorical_preprocessor, ["state", "gender"]),
("numerical", numeric_preprocessor, ["age", "weight"]),
]
)
pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
pipe # click on the diagram below to see the details of each step
顯示具有分類器的管道上的網格搜尋#
本節建構一個在具有 RandomForestClassifier
的 Pipeline
上的 GridSearchCV
,並顯示其視覺表示法。
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
numeric_preprocessor = Pipeline(
steps=[
("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
("scaler", StandardScaler()),
]
)
categorical_preprocessor = Pipeline(
steps=[
(
"imputation_constant",
SimpleImputer(fill_value="missing", strategy="constant"),
),
("onehot", OneHotEncoder(handle_unknown="ignore")),
]
)
preprocessor = ColumnTransformer(
[
("categorical", categorical_preprocessor, ["state", "gender"]),
("numerical", numeric_preprocessor, ["age", "weight"]),
]
)
pipe = Pipeline(
steps=[("preprocessor", preprocessor), ("classifier", RandomForestClassifier())]
)
param_grid = {
"classifier__n_estimators": [200, 500],
"classifier__max_features": ["auto", "sqrt", "log2"],
"classifier__max_depth": [4, 5, 6, 7, 8],
"classifier__criterion": ["gini", "entropy"],
}
grid_search = GridSearchCV(pipe, param_grid=param_grid, n_jobs=1)
grid_search # click on the diagram below to see the details of each step
腳本總執行時間: (0 分鐘 0.096 秒)
相關範例