群集 OPTICS xi#

sklearn.cluster.cluster_optics_xi(*, reachability, predecessor, ordering, min_samples, min_cluster_size=None, xi=0.05, predecessor_correction=True)[來源]#

根據 Xi-陡峭方法自動提取群集。

參數:

reachability形狀為 (n_samples,) 的 ndarray: 由 OPTICS 計算的可達性距離 (reachability_)。
predecessor形狀為 (n_samples,) 的 ndarray: 由 OPTICS 計算的前導點。
ordering形狀為 (n_samples,) 的 ndarray: OPTICS 排序的點索引 (ordering_)。
min_samples大於 1 的整數或介於 0 和 1 之間的浮點數: 與給定 OPTICS 的 min_samples 相同。向上和向下陡峭區域不能有超過 min_samples 個連續的非陡峭點。表示為絕對數字或樣本數的比例（四捨五入至少為 2）。
min_cluster_size大於 1 的整數或介於 0 和 1 之間的浮點數，預設值=None: OPTICS 群集中樣本的最小數量，表示為絕對數字或樣本數的比例（四捨五入至少為 2）。如果為 None，則改為使用 min_samples 的值。
xi介於 0 和 1 之間的浮點數，預設值=0.05: 決定可達性圖上構成群集邊界的最小陡峭度。例如，可達性圖中的向上點定義為一個點到其後繼點的比率最多為 1-xi。
predecessor_correction布林值，預設值=True: 根據計算出的前導點修正群集。

回傳:

labels形狀為 (n_samples,) 的 ndarray: 分配給樣本的標籤。未包含在任何群集中的點標記為 -1。
clusters形狀為 (n_clusters, 2) 的 ndarray: 以每行 [開始, 結束] 形式的群集列表，所有索引均包含在內。群集根據 (結束, -開始) (遞增) 排序，以便包含較小群集的大群集在這些巢狀較小群集之後出現。由於 labels 不反映層次結構，通常 len(clusters) > np.unique(labels)。

範例

>>> import numpy as np
>>> from sklearn.cluster import cluster_optics_xi, compute_optics_graph
>>> X = np.array([[1, 2], [2, 5], [3, 6],
...               [8, 7], [8, 8], [7, 3]])
>>> ordering, core_distances, reachability, predecessor = compute_optics_graph(
...     X,
...     min_samples=2,
...     max_eps=np.inf,
...     metric="minkowski",
...     p=2,
...     metric_params=None,
...     algorithm="auto",
...     leaf_size=30,
...     n_jobs=None
... )
>>> min_samples = 2
>>> labels, clusters = cluster_optics_xi(
...     reachability=reachability,
...     predecessor=predecessor,
...     ordering=ordering,
...     min_samples=min_samples,
... )
>>> labels
array([0, 0, 0, 1, 1, 1])
>>> clusters
array([[0, 2],
       [3, 5],
       [0, 5]])