稀疏編碼#

sklearn.decomposition.sparse_encode(X, dictionary, *, gram=None, cov=None, algorithm='lasso_lars', n_nonzero_coefs=None, alpha=None, copy_cov=True, init=None, max_iter=1000, n_jobs=None, check_input=True, verbose=0, positive=False)[原始碼]#

稀疏編碼。

結果的每一列都是稀疏編碼問題的解。目標是找到一個稀疏陣列 code，使得

X ~= code * dictionary

請參閱使用者指南以了解更多資訊。

參數:

X形狀為 (n_samples, n_features) 的類陣列

資料矩陣。

dictionary形狀為 (n_components, n_features) 的類陣列

用於解決資料稀疏編碼的字典矩陣。某些演算法假設標準化的列以獲得有意義的輸出。

gram形狀為 (n_components, n_components) 的類陣列，預設值為 None

預先計算的格拉姆矩陣，dictionary * dictionary'。

cov形狀為 (n_components, n_samples) 的類陣列，預設值為 None

預先計算的共變異數，dictionary' * X。

algorithm{‘lasso_lars’, ‘lasso_cd’, ‘lars’, ‘omp’, ‘threshold’}, 預設值為 ‘lasso_lars’

使用的演算法

'lars'：使用最小角度迴歸法 (linear_model.lars_path)；
'lasso_lars'：使用 Lars 計算 Lasso 解；
'lasso_cd'：使用座標下降法計算 Lasso 解 (linear_model.Lasso)。如果估計的成分是稀疏的，則 lasso_lars 會更快；
'omp'：使用正交匹配追蹤來估計稀疏解；
'threshold'：將投影 dictionary * data' 中所有小於正規化的係數壓制為零。

n_nonzero_coefsint，預設值為 None

解決方案的每一列中要鎖定的非零係數數量。這僅由 algorithm='lars' 和 algorithm='omp' 使用，並且在 omp 的情況下會被 alpha 覆寫。如果 None，則 n_nonzero_coefs=int(n_features / 10)。

alphafloat，預設值為 None

如果 algorithm='lasso_lars' 或 algorithm='lasso_cd'，則 alpha 是應用於 L1 範數的懲罰。如果 algorithm='threshold'，則 alpha 是閾值的絕對值，低於此閾值的係數將被壓制為零。如果 algorithm='omp'，則 alpha 是容差參數：目標重建誤差的值。在這種情況下，它會覆寫 n_nonzero_coefs。如果 None，則預設為 1。

copy_covbool，預設值為 True

是否複製預先計算的共變異數矩陣；如果為 False，則可能會被覆寫。

init形狀為 (n_samples, n_components) 的 ndarray，預設值為 None

稀疏碼的初始化值。僅在 algorithm='lasso_cd' 時使用。

max_iterint，預設值為 1000

如果 algorithm='lasso_cd' 或 'lasso_lars'，則執行的最大迭代次數。

n_jobsint，預設值為 None

要執行的並行作業數。None 表示 1，除非在 joblib.parallel_backend 內容中。-1 表示使用所有處理器。請參閱詞彙表以了解更多詳細資訊。

check_inputbool，預設值為 True

如果為 False，則不會檢查輸入陣列 X 和 dictionary。

verboseint，預設值為 0

控制詳細程度；數值越高，顯示的訊息越多。

positivebool，預設值為 False

是否在尋找編碼時強制正值。

在 0.20 版本中新增。

返回:

code形狀為 (n_samples, n_components) 的 ndarray: 稀疏碼。

另請參閱

sklearn.linear_model.lars_path: 使用 LARS 演算法計算最小角度迴歸或 Lasso 路徑。
sklearn.linear_model.orthogonal_mp: 解決正交匹配追蹤問題。
sklearn.linear_model.Lasso: 訓練具有 L1 先驗作為正規化器的線性模型。
SparseCoder: 從固定的預先計算字典中尋找資料的稀疏表示。

範例

>>> import numpy as np
>>> from sklearn.decomposition import sparse_encode
>>> X = np.array([[-1, -1, -1], [0, 0, 3]])
>>> dictionary = np.array(
...     [[0, 1, 0],
...      [-1, -1, 2],
...      [1, 1, 1],
...      [0, 1, 1],
...      [0, 2, 1]],
...    dtype=np.float64
... )
>>> sparse_encode(X, dictionary, alpha=1e-10)
array([[ 0.,  0., -1.,  0.,  0.],
       [ 0.,  1.,  1.,  0.,  0.]])