Dimensionality Reduction
Model Types
Principal Component Analysis
- Requires feature scaling
Sample Code
from sklearn.decomposition import PCA
# n_components is the final number of extracted features
# start with 2, increase if the prediction is not good enough
pca = PCA(n_components = 2)
X_train = pca.fit_transform(X_train)
# Train Model and Predict
# This example uses Logistic Regression
# Other classification models can also be used
from sklearn.linear_model import LogisticRegression
c_lr = LogisticRegression(random_state = 0)
c_lr.fit(X_train, y_train)
y_pred_lr = c_lr.predict(pca.transform(sc.transform(X_test)))
Linear Discriminant Analysis
- Requires feature scaling
Sample Code
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
# n_components is the final number of extracted features
lda = LDA(n_components = 2)
# LDA uses the dependent variable as well
X_train = lda.fit_transform(X_train, y_train)
# Train Model and Predict
# This example uses Logistic Regression
# Other classification models can also be used
rom sklearn.linear_model import LogisticRegression
c_lr = LogisticRegression(random_state = 0)
c_lr.fit(X_train, y_train)
y_pred_lr = c_lr.predict(lda.transform(sc.transform(X_test)))
Kernel PCA
- Requires feature scaling
Sample Code
from sklearn.decomposition import KernelPCA
# n_components is the final number of extracted features
# start with 2, increase if the prediction is not good enough
kpca = KernelPCA(n_components = 2, kernel = 'rbf')
X_train = kpca.fit_transform(X_train)
# Train Model and Predict
# This example uses Logistic Regression
# Other classification models can also be used
from sklearn.linear_model import LogisticRegression
c_lr = LogisticRegression(random_state = 0)
c_lr.fit(X_train, y_train)
y_pred_lr = c_lr.predict(kpca.transform(sc.transform(X_test)))