Dimensionality Reduction with Principal Component Analysis

This article takes a look at principal component analysis (PCA). In the course of the article, we will work our way through the most important theoretical knowledge and create an implementation of PCA in Python.

The content of this blog post was originally published as a Jupyter notebook here. You can run the code interactively in your browser using Binder.

What is principal component analysis?

In simple terms, principal component analysis (PCA) is a technique to perform dimensionality reduction. It has been around for more than 100 years and is still heavily used. Although PCA is most often applied to find a lower-dimensional representation of data, it can also be used for other purposes, e.g., to detect simple patterns in data.

Why do we need principal component analysis?

A lot of data is high-dimensional. Imagine patient data collected in a hospital. For each patient, we can have hundreds or even thousands of measurements (blood pressure, heart rate, respiratory rate, etc.). Working with such data is difficult – it is expensive to store, hard to analyze, and almost impossible to visualize.

Luckily, many dimensions in high-dimensional data are often redundant and can be expressed by combining some of the other dimensions. Also, the key part of the data is often contained in a more compact lower-dimensional structure. Consequently, we can simplify high-dimensional data points using dimensionality reduction techniques like principal component analysis (PCA).

Key ideas of PCA

PCA finds a lower-dimensional representation of data by constructing new features (called principal components) which are linear combinations of the original features
The new features are selected carefully: PCA looks for features that can summarize the data best without losing too much information

Two ways of deriving PCA

Let’s say we have an i.i.d dataset $\boldsymbol{X}$ with $D$ dimensions and a mean value of $0$:
$\mathcal{\boldsymbol{X}} = \{\boldsymbol{x}_1, …, \boldsymbol{x}_N\} \boldsymbol{x}_n \in \mathbb{R}^D$. The data covariance matrix (which will be needed later on) is computed as follows:

$$ \boldsymbol{S} = \frac{1}{N} \sum_{n=1}^{N} \boldsymbol{x}_n \boldsymbol{x}_n^T$$

In PCA our goal is to find projections of the datapoints $\boldsymbol{x}_n$ that are as similar to the original data points as possible but have lower dimensionality. We can approach this goal from two perspectives:

Maximum variance perspective: We try to find a low-dimensional representation that maximizes the variance of the projected data.
Projection perspective: We try to find a low-dimensional representation that minimizes the average reconstruction error between the original data and the reconstructed data.

Both approaches reach the same result.

Maximum variance perspective

In the maximum variance perspective, PCA is derived as an algorithm that tries to find a transformation matrix $\boldsymbol{B}$ which projects the original data $\boldsymbol{X}$ to a low-dimensional space, ideally without losing information. Let’s say that this low-dimensional space has dimension $M$.

How can we make sure that we find a matrix $B$ that retains as much information as possible? It turns out that this is achieved when $\boldsymbol{B}$ projects the data in a way that maximizes the variance of the data in the (new) low-dimensional space.

Let’s take a closer look at $\boldsymbol{B}$. We can define $\boldsymbol{B}$ as a collection of vectors $\boldsymbol{b}_1, …, \boldsymbol{b}_M$:

$$
\boldsymbol{B} := [\boldsymbol{b}_1, …, \boldsymbol{b}_M] \in \mathbb{R}^{D x M}
$$

The vectors $\boldsymbol{b}_m$ are called basis vectors and form the axes of the new $M$-dimensional space we project our data to.

When deriving which vectors $\boldsymbol{b}_m$ we should select to maximize the variance, we will find that maximizing the variance is equivalent to selecting those vectors that belong to the largest eigenvalues of the data covariance matrix $\boldsymbol{S}$. That means that we can construct our matrix $\boldsymbol{B}$ by first computing the eigenvalues and eigenvectors of the covariance matrix $\boldsymbol{S}$ and then selecting the $M$ eigenvectors with the largest eigenvalues.

To be more precise: $\boldsymbol{b}_1$ will be the eigenvector with the largest eigenvalue. In the context of PCA, it is called the first principal component. $\boldsymbol{b}_2$ will be the eigenvector with the second largest eigenvalue and is called second principal component and so on.

If you are interested in the derivation, take a look at chapter 10.2 of the book Mathematics for Machine Learning.

Once we have found our projection matrix $\boldsymbol{B}$, we can transform each data vector $\boldsymbol{x}_n$ by multiplying it with $\boldsymbol{B}$. This will give us a low-dimensional compressed representation of $\boldsymbol{x}_n$:

$$
\boldsymbol{z}_n = \boldsymbol{B}^T \boldsymbol{x}_n \in \mathbb{R}^M
$$

Using matrix multiplications we could represent the computation as follows:

data matrix X, projection matrix B and compressed data matrix Z on a high level

Projection perspective

Another way to derive PCA is to consider the original data points $\boldsymbol{x}_n$ and their reconstructions $\boldsymbol{\tilde{x}}_n$. In this perspective, we are trying to find reconstructions $\boldsymbol{\tilde{x}}_n$ that minimize the averaged squared Euclidean distance between the original datapoints and their reconstructions: $J = \frac{1}{N} \sum_{n=1}^{N}||\boldsymbol{x}_n – \boldsymbol{\tilde{x}}_n ||^2$.

You can find more details about this perspective in chapter 10.3 of the book Mathematics for Machine Learning.

Interpreting the projection matrix

In the beginning, I mentioned that PCA constructs new features (the principal components) that are linear combinations of the original features. Let’s take a closer look at what this means.

Each data point is stored in a vector with $D$ elements: $\boldsymbol{x}_n = [x_{n1}, …, x_{nD}]$. Each of the $D$ dimensions represents a different feature. For example, think of an image with $D = 64$ pixels. We can describe each data point as a linear combination of the features:
$$
\boldsymbol{x}_n = x_{n1} \cdot \text{feature}_1 + x_{n2} \cdot \text{feature}_2 + … + x_{nD} \cdot \text{feature}_D
$$

Using the example of pixels this would correspond to:
$$
\boldsymbol{x}_n = x_{n1} \cdot \text{pixel}_1 + x_{n2} \cdot \text{pixel}_2 + … + x_{nD} \cdot \text{pixel}_D
$$

When PCA projects the data to a low-dimensional space it also uses a combination of the features. This becomes evident in the projection equation we have seen above:

$$
\boldsymbol{z}_n = \boldsymbol{B}^T \boldsymbol{x}_n \in \mathbb{R}^M
$$

In this equation, we find the compressed representation $\boldsymbol{z}_n$ of the datapoint $\boldsymbol{x}_n$ by performing a matrix multiplication. The new, compressed representation of a datapoint $\boldsymbol{x}_n$ will be given by a linear combination of all its feature values. We can make this clear when considering an example.

Let’s say our data matrix $\boldsymbol{X}$ has 3 datapoints and 2 features, so $\boldsymbol{X} \in \mathbb{R}^{3×2}$ and we consider only the first principal component when performing PCA. Hence, we have $\boldsymbol{B} \in \mathbb{R}^{2×1}$. When multiplying the data matrix with the projection matrix we receive the compressed versions of the data points as a matrix $\boldsymbol{Z} \in \mathbb{R}^{3×1}$. Each one-dimensional code $\boldsymbol{z}_n$ is given by a linear combination of the original feature values:

data matrix X, projection matrix B and compressed data matrix Z in detail

Computing the eigenvectors

By now we know that we have to compute eigenvalues and eigenvectors to perform PCA. We have two ways to do that:

Perform an eigendecomposition of the data covariance matrix $\boldsymbol{S}$
Perform a singular value decomposition of the data matrix $\boldsymbol{X}$

The standard approach (which we will use in this tutorial) is the first approach. You can find more on approach two in chapter 10.4 of the Mathematics for Machine Learning book.

Note: In many applications, we only need the first few eigenvectors. Performing a full eigendecomposition or singular value decomposition would be computationally wasteful. Therefore, most software packages implement iterative methods which directly optimize the required number of eigenvectors.

Summary of important terminology

$\boldsymbol{x}_n$ are our datapoints, stored in the matrix $\mathcal{\boldsymbol{X}} = \{\boldsymbol{x}_1, …, \boldsymbol{x}_N\}, \boldsymbol{x}_n \in \mathbb{R}^D$.
$\boldsymbol{S}$ is the data covariance matrix.
$\boldsymbol{B}$ is the projection matrix and consists of column vectors $\boldsymbol{b}_m$: $\boldsymbol{B} := [\boldsymbol{b}_1, …, \boldsymbol{b}_M] \in \mathbb{R}^{D x M}$.
$\boldsymbol{z}_n$ is the low-dimensional representation of $\boldsymbol{x}_n$: $\boldsymbol{z}_n = \boldsymbol{B}^T \boldsymbol{x}_n \in \mathbb{R}^M$.

Implementation

from sklearn.datasets import load_digits
import matplotlib.pyplot as plt
import numpy as np

from sklearn.datasets import load_digits

import matplotlib.pyplot as plt

import numpy as np

Dataset

We will use the digits dataset from scikit-learn for our implementation. It contains 8×8 images of handwritten digits (1797 in total). The data is stored in a data matrix with 1797 rows and 64 columns (corresponding to all 8×8=64 pixel values).

digits = load_digits()
data, labels = digits.data, digits.target

print(f"Shape of data matrix: {data.shape}")
print(f"Shape of label matrix: {labels.shape}")

# Example image
plt.imshow(digits.images[42], cmap="gray");

digits = load_digits()

data, labels = digits.data, digits.target

print(f"Shape of data matrix: {data.shape}")

print(f"Shape of label matrix: {labels.shape}")

# Example image

plt.imshow(digits.images[42], cmap="gray");

Shape of data matrix: (1797, 64)
Shape of label matrix: (1797,)

1 2	Shape of data matrix: (1797, 64) Shape of label matrix: (1797,)

white pixels on black ground that look like a one

Standardization

Before we can apply PCA, we should standardize the data. If you are wondering why this is necessary, check out this explanation on StackExchange. Standardization rescales the data to have a mean zero and a standard deviation one.

mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
# If values have a standard deviation of zero, the normalization step
# will contain a division by zero. To prevent this we replace standard
# deviations of zero with one.
std[std == 0] = 1.
data_norm = (data - mean) / std

# We could also perform standardization using scikit learn
from sklearn.preprocessing import StandardScaler
data_norm_sklearn = StandardScaler().fit_transform(data)

mean = np.mean(data, axis=0)

std = np.std(data, axis=0)

# If values have a standard deviation of zero, the normalization step

# will contain a division by zero. To prevent this we replace standard

# deviations of zero with one.

std[std == 0] = 1.

data_norm = (data - mean) / std

# We could also perform standardization using scikit learn

from sklearn.preprocessing import StandardScaler

data_norm_sklearn = StandardScaler().fit_transform(data)

Computing the data covariance matrix

To find the principal components we first have to compute the data covariance matrix $\boldsymbol{S}$:

$$
\boldsymbol{S} = \frac{1}{N} \sum_{n=1}^{N} \boldsymbol{x}_n \boldsymbol{x}_n^T
$$

We can summarize the summation as follows:

$$
\boldsymbol{S} = \frac{1}{N} \boldsymbol{X}^T \boldsymbol{X}
$$

n_samples, n_features = data_norm.shape
cov_matrix = (data_norm.T.dot(data_norm)) / (n_samples)

# We could also compute the covariance matrix with numpy (see code below)
# Note: the numpy implementation divides by n_samples - 1. This
# is known as Bessel's correction. Refer to the corresponding 
# Wikipedia article if interested in more details.
cov_matrix_numpy = np.cov(data_norm.T)

n_samples, n_features = data_norm.shape

cov_matrix = (data_norm.T.dot(data_norm)) / (n_samples)

# We could also compute the covariance matrix with numpy (see code below)

# Note: the numpy implementation divides by n_samples - 1. This

# is known as Bessel's correction. Refer to the corresponding

# Wikipedia article if interested in more details.

cov_matrix_numpy = np.cov(data_norm.T)

Computing the eigenvalues and eigenvectors

To compute eigenvalues and eigenvectors, we perform an eigendecomposition of the covariance matrix $\boldsymbol{S}$. Doing this manually is messy and it is hard to do the computations correctly. Therefore, we will make use of NumPy to get both eigenvalues and eigenvectors. If you want to learn more about how to perform an eigendecomposition, take a look at this Wikipedia article.

eig_vals, eig_vecs = np.linalg.eig(cov_matrix)
print(f"Shape of eigenvalue matrix: {eig_vals.shape}")
print(f"Shape of eigenvectors matrix: {eig_vecs.shape}")
print("=====================================")

# We want to find the eigenvectors with the highest eigenvalues. 
# To do so, we use argsort which returns the indices that would 
# sort the array of eigenvalues. Since we want to sort from largest
# to lowest value we need to reverse the order of the result
sorted_indices = np.argsort(eig_vals)[::-1]

# Look at the first few eigenvalues to make sure that we sorted them correctly
for idx in sorted_indices[:3]:
    print(f"Eigenvalue: {eig_vals[idx]}")

eig_vals, eig_vecs = np.linalg.eig(cov_matrix)

print(f"Shape of eigenvalue matrix: {eig_vals.shape}")

print(f"Shape of eigenvectors matrix: {eig_vecs.shape}")

print("=====================================")

# We want to find the eigenvectors with the highest eigenvalues.

# To do so, we use argsort which returns the indices that would

# sort the array of eigenvalues. Since we want to sort from largest

# to lowest value we need to reverse the order of the result

sorted_indices = np.argsort(eig_vals)[::-1]

# Look at the first few eigenvalues to make sure that we sorted them correctly

for idx in sorted_indices[:3]:

print(f"Eigenvalue: {eig_vals[idx]}")

Shape of eigenvalue matrix: (64,)
Shape of eigenvectors matrix: (64, 64)
=====================================
Eigenvalue: 7.340688819618325
Eigenvalue: 5.832243185889725
Eigenvalue: 5.151093084500989

Shape of eigenvalue matrix: (64,)

Shape of eigenvectors matrix: (64, 64)

=====================================

Eigenvalue: 7.340688819618325

Eigenvalue: 5.832243185889725

Eigenvalue: 5.151093084500989

Choosing the number of eigenvalues

In order to decide how many eigenvalues we want to use, we should ask ourselves what our goal is. Remember that the number of eigenvalues that we select corresponds to the dimensionality of the new low-dimensional space we are creating.

Most of the time, PCA is used to reduce the dimensionality of data. But at the same time, we want to make sure that our solution is good, meaning that we do not lose a lot of information. Consequently, we should try to select the eigenvalues that explain most of the variance in the data.

Typically, the first few eigenvalues capture most of the variance whereas later ones do not add much information. When using tools like scikit-learn, we can specify how much variance should be explained by the solution so we do not have to choose the number of eigenvalues ourselves.

In our case, we can decide how many principal components to use by computing the explained variance of the individual eigenvectors.

sum_eig_vals = sum(eig_vals)
sorted_eig_vals = sorted(eig_vals, reverse=True)
explained_variance = [(eig_val / sum_eig_vals) *  100 for eig_val in sorted_eig_vals]
cumulative_explained_variance = np.cumsum(explained_variance)

plt.figure(figsize=(8, 6))
plt.bar(range(len(eig_vals)), explained_variance, label="Explained variance", color="red")
plt.xlabel("Index of principal components")
plt.ylabel("Explained variance in percent")
plt.title("Variance captured by the individual principal components")
plt.show()

plt.figure(figsize=(8, 6))
plt.bar(range(len(eig_vals)), cumulative_explained_variance, label="Cumulative explained variance", color="blue")
plt.xlabel("Number of principal components")
plt.ylabel("Cumulative explained variance in percent")
plt.title("Cumulative variance captured by the principal components")
plt.show()

sum_eig_vals = sum(eig_vals)

sorted_eig_vals = sorted(eig_vals, reverse=True)

explained_variance = [(eig_val / sum_eig_vals) * 100 for eig_val in sorted_eig_vals]

cumulative_explained_variance = np.cumsum(explained_variance)

plt.figure(figsize=(8, 6))

plt.bar(range(len(eig_vals)), explained_variance, label="Explained variance", color="red")

plt.xlabel("Index of principal components")

plt.ylabel("Explained variance in percent")

plt.title("Variance captured by the individual principal components")

plt.show()

plt.figure(figsize=(8, 6))

plt.bar(range(len(eig_vals)), cumulative_explained_variance, label="Cumulative explained variance", color="blue")

plt.xlabel("Number of principal components")

plt.ylabel("Cumulative explained variance in percent")

plt.title("Cumulative variance captured by the principal components")

plt.show()

diagram of variants captured by the individual components, high values at first going lower

diagram of cumulative variance captured by the principal components with low to high values.

The left plot shows that the first principal components explain more variance than the latter ones. However, the amount of variance explained is not very high. Consequently, we would lose a lot of information when using only the first few principal components. The plot to the right shows that we would have to keep 31 principal components to explain 90% of the variance. This is a lot but still better than keeping all 64 values.

For the purpose of this tutorial, we will keep only the 2 largest eigenvalues and their corresponding eigenvectors. This will allow us to visualize the result in two dimensions.

# Select only the number of eigenvalues / principal components that we want to use
n_principal_comps = 2
eig_vals = eig_vals[sorted_indices[:n_principal_comps]]
eig_vecs = eig_vecs[:, sorted_indices[:n_principal_comps]]

print(f"Eigenvalues shape: {eig_vals.shape}")
print(f"Eigenvectors shape: {eig_vecs.shape}")

# Select only the number of eigenvalues / principal components that we want to use

n_principal_comps = 2

eig_vals = eig_vals[sorted_indices[:n_principal_comps]]

eig_vecs = eig_vecs[:, sorted_indices[:n_principal_comps]]

print(f"Eigenvalues shape: {eig_vals.shape}")

print(f"Eigenvectors shape: {eig_vecs.shape}")

Eigenvalues shape: (2,)
Eigenvectors shape: (64, 2)

1 2	Eigenvalues shape: (2,) Eigenvectors shape: (64, 2)

Constructing the projection matrix

After finding the eigenvectors/principal components, we can arrange them into a matrix (our projection matrix $\boldsymbol{B}$).

vector_list = [vec.reshape(n_features, 1) for vec in eig_vecs.T]
proj_matrix = np.hstack(vector_list) # This is our projection matrix B
print(f"Shape of projection matrix: {proj_matrix.shape}")

vector_list = [vec.reshape(n_features, 1) for vec in eig_vecs.T]

proj_matrix = np.hstack(vector_list) # This is our projection matrix B

print(f"Shape of projection matrix: {proj_matrix.shape}")

Shape of projection matrix: (64, 2)

1	Shape of projection matrix: (64, 2)

Projecting the data

In order to actually project the original data $\boldsymbol{X}$ into the lower-dimensional space, we multiply it with the matrix $\boldsymbol{B}$:

$$
\boldsymbol{Z} = \boldsymbol{X} \boldsymbol{B} \in \mathbb{R}^{N x M}
$$

Data matrix X: 1798x64, * Projection matrix B: 64x2 = Compressed data matrix Z: 1997x2

data_proj = data_norm.dot(proj_matrix) # This is our compressed data matrix Z
print(f"Shape of projected data matrix: {data_proj.shape}")

1 2	data_proj = data_norm.dot(proj_matrix) # This is our compressed data matrix Z print(f"Shape of projected data matrix: {data_proj.shape}")

Shape of projected data matrix: (1797, 2)

1	Shape of projected data matrix: (1797, 2)

Plotting the transformed data

With only two principal components, we can plot the transformed data easily. One axis will correspond to the first principle component, the other axis to the second.

labels_str = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

plt.figure(figsize=(8, 6))
scatter = plt.scatter(data_proj[:, 0], data_proj[:, 1], c=labels)
plt.legend(handles=scatter.legend_elements()[0], labels=labels_str, loc="upper left")
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title("Compressed digit dataset with labels")
plt.show();

# Stackoverflow post to add legend: 
# https://stackoverflow.com/questions/17411940/matplotlib-scatter-plot-legend

labels_str = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

plt.figure(figsize=(8, 6))

scatter = plt.scatter(data_proj[:, 0], data_proj[:, 1], c=labels)

plt.legend(handles=scatter.legend_elements()[0], labels=labels_str, loc="upper left")

plt.xlabel('Principal Component 1')

plt.ylabel('Principal Component 2')

plt.title("Compressed digit dataset with labels")

plt.show();

# Stackoverflow post to add legend:

# https://stackoverflow.com/questions/17411940/matplotlib-scatter-plot-legend

Comparison to scikit-learn

In practice, it does not make sense to perform PCA manually. Instead, we can use packages like scikit-learn. This simplifies the computation substantially. Let’s apply principal component analysis as contained in scikit-learn to the digits dataset and compare it to our solution. As visible in the plot below, our implementation of PCA gives the same result as the scikit-learn version!

from sklearn.decomposition import PCA
sklearn_pca = PCA(n_components=2)
data_proj_sklearn = sklearn_pca.fit_transform(data_norm)
# The eigenvectors in scikit-learn will point into the opposite direction
# compared to our eigenvectors (in other words: they are flipped). This
# does not make a difference regarding the solution. However, to make 
# sure that we get exactly the same plot we multiply the result by -1.
data_proj_sklearn *= -1

plt.figure(figsize=(8, 6))
scatter = plt.scatter(data_proj_sklearn[:, 0], data_proj_sklearn[:, 1], c=labels, label=labels_str)
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.legend(handles=scatter.legend_elements()[0], labels=labels_str, loc="upper left")
plt.title("Compressed digits dataset with labels (sklearn version)")
plt.show()

from sklearn.decomposition import PCA

sklearn_pca = PCA(n_components=2)

data_proj_sklearn = sklearn_pca.fit_transform(data_norm)

# The eigenvectors in scikit-learn will point into the opposite direction

# compared to our eigenvectors (in other words: they are flipped). This

# does not make a difference regarding the solution. However, to make

# sure that we get exactly the same plot we multiply the result by -1.

data_proj_sklearn *= -1

plt.figure(figsize=(8, 6))

scatter = plt.scatter(data_proj_sklearn[:, 0], data_proj_sklearn[:, 1], c=labels, label=labels_str)

plt.xlabel('Principal Component 1')

plt.ylabel('Principal Component 2')

plt.legend(handles=scatter.legend_elements()[0], labels=labels_str, loc="upper left")

plt.title("Compressed digits dataset with labels (sklearn version)")

plt.show()

Sources and further reading

The basis for this notebook is chapter 10 of the book Mathematics for Machine Learning. I can highly recommend reading through the chapter to get a deeper understanding of PCA.

Another fantastic explanation of PCA is given in this post on StackExchange.

Python Fundamentals: Dein Einstieg in die Welt von Python

Dieses Training führt praxisorientiert in die Grundlagen von Python ein. Im Verlauf des Trainings lernen die Teilnehmer:innen anhand von interaktiven Beispielen und Praxisaufgaben die wichtigsten grundlegenden Konzepte der Programmiersprache kennen.

Zum Training

One thought on “Dimensionality Reduction with Principal Component Analysis”

ZA sagt:

14.12.2022 um 4:18 pm Uhr

Sometimes study through cases can be very beneficial for understanding.

An interesting case-study of PCA can be found here:
https://iopscience.iop.org/article/10.1088/2632-2153/abd87c

This is the case where PCA was applied in conjunction with the K-means clustering to perform an advanced characterization of monolayer matter, where different physically distinct areas were identified.

Antworten

Name	Borlabs Cookie
Anbieter	Eigentümer dieser Website
Zweck	Speichert die Einstellungen der Besucher, die in der Cookie Box von Borlabs Cookie ausgewählt wurden.
Cookie Name	borlabs-cookie
Cookie Laufzeit	1 Jahr

Akzeptieren
Name	Google Analytics
Anbieter	Google LLC
Zweck	Cookie von Google für Website-Analysen. Erzeugt statistische Daten darüber, wie der Besucher die Website nutzt.
Datenschutzerklärung	https://policies.google.com/privacy?hl=de
Cookie Name	_ga,_gat,_gid
Cookie Laufzeit	2 Jahre

Akzeptieren
Name	Hotjar
Anbieter	Hotjar Ltd.
Zweck	Hotjar ist ein Analysewerkzeug für das Benutzerverhalten von Hotjar Ltd. Wir verwenden Hotjar, um zu verstehen, wie Benutzer mit unserer Website interagieren.
Datenschutzerklärung	https://www.hotjar.com/legal/policies/privacy/
Host(s)	*.hotjar.com
Cookie Name	_hjClosedSurveyInvites, _hjDonePolls, _hjMinimizedPolls, _hjDoneTestersWidgets, _hjIncludedInSample, _hjShownFeedbackMessage, _hjid, _hjRecordingLastActivity, hjTLDTest, _hjUserAttributesHash, _hjCachedUserAttributes, _hjLocalStorageTest, _hjptid
Cookie Laufzeit	Sitzung / 1 Jahr

Akzeptieren
Name	HubSpot
Anbieter	HubSpot Inc.
Zweck	HubSpot ist ein Verwaltungsdienst für Benutzerdatenbanken bereitgestellt von HubSpot, Inc. Wir nutzen HubSpot auf dieser Website für unsere Online Marketing-Aktivitäten.
Datenschutzerklärung	https://legal.hubspot.com/privacy-policy
Host(s)	*.hubspot.com, hubspot-avatars.s3.amazonaws.com, hubspot-realtime.ably.io, hubspot-rest.ably.io, js.hs-scripts.com
Cookie Name	__hs_opt_out, __hs_d_not_track, hs_ab_test, hs-messages-is-open, hs-messages-hide-welcome-message, __hstc, hubspotutk, __hssc, __hssrc, messagesUtk
Cookie Laufzeit	Sitzung / 30 Minuten / 1 Tag / 1 Jahr / 13 Monate

Akzeptieren
Name	OpenStreetMap
Anbieter	OpenStreetMap Foundation
Zweck	Wird verwendet, um OpenStreetMap-Inhalte zu entsperren.
Datenschutzerklärung	https://wiki.osmfoundation.org/wiki/Privacy_Policy
Host(s)	.openstreetmap.org
Cookie Name	_osm_location, _osm_session, _osm_totp_token, _osm_welcome, _pk_id., _pk_ref., _pk_ses., qos_token
Cookie Laufzeit	1-10 Jahre

Dimensionality Reduction with Principal Component Analysis

What is principal component analysis?

Why do we need principal component analysis?

Key ideas of PCA

Two ways of deriving PCA

Maximum variance perspective

Projection perspective

Interpreting the projection matrix

Computing the eigenvectors

Summary of important terminology

Implementation

Dataset

Standardization

Computing the data covariance matrix

Computing the eigenvalues and eigenvectors

Choosing the number of eigenvalues

Constructing the projection matrix

Projecting the data

Plotting the transformed data

Comparison to scikit-learn

Sources and further reading

Python Fundamentals: Dein Einstieg in die Welt von Python

One thought on “Dimensionality Reduction with Principal Component Analysis”

Hat dir der Beitrag gefallen? Antworten abbrechen

Ähnliche Artikel

From Data to Decision: Pioneering Predictive Maintenance in Food Safety

Survival Analysis for State of Charge Prediction in IoT Devices

Who Let the Dogs Out? Pet Tracker Analytics for Early Identification of Users’ Behavioral Patterns

Dimensionality Reduction with Principal Component Analysis

What is principal component analysis?

Why do we need principal component analysis?

Key ideas of PCA

Two ways of deriving PCA

Maximum variance perspective

Projection perspective

Interpreting the projection matrix

Computing the eigenvectors

Summary of important terminology

Implementation

Dataset

Standardization

Computing the data covariance matrix

Computing the eigenvalues and eigenvectors

Choosing the number of eigenvalues

Constructing the projection matrix

Projecting the data

Plotting the transformed data

Comparison to scikit-learn

Sources and further reading

Python Fundamentals: Dein Einstieg in die Welt von Python

One thought on “Dimensionality Reduction with Principal Component Analysis”

Hat dir der Beitrag gefallen? Antworten abbrechen

Ähnliche Artikel

From Data to Decision: Pioneering Predictive Maintenance in Food Safety

Survival Analysis for State of Charge Prediction in IoT Devices

Who Let the Dogs Out? Pet Tracker Analytics for Early Identification of Users’ Behavioral Patterns

Newsletter