1. Dimensionality Reduction
Dimensionality Reduction: Simplifying complex data.
If you have a dataset with hundreds of features, unsupervised algorithms can compress them into a few essential variables without losing the core information.
Dimensionality reduction has a role in both supervised and unsupervised learning however, it serves different purposes in each.
Here is how it works across both paradigms:
1. Dimensionality Reduction in Unsupervised Learning
This is the most common and intuitive home for dimensionality reduction. In unsupervised learning, you don't have target labels (y-values); you only have the raw features (X-values).
The objection is to find hidden structure, simplify the data, and visualize it without losing the core information.
The method compresses the data by finding new, combined axes that capture the most variance or geometric structure in the data.
Common Algorithms:
PCA (Principal Component Analysis): A linear method that rotates the data to find the directions (principal components) of maximum variance.
t-SNE and UMAP: Non-linear methods primarily used to map high-dimensional data into 2D or 3D space for cluster-based visualizations.
2. Dimensionality Reduction in Supervised Learning
In supervised learning, you have target labels (e.g., predicting if an email is "spam" or "not spam").
Dimensionality reduction is used as a crucial preprocessing step before training a predictive model.
The Goal: To fight the "Curse of Dimensionality." Having too many features (like thousands of columns in a spreadsheet) can cause models to overfit, run incredibly slowly, or get confused by noise.
How it's used: It reduces the feature space to the most informative components so algorithms like Random Forests, Support Vector Machines, or Linear Regression can predict the labels more accurately and efficiently.
Two main approaches:
Unsupervised Preprocessing: You can run an unsupervised method like PCA on your features before passing them to a supervised model. The reduction step doesn't know about the labels, but it still cleans up the noise.
Supervised Dimensionality Reduction: Some reduction techniques explicitly look at the labels to find the best reduction. The classic example is LDA (Linear Discriminant Analysis), which reduces dimensions by maximizing the distance between different classes.
Whether trying to understand the underlying shape of unlabelled data or trying to optimize a model to predict a specific target, shrinking your feature space applies to both.
While these terms sound incredibly similar and all deal with the concept of "dimensions," they look at your data from different perspectives: the columns, the environment, and the individual items. Here is the breakdown to help you clearly disambiguate them.
1. Feature Dimensionality (Columns)
Feature dimensionality refers to the number of distinct attributes, variables, or properties measured for each data point. In a standard spreadsheet, this is simply the number of columns.
The Concept: If you are building a model to predict house prices, and you collect data on Square Footage, Number of Bedrooms, Zip Code, and Year Built, your feature dimensionality is 4.
Why it matters: This is the most common term used when discussing the "Curse of Dimensionality." High feature dimensionality means your model has to juggle a massive number of variables, which can lead to overfitting.
2. Space Dimensionality (Container)
Space dimensionality refers to the mathematical environment (the coordinate system) where your data lives. It is directly determined by the feature dimensionality.
The Concept: If a data point has 2 features, it lives in a 2D space (a flat plane with an X and Y axis). If it has 3 features, it lives in a 3D space (a cube with X, Y, and Z axes). If it has 100 features, it lives in a 100-dimensional vector space.
Why it matters: Geometry changes drastically as space dimensionality increases. In high-dimensional space, data points naturally fly apart and become incredibly sparse, making concepts like "distance" or "nearest neighbors" less meaningful.
3. Object Dimensionality (The Shape Inside the container)
Object dimensionality (sometimes referred to as intrinsic dimensionality) describes the true, structural shape of the data itself, regardless of how many features it has.
The Concept: Imagine a piece of paper. It is a 2D object. Now, crumble that paper into a ball and throw it into a 3D room.
The space dimensionality is 3D (because you need X, Y, and Z coordinates to locate any point on the paper).
The object dimensionality is still 2D, because the paper itself is fundamentally a flat, two-dimensional sheet wrapped up in a higher space.
This is the entire reason dimensionality reduction works! If you have 50 features (50D space), but the data actually lies on a smooth, winding 3D curve embedded inside that space, the object dimensionality is 3.
Algorithms like PCA or t-SNE try to find this true object dimensionality and unwrap it so you can see it clearly.
Quick Analogy Summary
Think of a museum exhibition:
Feature Dimensionality: The number of specific details written on the plaque about a statue (height, weight, age, material = 4).
Space Dimensionality: The 3D room the statue is standing in.
Object Dimensionality: The actual geometric structure of the statue itself (e.g., it's a flat, 2D silhouette cutout standing in that 3D room).