This notebook demonstrates dimensionality reduction using Principal Component Analysis (PCA) on the Fashion-MNIST dataset. You'll work through a real-world scenario where you need to reduce 784 pixel features down to a manageable number of components while maintaining model accuracy. The notebook covers:
- Loading and visualizing Fashion-MNIST data (70,000 grayscale images of clothing)
- Training a baseline Random Forest classifier on the full 784-dimensional dataset
- Applying PCA to reduce dimensionality
- Comparing model performance and training time with different numbers of components
- Visualizing data in 2D and 3D using principal components
pixel-features.ipynb # Main notebook
data/
fashion-mnist_train.csv
fashion-mnist_test.csv
- Download the datasets from the Fashion MNIST dataset on Kaggle
- Place both CSV files in the
data/directory
Open pixel-features.ipynb in your editor (VS Code, Jupyter Notebook, or JupyterLab) and run the cells in order.