Credit Card Customer Clustering Tutorial

A hands-on tutorial demonstrating K-means clustering to segment credit card customers into actionable business groups using the UCI Credit Card Default dataset.

Overview

This tutorial walks through the complete process of customer segmentation using unsupervised machine learning. You'll learn to:

Transform raw transactional data into behavioral features
Apply K-means clustering to identify customer segments
Select optimal cluster numbers using the Elbow Method and Silhouette Score
Interpret clusters for business decision-making

Dataset

Source: UCI Kaggle Credit Card Default Data

The dataset contains 30,000 credit card customers with 6 months of payment history and transaction data.

Prerequisites

Python 3.7 or higher
Jupyter Notebook or VS Code with Jupyter extension

Installation

1. Clone or Download

Download this repository or navigate to the project directory:

cd /path/to/Unit6-Practice

2. Install Dependencies

The tutorial uses the following Python packages:

pandas
numpy
matplotlib
seaborn
scikit-learn

Install all dependencies using the first code cell in the notebook, or run:

pip install pandas numpy matplotlib seaborn scikit-learn

3. Verify Data File

Ensure the dataset is located at:

data/UCI_Credit_Card.csv

How to Run

Option 1: Using Jupyter Notebook

Launch Jupyter Notebook:
```
jupyter notebook
```
Open credit_groups.ipynb
Run cells sequentially from top to bottom using Shift + Enter

Option 2: Using VS Code

Open the project folder in VS Code
Open credit_groups.ipynb
Select a Python kernel when prompted
Run cells sequentially using the play button or Shift + Enter

Tutorial Structure

Part 1: Data Loading and Feature Engineering

Load the UCI Credit Card dataset
Create three behavioral features using RFM methodology:
- Recency: Most recent payment status
- Frequency: Count of on-time payments over 6 months
- Monetary: Average monthly payment amount

Part 2: Data Preparation

Visualize feature distributions
Standardize features using StandardScaler
Prepare data for clustering

Part 3: Selecting Optimal K

Test K values from 2 to 10
Apply the Elbow Method (inertia analysis)
Calculate Silhouette Scores
Visualize both metrics to select optimal K

Part 4: Final Clustering

Train final K-means model with selected K value
Assign cluster labels to all customers
Analyze cluster characteristics

Part 5: Business Interpretation

Visualize customer segments
Interpret clusters as business segments:
- Champions: High payment amounts, always on time
- Solid Performers: Consistent on-time payers
- High Rollers: Large payments but inconsistent
- At-Risk: Declining payment behavior
- Problem Accounts: Frequent late payments, low amounts

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
README.md		README.md
credit_groups.ipynb		credit_groups.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Card Customer Clustering Tutorial

Overview

Dataset

Prerequisites

Installation

1. Clone or Download

2. Install Dependencies

3. Verify Data File

How to Run

Option 1: Using Jupyter Notebook

Option 2: Using VS Code

Tutorial Structure

Part 1: Data Loading and Feature Engineering

Part 2: Data Preparation

Part 3: Selecting Optimal K

Part 4: Final Clustering

Part 5: Business Interpretation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Credit Card Customer Clustering Tutorial

Overview

Dataset

Prerequisites

Installation

1. Clone or Download

2. Install Dependencies

3. Verify Data File

How to Run

Option 1: Using Jupyter Notebook

Option 2: Using VS Code

Tutorial Structure

Part 1: Data Loading and Feature Engineering

Part 2: Data Preparation

Part 3: Selecting Optimal K

Part 4: Final Clustering

Part 5: Business Interpretation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages