A comprehensive time series analysis of the Capital Bikeshare system in Washington D.C., demonstrating fundamental concepts in time series forecasting including data preparation, decomposition, and forecasting methods.
This analysis uses the Bike Sharing in Washington D.C. Dataset from Kaggle, which contains daily bike rental counts for 2011-2012 along with weather and seasonal information.
This notebook demonstrates:
-
Time Series Data Preparation
- Loading and filtering temporal data
- Handling missing dates
- Creating proper time series indexes
-
Trend Analysis with Moving Averages
- 7-day (weekly) moving averages
- 30-day (monthly) moving averages
- Visual trend identification
-
Seasonal Pattern Detection
- Using pandas datetime accessors
- Day-of-week analysis
- Weekday vs. weekend patterns
-
Time Series Decomposition
- Separating trend, seasonal, and residual components
- Additive vs. multiplicative models
- Understanding seasonal patterns
-
Simple Forecasting Methods
- Naive baseline (simple average)
- Seasonal naive (repeating weekly patterns)
- Forecast accuracy evaluation (RMSE)
- Understanding forecast limitations
- Python 3.7 or higher
- Jupyter Notebook or VS Code with Jupyter extension
-
Clone or download this repository
-
Install required packages:
pip install pandas numpy matplotlib scikit-learn statsmodelsOr install all at once:
pip install -r requirements.txt- Verify the data files are present:
data/day.csv- Daily bike rental datadata/hour.csv- Hourly bike rental data (optional)
-
Open
bike_share_analysis.ipynbin Jupyter Notebook or VS Code -
Run cells sequentially from top to bottom (the notebook is designed to be run in order)
-
Each section builds on previous sections, so make sure to run all cells in Part 1 before moving to Part 2, etc.
Unit5-Practice/
├── README.md # This file
├── bike_share_analysis.ipynb # Main analysis notebook
├── requirements.txt # Python dependencies
└── data/
└── day.csv # Daily aggregated bike rental data
Assignment 3: Building Time Series Forecasts