Skip to content

A Python-based NLP project that classifies airline tweets as positive, neutral, or negative with ~80% accuracy using Logistic Regression. Features text preprocessing with NLTK and TF-IDF, and visualizations with Matplotlib/Seaborn. Built to analyze user feedback, with applications in UX analytics and social impact.

Notifications You must be signed in to change notification settings

nic-stack/Twitter-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Twitter-Sentiment-Analysis

Description This Python-based NLP project classifies airline tweets as positive, neutral, or negative with ~75% accuracy using Logistic Regression. It features text preprocessing with NLTK (tokenization, lemmatization, TF-IDF) and visualizations with Matplotlib and Seaborn to highlight sentiment trends. Built to analyze user feedback, it has applications in UX analytics and social impact, reflecting my passion for data-driven solutions inspired by entrepreneurial experiences. Features

Dataset: Twitter Airline Sentiment dataset (~14,000 tweets from Kaggle). Preprocessing: Tokenization, stopword removal, lemmatization, and TF-IDF vectorization. Model: Logistic Regression with scikit-learn. Visualizations: Sentiment distribution and confusion matrix.

Requirements

Python 3.8+ Libraries: pandas, numpy, nltk, scikit-learn, matplotlib, seaborn Install: pip install -r requirements.txt

Files

sentiment_analysis.py: Main script for data processing, modeling, and visualization. sentiment_distribution.png: Bar plot of sentiment distribution. confusion_matrix.png: Confusion matrix of model performance. (Optional) sentiment_analysis.ipynb: Jupyter Notebook version. (Dataset not included; download from Kaggle).

Usage

Clone the repository:git clone (https://github.com/nic-stack/Twitter-Sentiment-Analysis)

Install dependencies:pip install -r requirements.txt

Download Tweets.csv from Kaggle and place in the project folder. Run the script:python sentiment_analysis.py

Or open sentiment_analysis.ipynb in Jupyter Notebook.

Results

Accuracy: ~80% on test data. Visualizations: Bar plot and confusion matrix show sentiment trends and model performance. Sample Prediction: "I love this airline!" → Positive.

Future Improvements

Explore advanced models like BERT or LSTM. Add real-time Twitter API integration. Create Tableau dashboards for interactive visualizations ([Tableau Public link, if applicable]).

Contact

GitHub: (https://github.com/nic-stack) LinkedIn: https://www.linkedin.com/in/nicolette-mtisi Email: nicmtisi@gmail.com

By Nicolette, an aspiring data scientist passionate about UX analytics and social impact through data.

About

A Python-based NLP project that classifies airline tweets as positive, neutral, or negative with ~80% accuracy using Logistic Regression. Features text preprocessing with NLTK and TF-IDF, and visualizations with Matplotlib/Seaborn. Built to analyze user feedback, with applications in UX analytics and social impact.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published