Description This Python-based NLP project classifies airline tweets as positive, neutral, or negative with ~75% accuracy using Logistic Regression. It features text preprocessing with NLTK (tokenization, lemmatization, TF-IDF) and visualizations with Matplotlib and Seaborn to highlight sentiment trends. Built to analyze user feedback, it has applications in UX analytics and social impact, reflecting my passion for data-driven solutions inspired by entrepreneurial experiences. Features
Dataset: Twitter Airline Sentiment dataset (~14,000 tweets from Kaggle). Preprocessing: Tokenization, stopword removal, lemmatization, and TF-IDF vectorization. Model: Logistic Regression with scikit-learn. Visualizations: Sentiment distribution and confusion matrix.
Requirements
Python 3.8+ Libraries: pandas, numpy, nltk, scikit-learn, matplotlib, seaborn Install: pip install -r requirements.txt
Files
sentiment_analysis.py: Main script for data processing, modeling, and visualization. sentiment_distribution.png: Bar plot of sentiment distribution. confusion_matrix.png: Confusion matrix of model performance. (Optional) sentiment_analysis.ipynb: Jupyter Notebook version. (Dataset not included; download from Kaggle).
Usage
Clone the repository:git clone (https://github.com/nic-stack/Twitter-Sentiment-Analysis)
Install dependencies:pip install -r requirements.txt
Download Tweets.csv from Kaggle and place in the project folder. Run the script:python sentiment_analysis.py
Or open sentiment_analysis.ipynb in Jupyter Notebook.
Results
Accuracy: ~80% on test data. Visualizations: Bar plot and confusion matrix show sentiment trends and model performance. Sample Prediction: "I love this airline!" → Positive.
Future Improvements
Explore advanced models like BERT or LSTM. Add real-time Twitter API integration. Create Tableau dashboards for interactive visualizations ([Tableau Public link, if applicable]).
Contact
GitHub: (https://github.com/nic-stack) LinkedIn: https://www.linkedin.com/in/nicolette-mtisi Email: nicmtisi@gmail.com
By Nicolette, an aspiring data scientist passionate about UX analytics and social impact through data.