Sentiment Analysis with BERT

📌 Project Overview

This project implements Sentiment Analysis using BERT (Bidirectional Encoder Representations from Transformers). The goal is to classify text data into positive or negative sentiment, leveraging BERT's powerful ability to understand deep contextual relations in text.

The pipeline includes:

Data loading and preprocessing with a BERT-specific tokenizer.
Splitting data into training and validation sets.
Fine-tuning a pre-trained BERT model with early stopping to prevent overfitting.
Visualizing performance metrics (accuracy and loss).
Evaluate by testing your own text.

⚙️ Installation and Setup

1. Clone the repository

git clone https://github.com/anniemburu/Sentimental-Analysis-with-BERT

2. Create and activate a virtual environment (Recommend Anaconda or miniconda)

conda create -n myenv python=3.9

conda activate myenv

3. Install dependencies

All dependencies are listed in requirements.txt. Install them with:

pip install -r requirements.txt

4. Data setup

The processed dataset is expected at: bash datasets/processed/sentiment_data.csv. The data used from this project was sourced from Kaggle. You can modify data_split in src/data/preprocessing.py if you wish to use a different dataset. You can modify bash data_split in bash src/data/preprocessing.py if you wish to use a different dataset.

🚀 Training the Model

Run the training pipeline with:

python3 train.py

or

python3 -m train

This will:

Train the BiLSTM model on the training data.
Validate it on the validation set.
Save the trained model to bash src/models/sentimental_model/ .
Generate training performance plots at bash src/results/model_performance.png .

You can add extra parameters as defined in bash src/utils/parser.py.

🚀 Evaluate the Model

You can test your own text by running :

python3 evaluate.py

or

python3 -m evaluate

📊 Data Source

The data is sourced from Kaggle. You can either download it manually or automatically.

python3 -m train --autodownload

Preprocessing: The data has been tokenized, padded to fixed sequence length, and split into training and testing.

📂 Project Structure

├── config
│   └── vars.yml
├── datasets/
│   ├── processed/
│   │   └── sentiment_data.csv
│   └── raw/
│       └── sentimentdataset.csv
├── src/
│   ├── data/
│   │   ├── preprocessing.py
│   │   └── data_loader.py
│   ├── models/
│   │   ├── sentimental_model/
│   ├── results/
│   │   └── model_performance.png
│   └── utils/
│       └── parser.py
├── evaluate.py
├── train.py                
├── requirements.txt
└── README.md

🔎 Findings & Results

TBA

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
src		src
.DS_Store		.DS_Store
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
evaluate.py		evaluate.py
evaluate1.py		evaluate1.py
quickies.ipynb		quickies.ipynb
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis with BERT

📌 Project Overview

⚙️ Installation and Setup

1. Clone the repository

2. Create and activate a virtual environment (Recommend Anaconda or miniconda)

3. Install dependencies

4. Data setup

🚀 Training the Model

🚀 Evaluate the Model

📊 Data Source

📂 Project Structure

🔎 Findings & Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

anniemburu/Sentimental-Analysis-with-BERT

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis with BERT

📌 Project Overview

⚙️ Installation and Setup

1. Clone the repository

2. Create and activate a virtual environment (Recommend Anaconda or miniconda)

3. Install dependencies

4. Data setup

🚀 Training the Model

🚀 Evaluate the Model

📊 Data Source

📂 Project Structure

🔎 Findings & Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages