This project demonstrates the use of machine learning algorithms to classify Iris flower species based on their physical characteristics. It utilizes the classic Iris dataset and evaluates classification models such as K-Nearest Neighbors (KNN) and Gaussian Naive Bayes (NB).
- Dataset Used: Iris dataset (iris.csv)
- Features: Sepal Length, Sepal Width, Petal Length, Petal Width
- Target Variable: Species (Iris-setosa, Iris-versicolor, Iris-virginica)
.
├── iris (1).csv # Dataset file
├── iris_classification.py # Python script for analysis and modeling
├── README.md # Project description
-
Data Exploration
- Viewing head, tail, summary statistics
- Checking for null values
-
Data Visualization
- Box plots
- Histograms
- Scatter matrix
-
Model Training and Evaluation
- Data splitting (train-test split)
- Cross-validation
- Model comparison (KNN, Naive Bayes)
-
Final Model Validation
- Accuracy metrics
- Confusion matrix
- Classification report
pandasmatplotlibscikit-learn
- Clone the repository:
git clone https://github.com/paviabera/Classifiers- Install dependencies (if needed):
pip install pandas matplotlib scikit-learn- Execute the script:
python iris_classification.pyThe K-Nearest Neighbors model achieved high accuracy in predicting the Iris species, making it suitable for reliable classification tasks.
- Pavia Bera
- Contact: paviabera@usf.edu
This project is open-source under the MIT License.