BrewModel: Hops Flavor & Aroma Profiler

Image Source: VinePair

BrewModel: Hops Flavor & Aroma Profiler

Tomer D. & Romith C.

Report Bug · Request Feature

Table of Contents

About The Project
Getting Started
- Prerequisite Installations
- Repository Cloning
- Environment Setup
Usage

Scraper
Cleaner
EDA
Classifier
Analyzer

Acknowledgments

About The Project

Hops are primarily used as a bittering, flavoring, and stabilizing agent in beer. In recent years, hops have become the center of attention in the beer industry, as hop-forward beers have become one of the most popular styles of beer. Hops varieties are developed and grown in moderate climates around the world. Every branded hop variety has a unique flavor and aroma profile. This makes for an exciting and delicious reason to explore what flavor and aroma a hop can offer on its own, or together with other hops to bring waves and layers of flavor and aromas.

The purpose of this project is to build a processed dataset to explore some definitive hop characteristics, draw initial insights into geographical relationships between hops, and lay the groundwork for further model-building in future studies. The first step was to compile a comprehensive dataset that consists of these characteristics, with both numeric brew values, as well as an aroma profile for each hop. This was achieved through scraping BeerMaverick's database of a diverse set of 300+ hops from around the world. This raw data was thoroughly processed for exploratory studies, and feature-engineered to create insightful visualizations & prepare for initial model-building. Using a supervised tree-based ensemble methods (XG-Boost & Random Forest), these preliminary models were built for a deeper look into classification techniques of beer hops.

(back to top)

Getting Started

This section walks through the steps to download a local copy of the project and reproduce the findings.

Prerequisites

Python 3.X and a means of accessing iPython notebook files is assumed. Links below walk through the necessary steps to fulfill these prerequisites.

Windows:
https://medium.com/@kswalawage/install-python-and-jupyter-notebook-to-windows-10-64-bit-66db782e1d02
MacOS:
https://docs.python-guide.org/starting/install3/osx/
https://medium.com/@blessedmarcel1/how-to-install-jupyter-notebook-on-mac-using-homebrew-528c39fd530f

Repository Cloning

Navigate to a directory to save the repo through a Terminal.
```
cd c:\path\to\directory
```

Clone the repo

git clone https://github.com/rc-9/tools1_project.git

Environment Setup

Install virtual environment capability.
```
pip install virtualenv
```
Navigate to directory of the cloned repo and create a virtual environment for the project.
```
python -m venv c:\path\to\directory\venvname
```
Activate virtual environment for the project.
```
.\venv\Scripts\activate
```
Install necessary modules from the provided requirements file.
```
pip install -r requirements.txt
```
Launch Jupyter through virtual environment to view and execute codeblocks or run scripts directly in Terminal for outputs files.

(back to top)

Usage

This section outlines the order to execute iPython scripts to retrieve & clean the necessary data, generate visuals, perform analysis, and build basic classification models.

Execute step1_scraper.ipynb to collect the info from BeerMaverick's Hops Database. This script is designed to take 40+ minutes to fully execute and scrape all the necessary data. As the database contains over 300+ hops, each with individual webpages for detailed info, the scraper is set up with a wait-time to ensure the scraping can be fully completed without running into automatic IP blocks. This step is optional and can be skipped to avoid the long run-time as the output raw_data.csv is already provided in the repository.

The raw data is stored in the raw_data directory consisting of the following csv files:
- raw_hops_main.csv: primary data file to be used for cleaning & analysis
- raw_ref_aroma_types.csv: reference document for metadata info on aroma types
- raw_ref_brew_values.csv: reference document for metadata info on brew values
- raw_ref_hops_substitutions.csv: reference document for metadata info on pre-determined hop substitutions
Execute step2_cleaner.ipynb to wrangle and feature-engineer the raw data files from raw_data and store cleaned data into clean_data.

The clean_data directory consists of the following csv files:
- cln_hops_brewvalues.csv: processed numerical data for various brew values for each hop
- cln_hops_profile.csv: processed categorical & boolean data of country, purpose, aroma info for each hop
- cln_ref_aroma_types.csv: processed reference document for metadata info on aroma types
- cln_ref_brew_values.csv: processed reference document for metadata info on brew values
- cln_ref_hops_substitutions.csv: processed reference document for metadata info on pre-determined hop substitutions
Execute eda_and_summary_visuals.ipynb which uses the processed data files from clean_data to perform exploratory analysis, develop insights into our dataset, and provide summary visualizations that present the data.

The images directory generated from execution will consist of all the png output files from our script.
Execute region_classifier.ipynb to construct two tree-based ensemble models using XG-Boost & Random Forest classification algorithms to classify hop region.

In this script, we attempt to classify geographical regions based on various hop characteristics. The processed data from clean_data undergoes further feature-engineering to prepare to be fed into a model that can carry out this task. This script is self-contained and will consist of the resulting confusion matrix from the model predictions.

Additionally, a tool was also built to take in input data of various hop characteristics from the user, and execute the classifier model to predict the correct region to which the hop belongs to.
Execute purpose_classifier.ipynb to construct two tree-based ensemble models using XG-Boost & Random Forest classification algorithms to classify hop purpose.

In this script, we attempt to classify hop purpose (Dual vs Aroma vs Bittering) based on various hop characteristics. The processed data from clean_data undergoes further feature-engineering to prepare to be fed into a model that can carry out this task. This script is self-contained and will consist of the resulting confusion matrix from the model predictions.

Additionally, a tool was also built to take in input data of various hop characteristics from the user, and execute the classifier model to predict the correct purpose of the hop.

(back to top)

Acknowledgments

Data source: https://beermaverick.com/hops/

(back to top)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BrewModel: Hops Flavor & Aroma Profiler

About The Project

Getting Started

Prerequisites

Repository Cloning

Environment Setup

Usage

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
clean_data		clean_data
images		images
raw_data		raw_data
Final_Project_Writeup.pdf		Final_Project_Writeup.pdf
README.md		README.md
eda_and_summary_visuals.ipynb		eda_and_summary_visuals.ipynb
eda_and_summary_visuals.zip		eda_and_summary_visuals.zip
purpose_classifier.ipynb		purpose_classifier.ipynb
region_classifier.ipynb		region_classifier.ipynb
requirements.txt		requirements.txt
step1_scraper.ipynb		step1_scraper.ipynb
step2_cleaner.ipynb		step2_cleaner.ipynb

rc-9/BrewModel

Folders and files

Latest commit

History

Repository files navigation

BrewModel: Hops Flavor & Aroma Profiler

About The Project

Getting Started

Prerequisites

Repository Cloning

Environment Setup

Usage

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages