This repository contains a small collection of self-contained Python examples that demonstrate core probabilistic models using the pomegranate library (version 1.1.2):
- A Markov chain over weather states (sun/rain)
- A Hidden Markov Model (HMM) for inferring hidden weather from umbrella observations
- A Bayesian network for reasoning about rain, track maintenance, train delays, and missed appointments
The code is written as minimal, readable scripts intended for teaching and experimentation, and has been migrated to the newer pomegranate 1.1.x API with integer-encoded categorical variables.
Working through this project you can learn how to:
- Represent simple discrete probabilistic models (Markov chains, HMMs, Bayesian networks)
- Encode categorical variables as integer indices compatible with pomegranate 1.1.x
- Compute likelihoods and perform inference with pomegranate
- Sample from probabilistic models and interpret the results in human-readable form
This project targets Python 3.11+ (the examples were developed against a recent CPython and pomegranate 1.1.2).
Core dependencies (see requirements.txt):
pomegranate==1.1.2torchnumpy==2.3.5
To set up a virtual environment and install dependencies:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txtNote: pomegranate 1.1.x uses PyTorch under the hood, so installing
torchfirst (or viarequirements.txt) is required.
.
├── bayesnet/ # Bayesian network model and related examples
│ ├── model.py # Defines the BN structure and conditional distributions
│ ├── likelihood.py # Computes likelihood of an observation under the BN
│ ├── inference.py # Performs inference with partial evidence using MaskedTensor
│ └── sample.py # Draws samples and performs simple rejection sampling
├── chain/ # Markov chain (non-hidden) example
│ └── model.py # Simple 2-state weather Markov chain and sampling
├── hmm/ # Hidden Markov Model example
│ ├── model.py # Defines a 2-state HMM and emission/transition structure
│ └── sequence.py # Encodes observations and decodes most likely state sequence
├── requirements.txt
└── lecture.pdf # Slide deck or notes (not required to run the code)
All scripts are designed to be run directly from the repository root with Python.
A simple 2-state Markov chain over weather states (sun/rain):
python -m chain.modelThis will:
- Construct a Markov chain with start distribution
P(X_0)and transition matrixP(X_t | X_{t-1}) - Sample 50 states from the chain and print them as integer indices (0 for
"sun", 1 for"rain")
You can map indices back to labels using chain.model.STATES if you modify or extend the example.
Hidden Markov Model example
A 2-state HMM where the hidden weather (sun/rain) generates observable umbrella usage:
python -m hmm.sequenceThis will:
- Define a fixed observed sequence of umbrella usage (strings)
- Encode observations to integer categories using
hmm.model.OBSERVATION_STATES - Use a
DenseHMMto predict the most likely sequence of hidden weather states - Decode and print human-readable weather labels using
hmm.model.WEATHER_STATES
The Bayesian network models the relationships:
rain→maintenance→train→appointment- With additional edges
rain → trainandrain → maintenance
All variables are discrete and integer-encoded:
RAIN_STATES = ["none", "light", "heavy"]MAINTENANCE_STATES = ["yes", "no"]TRAIN_STATES = ["on time", "delayed"]APPOINTMENT_STATES = ["attend", "miss"]
Model definition
python -m bayesnet.modelbayesnet/model.py primarily defines the BN and does not print output by itself, but it is imported by the other scripts.
Likelihood of an observation
python -m bayesnet.likelihoodThis script:
- Encodes an observation
["none", "no", "on time", "attend"]into integer indices - Computes
P(rain=none, maintenance=no, train=on time, appointment=attend)usingmodel.probability - Prints the scalar probability value
Inference with partial evidence
python -m bayesnet.inferenceThis script:
- Builds a
torch.masked.MaskedTensorrepresenting a single sample - Observes
train = "delayed"and leaves other variables unobserved - Calls
model.predict_probato get marginal distributions for each variable - Prints a readable distribution over the states of
rain,maintenance,train, andappointment
Sampling and rejection sampling
python -m bayesnet.sampleThis script:
- Draws many samples from the BN using
model.sample - Decodes integer samples back to string labels
- Uses rejection sampling to approximate the distribution of
appointmentgiven thattrainis"delayed" - Prints the resulting counts as a
collections.Counter
- Markov chain (in
chain/): models a sequence of states where the next state depends only on the current state via a transition matrix. - Hidden Markov Model (HMM) (in
hmm/): models a sequence of hidden states (weather) that emit observable symbols (umbrella usage). Inference recovers the most likely hidden sequence. - Bayesian network (in
bayesnet/): a directed acyclic graph over discrete variables that supports computing joint probabilities, conditional probabilities, and posterior distributions given evidence.
All examples use integer-encoded categorical variables to match the pomegranate 1.1.x API.
These examples were migrated from older pomegranate APIs to pomegranate 1.1.2 and include several important patterns:
- Use
CategoricalandConditionalCategoricalfrompomegranate.distributionsinstead of the olderDiscreteDistributionandConditionalProbabilityTablestyle. - Use
MarkovChainfrompomegranate.markov_chainandBayesianNetworkfrompomegranate.bayesian_network. - Use
DenseHMMfrompomegranate.hmmfor hidden Markov models. - Represent observations and states as integer indices, not strings, when calling
probability,sample,predict, andpredict_proba. - For Bayesian network inference with partially observed data, use
torch.masked.MaskedTensoras demonstrated inbayesnet/inference.py.
If you run into API or shape errors, check that:
- You are using
pomegranate==1.1.2(or a compatible 1.1.x release) - Your inputs are integer-encoded with shapes matching what the models expect (see the example scripts)
No explicit license is included in this repository. If you intend to reuse or redistribute this code beyond personal or educational purposes, please add an appropriate license file (for example, MIT, BSD, or Apache-2.0) according to your needs.