High-performance probabilistic data structures for Elixir
Fate provides concurrent, space-efficient implementations of probabilistic data structures backed by :atomics for thread-safe operations. Perfect for membership testing, cardinality estimation, and frequency counting at scale.
- Features
- Installation
- Quick Start
- Data Structures
- Performance
- When to Use What
- Documentation
- Testing
- Contributing
- License
- References
- 🚀 High Performance: Optimized implementations matching or exceeding Erlang reference libraries
- 🔒 Concurrent: Thread-safe operations using
:atomicswith lock-free reads - 🎯 Pluggable Hashing: Support for multiple hash functions (xxHash, Murmur3, XXH3, FNV1a)
- 📊 Accurate Cardinality: HyperLogLog delivers lock-free distinct counting at sub-millisecond speeds
- 📦 Zero Dependencies: Core functionality requires no external dependencies
- 🧪 Well Tested: Comprehensive test coverage (~87%) with 61+ tests
- 🔄 Production Ready: Battle-tested algorithms with comprehensive error handling
Add fate to your list of dependencies in mix.exs:
def deps do
[
{:fate, "~> 0.1.0"}
]
endThen run mix deps.get to install.
For better performance, install one or more hash function libraries:
def deps do
[
{:fate, "~> 0.1.0"},
# Choose one or more:
{:xxh3, "~> 0.3"}, # Fastest for most workloads
{:xxhash, "~> 0.3"}, # Fast general-purpose hash
{:murmur, "~> 2.0"} # Good distribution
]
endFate will automatically select the best available hash function, or you can specify one explicitly.
# Bloom Filter - Fast membership testing
bloom = Fate.Filter.Bloom.new(1000, false_positive_probability: 0.01)
Fate.Filter.Bloom.put(bloom, "user:123")
Fate.Filter.Bloom.member?(bloom, "user:123") # => true
# Cuckoo Filter - Membership testing with deletion
cuckoo = Fate.Filter.Cuckoo.new(1000)
Fate.Filter.Cuckoo.put(cuckoo, "session:abc")
Fate.Filter.Cuckoo.delete(cuckoo, "session:abc") # => :ok
Fate.Filter.Cuckoo.member?(cuckoo, "session:abc") # => falseSpace-efficient probabilistic set membership testing with configurable false-positive rates.
Key Features:
- Configurable false-positive probability
- Cardinality estimation
- Serialization/deserialization
- Set operations (merge, intersection)
- No deletion support
Example:
alias Fate.Filter.Bloom
# Create a filter for ~1000 items with 1% false positive rate
bloom = Bloom.new(1000, false_positive_probability: 0.01)
# Insert items
Bloom.put(bloom, "user:123")
Bloom.put(bloom, "user:456")
# Check membership
Bloom.member?(bloom, "user:123") # => true
Bloom.member?(bloom, "user:789") # => false (probably)
# Get statistics
Bloom.cardinality(bloom) # => ~2
Bloom.false_positive_probability(bloom) # => ~0.01
Bloom.bits_info(bloom) # => %{total_bits: ..., set_bits_count: ..., set_ratio: ...}
# Serialize for storage
binary = Bloom.serialize(bloom)
restored = Bloom.deserialize(binary)
# Merge multiple filters
merged = Bloom.merge([bloom1, bloom2, bloom3])
intersected = Bloom.intersection([bloom1, bloom2])Performance: ~2x faster than existing Elixir implementations
Compact filter with deletion support, making it more versatile than Bloom filters.
Key Features:
- Item deletion support (unlike Bloom filters)
- Dynamic insertion with bounded relocation
- Exact item count tracking
- Serialization/deserialization
- Set operations (merge, intersection)
- Statistics and analytics
Example:
alias Fate.Filter.Cuckoo
# Create a filter for ~1000 items
cuckoo = Cuckoo.new(1000)
# Insert items
:ok = Cuckoo.put(cuckoo, "session:abc")
:ok = Cuckoo.put(cuckoo, "session:def")
# Check membership
Cuckoo.member?(cuckoo, "session:abc") # => true
Cuckoo.member?(cuckoo, "session:xyz") # => false
# Delete items (unique to Cuckoo filters!)
:ok = Cuckoo.delete(cuckoo, "session:abc")
Cuckoo.member?(cuckoo, "session:abc") # => false
# Check capacity and load
Cuckoo.size(cuckoo) # => 1
Cuckoo.capacity(cuckoo) # => 1000
Cuckoo.load_factor(cuckoo) # => 0.0002...
# Get statistics
Cuckoo.bits_info(cuckoo) # => %{total_slots: ..., occupied_slots: ..., ...}
Cuckoo.cardinality(cuckoo) # => 1 (same as size for Cuckoo)
Cuckoo.false_positive_probability(cuckoo) # => ~0.0001
# Serialize for storage
binary = Cuckoo.serialize(cuckoo)
restored = Cuckoo.deserialize(binary)
# Merge multiple filters
merged = Cuckoo.merge([cuckoo1, cuckoo2, cuckoo3])
intersected = Cuckoo.intersection([cuckoo1, cuckoo2])
# Handle full filter
case Cuckoo.put(cuckoo, item) do
:ok -> :inserted
{:error, :full} -> :filter_full
endPerformance: On par with Erlang reference implementation when using the same hash function
Approximate distinct counter with configurable precision and lock-free updates, ideal for large-scale analytics.
Key Features:
- Precision range 4–18 with <1% typical relative error at default settings
- Lock-free concurrent updates backed by
:atomics - Supports
add_hashed/2for workloads that already have 64-bit hashes - Mergeable sketches with serialization support
Example:
alias Fate.Cardinality.HyperLogLog
# Create a sketch with default precision (14)
hll = HyperLogLog.new()
# Add raw items using the selected hash module
Enum.each(1..1_000, &HyperLogLog.add(hll, &1))
# Or skip hashing if you already have 64-bit hashes
Enum.each(1..1_000, fn value ->
hash = :erlang.phash2({value, 0})
HyperLogLog.add_hashed(hll, hash)
end)
# Estimate distinct count
HyperLogLog.cardinality(hll) # => ~1000
# Merge sketches
merged = HyperLogLog.merge([hll, HyperLogLog.new()])Performance: Outruns Erlang HLL and Elixir Hypex when fed pre-hashed values (see Performance)
# Specify a hash function explicitly
bloom = Fate.Filter.Bloom.new(1000, hash_module: Fate.Hash.XXH3)
cuckoo = Fate.Filter.Cuckoo.new(1000, hash_module: Fate.Hash.Murmur3)
# Or let Fate choose the best available
hash_module = Fate.Hash.module() # Auto-selects best available# Bloom filter with custom parameters
bloom = Fate.Filter.Bloom.new(10_000,
false_positive_probability: 0.001, # 0.1% FPP
hash_count: 10, # Override optimal k
hash_module: Fate.Hash.XXH3
)
# Cuckoo filter with custom parameters
cuckoo = Fate.Filter.Cuckoo.new(10_000,
bucket_size: 4, # Slots per bucket (default: 4)
fingerprint_bits: 16, # Bits per fingerprint (default: 16)
load_factor: 0.95, # Target load before failures (default: 0.95)
max_kicks: 100, # Max relocation attempts (default: 100)
hash_module: Fate.Hash.Default
)Benchmarks on Intel Xeon E5-2695 v4 @ 2.10GHz:
# Insert 10k items
Fate.Bloom put/2 12.82 ips (78.00 ms)
Talan.BloomFilter put/2 5.74 ips (174.21 ms) - 2.23x slower
# Lookup 10k items
Fate.Bloom member?/2 12.89 ips (77.59 ms)
Talan.BloomFilter 6.09 ips (164.29 ms) - 2.12x slower
# Insert 10k items (with phash2)
Fate.Cuckoo put/2 ~50 ips (~20 ms)
:cuckoo_filter.add/2 ~55 ips (~18 ms) - 1.1x faster
# Lookup 10k items (with phash2)
Fate.Cuckoo member?/2 121 ips (8.24 ms)
:cuckoo_filter.contains 133 ips (7.53 ms) - 1.09x faster
Note: Performance is on par when using the same hash function. Differences are within measurement variance.
Run benchmarks locally:
mix run bench/bloom_cmp.exs
mix run bench/cuckoo_cmp.exs# Insert 1M hashed values (phash2)
Fate.HyperLogLog add_hashed/2 5.22 ips (191.75 ms)
HLL.add/2 3.19 ips (313.88 ms) - 1.64x slower
Hypex.update/2 2.68 ips (373.43 ms) - 1.95x slower
# Insert 1k raw values
Fate.HyperLogLog add/2 1.92 K ips (520.71 µs)
HLL.add/2 2.86 K ips (349.85 µs) - 1.49x faster
Hypex.update/2 1.15 K ips (872.18 µs) - 1.67x slower
# Cardinality (1M values)
Fate.HyperLogLog cardinality 1.24 K ips (805.81 µs)
HLL.cardinality 0.72 K ips (1397.38 µs) - 1.73x slower
Hypex.cardinality 1.15 K ips (867.12 µs) - 1.08x slower
- ✅ You only need membership testing (no deletions)
- ✅ You want to minimize memory usage
- ✅ You need set operations (merge/intersection)
- ✅ False positives are acceptable
- ✅ You need cardinality estimation
- ✅ You need to delete items
- ✅ You want bounded false-positive rates
- ✅ You need exact item counts
- ✅ You need set operations with deletion support
- ✅ Slightly higher memory usage than Bloom is acceptable
- ✅ You need approximate distinct counts with fixed memory
- ✅ Lock-free concurrent updates are important
- ✅ Sub-millisecond cardinality queries matter
- ✅ You can tolerate small relative error (≈1% by default)
- ✅ You already hash keys and want to reuse those 64-bit hashes (via
add_hashed/2)
Full documentation is available on HexDocs (when published).
Generate local documentation:
mix docsThe project includes comprehensive test coverage:
# Run tests
mix test
# Run tests with coverage
mix coveralls
# Generate HTML coverage report
mix coveralls.htmlCurrent Coverage: ~87% overall
Fate.Filter.Bloom: 97.3%Fate.Filter.Cuckoo: 89.4%Fate.Hash: 52.3% (expected - many hash modules are optional)
- Bloom Filter: Uses double hashing with configurable hash functions, stores bits in
:atomicswords with lock-free CAS operations. Optimized with direct recursion to avoid list allocations. - Cuckoo Filter: Follows the Erlang reference implementation with bitpacked bucket storage, eviction caching, and bounded relocation. Uses fast bit-mixing for alternate index calculation.
- Hash Functions: Pluggable via
Fate.Hashbehaviour, with runtime availability checking. Supports XXH3, XXHash, Murmur3, FNV1a, and Erlang'sphash2.
Future data structures planned:
- Count-Min Sketch (frequency estimation)
- Quotient Filter (compact alternative to Cuckoo)
Contributions are welcome! Please feel free to submit a Pull Request.
# Clone the repository
git clone https://github.com/YOUR_USERNAME/fate.git
cd fate
# Install dependencies
mix deps.get
# Run tests
mix test
# Run benchmarks
mix run bench/bloom_cmp.exs
mix run bench/cuckoo_cmp.exs
# Format code
mix format
# Check formatting
mix format --check-formatted- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Make sure tests pass (
mix test) - Ensure code is formatted (
mix format) - Add tests for new functionality
- Update documentation as needed
- Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow Elixir style guide
- Use
mix formatbefore committing - Write descriptive commit messages
- Add tests for new features
- Update documentation for API changes
Copyright (c) 2025 Gustavo Gawryszeski
Licensed under the MIT License. See LICENSE for details.
- Inspired by the Erlang
cuckoo_filterimplementation - Built with performance and concurrency in mind
