Skip to content

raphaelreiss/dp-db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents
  1. About The Project
  2. Getting Started
  3. License
  4. Contact
  5. Acknowledgements

About The Project

Creating a anonymized dataset is commonly known to be a really hard product to build since pseudonymization of identifiers is not a good option since it can be compared to open source data and then the pseudonymes can eventually be cracked.

It is however of importance to be able to learn useful information from a database but not at the cost of the privacy from people registered in the database.

Here comes the interest of a differential privacy preserveing database.

The database client only support count aggregation function such as:

SELECT COUNT(*)
FROM db
WHERE movie = [queried movie name]
      AND rating >= [queried rating level]

Getting Started

This is an example of how you may give instructions:

querier = DpQuerier("imdb-dp.csv", privacy_budget=1)
count = querier.get_count("Seven Samurai", rating_threshold=3, epsilon=0.25)

Prerequisites

Some libraries are required

  • numpy
  • termcolor
  • scipy
  • attr

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Raphaël Reis Nunes - raphael.reisnunes@epfl.ch

Project Link

Acknowledgements

This project is a homework from the EPFL course COM-402

About

Basic differential privacy database client.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages