Skip to content

Approximate Query Processing Engine written in Scala

License

Notifications You must be signed in to change notification settings

raphaelreiss/DBest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Scala DBEst


A Scala implementation of DBEst Approximate Query Processing

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact
  6. Acknowledgements

About The Project

The present design was realized for a Semester Project at EPFL in collaboration with the Laboratory for Data Intensive Applications & Systems of Prof. Anastasia Ailamaki and under the supervision of the PhD student Viktor Sanca.

This project aimed at building a Scala implementation of the DBEst Approximation Query Processor (AQP) with the well known Spark library. Traditional AQPs rely on data sampling to approximate a query anwer. DBEst is a novative AQP that approximate the answer based on Machine Learning models. This brings many advantages regarding the query response time, database portability and data transfer. Indeed, under certain error tolerance (which is manageable), the data is not required anymore to information from a certain database.

We decided to write a Spark based implementation of DBEst as a first step to extend the original DBEst implementation. One could analyse the perspectives of model-based querying in situation where there is constraints regarding the query responsiveness, the network data flow or the data storage.

Please find here the report related to my work.

Built With

As mention above, the implementation rely on Apache Spark Library (2.4.6) and Scala Lang (2.11.12). For the other libraries, please check the buidl.sbt file for further details.

Sbt(1.0.0) is also required to build the project.

Getting Started

Here are the steps to start building the project and run the analysis experiments.

  • Please first download the code or import it through git clone https://github.com/raphaelreis/DBest.git command.

  • Then you have to write on the conf/configuration.conf file. Mostly you have to setup your base directory path (the path to the directory of the project).

  • You can setup the directory paths for the results running the script scripts/setup.sh from your working directory.

  • Build the project in the working directory with the command sbt package

  • To run all the experiments run the command scripts/runexp_sample.sh 1 2 3.

  • There is also a script to run the model training beforehand scripts/train_models.sh.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Raphaël Reis Nunes - @LinkedIn - email: raphael.reisnunes at epfl dot ch

Acknowledgements

About

Approximate Query Processing Engine written in Scala

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published