A local semantic search engine for places, with:
- Sentence-Transformers for embeddings
- FAISS for fast similarity search
- FastAPI for backend
- JavaScript + HTML/CSS for frontend
- Link to copy of Jupyter Notebook:
https://colab.research.google.com/drive/1818b_G8VTFzyJA23X3lbZPFQpvIfbPpr?usp=sharing
This project was developed as part of the 2025 Corner-DSC-BAC Datathon.
We gratefully acknowledge the support and data provided by our sponsor, corner.
Built collaboratively by:
- Tomas Gutierrez
- Yarden Morad
- Robin Chen
The project should look like this:
-
backend/
- app/
__init__.pymain.py— App setup- core/ — Logic helper functions
config.py— Paths/settingsembeddings.py— Load model, embeddingsfaiss_io.py— FAISS indicestext_utils.py— Preprocess
- api/
__init__.py- routes/
search.py— Search
- app/
-
data/
- raw/ — Source data, copy from given dataset 2
media.csvplaces.csvreviews.csv- AGREEMENTS + LICENSE/
data-license.mdusage-agreement.md
- processed/ — Merged data
merged.csv
- indices/ — FAISS indices
metadata.indexreviews.index
- raw/ — Source data, copy from given dataset 2
-
frontend/
index.html- static/
api.jsui.jsstyle.css
-
scripts/
build_index.py— Build merged CSV and FAISS indicesrun_server.sh— Launch FastAPI server
-
.gitignore -
README.md
cd backend
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -r requirements.txtExtract the given data into data/raw. If
data/processed/merged.csv
data/indices/*.index
doesn't exist, run
python ../scripts/build_index.pyFrom the Project root, run
./scripts/run_server.sh
# chmod +x ./scripts/run_server.sh
# ./scripts/run_server.shStart the FastAPI server (uvicorn app.main:app) on http://127.0.0.1:8000
cd frontend
python -m http.server 5500Visit http://127.0.0.1:5500 in browser.
- User opens frontend in a browser (
http://127.0.0.1:5500) - User types query and hit "Search"
- Frontend sends HTTP request to backend (
http://127.0.0.1:8000/search?q=...) - Backend:
- Encodes the query using the model
- Searches the FAISS index
- Returns the 5 closest matches
- Frontend display the matching places