GraphFMD is a temporal graph learning benchmark for financial misconduct detection in the Bitcoin transaction network.
Participants must classify transactions as illicit (fraudulent) or licit (legitimate).
This repository is designed for Human vs. LLM task.
View the real-time rankings here: https://faranbutt.github.io/GraphFMD/
To ensure the secrecy of the test labels and participant data, we use a Secure Submission Portal.
You must prepare two files:
predictions.csv: Must contain exactly two columns:idandy_pred.1: Illicit (Fraudulent)2: Licit (Legal)
metadata.json: A short description of your approach.
{
"team": "Your_Team_Name",
"run_id": "run_01/run_02.... etc",
"author_type": "human / llm / hybrid",
"model": "GCN / GraphSAGE / etc.",
"notes": "Briefly describe your layers/hyperparameters"
}
Submit your files via the official Google Form:
π Official Submission Form
Once you submit the form:
- A GitHub Action is triggered automatically.
- Your model is scored against the Hidden Ground Truth.
- The Leaderboard is updated instantly.
- Task: Temporal Inductive Node Classification (Licit vs. Illicit).
- Domain: Cryptocurrency (Bitcoin) Forensics.
- Target: Predict the class label of each transaction (Illicit = 1, Licit = 2).
- Metric: Macro-F1 across both classes (Illicit and Licit).
- Nodes (Node Feature Matrix (X)): Bitcoin transactions.165 local and aggregate features. (Train = 16658 , Test = 8896)
- Edges (adjacency matrix (A)) : The flow of BTC between transactions.
- Feature Noise Gaussian noise was added to make the features simulate real world noisy data.
- Temporal Shifting: Time-based split (Train: 1β34, Test: 35+)
- Class Imbalance & Graph Sparsity: All illicit transactions are preserved while only 50% of licit transactions are retained (unknown nodes removed)
For maintaining fairness and competition competency
- One submission policy is enforced so you are only allowed to do one form submission
To enter the competition, you must submit a CSV file named exactly prediction.csv inside the submissions/ folder.
submissions/participant1/prediction.csv
id,y_pred
6418,1
7952,2
.....
.....id: Transaction ID (must match test_nodes.csv).
y_pred: The predicted class label:
- 1: Illicit (Fraudulent)
- 2: Licit (Legal)
When a Pull Request is opened the bot will
- Check identity (Verify if you have already submitted)
- Check Formats (Ensure your JSON and CSV files are structured properly)
.
βββ data/
β βββ public/
β β βββ train_nodes.csv
β β βββ train_labels.csv
β β βββ test_nodes.csv
β β βββ edgelist.csv
βββ competition/
β βββ baseline.py # Starter GCN model
β βββ evaluate.py # Scoring logic
β βββ metrics.py # F1-Score calculation
β βββ update_leaderboard.py
βββ submissions/ # Submission directory
β βββ participant1
β β βββ predictions.csv
βββ leaderboard/ # CSV/Markdown rankings
βββ docs/ # Interactive Leaderboard
βββ images/
If you use this challenge, dataset, or repository in your research, please cite:
@dataset{graphfmd_2026,
title={GraphFMD: Graph-based Financial Misconduct Detection Benchmark},
author={Faran Taimoor Butt},
year={2026},
url = {https://github.com/faranbutt/GraphFMD}
}Faran Taimoor Butt Software Engineer and Researcher in Computer Vision, NLP & Graph ML.
- Email: faranbutt789@gmail.com
- GitHub: @faranbutt
For questions regarding the competition setup, data preprocessing or automated scoring issues, please open an Issue in this repository or contact me directly.
- [Basira Lab] Deep Graph Learning Playlist β Essential video tutorials for GNN fundamentals.
- [Basira Lab] Deep Graph Learning GitHub β Codebase and implementations for graph-based models.
- [1] Elliptic, www.elliptic.co.
- [2] M. Weber, G. Domeniconi, J. Chen, D. K. I. Weidele, C. Bellei, T. Robinson, C. E. Leiserson, "Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics", KDD β19 Workshop on Anomaly Detection in Finance, August 2019, Anchorage, AK, USA.
