📚 Research Papers Collection

A curated collection of research papers I'm reading, have read, or plan to read.

📑 Content by Topic

Tools

Papers

🧠 Reasoning
🤖 Agent
💬 Large Language Model
📊 Physiological Signals
🔒 Privacy
🔬 Multimodal

Knowledge Base

📖 AI/ML Knowledge

Courses

Courses

Tools

ComfyAI - Collection of LLM techniques and workflows
verl - Volcano Engine Reinforcement Learning for LLMs (RLHF framework supporting FSDP, vLLM, SGLang)
FeatureDB - Pattern recognition methods for ECG feature extraction (expert features including HRV, morphologic variability, frequency domain, QRS axis)
HeartPy - Python Heart Rate Analysis Toolkit for PPG and ECG signals (time-domain & frequency-domain measures)
Braindecode - Deep learning toolbox for decoding EEG, ECG, and MEG signals (PyTorch-based, includes datasets, preprocessing, models)

Papers and Blogs

🧠 Reasoning

Proximal Policy Optimization Algorithms - John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov (2017)
Tree of Thoughts: Deliberate Problem Solving with Large Language Models - Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan (2023)
Understanding the Math Behind GRPO — DeepSeek-R1-Zero - Soumanta Das, Yugen.ai (2025)
DeepSeek-V3 Explained 1: Multi-head Latent Attention - Shirley Li (2025)
Mixture-of-Experts (MoE) LLMs - Cameron R. Wolfe (2025)
DeepSeek-V3 — Advances in MoE Load Balancing and Multi-Token Prediction Training - Soumanta Das, Yugen.ai (2025)
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models - Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, et al. (2024)
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - DeepSeek-AI (2025)
QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training - Wei Dai, Peilin Chen, Chanakya Ekbote, Paul Pu Liang (2025)
MedCritical: Enhancing Medical Reasoning in Small Language Models via Self-Collaborative Correction - (2025)
OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text- and Time-Series Data - Patrick Langer, Thomas Kaar, Max Rosenblattl, Maxwell A. Xu, Winnie Chow, et al. (2025)

🤖 Agent

The Anatomy of a Personal Health Agent - A. Ali Heydari, Ken Gu, Vidya Srinivas, Hong Yu, et al. (2025)

💬 Large Language Model

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer - Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean (2017)
Deep contextualized word representations (ELMo) - Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer (2018)
LoRA: Low-Rank Adaptation of Large Language Models - Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, et al. (2021)
The Llama 3 Herd of Models - Meta AI (2024)
Qwen2.5 Technical Report - Qwen Team, Alibaba (2024)
Gemma 3 Technical Report - Gemma Team, Google DeepMind (2025)

📊 Physiological Signals

ENCASE: an ENsemble ClASsifiEr for ECG Classification Using Expert Features and Deep Neural Networks - Shenda Hong, Meng Wu, Yuxi Zhou, Qingyun Wang, Junyuan Shang, Hongyan Li, Junqing Xie (2017)
ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram - Jungwoo Oh, Gyubok Lee, Seongsu Bae, Joon-myoung Kwon, Edward Choi (2023)
Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data - Yubin Kim, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, Hae Won Park (2024)
A lightweight deep neural network for personalized detecting ventricular arrhythmias from a single-lead ECG device - Zhejun Sun, Wenrui Zhang, Yuxi Zhou, Shijia Geng, et al. (2025)
ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis - Yubao Zhao, Jiaju Kang, Tian Zhang, Puyu Han, Tong Chen (2024)
ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling - William Han, Chaojing Duan, Michael A. Rosenberg, Emerson Liu, Ding Zhao (2024)
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images - Xiang Lan, Feng Wu, Kai He, Qinghao Zhao, Shenda Hong, Mengling Feng (2025)
Signal, Image, or Symbolic: Exploring the Best Input Representation for Electrocardiogram-Language Models Through a Unified Framework - William Han, Chaojing Duan, Zhepeng Cen, Yihang Yao, Xiaoyu Song, Atharva Mhaskar, Dylan Leong, Michael A. Rosenberg, Emerson Liu, Ding Zhao (2025)
Retrieval-Augmented Generation for Electrocardiogram-Language Models - Xiaoyu Song, William Han, Tony Chen, Chaojing Duan, Michael A. Rosenberg, Emerson Liu, Ding Zhao (2025)
SensorLM: Learning the Language of Wearable Sensors - Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, et al. (2025)
LSM-2: Learning from Incomplete Wearable Sensor Data - Maxwell A. Xu, Girish Narayanswamy, Kumar Ayush, Dimitris Spathis, et al. (2025)
PPGFlowECG: Latent Rectified Flow with Cross-Modal Encoding for PPG-Guided ECG Generation and Cardiovascular Disease Detection - Xiaocheng Fang, Jiarui Jin, Haoyu Wang, Che Liu, Jieyi Cai, Guangkun Nie, Jun Li, Hongyan Li, Shenda Hong (2025)
MEETI: A Multimodal ECG Dataset from MIMIC-IV-ECG with Signals, Images, Features and Interpretations - Deyun Zhang, Xiang Lan, Shijia Geng, Qinghao Zhao, Sumei Fan, Mengling Feng, Shenda Hong (2025)

🔒 Privacy

Communication-Efficient Learning of Deep Networks from Decentralized Data (Federated Learning) - H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas (2016)
Privacy and Security Challenges in Large Language Models - Vishal Rathod, Seyedsina Nabavirazavi, Samira Zad, Sundararaja Sitharama Iyengar (2025)
SoK: Security and Privacy Risks of Healthcare AI - Yuanhaur Chang, Han Liu, Chenyang Lu, Ning Zhang (2024)

🔬 Multimodal

Zero-Shot Text-to-Image Generation (DALL-E) - Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever (2021)
Learning Transferable Visual Models From Natural Language Supervision (CLIP) - Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, et al. (2021)
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation - Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi (2022)
Sigmoid Loss for Language Image Pre-Training - Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, Lucas Beyer (2023)
Visual Instruction Tuning (LLaVA) - Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee (2023)
Med-Flamingo: a Multimodal Medical Few-shot Learner - Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec (2023)
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond - Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, Jingren Zhou (2023)
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models - Wei Dai, Peilin Chen, Malinda Lu, Daniel Li, Haowen Wei, Hejie Cui, Paul Pu Liang (2025)
Qwen2.5-VL Technical Report - Qwen Team, Alibaba (2025)

📖 AI/ML Knowledge

Courses

Introduction to Deep Learning - CMU 11-785, Fall 2025
Large Language Models: Methods, Analysis, and Applications - CMU 11-667/11-867
Advanced Natural Language Processing - CMU 11-711, Spring 2025
Multimodal Machine Learning (YouTube) - CMU 11-777
How To AI (Almost) Anything - MIT MAS.S60, Spring 2025
Affective Computing and Multimodal Interaction - MIT MAS.S63, Fall 2025

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Research Papers Collection

📑 Content by Topic

Tools

Papers and Blogs

🧠 Reasoning

🤖 Agent

💬 Large Language Model

📊 Physiological Signals

🔒 Privacy

🔬 Multimodal

📖 AI/ML Knowledge

Courses

License

About

Uh oh!

Releases

Packages

License

nbbb24/Paper-Collection

Folders and files

Latest commit

History

Repository files navigation

📚 Research Papers Collection

📑 Content by Topic

Tools

Papers and Blogs

🧠 Reasoning

🤖 Agent

💬 Large Language Model

📊 Physiological Signals

🔒 Privacy

🔬 Multimodal

📖 AI/ML Knowledge

Courses

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages