Welcome to Avi's reading-list!
Hello, I'm a part of the 2025-26 cohort of the MS in Machine Learning program at CMU. From 2020-25 I was a dual-degree student at IIT Kharagpur's ECE department. Here, I will occasionally log interesting things I read or want to read. Some of it will pertain to my research (which falls under the broad purview of representaiton learning). Check out my website to learn more. I would put my Goodreads here but I have made embarrassingly poor progress on my leisure reading.
About This:
I've decided to start this thing afresh, so it's currently almost empty.
I mostly intend to use this as a live, remote personal agenda that I can access from any device anywhere. If you're here and you're not me, I can't guarantee that you'll find anything useful. But you are welcome to try. (I hope gold.md is a good starting point towards that).
Use the sidebar to navigate. Happy reading!
Gem Alert
Here is a list of manuscripts (papers/articles/monographs/tutorials etc.) that I find myself referring to in an unexpectedly diverse variety of contexts.
Probabilistic Modelling and Variational Inference
- Albergo, Boffi, Vanden-Eijden: Stochastic Interpolants: A Unifying Framework for Flows and Diffusions. (official impl) rigorous, unifying analysis and generalization of continous-time stochastic interpolation
- McAllester, Stratos: Formal Limitations on the Measurement of Mutual Information (official impl) fundamental KL and MI estimation bounds, interesting proof technique
- Krueger et al.: Bayesian Hypernetworks very flexible and underexplored Bayesian model
- Touchette: The large deviation approach to statistical mechanics digestible bridge to statistical mechanics for people with ml bg
- Luo: Understanding Diļ¬usion Models: A Unified Perspective have only skimmed through this, but recommended reading imo
- Zhi-Han Training Latent Variable Models with Auto-encoding Variational Bayes: A Tutorial intuitive exposition of AEVB as approximate EM
- Marion et al.: Implicit Diffusion: Efficient Optimization through Stochastic Sampling (official impl) optimize implicit models from samples/without going through likelihood estimation
Nonparametric and Semiparametric Inference
- Chernozhukov et al.: Automatic Debiased Machine Learning via Riesz Regression (official impl for a follow-up work) foundational for AutoDML
- Foster and Syrgkanis: Orthogonal Statistical Learning (official impl) analysis like AutoDML outside the standard M/Z-estimand form
- Tibshirani: Conformal Prediction gentle and balanced introduction to conformal prediction
- Han et al.: Optimal rates of entropy estimation over Lipschitz balls synthesis of various ways of using simple polynomial approximations for estimating functionals
Representation Learning
- Zhai et al.: Contextures: Representations from Contexts fundamental work in our understanding of representation learning + scaling laws, math is a little terse though
- Tsai, Yeh, Ravikumar: Faith-Shap: The Faithful Shapley Interaction Index (official impl) feature importance w/o unhealthy assumptions
Misc.
- Lin: Bayesian Epistemology scratches a certain itch
Misc
Index of random stuff (most of which I haven't read yet but suspect might be useful at some point in the future). Mostly for my own reference.
Generic deep learning stuff
Systems
I still enjoy messing with computer systems as a hobby. Sadly, owing to the allure of AI-assisted code generation, I find myself thinking about it less and less. (Of course, to whatever extent system design and systems-level considerations are inapplicable to the high-level architecture I embed in my prompts to begin with). Perhaps more saliently: unlike undergrad, my graduate coursework (much less research) is very far removed from systems. As a result there is a huge and growing backlog of stuff here.
- Software Transactional Memory blogpost
- Algorithmica/HPC (parallelism)
- Everything You Never Wanted To Know About Linker Script
- Learn R*st With Entirely Too Many Linked Lists
- A From-Scratch Tour of Bitcoin in Python
- What every systems programmer should know about concurrency
- What Every Programmer Should Know About Memory
- The Path of a Packet through the Linux Kernel
- Writing a Simple Garbage Collector in C
- How Nginx Handles Thousands of Concurrent Requests
- PicoCTF 2021 - Binary Exploitation Challenge Writeups
(I won't ever remove anything from here, so I might have read some of this. But I might not, so don't blame me if something is terribly wrong with some of these.)