Reading

A running list of books, papers, and articles I'm currently reading or recently read.

Recent

MeMo's memory model lets teams upgrade their LLM without retraining it — and performance jumps 26%

Article · 2026-06-09

Memo demonstrates the shift toward pre-processing of data using SLM to reduce reliance on RAG architecture as an exclusive retrieval. This is analogous to the Two-Brains-In-A-Skull architectural enhancements that we propose and implemented in the Syncratic platform.
Monkey Business: Who owns the copyright?

Article · 2026-04-16

Interesting article on boundaries on copyright. Titled: This monkey selfie will protect you from AI slop
Open source Mamba 3 arrives to surpass Transformer architecture with nearly 4% improved language modeling, reduced latency

Article · 2026-03-20

Improving the underlying architecture of training models. “The most significant leap in inference efficiency comes from the transition from Single-Input, Single-Output (SISO) to Multi-Input, Multi-Output (MIMO) SSMs.”
On-policy Distillation

Blog · 2026-03-08

Technique to train smaller models from large models
Memory Crystal Breakthrough

Article · 2026-02-24
Optimal Architecture for SLM

blog · 2026-01-03

It is becoming clear that Small Language Models are marginally better when the underlying parameters are about the same. Results from this paper claim that the models generally have 1% deviation from each other in results.
Understanding RPO for SaaS Companies

blog · 2025-12-31
Titans: Learning to Memorize at Test Time

paper · 2025-12-24

Bringing long-term memory to inference
Stanford CS230 Lectures

video · 2025-12-20

Great lectures by Andrew Ng and the Stanford team. Thank you to Stanford for making this available to the public.
Defeating Nondeterminism in LLM Inference

paper · 2025-11-28

There is a general notion that LLMs are nondeterministic due to concurrency and floating points roundoff on the GPU. This paper revisits that idea that LLMs can become deterministic.
Fine-tuning with RapidFire AI

blog · 2025-11-25
Mixture of Experts Explained

blog · 2025-11-25