-

A technical overview and some benchmarks
10 min read -

Why Custom Inference in DeepStream?
10 min read
Latest
-

What I thought was a scheduling problem turned out to be a portability problem first
8 min read -

Parse Scanned PDFs for RAG with EasyOCR: Free OCR Gives You Words, Not a Document
Large Language ModelsEnterprise Document Intelligence [Vol.1 #5quinquies] – Same 1974 scanned PDF, two engines. EasyOCR recovers text.…
15 min read -

GPU-Resident Top-K for Agentic RAG: I Built a CUDA Kernel So My Retrieval Step Would Stop Bouncing Off the GPU
Agentic AIThe PCIe transfer latency is silently bottlenecking your agentic inference. Here is how building a…
31 min read -

Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each
Large Language ModelsGetting reliable, readable responses out of your LLM, and knowing which tool to reach for
13 min read -

Learn about the upsides and downsides of Claude Fable 5
9 min read -

For decades, the existence of the hydrophobic core, a region in the 3D structure of…
12 min read -

Dispatching the Parsed RAG Question: Chunk Strategy, Model Tier, Activations, Audit
Large Language ModelsEnterprise Document Intelligence [Vol.1 #6c] – The decisions the parser makes on top of the…
28 min read -

How unit economics should set your classification cutoff, and why they rarely do.
15 min read
Editor’s Picks
-

Most LLM applications need a clear workflow, not an autonomous agent. Here’s how to build…
19 min read -

Budgets for AI tokens can’t be infinite, no matter how much hyperscalers wish they were
8 min read -

A single model hands you a single answer and no sense of how much it…
11 min read -

Let’s practice data science thinking through a probability problem
9 min read -

Claude can now write its own harness on the fly, custom-built for the task at…
28 min read -

-

Why “average utilization” lies about how full your GPUs really are
13 min read -

A structured methodology for comparing candidate models, testing stability, and selecting a robust final score
18 min read -

A quick guide to separating Physical AI from world models, embodied AI, physics AI, and…
9 min read
The Variable Newsletter
-

Sorting through the good, bad, and ambiguous aspects of vibe coding
4 min read
Deep Dives
-

The Secret to Reproducible and Portable Optimization: ORPilot’s Intermediate Representation (IR)
Agentic AIWhy production-level AI optimization modeling agent needs reproducibility and portability, and how IR helps achieve…
15 min read -

The System Always Knows: Why Local Efficiency and System Performance Are Not the Same Problem
Data ScienceHow local optimization in last‑mile delivery can quietly break the system
15 min read -

A systems-level deep dive into the hidden microarchitectural costs of Kubernetes GPU time-slicing, and what…
22 min read -

Increasing context size in RAG systems doesn’t improve accuracy for aggregation tasks—it makes errors harder…
15 min read -

Why Decade-Old Residual Connections Still Power All of AI (And Why That’s a Problem)
Large Language ModelsFor nearly a decade, this part of neural networks barely changed. DeepSeek is trying to…
16 min read -

Enterprise Document Intelligence [Vol.1 #5B] – One PDF in, a relational set of DataFrames out:…
29 min read

