Latest stories published on Towards Data Science

Parse Scanned PDFs for RAG with EasyOCR: Free OCR Gives You Words, Not a Document

Large Language Models

Enterprise Document Intelligence [Vol.1 #5quinquies] – Same 1974 scanned PDF, two engines. EasyOCR recovers text.…

Kezhan Shi

June 19, 2026

15 min read

GPU-Resident Top-K for Agentic RAG: I Built a CUDA Kernel So My Retrieval Step Would Stop Bouncing Off the GPU

Agentic AI

The PCIe transfer latency is silently bottlenecking your agentic inference. Here is how building a…

Anubhab Banerjee

June 19, 2026

31 min read

Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each

Large Language Models

Getting reliable, readable responses out of your LLM, and knowing which tool to reach for

Maria Mouschoutzi

June 18, 2026

11 min read

How Powerful is Claude Fable (Mythos) 5 for Coding?

Large Language Models

Learn about the upsides and downsides of Claude Fable 5

Eivind Kjosbakken

June 18, 2026

9 min read

Proteins: A Mosaic Pattern to Rule Them All?

Machine Learning

For decades, the existence of the hydrophobic core, a region in the 3D structure of…

Francisco Javier Lobo-Cabrera

June 18, 2026

12 min read

Dispatching the Parsed RAG Question: Chunk Strategy, Model Tier, Activations, Audit

Large Language Models

Enterprise Document Intelligence [Vol.1 #6c] – The decisions the parser makes on top of the…

angela shi

June 18, 2026

28 min read

The Power and Pitfalls of Vector-Based Image Search

Artificial Intelligence

A hands-on guide to setting up image similarity search in Milvus, and why visual replication…

Soner Yıldırım

June 18, 2026

8 min read

Your Churn Threshold Is a Pricing Decision

Data Science

How unit economics should set your classification cutoff, and why they rarely do.

Fabio Oliveira

June 17, 2026

15 min read

The Secret to Reproducible and Portable Optimization: ORPilot’s Intermediate Representation (IR)

Agentic AI

Why production-level AI optimization modeling agent needs reproducibility and portability, and how IR helps achieve…

Guangrui Xie

June 17, 2026

15 min read

You Probably Don’t Need an Agent Framework

Large Language Models

Most LLM applications need a clear workflow, not an autonomous agent. Here’s how to build…

Shuai Guo

June 17, 2026

19 min read