The first rule of machine learning: Start without machine learning
Eugene Yan
4,774 posts
MTS @AnthropicAI. Prev: Principal Applied Scientist @Amazon, led ML @ Alibaba, Lazada, Healthtech startup.
- I'm excited to share something I've been working on for a while—ApplyingML.com! It collects the tacit, tribal, ghost knowledge on how to apply machine learning from papers, guides, and interviews with ML practitioners.
- Ran a simple benchmark (Mandelbrot sets) between Mojo & Python. The speedup is impressive, and it benefits from Python's libraries. • Python: 1,184ms • Mojo: 27ms 🤯 • Python (vectorized): 240ms • Mojo (vectorized): 2ms
- Replying to @sinclanichnp.array people keep reaching for much fancier things way too fast these days
- Started a list of open-source LLMs with commercial licenses so you can fine-tune your own applications. Contributions welcome! 🙏
- “HuggingFace’s leaderboards show how truly blind they are because they actively hurting the open source movement by tricking it into creating a bunch of models that are useless for real usage.” Ouch.
- How do you build an LLM-evaluator / LLM-as-Judge? The book for "AI Evals for PMs and Engineers" has a chapter devoted to it (35% discount: maven.com/parlance-labs/…) First, we need to define the right metrics. For example, we can start by listing the failure modes from our error
- This is why CS fundamentals continue to be crucial: LLaMA 30B only needs 4gb of memory if we use mmap(). Not sure why this works but one reason could be that 30B weights are sparse. Thus, lazy loading the fraction of needed weights reduces memory usage. github.com/ggerganov/llam…
- agent ≈ model + tools, within a for-loop + environmentAnthropic's AI Engineer source code is fully public / there is no server there is no separate backend. they just use the same api.anthropic.com/v1/messages api in a loop with tool use all packaged into a single file: gist.githubusercontent.com/1rgs/e4e13ac9a… tools available: 1. dispatch_agent -
- You're a great engineer if you know: • Simple is beautiful • To focus on the problem, not the tech • How to reuse what already exists • Launching is the start, not the end • How to step back & let others lead • How to get different views & change your mind • Your customerYou’re a great engineer if you know the definition of: - idempotent - monoid - decoupled - dependency injection - unit - functional programming - asynchronous vs parallel programming - thread locking - eventual consistency - exactly-once semantics - lambda vs kappa
- A love letter to @claude_code: Over the past two days, I built a stock analysis web app for loved ones that includes auth, charting tools, stock data/llm apis, db persistance, and more. (see 1 min demo) The velocity was only possible with claude code—please try it if you haven't
00:00 - Wrote abt patterns for LLM systems/products • Evals: Track performance • RAG: Add external knowledge • Finetuning: Improve specific tasks • Caching: Reduce latency & cost • Guardrails: Ensure output quality • Defensive UX: Anticipate & manage errors
- Starting a machine learning project? • How to frame the problem? • What methodology to adopt? • How to design the system? • Estimating potential ROI? Learn how other companies did it via their papers & tech blogs.
- Wrote an intro to evals for long-context Q&A systems: • How it differs from basic Q&A • What dimensions & metrics to eval on • How to build llm-evaluators • How to build eval datasets • Benchmarks: narratives, technical docs, multi-docs














