Showing posts with label Python. Show all posts
Showing posts with label Python. Show all posts

Monday, June 1, 2026

Building Agentic AI Pipelines for Document Analysis

In this video, I show how to build a local agentic AI pipeline using Sparrow to extract and analyze data from financial documents. 

 The agent runs two steps: 

- Extract structured data from a bonds table image using Sparrow Parse pipeline and Ministral 3B 14B model 
- Analyze portfolio risk using Sparrow Instructor pipeline and Gemma 4 31B model — classifying each position as low, medium, or high risk
 
Both steps run as Prefect tasks inside a single flow, fully locally — no data leaves your machine.

 

Thursday, April 2, 2026

Running Multiple Models on One GPU with vLLM and GPU Memory Utilization

In this video I show how to run multiple vLLM model instances on the same GPU (Nvidia) in parallel by adjusting the --gpu-memory-utilization flag.

You'll see: 

- How to launch separate vLLM servers for different models 

- How to split GPU memory between them without running out of VRAM

This approach works when you want to serve several smaller models concurrently on limited hardware.

 

Saturday, December 27, 2025

DeepSeek OCR Review

I'm testing structured data extraction with DeepSeek OCR. It works well and gives good data accuracy and performance to disrupt traditional cloud based document processing solutions.

 

Monday, December 15, 2025

New Ministral 3 14B vs Mistral Small 3.2 24B Review

I review data accuracy retrieval and inference speed for the new Ministral 3 14B model vs older Mistral Small 3.2 24B. Older and larger 24B model wins this time. 

 

Wednesday, December 3, 2025

Structured Data Retrieval with Sparrow using OCR and Vision LLM [Improved Accuracy]

I explain improvements I'm adding into Sparrow to achieve better accuracy for structured data. I'm using a method, where I run OCR step first, then construct advanced prompt with injected OCR data. This prompt is sent along with image to Vision LLM for structured data retrieval. All this happens as part of a single pipeline.

 

Wednesday, September 10, 2025

Advanced Structured Data Processing in Sparrow

I added instruction and validation functionality into Sparrow. This allows to process business logic with document data directly through Sparrow query. For example, it allows to check if given fields are present in the document.

 

Wednesday, August 27, 2025

Financial Table Structure Analysis with Computer Vision

Explaining new functionality I'm implementing in Sparrow to pre-process tables with grid structure. This greatly improves table data extraction by Vision LLMs. 

 

Monday, June 16, 2025

Boost Vision LLM Accuracy with OCR Text Integration

I show an interesting approach where I send both an image and OCR text to a Vision LLM. The prompt is constructed to instruct the Vision LLM to prioritize the OCR text. This allows the use of a Vision LLM for structured output construction while relying on external OCR text, giving you more control over the results.

 

Tuesday, June 10, 2025

Solving Vision LLM Number Formatting Issues Using PaddleOCR and Sparrow

Discover how to fix number formatting errors in vision LLMs like Mistral! In this video, I show how Mistral misreads "56,000" as "56000" and how combining PaddleOCR’s text extraction with Sparrow’s spatial data processing solves this hallucination issue.

 

Tuesday, June 3, 2025

PaddleOCR 3.0: Supercharge Your AI

I upgraded to PaddleOCR 3.0 and explain the new PaddleOCR API integration. My goal is to integrate OCR result output with Vision LLM processing to enhance large-scale, structured table data output. 

 

Monday, May 26, 2025

Box Annotations in Sparrow for Structured Data Extraction

Check out my video on Box Annotations in Sparrow for Structured Data Extraction! I’ll show you how the Qwen2.5 vision model pulls bounding box annotations from images based on what you need. Plus, create simple descriptions and confidence score boxes. 

 

Monday, May 19, 2025

Structured Data Annotation with Qwen2.5 VL and MLX-VLM

Qwen2.5 VL can provide bounding box coordinates and confidence values for extracted structured data. This is useful for visual data review and reporting. I will explain with a practical example what prompt should be used to ensure Qwen2.5 returns this data. 

 

Tuesday, May 13, 2025

LLM Microservice with Instruction Calling

I describe the idea of implementing interaction with LLM through a concept of microservice with instruction calling. This works great for enterprise application use cases, such as data validation, workflor decisions.

 

Monday, May 5, 2025

Local LLM Instruction Processing with Sparrow

I explain how to execute instructions with a payload using a local LLM. This is useful when you want to process your data with an LLM and provide contextual instructions, specifying the desired outcome of what needs to be achieved. 

 

Monday, April 28, 2025

Vision LLM on Mac Mini M4 Pro: Real-World MLX Performance

I discuss the real-world MLX performance of Sparrow for structured data extraction with public access. The current Sparrow online instance runs on a Mac Mini M4 Pro with 64GB of memory. On average, it processes one page in 100 seconds. I explain why tokens-per-second measurements can be misleading when evaluating structured data extraction. 

 

Tuesday, April 15, 2025

Dashboard with Gradio Python

This video showcases the Sparrow dashboard, where you can view statistics on document data extraction events processed by Sparrow. This elegant dashboard is built with Python using Gradio, a server-side web UI framework.

 

Tuesday, March 25, 2025

Oracle DB 23ai Free Connection Pool in Python

I describe how to connect to Oracle DB from Python. I explain why DB connection pool is important for better performance. Connection is done through thin oracledb mode, without installing Oracle Client.

 

Monday, March 17, 2025

Temporary Files Cleaner for Gradio Web App

Learn how to implement an automatic temporary file cleanup solution for Gradio web applications. This tutorial shows you how to prevent disk space issues by periodically removing old upload files and folders that Gradio leaves behind. Perfect for developers who deploy Gradio apps in production environments or run memory-intensive applications. 

 

Wednesday, March 12, 2025

Building AI Agent for Local Structured JSON Output

I explain key steps of building AI agent to process document and extract structured JSON data locally. I'm running it with Sparrow and using Qwen VL model for vision processing backend and OCR. The steps are explained with Sparrow code walkthrough. 

 

Monday, March 3, 2025

Querying Non Existing Fields with Qwen2.5 Vision LLM

I describe how Sparrow helps to query non existing fields with Qwen2.5 Vision LLM. Running it locally with MLX and MLX-VLM.