python

Building a Production Vision RAG System with ColPali and Light-ColPali

We took ColPali (vision-language embeddings for documents) and Light-ColPali (token merging via hierarchical clustering) and built the production infrastructure around them. The system uses PostgreSQL + pgvector as a unified store, a lease-based job queue for resilient ingestion, and a two-stage retrieval pipeline that retrieves at patch granularity but ranks at page level.

The key insight: text extraction is lossy. For documents with complex layouts, charts, and tables, embedding the rendered page as an image solves problems that text-based RAG can’t touch.

  • March 8, 2026

Transcribly

Github Repository 2023 – Present | Full-stack development | Technologies: Python, Flask, reactjs, Openai API, FFmpeg, pyDub, mpviepy,...

  • December 16, 2023

MOJO: SUPER PYTHON

Introducing Mojo: The Next Evolution in Programming Speed In the dynamic realm of programming languages, a new contender...

  • December 10, 2023