What I deliver

Four AI capability areas — pick one or combine them into a full pipeline.

Local LLM Integration

llama.cpp, Ollama, Qwen, Llama 3 — runs on CPU, no API key required, zero data leakage. Perfect for desktop apps and air-gapped environments.

llama.cpp
Ollama
Qwen
Llama 3

Document OCR & Extraction

PDF and image to structured JSON. Invoice parsing, contract analysis, receipt extraction — with field validation and confidence scoring.

pdfjs
Tesseract
JSON schema

RAG Pipelines

Vector embeddings, semantic search, chat over your documents. Built with pgvector or SQLite-vec for fully local or cloud-hosted retrieval.

pgvector
SQLite-vec
LangChain

Claude / OpenAI API Integration

Tool use, function calling, streaming responses, structured output, and careful prompt engineering. When you need cloud AI with full control.

Claude API
OpenAI
Streaming

Case Study

Invoice OCR + local chat — jaklens.ai

What was built

An Electron desktop app that extracts invoice data from PDFs and images using a locally-running Qwen2.5 1.5B model via llama.cpp — no internet connection required, no data ever leaves the user's machine.

The AI pipeline: pdfjs renders the document → base64 image is passed to the GGUF model → a structured JSON response is parsed and written to SQLite. The user can then chat with the AI about their invoice history.

Qwen2.5

Local AI model

0

Cloud dependencies

418MB

Installer with model

Read the full case study

Technologies

The tools I use to deliver production AI features.

llama.cpp Qwen2.5 LangChain pgvector SQLite-vec pdfjs Claude API OpenAI Whisper Node.js

Have an AI feature you need built?

Whether it's a local LLM, a document parser, or a full RAG pipeline — let's talk about what you need.

Start a conversation

AI Workflow Integration Services