Four AI capability areas — pick one or combine them into a full pipeline.
llama.cpp, Ollama, Qwen, Llama 3 — runs on CPU, no API key required, zero data leakage. Perfect for desktop apps and air-gapped environments.
PDF and image to structured JSON. Invoice parsing, contract analysis, receipt extraction — with field validation and confidence scoring.
Vector embeddings, semantic search, chat over your documents. Built with pgvector or SQLite-vec for fully local or cloud-hosted retrieval.
Tool use, function calling, streaming responses, structured output, and careful prompt engineering. When you need cloud AI with full control.
An Electron desktop app that extracts invoice data from PDFs and images using a locally-running Qwen2.5 1.5B model via llama.cpp — no internet connection required, no data ever leaves the user's machine.
The AI pipeline: pdfjs renders the document → base64 image is passed to the GGUF model → a structured JSON response is parsed and written to SQLite. The user can then chat with the AI about their invoice history.
The tools I use to deliver production AI features.
Whether it's a local LLM, a document parser, or a full RAG pipeline — let's talk about what you need.
Start a conversation