Services AI Workflows Work Blog Hire Me
AI Workflow Engineering

AI Workflow Integration Services

Embed AI into your product. Privately.

On-device or cloud AI — no black-box SaaS, no surprise API bills, no user data leaving your server. I design and ship AI features that are fast, private, and maintainable.

What I deliver

Four AI capability areas — pick one or combine them into a full pipeline.

Local LLM Integration

llama.cpp, Ollama, Qwen, Llama 3 — runs on CPU, no API key required, zero data leakage. Perfect for desktop apps and air-gapped environments.

  • llama.cpp
  • Ollama
  • Qwen
  • Llama 3

Document OCR & Extraction

PDF and image to structured JSON. Invoice parsing, contract analysis, receipt extraction — with field validation and confidence scoring.

  • pdfjs
  • Tesseract
  • JSON schema

RAG Pipelines

Vector embeddings, semantic search, chat over your documents. Built with pgvector or SQLite-vec for fully local or cloud-hosted retrieval.

  • pgvector
  • SQLite-vec
  • LangChain

Claude / OpenAI API Integration

Tool use, function calling, streaming responses, structured output, and careful prompt engineering. When you need cloud AI with full control.

  • Claude API
  • OpenAI
  • Streaming
Case Study

Invoice OCR + local chat — jaklens.ai

What was built

An Electron desktop app that extracts invoice data from PDFs and images using a locally-running Qwen2.5 1.5B model via llama.cpp — no internet connection required, no data ever leaves the user's machine.

The AI pipeline: pdfjs renders the document → base64 image is passed to the GGUF model → a structured JSON response is parsed and written to SQLite. The user can then chat with the AI about their invoice history.

Qwen2.5
Local AI model
0
Cloud dependencies
418MB
Installer with model
Read the full case study

Technologies

The tools I use to deliver production AI features.

llama.cpp Qwen2.5 LangChain pgvector SQLite-vec pdfjs Claude API OpenAI Whisper Node.js

Have an AI feature you need built?

Whether it's a local LLM, a document parser, or a full RAG pipeline — let's talk about what you need.

Start a conversation