· 0 min read
Prompt Versioning in Production: Managing LLM Prompts Like Code
How EasyCommerce manages LLM prompts across Claude, OpenAI, and DeepSeek with a versioned registry, structured evals in CI, and a rollback path that does not require a deployment.
· 0 min read
RAG Without a Framework: A Minimal pgvector Pipeline with Claude
How to build a complete RAG pipeline in Python without LangChain or LlamaIndex — chunking, embedding with text-embedding-3-large, pgvector retrieval, and grounded generation with Claude — with the tradeoffs that matter at production scale.
· 0 min read
Building a Claude Code MCP Server in Python: Lessons from codebase-research-agent
How codebase-research-agent exposes a tool-using AI agent as a Claude Code MCP server — JSON-RPC over Streamable HTTP, tool registration, streaming SSE, and the four things that broke in production before it worked reliably.
· 0 min read
How I Cut Edit Distance from 168 to 43 in a Legal Document RAG Pipeline
Three architectural changes that brought average edit distance from 168 to 43 in a legal document RAG pipeline — section-scoped chunking, section-scoped exemplar prompting, and switching from a general embedding model to a legal-domain one — with a CI eval suite that made the improvement trajectory visible and reproducible.
· 0 min read
Designing a ReAct-Style Codebase Research Agent with Tool Use
How codebase-research-agent uses a ReAct loop to orchestrate semantic search, AST navigation, symbol lookup, grep, and git blame as callable tools — the hybrid retrieval substrate it operates on, and why framing RAG as a tool rather than a pipeline changes the architecture fundamentally.
· 0 min read
Building a 12-Source Job Discovery Pipeline with RAG and pgvector
How JobPulse aggregates job listings across 12 sources in three tiers, embeds your resume into pgvector, and generates grounded cover letters via Claude — including the async adapter protocol, composite scoring model, and the three retrieval failure modes that shaped the final architecture.
· 0 min read
Scaling a Laravel MCP Tool Registry to 46 Tools
The May 12 post covered the MCP server architecture. This one goes inside the tool registry: class-per-tool structure, Sanctum ability scoping, JSON Schema validation before dispatch, tag-based auto-discovery, and the three things that broke when the registry hit 46 tools.
· 0 min read
Stop Using Fixed-Size Chunks for Technical Documentation
Why naive fixed-size chunking breaks on code-heavy documentation, and the heading-aware approach with paragraph fallback and overlap that replaced it in this portfolio's RAG pipeline.
· 0 min read
Voyage AI vs OpenAI Embeddings for Technical RAG in PHP
Why I switched from OpenAI text-embedding-3-small to Voyage AI voyage-code-3 for this portfolio's RAG layer — model comparison, the input_type asymmetry Voyage requires, measured retrieval improvement, and the migration path that keeps search live throughout.
· 0 min read
A Claude Tool-Calling Loop in Laravel: From First Request to Final Answer
The exact pattern for registering tools, dispatching multi-turn Claude conversations, and processing tool results in a Laravel service — including the three production failure modes the happy path does not cover.
· 0 min read
Embedding LLMs in a WordPress Plugin: EasyCommerce's Async Architecture
How EasyCommerce wires LLM calls — product description generation, fraud detection, and inventory forecasting — into a WordPress plugin using async dispatch and a provider abstraction that survives outages and scales across catalogues.
· 0 min read
MCP Servers for Laravel: A Production Pattern
How I built the MCP server that powers AI access to this portfolio — JSON-RPC over Streamable HTTP, Sanctum bearer auth, an ability-scoped tool registry, and the gotchas worth knowing before you ship.