Voyage AI vs OpenAI Embeddings for Technical RAG in PHP
For most RAG tutorials, the embedding model is an afterthought — pick whatever the quickest example uses and move on. That works until you try to retrieve code snippets, function names, or technical prose, and your similarity scores start returning documentation from the wrong language or the wrong abstraction level.
I use Voyage AI's voyage-code-3 for the RAG layer in this portfolio. I switched from OpenAI's text-embedding-3-small three months in. Here is what changed, what it cost, and how to swap providers without rewriting your retrieval layer.
The abstraction you need first
Before comparing providers, abstract the embedding call behind an interface. Switching models later — whether for cost, quality, or compliance — cannot require changing retrieval code.
interface EmbeddingProvider
{
/**
* Generate an embedding vector for a single text input.
*
* @param string $text Text to embed.
* @return list<float> Embedding vector.
* @throws EmbeddingException On API failure.
*/
public function embed(string $text): array;
/**
* Generate embedding vectors for multiple texts in a single API call.
*
* @param list<string> $texts Texts to embed.
* @return list<list<float>> Vectors in the same order as input.
* @throws EmbeddingException On API failure.
*/
public function embedBatch(array $texts): array;
/**
* Number of dimensions in this model's output vectors.
*/
public function dimensions(): int;
}Store the model name and dimensions in config, not in the implementation. When you switch providers, only the bound class and the pgvector column size change.
OpenAI implementation
class OpenAiEmbeddingProvider implements EmbeddingProvider
{
private const MODEL = 'text-embedding-3-small';
private const DIMENSIONS = 1536;
public function __construct(private readonly \OpenAI\Client $client) {}
public function embed(string $text): array
{
$response = $this->client->embeddings()->create([
'model' => self::MODEL,
'input' => $text,
]);
return $response->embeddings[0]->embedding;
}
public function embedBatch(array $texts): array
{
$response = $this->client->embeddings()->create([
'model' => self::MODEL,
'input' => $texts,
]);
return array_map(fn($e) => $e->embedding, $response->embeddings);
}
public function dimensions(): int
{
return self::DIMENSIONS;
}
}Voyage AI implementation
Voyage has no official PHP client, so this hits the REST API directly via Guzzle.
class VoyageEmbeddingProvider implements EmbeddingProvider
{
private const BASE_URL = 'https://api.voyageai.com/v1';
private const MODEL = 'voyage-code-3';
private const DIMENSIONS = 1024;
public function __construct(
private readonly \GuzzleHttp\Client $http,
private readonly string $apiKey,
private readonly string $inputType = 'document',
) {}
public function embed(string $text): array
{
return $this->embedBatch([$text])[0];
}
/**
* @param list<string> $texts
* @return list<list<float>>
* @throws EmbeddingException
*/
public function embedBatch(array $texts): array
{
$response = $this->http->post(self::BASE_URL . '/embeddings', [
'headers' => [
'Authorization' => "Bearer {$this->apiKey}",
'Content-Type' => 'application/json',
],
'json' => [
'model' => self::MODEL,
'input' => $texts,
'input_type' => $this->inputType,
],
]);
$body = json_decode(
$response->getBody()->getContents(),
true,
512,
JSON_THROW_ON_ERROR
);
usort($body['data'], fn($a, $b) => $a['index'] <=> $b['index']);
return array_map(fn($item) => $item['embedding'], $body['data']);
}
public function dimensions(): int
{
return self::DIMENSIONS;
}
}The usort on index is not optional. The Voyage API does not guarantee batched results return in input order — the documentation notes this explicitly. OpenAI's batch endpoint maintains order, but sort defensively there too.
The input_type asymmetry
Voyage supports an input_type parameter that OpenAI's API does not have. Set it to 'document' when indexing chunks and 'query' when embedding the user's search term. The model was trained to align query vectors with document vectors across this asymmetry — using 'document' for both degrades retrieval precision measurably.
This means the provider you bind for indexing and the one you bind for retrieval need different configuration. I register two instances in the service container:
// AppServiceProvider
$this->app->bind(EmbeddingProvider::class, fn() => new VoyageEmbeddingProvider(
app(\GuzzleHttp\Client::class),
config('services.voyage.key'),
inputType: 'document', // default for indexing
));
$this->app->bind('embedding.query', fn() => new VoyageEmbeddingProvider(
app(\GuzzleHttp\Client::class),
config('services.voyage.key'),
inputType: 'query', // for retrieval
));Retrieval code resolves 'embedding.query' explicitly. Everything else gets the document provider. OpenAI users can ignore this — both directions use the same model call.
Model comparison
| Model | Dimensions | Cost / 1M tokens | Best for |
|---|---|---|---|
| text-embedding-3-small | 1536 | $0.02 | General text |
| text-embedding-3-large | 3072 | $0.13 | General text, high precision |
| voyage-3-lite | 512 | $0.02 | Latency-sensitive, general |
| voyage-3 | 1024 | $0.06 | General text, strong baseline |
| voyage-code-3 | 1024 | $0.18 | Code, technical prose |
voyage-code-3 is the outlier: same price tier as text-embedding-3-large but trained specifically on code and technical documentation. For a corpus of posts mixing PHP, SQL, and shell code, it retrieves the right chunk more often than any general model at comparable cost.
On this portfolio's corpus — around 180 chunked documents — switching from text-embedding-3-small to voyage-code-3 improved top-3 retrieval precision by 22 percentage points on a set of 40 manually evaluated queries. The sample size is too small for a publishable benchmark, but the directional signal was consistent across every query category I tested: function name lookup, error message matching, and architectural concept retrieval all improved; general prose retrieval stayed flat.
If your corpus is predominantly prose with occasional code references, voyage-3 at $0.06/1M is the better starting point. voyage-code-3 earns its premium only when code retrieval is the primary concern.
Migration without downtime
Changing providers means the existing pgvector column is full of incompatible vectors. You cannot mix OpenAI and Voyage embeddings in the same column and get meaningful similarity scores — the spaces are different. The migration adds a new column, back-fills it, and cuts over the retrieval query.
// 1. Add the new column for voyage-code-3's 1024-dim vectors
Schema::table('document_chunks', function (Blueprint $table): void {
$table->vector('embedding_v2', 1024)->nullable();
});// 2. Back-fill via a queued job to avoid timeouts
class ReEmbedChunkJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable;
public function __construct(private readonly int $chunkId) {}
public function handle(EmbeddingProvider $provider): void
{
$chunk = DocumentChunk::findOrFail($this->chunkId);
$chunk->update(['embedding_v2' => $provider->embed($chunk->content)]);
}
}
// Dispatch staggered to respect Voyage's rate limits
DocumentChunk::query()
->whereNull('embedding_v2')
->chunkById(100, function ($chunks): void {
foreach ($chunks as $i => $chunk) {
ReEmbedChunkJob::dispatch($chunk->id)
->delay(now()->addSeconds($i * 2));
}
});// 3. Once embedding_v2 is fully populated, point retrieval at the new column
$results = DB::select(
'SELECT id, content, embedding_v2 <=> ? AS distance
FROM document_chunks
ORDER BY distance
LIMIT ?',
[json_encode($queryVector), $limit]
);Keeping both columns live during the migration means your search endpoint stays up throughout. Only drop embedding after you have validated retrieval quality on your eval set against the new column.
What I would change
Build the evaluation harness before picking a model, not after. I chose text-embedding-3-small first because it was the path of least resistance, noticed retrieval quality issues in production, switched to voyage-code-3, and only then built the 40-query eval set to confirm the improvement. Building the eval set first would have made the right choice obvious from day one and avoided the migration entirely. Twenty manually written query-answer pairs from your actual corpus, run against each candidate model, is enough signal to decide.
Al Amin Ahamed
Senior software engineer & AI practitioner. 5+ years shipping Laravel platforms, WordPress plugins, WooCommerce extensions, and AI-driven products.
About me →More from the blog