Vector Search in Production: Patterns for AI-Native Applications

If you’re building a Retrieval-Augmented Generation (RAG) application on Oracle 23ai, here are the architectural patterns that work best in production — based on real implementations.

The basic RAG pipeline with Oracle:

-- 1. Store documents and embeddings
INSERT INTO knowledge_base (doc_id, content, embedding)
VALUES (:id, :text, TO_VECTOR(:embedding_array));

-- 2. At query time: find the most relevant documents
SELECT content, VECTOR_DISTANCE(embedding, :query_embed, COSINE) AS score
FROM   knowledge_base
WHERE  category = :filter_category    -- combine with relational filter
ORDER BY score
FETCH FIRST 5 ROWS ONLY;

Chunking strategy matters:

Large documents should be split into chunks (paragraphs, sentences, fixed token windows) before embedding. Each chunk becomes a separate row. The optimal chunk size depends on your embedding model — typically 256-512 tokens for OpenAI models.

-- Store chunks with parent document reference
CREATE TABLE knowledge_chunks (
    chunk_id   NUMBER GENERATED ALWAYS AS IDENTITY,
    doc_id     NUMBER REFERENCES knowledge_base(doc_id),
    chunk_seq  NUMBER,          -- position in document
    chunk_text CLOB,
    embedding  VECTOR(1536, FLOAT32),
    CONSTRAINT pk_chunk PRIMARY KEY (chunk_id)
);

Hybrid search — vector + keyword:

Pure semantic search misses exact matches. Combine vector similarity with Oracle Text full-text search:

SELECT k.doc_id, k.chunk_text,
       VECTOR_DISTANCE(k.embedding, :q_vec, COSINE) AS semantic_score,
       SCORE(1) AS keyword_score
FROM   knowledge_chunks k
WHERE  CONTAINS(k.chunk_text, :keyword, 1) > 0     -- keyword match
ORDER BY (0.6 * semantic_score + 0.4 * (SCORE(1)/100))
FETCH FIRST 10 ROWS ONLY;

Metadata filtering before vector search:

Always apply relational filters before the vector index scan to reduce the candidate set:

SELECT content, VECTOR_DISTANCE(embedding, :q_vec, COSINE) AS score
FROM   knowledge_chunks
WHERE  doc_category = 'LEGAL'           -- narrow the search space first
AND    created_date > DATE '2024-01-01'
ORDER BY score
FETCH FIRST 5 ROWS ONLY;

Oracle’s vector index (HNSW) performs much better when the candidate pool is pre-filtered to relevant documents.

Discover more from grepOra

Subscribe now to keep reading and get access to the full archive.

Continue reading