Oracle AI Vector Search: A New Data Type for the AI Era

The most AI-forward feature in Oracle 23ai is AI Vector Search — a new VECTOR data type and a set of functions for storing, indexing, and querying vector embeddings directly in Oracle Database.

Vector embeddings are numerical representations of unstructured content (text, images, audio) generated by AI models. They’re the foundation of semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation) architectures.

The VECTOR data type:

CREATE TABLE documents (
    doc_id      NUMBER,
    content     CLOB,
    embedding   VECTOR(1536, FLOAT32)  -- 1536 dimensions, OpenAI ada-002 format
);

Inserting a vector:

INSERT INTO documents (doc_id, content, embedding)
VALUES (1, 'Oracle 23ai brings AI features to the database.',
        TO_VECTOR('[0.023, -0.145, 0.871, ...]'));

Semantic similarity search:

SELECT doc_id, content,
       VECTOR_DISTANCE(embedding, :query_vector, COSINE) AS similarity
FROM   documents
ORDER BY similarity
FETCH FIRST 5 ROWS ONLY;

Creating a vector index for fast ANN search:

CREATE VECTOR INDEX docs_vector_idx
ON documents (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

The bigger picture: RAG in the database

The reason this matters is the architecture it enables. Instead of maintaining a separate vector database (Pinecone, Weaviate, Chroma), you can store embeddings alongside the source data in Oracle. Your SQL queries can combine vector similarity search with traditional relational filters:

SELECT content
FROM   documents
WHERE  created_date > DATE '2024-01-01'   -- relational filter
ORDER BY VECTOR_DISTANCE(embedding, :query_vec, COSINE)
FETCH FIRST 10 ROWS ONLY;

No data movement, no synchronization, full ACID guarantees. This is Oracle’s AI play, and it’s well-designed.

Discover more from grepOra

Subscribe now to keep reading and get access to the full archive.

Continue reading