Oracle AI Vector Search: A New Data Type for the AI Era

The most AI-forward feature in Oracle 23ai is AI Vector Search — a new VECTOR data type and a set of functions for storing, indexing, and querying vector embeddings directly in Oracle Database.

Vector embeddings are numerical representations of unstructured content (text, images, audio) generated by AI models. They’re the foundation of semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation) architectures.

The VECTOR data type:

CREATE TABLE documents (
    doc_id      NUMBER,
    content     CLOB,
    embedding   VECTOR(1536, FLOAT32)  -- 1536 dimensions, OpenAI ada-002 format
);

Inserting a vector:

INSERT INTO documents (doc_id, content, embedding)
VALUES (1, 'Oracle 23ai brings AI features to the database.',
        TO_VECTOR('[0.023, -0.145, 0.871, ...]'));

Semantic similarity search:

SELECT doc_id, content,
       VECTOR_DISTANCE(embedding, :query_vector, COSINE) AS similarity
FROM   documents
ORDER BY similarity
FETCH FIRST 5 ROWS ONLY;

Creating a vector index for fast ANN search:

CREATE VECTOR INDEX docs_vector_idx
ON documents (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

The bigger picture: RAG in the database

The reason this matters is the architecture it enables. Instead of maintaining a separate vector database (Pinecone, Weaviate, Chroma), you can store embeddings alongside the source data in Oracle. Your SQL queries can combine vector similarity search with traditional relational filters:

SELECT content
FROM   documents
WHERE  created_date > DATE '2024-01-01'   -- relational filter
ORDER BY VECTOR_DISTANCE(embedding, :query_vec, COSINE)
FETCH FIRST 10 ROWS ONLY;

No data movement, no synchronization, full ACID guarantees. This is Oracle’s AI play, and it’s well-designed.

SQL/PGQ: Querying Property Graphs with Standard SQL

Property graphs model data as nodes and edges — useful for social networks, supply chains, fraud detection, recommendation engines, and any domain where relationships are as important as the data itself. Oracle has had graph support for years (via Oracle Graph Server), but Oracle 23ai brings graph queries directly into SQL using the new ANSI SQL/PGQ standard.

Creating a property graph:

CREATE PROPERTY GRAPH company_graph
    VERTEX TABLES (employees KEY (employee_id))
    EDGE TABLES   (
        reports_to
            SOURCE KEY (employee_id) REFERENCES employees
            DESTINATION KEY (manager_id) REFERENCES employees
    );

Querying the graph with SQL/PGQ:

SELECT *
FROM GRAPH_TABLE (
    company_graph
    MATCH (e IS employees) -[r IS reports_to]-> (m IS employees)
    WHERE e.department_id = 10
    COLUMNS (e.first_name AS employee, m.first_name AS manager)
);

Finding paths of arbitrary depth:

-- Find all employees up to 3 levels below a given manager
SELECT *
FROM GRAPH_TABLE (
    company_graph
    MATCH (top) -[IS reports_to]->+ (sub)  -- + means one or more hops
    WHERE top.employee_id = 100
    COLUMNS (sub.employee_id, sub.first_name)
);

Why in SQL, not a separate graph query language?

This is the key insight. By embedding graph traversal in SQL, you can combine graph patterns with relational predicates, aggregations, and joins — in a single query. No need to move data to a dedicated graph database.

Oracle is the first major commercial RDBMS to implement the SQL/PGQ standard. This positions it well for use cases in fraud detection, network analysis, and knowledge graphs — all of which are growing rapidly in enterprise settings.

JSON Relational Duality Views: Bridging Document and Relational Models

One of Oracle 23ai’s headline features is JSON Relational Duality Views. The concept is elegant: define a view once, and it automatically presents the same data as both structured relational tables and JSON documents. No duplication, no synchronization lag, no ETL.

Creating a duality view:

CREATE OR REPLACE JSON RELATIONAL DUALITY VIEW order_dv AS
SELECT JSON {
    '_id'       : o.order_id,
    'status'    : o.status,
    'customer'  : o.customer_name,
    'items'     : [
        SELECT JSON {
            'product'  : oi.product_name,
            'qty'      : oi.quantity,
            'price'    : oi.unit_price
        }
        FROM order_items oi
        WHERE oi.order_id = o.order_id
    ]
}
FROM orders o WITH INSERT UPDATE DELETE;

Reading as JSON:

SELECT data FROM order_dv WHERE data.status = 'PENDING';

Reading as relational (same data):

SELECT * FROM orders WHERE status = 'PENDING';

Both queries hit the same underlying tables. No duplication.

Why this is architecturally significant

Modern applications often have mixed workloads. A REST API might need JSON documents. Analytics queries need relational structure. Traditional solutions involve maintaining both a relational database and a document store (like MongoDB) with synchronization middleware.

Duality Views eliminate that. The same Oracle database, the same tables, the same transaction — but surfaced as JSON for your Node.js app and as SQL for your BI tool.

ORDS integration: Oracle REST Data Services (ORDS) can automatically expose duality views as REST endpoints, letting you build MongoDB-style document APIs on top of relational data with minimal code.

This is a genuinely novel architecture that deserves its own blog series. For now, the key point: if you have applications that need document APIs and relational reporting, 23ai’s duality views are worth evaluating seriously.

Schema-Level Privileges: Simplifying Access Control in Oracle 23ai

Managing object-level privileges in Oracle has always been granular — sometimes too granular. If you wanted a user to have SELECT on every table in a schema, you had to grant it table by table. Add a new table later? Grant again. Automate it? Write a script.

Oracle 23ai introduces schema-level privileges — the ability to grant a privilege on an entire schema with a single statement that automatically applies to all current and future objects in that schema.

Granting schema-level SELECT:

GRANT SELECT ANY TABLE ON SCHEMA hr TO reporting_user;

reporting_user can now SELECT from any table (current or future) in the hr schema. No more individual grants per table.

Other supported schema-level privileges:

GRANT INSERT ANY TABLE   ON SCHEMA app_data TO etl_user;
GRANT UPDATE ANY TABLE   ON SCHEMA app_data TO etl_user;
GRANT DELETE ANY TABLE   ON SCHEMA app_data TO etl_user;
GRANT EXECUTE ANY PROCEDURE ON SCHEMA hr   TO app_user;
GRANT SELECT ANY SEQUENCE   ON SCHEMA hr   TO app_user;

Revoking schema-level privileges:

REVOKE SELECT ANY TABLE ON SCHEMA hr FROM reporting_user;

Key behavior: Schema privileges are dynamic. When a new table is created in hr, reporting_user immediately has SELECT on it — no additional grant required.

Checking existing schema-level grants:

SELECT grantee, privilege, schema
FROM   dba_schema_privs
ORDER BY schema, grantee;

Real-world impact

This dramatically simplifies privilege management for read-only reporting users, ETL processes, and microservice accounts that need access to a full schema rather than individual tables. It’s a long-requested feature that reduces the DBA overhead of schema management and eliminates entire categories of “who forgot to grant this?” support tickets.

New Aggregate Functions in Oracle 23ai

Oracle 23ai adds a meaningful set of new aggregate functions to SQL. These fill gaps in analytical and statistical capabilities that previously required PL/SQL, external tools, or verbose workarounds.

CHECKSUM

Returns a deterministic checksum value for a set of rows. Extremely useful for data validation and change detection.

SELECT CHECKSUM(*) AS row_fingerprint
FROM   employees
WHERE  department_id = 10;

If any row in the result set changes, the checksum changes — making it ideal for comparing snapshots.

BIT_AND_AGG / BIT_OR_AGG / BIT_XOR_AGG

Bitwise aggregate functions — apply bitwise AND, OR, or XOR across all values in a group.

-- Which permissions are held by ALL members of a group?
SELECT role_id, BIT_AND_AGG(permission_flags) AS common_permissions
FROM   role_members
GROUP BY role_id;

-- Which permissions are held by AT LEAST ONE member?
SELECT role_id, BIT_OR_AGG(permission_flags) AS any_permission
FROM   role_members
GROUP BY role_id;

SKEWNESS_POP / SKEWNESS_SAMP

Measures statistical skewness of a distribution — how asymmetric the data is around the mean.

SELECT SKEWNESS_POP(salary) AS salary_skew
FROM   employees;

A positive value means the distribution tails right (a few very high salaries); negative means it tails left.

KURTOSIS_POP / KURTOSIS_SAMP

Measures the “tailedness” of a distribution — whether outliers are common or rare.

SELECT KURTOSIS_SAMP(response_time_ms) AS response_time_kurtosis
FROM   api_logs;

Why these matter

Before 23ai, these calculations required either loading data into R/Python, implementing them manually as complex SQL expressions, or using the DBMS_STAT_FUNCS package. Now they’re first-class aggregate functions available anywhere SQL is valid — in reports, views, materialized views, and analytics queries.

EXCEPT and EXCEPT ALL: New Set Operators in Oracle 23ai

If you’ve ever had to port SQL from PostgreSQL or SQL Server to Oracle, you’ve probably hit this wall: Oracle uses MINUS where every other database uses EXCEPT. They do the same thing, but the name difference forces you to rewrite queries or maintain separate versions.

Oracle 23ai adds EXCEPT and EXCEPT ALL as aliases and extensions to the existing set operator vocabulary.

EXCEPT — equivalent to MINUS (distinct rows only):

-- Customers who placed orders in 2024 but NOT in 2025
SELECT customer_id FROM orders WHERE order_year = 2024
EXCEPT
SELECT customer_id FROM orders WHERE order_year = 2025;

This is identical to using MINUS. Both are now valid.

EXCEPT ALL — the new addition:

-- Returns all rows from the first query not in the second,
-- including duplicates
SELECT product_id FROM inventory_snapshot_jan
EXCEPT ALL
SELECT product_id FROM inventory_snapshot_feb;

EXCEPT ALL does not eliminate duplicates from either side before comparing. If a value appears 3 times in the first set and 1 time in the second, the result contains 2 occurrences. This is the behavior defined in ANSI SQL and already present in PostgreSQL.

INTERSECT ALL is also new:

Oracle 23ai also adds INTERSECT ALL for symmetric completeness — it returns matching rows including duplicates, preserving the lower count from either side.

Complete set operator list in Oracle 23ai:

Operator Behavior
UNION All distinct rows from both
UNION ALL All rows including duplicates
INTERSECT Distinct rows in both
INTERSECT ALL Rows in both, with duplicates
MINUS / EXCEPT Distinct rows in first but not second
EXCEPT ALL Rows in first not in second, with duplicates

For cross-database SQL portability alone, EXCEPT is a welcome addition.

SQL Annotations: Adding Metadata to Your Database Objects

Oracle 23ai introduces Annotations — a way to attach custom metadata to database objects and columns directly in the database, not in a separate spreadsheet or wiki page that goes out of date.

An annotation is a name-value pair (or just a name) attached to a table, column, index, view, or other object. Think of it like tags or labels on cloud resources, but for your schema objects.

Adding annotations when creating a table:

CREATE TABLE employees (
    employee_id  NUMBER         ANNOTATIONS (PrimaryIdentifier, DisplayOrder '1'),
    first_name   VARCHAR2(50)   ANNOTATIONS (UserFacing 'true', MaxLength '50'),
    salary       NUMBER(10,2)   ANNOTATIONS (Sensitive 'true', Audited 'true'),
    dept_code    VARCHAR2(10)   ANNOTATIONS (ForeignKey 'departments.dept_code')
)
ANNOTATIONS (BusinessDomain 'HumanResources', Owner 'HR Team');

Adding annotations to an existing object:

ALTER TABLE employees
    ANNOTATIONS (ADD Deprecated 'true', ADD MigratedFrom 'LEGACY_EMP');

Querying annotations:

SELECT object_name, column_name, annotation_name, annotation_value
FROM   user_annotations_usage
ORDER BY object_name, column_name;

Practical use cases:

  • Mark columns as Sensitive for data classification tools
  • Tag tables with BusinessDomain for impact analysis
  • Add Owner metadata for accountability
  • Flag deprecated objects with Deprecated so tooling can warn users
  • Store application-specific metadata like UILabel or DisplayOrder

Before annotations, metadata like this lived in comments (fragile, hard to query), separate tables (manual sync), or external tools. Annotations make it a first-class citizen of the schema.

This feature also has a personal story for me — one I’ll be sharing in detail in September. Stay tuned.

GROUP BY ALL: The Shorthand That Changes Everything

Here’s one of my favorite SQL improvements in Oracle 23ai, and one with a personal history I’ll share more about in September: GROUP BY ALL.

Every developer who has written aggregate queries knows the pain. You write a SELECT with some aggregate functions (SUM, COUNT, AVG), and Oracle forces you to list every non-aggregated column in the GROUP BY clause. Add a column to the SELECT? Add it to GROUP BY too. Forget one? ORA-00979: not a GROUP BY expression.

GROUP BY ALL solves this automatically. It groups by every non-aggregate expression in the SELECT list.

Before (Oracle ≤ 19c / 21c):

SELECT department_id, department_name, job_id, location_id,
       COUNT(*) AS headcount,
       AVG(salary) AS avg_salary
FROM   employees
GROUP BY department_id, department_name, job_id, location_id;

After (Oracle 23ai):

SELECT department_id, department_name, job_id, location_id,
       COUNT(*) AS headcount,
       AVG(salary) AS avg_salary
FROM   employees
GROUP BY ALL;

Oracle automatically groups by department_id, department_name, job_id, and location_id — every column that isn’t inside an aggregate function.

Adding a new column? No GROUP BY change needed:

SELECT department_id, department_name, job_id, location_id, country_id,
       COUNT(*) AS headcount,
       AVG(salary) AS avg_salary
FROM   employees
GROUP BY ALL;  -- country_id is automatically included

Works with ROLLUP, CUBE, and GROUPING SETS too:

SELECT department_id, job_id,
       SUM(salary)
FROM   employees
GROUP BY ROLLUP ALL;

This feature is one of those “why didn’t this always exist?” improvements. Less boilerplate, fewer bugs from missing a column in GROUP BY, and queries that are easier to extend. I’ll have more to say about the origins of this idea in September’s special series.

SQL Domains: Define Once, Reuse Everywhere

One of the more architecturally significant features in Oracle 23ai is SQL Domains. A domain is essentially a reusable column specification — a named type that bundles a data type, constraints, default values, display formatting, and ordering rules. Define it once, apply it to as many columns as you want across your schema.

Creating a domain:

CREATE DOMAIN email_address AS VARCHAR2(255)
    CONSTRAINT email_fmt CHECK (REGEXP_LIKE(VALUE, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'))
    DISPLAY UPPER(VALUE)
    ORDER LOWER(VALUE);

Using the domain in a table:

CREATE TABLE customers (
    customer_id   NUMBER,
    email         email_address,    -- domain applied here
    support_email email_address     -- same domain, second column
);

The constraint from the domain is automatically enforced on every column that uses it. If you later update the domain’s constraint, the change propagates.

More domain examples:

-- Positive monetary amounts
CREATE DOMAIN money_usd AS NUMBER(15,2)
    DEFAULT 0.00
    CONSTRAINT positive_amount CHECK (VALUE >= 0);

-- Status codes
CREATE DOMAIN order_status AS VARCHAR2(20)
    CONSTRAINT valid_status CHECK (VALUE IN ('PENDING','PROCESSING','SHIPPED','DELIVERED','CANCELLED'));

Querying domain metadata:

SELECT domain_name, data_type, char_length
FROM   user_domains;

Why domains matter

Without domains, every table that stores an email address has its own REGEXP constraint — or more likely, no constraint at all. Domains enforce consistency at the schema level. They’re the database’s way of saying “this concept means the same thing everywhere.”

This is a feature that benefits large teams and long-lived schemas the most. If you’re building from scratch in 2025, start using domains — your future self will thank you.

DELETE with Direct Joins in Oracle 23ai

Alongside the UPDATE improvement covered last week, Oracle 23ai also brings direct JOIN support to DELETE statements. The same ergonomic benefit applies — no more correlated subqueries just to filter a delete based on another table.

The old way:

DELETE FROM order_items oi
WHERE oi.order_id IN (
    SELECT o.order_id
    FROM   orders o
    WHERE  o.status = 'CANCELLED'
    AND    o.order_date < DATE '2024-01-01'
);

The new way (Oracle 23ai):

DELETE FROM order_items oi
JOIN   orders o ON o.order_id = oi.order_id
WHERE  o.status = 'CANCELLED'
AND    o.order_date < DATE '2024-01-01';

Same result, but the relationship between the two tables is explicit and readable.

Cascade delete pattern:

-- Delete all line items for expired promotions
DELETE FROM promotion_items pi
JOIN   promotions p ON p.promo_id = pi.promo_id
WHERE  p.expiry_date < SYSDATE;

An important caveat: The join conditions determine which rows are deleted from the target table. You cannot delete from the joined table — only from the primary table in the DELETE FROM clause. If you try to delete from both sides, Oracle will raise an error.

Why this matters

In data warehousing and ETL scenarios, conditional deletes based on staging table content are extremely common. Before 23ai, you had to choose between a readable EXISTS subquery or a sometimes-faster IN subquery, both of which hide the join relationship.

The direct JOIN syntax is not only cleaner to write — it’s also easier to review in code reviews and easier to optimize because the join relationship is explicit in the query structure.