SQL Domains: When to Use Them (And When Not To)

SQL Domains are powerful, but like any abstraction layer, they work better in some situations than others. Here’s a practical guide to the decision.

Use SQL Domains when:

The same semantic type appears in 3+ columns across your schema. Email addresses, phone numbers, status codes, country codes, monetary amounts — these are domain candidates.

The constraint logic is non-trivial. A REGEXP_LIKE email validation or a multi-value IN list is worth centralizing. A simple NUMBER column without constraints is not.

You want schema-level documentation to drive tooling. Domains are queryable — your data catalog, code generators, or validation tools can introspect them.

You’re building a new schema. Retrofitting domains is harder than designing with them from the start.

Be cautious when:

The constraint might need to vary per table. Domains enforce the same constraint everywhere they’re used. If one table needs CHECK (VALUE IN ('A','B')) and another needs CHECK (VALUE IN ('A','B','C')), that’s two domains, not one.

You have existing applications that don’t expect domain-level errors. Domain constraint violations raise the same ORA-02290 CHECK constraint error as inline constraints, but the constraint name includes the domain name — applications parsing constraint names may need updating.

You’re mid-migration on a legacy schema. Adding domains to existing columns requires an ALTER TABLE MODIFY that adds the domain. This can be done incrementally, but plan it carefully.

Creating domain variations correctly:

-- Don't force one domain to cover all cases
CREATE DOMAIN d_status_basic   AS VARCHAR2(10) CONSTRAINT CHECK (VALUE IN ('ACTIVE','INACTIVE'));
CREATE DOMAIN d_status_extended AS VARCHAR2(10) CONSTRAINT CHECK (VALUE IN ('ACTIVE','INACTIVE','PENDING','ARCHIVED'));

Two focused domains are better than one overly permissive domain or one domain with workaround constraints.

GROUP BY ALL: Advanced Analytics Patterns

GROUP BY ALL simplifies the common case, but it also works with advanced grouping operations. Let’s look at patterns where it really pays off.

Complex analytical queries with many dimensions:

-- Before 23ai: manually listing 6 non-aggregated columns in GROUP BY
SELECT
    region, country, product_line, product_family, channel, quarter,
    SUM(revenue)       AS total_revenue,
    SUM(units_sold)    AS total_units,
    AVG(unit_price)    AS avg_price
FROM sales_facts
GROUP BY region, country, product_line, product_family, channel, quarter;

-- With GROUP BY ALL:
SELECT
    region, country, product_line, product_family, channel, quarter,
    SUM(revenue), SUM(units_sold), AVG(unit_price)
FROM sales_facts
GROUP BY ALL;

Adding a new dimension (say, sales_rep_id) to the analysis: before 23ai, you must add it to both SELECT and GROUP BY. With GROUP BY ALL, you only add it to SELECT.

GROUP BY ROLLUP ALL — subtotals without listing columns:

SELECT region, country, SUM(revenue) AS total
FROM   sales_facts
GROUP BY ROLLUP ALL;
-- Equivalent to: GROUP BY ROLLUP(region, country)
-- Produces: region+country subtotals, region subtotals, grand total

GROUP BY CUBE ALL — cross-dimensional analysis:

SELECT channel, quarter, SUM(revenue)
FROM   sales_facts
GROUP BY CUBE ALL;
-- All combinations: channel+quarter, channel only, quarter only, grand total

Interaction with window functions:

GROUP BY ALL does not apply to window functions — they’re not aggregate expressions in the GROUP BY sense. Window functions work the same as always:

SELECT region, product_line, SUM(revenue) AS group_total,
       SUM(SUM(revenue)) OVER (PARTITION BY region) AS region_total
FROM   sales_facts
GROUP BY ALL;
-- GROUP BY ALL includes: region, product_line
-- Window function operates on the grouped results

GROUP BY ALL is at its best in analytical queries with 4+ dimensions where GROUP BY maintenance becomes error-prone.

Vector Search in Production: Patterns for AI-Native Applications

If you’re building a Retrieval-Augmented Generation (RAG) application on Oracle 23ai, here are the architectural patterns that work best in production — based on real implementations.

The basic RAG pipeline with Oracle:

-- 1. Store documents and embeddings
INSERT INTO knowledge_base (doc_id, content, embedding)
VALUES (:id, :text, TO_VECTOR(:embedding_array));

-- 2. At query time: find the most relevant documents
SELECT content, VECTOR_DISTANCE(embedding, :query_embed, COSINE) AS score
FROM   knowledge_base
WHERE  category = :filter_category    -- combine with relational filter
ORDER BY score
FETCH FIRST 5 ROWS ONLY;

Chunking strategy matters:

Large documents should be split into chunks (paragraphs, sentences, fixed token windows) before embedding. Each chunk becomes a separate row. The optimal chunk size depends on your embedding model — typically 256-512 tokens for OpenAI models.

-- Store chunks with parent document reference
CREATE TABLE knowledge_chunks (
    chunk_id   NUMBER GENERATED ALWAYS AS IDENTITY,
    doc_id     NUMBER REFERENCES knowledge_base(doc_id),
    chunk_seq  NUMBER,          -- position in document
    chunk_text CLOB,
    embedding  VECTOR(1536, FLOAT32),
    CONSTRAINT pk_chunk PRIMARY KEY (chunk_id)
);

Hybrid search — vector + keyword:

Pure semantic search misses exact matches. Combine vector similarity with Oracle Text full-text search:

SELECT k.doc_id, k.chunk_text,
       VECTOR_DISTANCE(k.embedding, :q_vec, COSINE) AS semantic_score,
       SCORE(1) AS keyword_score
FROM   knowledge_chunks k
WHERE  CONTAINS(k.chunk_text, :keyword, 1) > 0     -- keyword match
ORDER BY (0.6 * semantic_score + 0.4 * (SCORE(1)/100))
FETCH FIRST 10 ROWS ONLY;

Metadata filtering before vector search:

Always apply relational filters before the vector index scan to reduce the candidate set:

SELECT content, VECTOR_DISTANCE(embedding, :q_vec, COSINE) AS score
FROM   knowledge_chunks
WHERE  doc_category = 'LEGAL'           -- narrow the search space first
AND    created_date > DATE '2024-01-01'
ORDER BY score
FETCH FIRST 5 ROWS ONLY;

Oracle’s vector index (HNSW) performs much better when the candidate pool is pre-filtered to relevant documents.

Oracle 23ai in Production: What We’ve Learned After a Year

Oracle Database 23ai has been generally available since May 2024. After roughly 18 months of real-world deployments, a picture is emerging: where the new features deliver and where organizations run into friction.

What’s working well:

IF [NOT] EXISTS for DDL has been universally praised. It’s the kind of feature that pays dividends every single sprint. Migration scripts are cleaner, deployments are more predictable, and the volume of “deployment script failed on object already exists” incidents has dropped for teams that adopted 23ai.

SQL Domains are seeing strong adoption in organizations with mature data governance programs. Teams building new schemas from scratch in 23ai are embracing domains as a standard practice. Teams migrating existing schemas are more cautious — and rightly so. Retrofitting domains onto an existing schema requires careful planning.

The VECTOR data type and AI Vector Search have attracted significant interest from teams building RAG applications. The ability to store embeddings alongside relational data and combine vector similarity with SQL predicates is a genuine architectural advantage over separate vector stores.

Where organizations are moving cautiously:

JSON Relational Duality Views require careful design. Teams that tried to map complex, highly normalized schemas (10+ tables per “document”) found the duality view definition difficult to maintain. The sweet spot is 2-5 tables per document.

True Cache is still new enough that most organizations haven’t deployed it in production. Documentation and operational tooling (monitoring, sizing guidance) are still maturing.

Upgrade advice for 2026:

If you’re on 19c, evaluate 23ai for development environments now. Most organizations targeting 23ai for production are planning 2026-2027 timelines. That’s the right approach — test, validate, and migrate deliberately.

Predictions for Oracle Database in 2026

As 2025 closes, let’s make some educated guesses about where Oracle is heading in 2026.

More AI, deeper in the database: Oracle’s Vector AI Search is a foundation. In 2026, expect Oracle to ship tighter integrations with popular AI frameworks (LangChain, LlamaIndex), more built-in embedding models that run inside the database, and potentially an on-premises Select AI option that doesn’t require an external LLM API call.

Oracle Database 23ai on-premises maturity: The cloud version of 23ai has been available since May 2024. The on-premises version followed, but enterprise adoption takes time. 2026 will be the year most large Oracle shops begin serious evaluation for on-premises 23ai upgrades. Expect more migration tooling and upgrade advisory content from Oracle.

SQL/PGQ expansion: The ANSI SQL/PGQ standard is still young. Oracle was early to implement it, but the standard itself will evolve. Expect Oracle to track those changes, potentially adding recursive graph pattern matching and more sophisticated path expressions.

True Cache GA and wider adoption: True Cache, introduced in 23ai, will mature in 2026 with better monitoring, automatic cache sizing, and potentially multi-node True Cache configurations for horizontal read scaling.

Kafka API evolution: The Kafka-compatible TxEventQ API will likely gain Kafka Schema Registry compatibility and better tooling for migrating existing Kafka topics to Oracle TxEventQ.

On this blog: We’ll keep the weekly cadence going. Week by week, feature by feature. Oracle has given us a lot to work with. 2026 should be a good year.

See you in January.

Oracle 23ai Performance Tips for SQL Developers

As we close out the year, here’s a practical performance guide focused specifically on Oracle 23ai SQL features and how to get the most out of them.

GROUP BY ALL performance: GROUP BY ALL generates the same execution plan as an explicit GROUP BY list. There’s no performance penalty, the optimizer resolves the column list at parse time. Use it freely.

Vector index accuracy vs. speed tradeoff:

CREATE VECTOR INDEX docs_idx ON docs(embedding)
WITH TARGET ACCURACY 90;  -- faster, approximate results
-- vs.
WITH TARGET ACCURACY 99;  -- slower, more precise

For most RAG use cases, 90-95% accuracy is acceptable and significantly faster.

Schema privileges and parse overhead: Schema-level privileges (GRANT SELECT ANY TABLE ON SCHEMA) don’t affect query parse time. The privilege check at parse time is O(1) — same as object-level privileges.

Domains and constraint checking: Domain constraints are enforced at INSERT/UPDATE time, exactly like column-level CHECK constraints. There’s no additional overhead for using domains vs. inline constraints.

JSON Duality Views: Duality views over wide joins (5+ tables) can have higher write overhead because each JSON mutation must be decomposed into relational DML across multiple tables. Profile with realistic payloads before deploying to production.

MEMOPTIMIZE FOR WRITE throughput: To maximize Fast Ingest throughput, ensure the LARGE_POOL_SIZE is sufficient. Oracle recommends allocating at least 3x the expected peak buffer size:

ALTER SYSTEM SET LARGE_POOL_SIZE = 1G;

SQL Firewall learning period: Run the learning phase for at least 2 weeks to capture all application code paths (batch jobs, end-of-month processes, etc.) before enabling enforcement. An incomplete learning phase leads to false positives.

Oracle 23ai: The SQL Year in Review

What a year for Oracle SQL. Looking back at what we’ve covered in 2025, Oracle 23ai represents one of the most significant leaps in Oracle’s SQL language in a decade. Here’s the condensed highlight reel.

Developer ergonomics (finally addressed): IF [NOT] EXISTS for DDL, GROUP BY ALL, SELECT without FROM, EXCEPT/EXCEPT ALL, table aliases in DML, the ALIAS clause, and multirow VALUES syntax. These aren’t flashy features — they’re things developers have wanted for years, implemented at last.

Data model modernization: SQL Domains, the Boolean data type, Annotations, and enhanced JSON functions (JSON_TRANSFORM, JSON_MERGEPATCH, JSON Schema validation). Oracle’s data model is now closer to a modern standard than at any point in its history.

AI integration: The VECTOR data type and AI Vector Search are the most strategically significant additions. Combined with Select AI (natural language to SQL) and the integration with LLM providers via DBMS_CLOUD_AI, Oracle has made a credible bid to be the primary data store for AI workloads.

Graph and streaming: SQL/PGQ property graphs and GRAPH_TABLE bring graph analytics into the SQL layer. Kafka-compatible APIs for Transactional Event Queues connect Oracle to the streaming ecosystem.

Concurrency and performance: Lock-Free Reservations, Priority Transactions, True Cache, and Memoptimized Rowstore enhancements address real scalability pain points at the architecture level.

Security: SQL Firewall is a meaningful addition to Oracle’s security stack — kernel-level, transparent, and zero-code-change for applications.

If you’re on Oracle 19c or 21c and haven’t evaluated 23ai yet, 2026 is the year to start that assessment. The feature set justifies the upgrade conversation.

MLE: Running JavaScript Inside Oracle Database

Oracle 23ai ships MLE (Multi-Language Engine) as a fully supported feature. MLE allows you to write stored procedures, functions, and triggers in JavaScript (using the GraalVM runtime) and run them inside the Oracle Database process.

Creating a JavaScript function:

CREATE OR REPLACE MLE MODULE formatting_utils
LANGUAGE JAVASCRIPT AS
$$
export function formatCurrency(amount, currency = 'USD') {
    return new Intl.NumberFormat('en-US', {
        style: 'currency',
        currency: currency
    }).format(amount);
}

export function slugify(text) {
    return text.toLowerCase()
               .replace(/[^a-z0-9]+/g, '-')
               .replace(/^-|-$/g, '');
}
$$;

Calling JavaScript from SQL:

CREATE OR REPLACE FUNCTION format_currency(p_amount NUMBER, p_currency VARCHAR2 DEFAULT 'USD')
RETURN VARCHAR2
AS MLE MODULE formatting_utils SIGNATURE 'formatCurrency(number, string)';
/

SELECT product_name,
       format_currency(price) AS formatted_price
FROM   products;

Accessing the database from JavaScript:

-- JavaScript can issue SQL within the same session
CREATE OR REPLACE MLE MODULE order_processor
LANGUAGE JAVASCRIPT AS
$$
export function processOrder(orderId) {
    const result = session.execute(
        'SELECT status FROM orders WHERE order_id = :id',
        { id: orderId }
    );
    if (result.rows[0].STATUS === 'PENDING') {
        session.execute(
            'UPDATE orders SET status = :s WHERE order_id = :id',
            { s: 'PROCESSING', id: orderId }
        );
    }
}
$$;

Why this matters: Teams with JavaScript expertise can now contribute to database-layer logic without learning PL/SQL. Complex string manipulation, JSON processing, and business rules that are awkward in PL/SQL are natural in JavaScript.

The runtime is GraalVM — a secure, sandboxed JavaScript engine. It does not have access to the filesystem or network outside of Oracle’s SQL interface.

JSON Duality Views in Production: Patterns and Pitfalls

JSON Relational Duality Views were introduced in April. Having had more time to work with them, let’s talk about what works well in production and where to be careful.

What works well:

Simple document APIs: If your documents map to a small number of normalized tables (2-4 tables), duality views are excellent. The REST API you get via ORDS is clean, consistent, and requires almost no backend code.

Write consistency: Because duality views write through to relational tables in a single transaction, you get ACID guarantees that no external MongoDB + Oracle sync can provide. Write to the duality view; both your JSON API and your SQL reports see the same data, immediately.

-- ORDS exposes this as a REST endpoint automatically
-- GET /api/orders/42 → returns the JSON document
-- PUT /api/orders/42 → updates through to relational tables

Where to be careful:

Complex joins: Duality views over many tables with complex join conditions become difficult to reason about. If your document model doesn’t align reasonably with your relational model, the duality view definition becomes a maintenance burden.

Aggregations: Duality views represent individual documents (rows/graphs), not aggregations. You cannot expose aggregated data (counts, sums) as a duality view — that’s still a job for regular views or materialized views.

Caching behavior: Applications that cache JSON responses aggressively may see stale data if they don’t respect Oracle’s ETags that duality views generate for optimistic locking.

Design rule of thumb: Use duality views when your document model has a clear primary entity (Order, Customer, Product) that owns related data. Avoid them for cross-cutting concerns or heavily aggregated responses.

Annotations in Enterprise Environments: Advanced Use Cases

Annotations (introduced in March’s post) become particularly powerful in enterprise environments where data governance, self-service tooling, and schema documentation are critical. Let’s look at advanced patterns.

Data classification for security:

ALTER TABLE customers MODIFY (
    ssn         ANNOTATIONS (ADD DataClass 'PII', ADD Sensitivity 'Critical', ADD EncryptionRequired 'true'),
    email       ANNOTATIONS (ADD DataClass 'PII', ADD Sensitivity 'High'),
    birth_date  ANNOTATIONS (ADD DataClass 'PII', ADD Sensitivity 'Medium'),
    customer_id ANNOTATIONS (ADD DataClass 'Internal')
);

A data catalog tool can query USER_ANNOTATIONS_USAGE to automatically generate a data classification report without anyone maintaining a separate spreadsheet.

Deprecation management:

ALTER TABLE legacy_orders ANNOTATIONS (ADD Deprecated 'true', ADD DeprecatedSince '2024-01-01', ADD MigratedTo 'orders_v2');

CI/CD pipelines can query for deprecated objects and fail builds that reference them.

UI metadata for low-code applications:

ALTER TABLE employees MODIFY (
    first_name ANNOTATIONS (ADD UILabel 'First Name', ADD DisplayOrder '1', ADD UIRequired 'true'),
    last_name  ANNOTATIONS (ADD UILabel 'Last Name',  ADD DisplayOrder '2', ADD UIRequired 'true'),
    salary     ANNOTATIONS (ADD UILabel 'Annual Salary', ADD UIReadOnly 'true', ADD UIFormat 'currency')
);

Low-code platforms like Oracle APEX can read these annotations to auto-generate forms with the correct labels, ordering, and validation.

Queryable catalog:

-- Find all columns marked as PII across the schema
SELECT owner, object_name, column_name, annotation_value AS data_class
FROM   dba_annotations_usage
WHERE  annotation_name = 'DataClass'
AND    annotation_value = 'PII'
ORDER BY owner, object_name;

Annotations turn the Oracle data dictionary into a live, queryable governance layer — no external catalog required.