The SQL Features I Suggested (And Oracle Implemented) — Part 1: Introduction

THE SAGA: “The Features I Suggested, Oracle Implemented — A Personal History”

This is a five-part series about two Oracle SQL features introduced in Oracle Database 23ai that are strikingly similar — in concept, motivation, and syntax — to suggestions I posted in the Oracle Community forums in July 2016, nearly a decade before they were implemented.

Back in 2016, I was deep in the Oracle ACE program — eventually ACE Director for four years — and spending a lot of time in the Oracle Community forums. At the time, Oracle had a space called “Database Ideas” where community members could suggest improvements to the database.

I posted several ideas. Two of them, I believe, were implemented in Oracle Database 23ai, roughly eight years later. The features are GROUP BY ALL and SQL Annotations.

I want to be careful about how I frame this. I’m not claiming Oracle copied my ideas. Large engineering organizations like Oracle receive countless community suggestions, conduct internal research, and implement features based on their own roadmap priorities. It’s entirely possible — likely, even — that these features were conceived independently by Oracle’s SQL language team.

What I am saying is this: the similarities are striking. Not just in concept, but in the specific syntax and motivation I described. And I was never contacted, credited in the documentation, or mentioned in any release notes.

Over the next four weeks, I’ll walk through each feature:

  • Sep 8: GROUP BY ALL — my 2016 suggestion vs. the 23ai implementation
  • Sep 15: Annotations — my “Oracle Tagging for Objects” idea vs. the Annotations clause
  • Sep 22: CTAS with Autodrop — a related idea that may have seeded ADO
  • Sep 29: Reflections on being an uncredited contributor

I’m writing this series not to be bitter — I’m genuinely proud that ideas I had are now real Oracle features — but because I think it’s worth talking about the relationship between community feedback and product development. Transparency matters.

Kafka-Compatible APIs: Oracle 23ai Meets the Streaming World

Oracle 23ai introduces Kafka-compatible APIs for Transactional Event Queues (TxEventQ) — Oracle’s messaging and event streaming infrastructure. This means your existing Kafka producer and consumer code can talk to Oracle Database without any changes.

What are Transactional Event Queues?

TxEventQ is Oracle’s built-in messaging system — stored inside the Oracle Database, with full ACID guarantees. Unlike standalone Kafka, messages in TxEventQ participate in database transactions. If a transaction rolls back, the message is also rolled back.

The Kafka bridge:

Oracle 23ai exposes TxEventQ over the Kafka protocol. Kafka clients (using the standard Kafka Java client, Confluent libraries, or any Kafka-compatible client) connect to Oracle’s Kafka endpoint and produce/consume messages transparently.

# Standard Kafka client configuration
bootstrap.servers=oracle-db-host:9092
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
# (Oracle credentials used for SASL auth)

Creating a Kafka-compatible topic (TxEventQ):

BEGIN
    DBMS_AQADM.CREATE_TRANSACTIONAL_EVENT_QUEUE(
        queue_name    => 'order_events',
        queue_payload_type => DBMS_AQADM.JMS_TYPE
    );
    DBMS_AQADM.START_QUEUE('order_events');
END;

Why this matters:

  • Transactional safety: A Kafka consumer acknowledging a message and updating a database row are one atomic operation — no dual-write problem.
  • Simplified architecture: Replace your Kafka cluster + database for use cases that require exactly-once semantics between messaging and data storage.
  • Migration path: Gradually migrate Kafka-based microservices to Oracle without rewriting clients.

This bridges two worlds that have traditionally required complex integration middleware. For Oracle-centric architectures, it’s a compelling alternative to external Kafka clusters.

Priority Transactions in Oracle 23ai

Oracle 23ai introduces Priority Transactions — a mechanism to assign priorities to DML transactions so that higher-priority work can preempt lower-priority work when lock contention occurs.

In a busy OLTP system, a long-running batch job holding a lock can block critical customer-facing transactions. Until now, the only options were to kill the batch session manually or accept the latency. Priority Transactions automate this.

Setting a transaction priority:

-- High priority (e.g., customer checkout)
EXEC DBMS_TRANSACTION.SET_PRIORITY('HIGH');

-- Low priority (e.g., background batch)
EXEC DBMS_TRANSACTION.SET_PRIORITY('LOW');

How preemption works:

When a HIGH priority transaction attempts to acquire a lock held by a LOW priority transaction, Oracle evaluates the conflict. If the high-priority transaction has waited longer than a configurable threshold, Oracle can:

  1. Abort the low-priority transaction (rollback and release locks)
  2. Notify the low-priority session so it can self-abort cleanly
  3. Defer the decision based on age and priority differential

Configuration:

ALTER SYSTEM SET TRANSACTION_PRIORITY_POLICY = 'PREEMPTIVE';
ALTER SYSTEM SET PRIORITY_TRANSACTION_WAIT_TIMEOUT = 5; -- seconds

Monitoring:

SELECT sid, serial#, priority, wait_class, seconds_in_wait
FROM   v$session
WHERE  wait_class = 'Concurrency'
ORDER BY priority DESC, seconds_in_wait DESC;

Real-world impact: In e-commerce and financial systems where a payment transaction must complete quickly, Priority Transactions ensure that a poorly-timed batch report doesn’t cause a customer-facing timeout. This is the kind of feature that previously required complex application-layer retry logic.

Lock-Free Reservations: High-Concurrency DML Without Conflicts

One of Oracle 23ai’s most interesting concurrency features is Lock-Free Reservations. It solves a classic high-concurrency problem: multiple sessions competing to update the same row — like decrementing inventory counts in a flash sale.

The classic problem:

-- Session 1 and Session 2 both try to decrement inventory
UPDATE products SET stock_qty = stock_qty - 1 WHERE product_id = 42;

With thousands of concurrent orders, Sessions 2 through N all block waiting for Session 1’s lock. This creates a hot row contention bottleneck.

Lock-Free Reservation solution:

-- Define the column as a reservation target
ALTER TABLE products MODIFY (
    stock_qty NUMBER RESERVABLE CONSTRAINT stock_non_negative CHECK (stock_qty >= 0)
);

Once a column is declared RESERVABLE, Oracle handles concurrent updates to it using a reservation mechanism rather than row-level locking. Each session “reserves” a portion of the value without blocking others.

How it works under the hood:

Oracle maintains a pending reservation table alongside the main table. Each concurrent update is stored as a reservation. The actual column value is the base value plus the sum of all pending reservations. When a session commits, its reservation is rolled into the base value.

The net effect:

  • Multiple sessions can concrement stock_qty simultaneously without blocking each other
  • The CHECK constraint (stock_qty >= 0) is still enforced — the sum of all reservations cannot drive the value below zero
  • ACID guarantees are maintained

Ideal use cases: Inventory management, ticket booking, seat reservations, resource quotas — any scenario where many sessions update the same numeric column concurrently.

This is a sophisticated concurrency improvement that eliminates a well-known Oracle scalability bottleneck without requiring application-level workarounds.

SQL Firewall: Protecting Your Database with SQL Whitelisting

Oracle SQL Firewall is one of the most important security additions in Oracle 23ai. It’s a kernel-level feature (not an add-on product) that inspects every SQL statement sent to the database and compares it against a learned whitelist of approved statements.

How it works:

  1. Learning phase: SQL Firewall observes your application’s normal SQL traffic and builds a whitelist of allowed statements, users, and connection paths.
  2. Enforcement phase: Any SQL statement not in the whitelist is either blocked, logged, or both.

Setting up SQL Firewall:

-- Enable SQL Firewall
EXEC DBMS_SQL_FIREWALL.ENABLE;

-- Start learning for a specific user
EXEC DBMS_SQL_FIREWALL.START_CAPTURE('APP_USER');

-- (Let the application run normally for the learning period)

-- Stop learning and create the allow-list
EXEC DBMS_SQL_FIREWALL.STOP_CAPTURE('APP_USER');
EXEC DBMS_SQL_FIREWALL.CREATE_ALLOW_LIST('APP_USER');

-- Enable enforcement
EXEC DBMS_SQL_FIREWALL.ENABLE_ALLOW_LIST(
    username     => 'APP_USER',
    enforce      => DBMS_SQL_FIREWALL.ENFORCE_SQL,
    block        => TRUE
);

What it catches:

  • SQL injection attempts that generate SQL not in the whitelist
  • Unauthorized queries run by rogue application code or DBA accounts
  • Unexpected data exfiltration queries (SELECT * FROM users when the app only does parameterized lookups)
  • Suspicious connection paths (e.g., a user who normally connects from the app server suddenly connecting from a different IP)

Monitoring violations:

SELECT username, sql_text, action_taken, timestamp
FROM   dba_sql_firewall_violations
ORDER BY timestamp DESC;

SQL Firewall is especially valuable for organizations subject to PCI-DSS, HIPAA, or SOX compliance. It provides a last line of defense at the database level — even if the application layer is compromised.

True Cache: Oracle’s New In-Memory Caching Architecture

Oracle 23ai introduces True Cache — a new read-only, in-memory cache tier that sits between your application and the primary Oracle Database. Think of it as a lightweight, always-consistent cache that speaks SQL.

What True Cache is NOT:

  • Not Redis or Memcached (those are key-value stores, not SQL-aware)
  • Not Oracle In-Memory Column Store (that’s inside the primary database)
  • Not a full read replica (it doesn’t write redo logs for all changes)

What True Cache IS:

A lightweight Oracle instance that maintains a hot subset of the primary database in memory, synchronized via redo log shipping. Applications connect to True Cache and run full SQL queries — with full Oracle SQL semantics — but reads are served from memory without hitting primary storage.

Architecture:

App ──► True Cache (in-memory, read-only SQL)
               │
               │ redo sync
               ▼
        Primary Database (read-write)

Key characteristics:

  • Consistent reads: True Cache uses the same SCN (System Change Number) mechanism as the primary. Reads are always consistent with the primary database.
  • Automatic sync: Changes in the primary are shipped via redo and applied to True Cache — similar to Active Data Guard, but lighter weight.
  • No storage footprint: True Cache doesn’t persist data to disk. It’s purely an in-memory acceleration layer.
  • Transparent to applications: Applications connect using standard Oracle drivers; no code changes needed.

Best use case: Read-heavy OLTP workloads where the working set fits in memory and read latency matters — product catalogs, session data, reference lookups.

True Cache is Oracle’s answer to the “put everything in Redis” trend, with the critical advantage of full SQL consistency guarantees.

HAIP Subnet Fliping Upon Link Down/Up Event

HAIP (High Availability IP) is Oracle’s private network redundancy feature in Grid Infrastructure. When a cluster has multiple private network interfaces, HAIP automatically bonds them, providing failover if one link goes down. Bug 29379299 reveals a critical flaw in how HAIP handles link recovery.

What happens:

  1. Both private network links are UP. HAIP is healthy.
  2. Link A goes DOWN. HAIP correctly fails over to Link B. All traffic continues on Link B.
  3. Link A comes back UP.
  4. Bug: Instead of staying on Link B (the current active link), HAIP “flips” — it attempts to reassign traffic to Link A, triggering a brief HAIP reconfiguration. During this reconfiguration, cluster interconnect traffic is interrupted, which can cause:
    • False node evictions (CSS thinks a node is dead due to missed heartbeats)
    • ORA-29740 (evicted by member) in database alert logs
    • Brief application outages even though both links are UP

Affected versions:

Oracle Grid Infrastructure 12.1.0.2 through 18c. Fixed in 19c starting with 19.4 RU.

Workarounds for older releases:

Option 1 — Disable HAIP and use OS-level bonding (bonding/teaming at the OS layer is more stable in most environments anyway):

# In GI home, set in crsconfig_params or run during install
-haip_no_config

Option 2 — Apply the one-off patch for Bug 29379299 (available for 12.1.0.2 and 12.2.0.1 through MOS).

Recommendation: If you’re on an affected version and your cluster has had mysterious node evictions following network maintenance or link flaps, check the CRS diagnostic logs ($GRID_HOME/log/<hostname>/cssd/ocssd.log) for HAIP reconfiguration events correlated with the eviction timestamps. This bug is the likely culprit.

+ASM1 Fails to Join +ASM2 During Rolling Upgrade from 12c to 19c

Bug 30452852 is a particularly nasty corner case in rolling Grid Infrastructure upgrades: after upgrading node 1 to 19c, the upgraded +ASM1 instance (running 19c) cannot join the cluster because +ASM2 is still running 12c — and the two versions cannot communicate properly in certain network configurations.

Scenario:

Node 1: Grid Infrastructure 19.3.0, +ASM1 — upgraded
Node 2: Grid Infrastructure 12.1.0.2, +ASM2 — still on 12c

During the rolling upgrade window, both ASM instances must briefly coexist. In a properly configured cluster, 19c ASM is backward-compatible with 12c ASM for this transition. Bug 30452852 causes that compatibility to break.

Symptoms:

  • +ASM1 crashes immediately after starting, citing I/O errors on voting disks
  • Alert log on node 1: GMON: detected ASM instance eviction — cannot reach quorum
  • crsctl stat res ora.asm -t shows +ASM1 OFFLINE, +ASM2 ONLINE

Technical cause:

The bug is in the inter-instance GMON (Global Membership Oracle Network) communication protocol. A version mismatch in the GMON handshake message format causes +ASM1 to interpret a valid 12c heartbeat as a corrupted packet and self-evict as a safety measure.

Workaround:

Before starting the upgrade:

# Set this parameter in ASM spfile or init.ora
_asm_compatibility_override = '12.1.0.2'

This tells 19c ASM to use the 12c-compatible message format during the transition. Remove it after both nodes are on 19c.

The fix is available in the 19.8+ patch set. If you’re running 19.3 or 19.5, this parameter workaround is essential for rolling upgrades in multi-node configurations.

ASM Fails to Start When Upgrading from 12.1.0.2 to 19.3.0

Continuing the ASM upgrade bug series from last week: Bug 30265357 is one of the most impactful RAC upgrade bugs you can encounter during a 12.1.0.2 to 19.3.0 Grid Infrastructure upgrade. It’s distinct from Doc ID 2606735.1 and deserves its own analysis.

What happens:

During the upgrade, after the 19c CRS stack starts on node 1, ora.asm (the ASM resource managed by CRS) fails to start. The ASM instance itself may be running as an OS process, but CRS cannot register or control it properly.

Error signatures in CRS logs:

CRS-2674: Start of 'ora.asm' on 'node1' failed
CRS-2632: There are no more servers to try to place resource 'ora.asm' on that is not already attempting to start
ORA-15077: could not locate ASM instance serving a required diskgroup

Root cause:

The 19c version of ocssd.bin (Cluster Synchronization Services) changes how ASM instance discovery works. During the crossgrade window (19c on node 1, 12c on remaining nodes), the CSS heartbeat mechanism and voting disk access patterns change in a way that causes the ASM resource to time out before completing its startup sequence.

Resolution:

  1. Apply the patch bundle that includes the fix for Bug 30265357 to all nodes before beginning the upgrade. The fix is included in 19.7+ RU bungets.
  2. If you’re already mid-upgrade: manually start ASM using srvctl start asm -n node1 and then attempt crsctl start res ora.asm — if successful, proceed with the rootupgrade.sh on remaining nodes.
  3. As a last resort: bounce the entire cluster and restart the upgrade from a clean state.

Always confirm the patch status with opatch lspatches | grep 30265357 before your maintenance window.

ASM Fails to Start After rootupgrade.sh on the First Node

If you’ve ever performed a rolling upgrade from Oracle Grid Infrastructure 12c to 19c in a RAC environment, you may have hit this one: after running rootupgrade.sh on the first node, the ora.asm resource fails to come online, and the upgrade stalls.

Symptoms:

  • rootupgrade.sh completes on node 1 but ora.asm fails to start
  • CRS alert log shows errors like: CRS-2674: Start of 'ora.asm' on '<node1>' failed
  • crsctl stat res ora.asm -t shows OFFLINE on node 1
  • ASM alert log may show mount errors or listener registration failures

Root cause

This issue is documented under Doc ID 2606735.1 and is related to the ASM SPFILE location and CRS resource configuration misalignment during the upgrade transition window. The 19c CRS stack is running on node 1 while the 12c stack is still active on remaining nodes — creating a brief but critical incompatibility in how ora.asm is registered and started.

Workaround during upgrade:

  1. Before running rootupgrade.sh on node 1, ensure ASM is using a PFILE instead of an SPFILE stored on ASM diskgroups that require the instance to already be mounted.
  2. Run the upgrade with the ASM SPFILE temporarily backed up to a local filesystem location.
  3. After the upgrade on all nodes, restore the SPFILE to ASM.

Related bugs:

This issue is closely related to Bug 30265357 and Bug 30452852, which I’ll cover in the next two weeks. If you’re planning a 12c-to-19c Grid Infrastructure upgrade, read all three before you start the maintenance window.

Always test your upgrade procedure in a non-production environment first, and keep My Oracle Support (MOS) Doc ID 2606735.1 bookmarked.