---
url: /blog/posts/2026-03-31-subqueries-deep-dive.md
description: >-
  A deep dive into how we rebuilt Electric's subquery support — DNF
  decomposition, splice-point move handling, reverse-indexed stream routing, and
  oracle testing for correctness.
---

[Subqueries](/docs/guides/shapes#subqueries) are a key feature of Electric's [Postgres Sync](/primitives/postgres-sync), enabling cross-table filtering when syncing [subsets of data](/docs/guides/shapes) into your apps.

With [v1.X](#) we've shipped a major upgrade to subqueries that makes them more expressive and powerful. Supporting more real-world data loading patterns and solving edge cases where data moving into and out of shapes caused unnecessary re-syncing.

This post dives into the engineering details and shows how we approached the problem with techniques including [DNF decomposition](#arbitrary-expressive-subqueries-via-dnf-decomposition) and [multi-timeline reverse indexes](#replication-stream-routing-with-reverse-indexes). And how we used our [Postgres oracle tests](#oracle-testing-for-correctness) to verify correctness and consistency.

> \[!Warning] ✨  Try it now
> [Read the docs](/docs/guides/shapes#subqueries), the [release notes](#)  and the [move-in move-out visualiser](#) demo app.

## Understanding subqueries

* Electric syncs subsets of Postgres into local apps using shapes — table +
  where clause + optional columns
* Real-world apps have relational data; you often need to filter what you sync
  based on related tables
* Subqueries solve this:
  `WHERE user_id IN (SELECT user_id FROM memberships WHERE org_id = $1)`
* This is how you get "sync the users in my org" or "sync tasks for active
  projects" — the bread and butter of app data loading
* When the underlying data changes — a user joins an org, a project gets
  archived — rows need to move in and out of shapes dynamically

### Previous limitations

* Our previous subquery support handled the common case well but was
  constrained: single subquery per shape, limited boolean logic
* The really hard problem: when related data changes and rows move in/out, how
  do you update the shape incrementally without resending everything?
* You need to know exactly which rows are newly included or excluded, at a
  precise point in the replication stream, without race conditions or duplicates
* Real-world apps with complex filtering logic and dynamic relational data were
  hitting these limitations, causing unnecessary resyncs and data reloading

## Arbitrary expressive subqueries via DNF decomposition

* Previously limited to a single subquery per shape with constrained boolean
  logic
* Now supports arbitrary combinations: `WHERE x IN sq1 OR y IN sq2`, `AND`,
  `NOT IN`, nested expressions
* Show before/after examples of what you can now express

### Why DNF?

* The core insight: a single subquery move maps cleanly to one "what newly
  entered?" query
* With `WHERE x IN sq1 OR y IN sq2`, a move in sq1 should only fetch rows
  newly included by sq1 that weren't already present via sq2
* DNF gives the right planning unit — each disjunct is one independent reason
  a row can be in a shape
* A move only affects the disjuncts that reference the changed dependency
* Move-in queries can be scoped to just those disjuncts

### DNF compilation

* Where clause is normalised to positive DNF:
  `(term AND term) OR (term AND term) OR ...`
* Each term is either a plain row predicate or a positive `IN (SELECT ...)`
  subquery predicate
* The shape keeps compiled metadata: disjuncts, position count,
  dependency-to-position mapping

### The NOT problem

* NOT with subqueries is genuinely hard — negation breaks the clean
  disjunct-scoping model

## Efficient move-in/move-out without resync

* When related data changes, rows move in and out of shapes — a user joins an
  org, a project gets archived
* Previously this triggered a full resync — client gets a 409, reloads
  everything
* Now Electric computes exactly which rows are newly included or excluded and
  streams just those changes

### The splice model

* Core idea: buffer replication stream changes while running a precise move-in
  query, then splice the results into the stream at exactly the right point

### Move-in planning with DNF

* A move in dependency D with new values V: identify impacted disjuncts, build
  a candidate predicate scoped to just those disjuncts, exclude rows already
  present via other disjuncts
* The query is narrow — only fetches rows that are genuinely new to the shape

### Move-out handling

* Move-outs are simpler — emit position-aware broadcasts, clients re-evaluate
  inclusion locally
* No query needed; the client already has the row and just needs updated
  active\_conditions

### Tags and active\_conditions

* Rows carry per-disjunct tags and per-position active\_conditions booleans
* Clients evaluate inclusion: for each tag, AND the active\_conditions at its
  positions, OR the results across tags
* Move broadcasts update active\_conditions for rows already on the client
  without resending the row data

## Replication stream routing with reverse indexes

* When a change arrives from Postgres, Electric needs to figure out which
  shapes it's relevant to
* With many shapes using subqueries, naively evaluating every shape's where
  clause for every change doesn't scale
* We use a reverse index backed by ETS that maps typed values to shape
  handles — a single lookup finds candidate shapes instead of iterating

### How the reverse index works

* For positive predicates (`x IN sq`): look up the value, get shapes whose
  membership contains it
* For negated predicates (`x NOT IN sq`): complement at read time — all
  negated shapes minus those containing the value
* Candidates are verified against the full where clause to handle non-subquery
  branches

### Consistency during moves

* During a move-in, the index needs to be broad enough to capture changes
  relevant to both pre-splice and post-splice views
* For positive dependencies: store the union of before and after membership
* For negative dependencies: store the intersection
* This may over-route changes, but correctness is enforced downstream by
  `convert_change` using the right subquery view for the change's position
  relative to the splice boundary

## Oracle testing for correctness

* Incremental view maintenance with splice points and multi-timeline routing
  is hard to get right
* Edge cases are combinatorial — boolean logic × move timing × concurrent
  changes × multiple dependencies
* Unit tests can't cover the state space; you need a fundamentally different
  testing approach

### Postgres as oracle

* The core idea: Postgres already knows the right answer — run the full query
  and compare
* For any sequence of operations, the oracle runs the equivalent `SELECT`
  against the current database state
* Our incremental system must produce exactly the same result set at every
  point

### What we found

## Next steps

* Try it now: subqueries work with any where clause — see the
  [shapes guide](/docs/guides/shapes#subqueries-experimental) for syntax and
  examples
* Subquery support ships in Electric vX.X — upgrade and start using richer
  cross-table filtering in your shapes
* Works with [TanStack DB 0.6](/blog/2026/03/25/tanstack-db-0.6-app-ready-with-persistence-and-includes)'s
  query-driven sync for progressive data loading with relational filtering
* Join the conversation on [Discord](https://discord.electric-sql.com) — we'd
  love to hear what data loading patterns you're building

***