Materialize for Streaming SQL

Materialize is a streaming database — you write SQL against Kafka/Postgres/event sources, and Materialize maintains the results incrementally as data arrives. Sub-second latency for arbitrary SQL on streams. The closest thing to 'a real-time database that you query like Postgres'.

Advertisement

Standard SQL, streaming semantics

-- Define a Kafka source
CREATE SOURCE orders FROM KAFKA BROKER 'kafka:9092' TOPIC 'orders';

-- Materialized view continuously updated
CREATE MATERIALIZED VIEW hourly_revenue AS
SELECT date_trunc('hour', ts) AS hour,
       SUM(amount) AS total
FROM orders GROUP BY 1;

-- Query the view — always reflects latest data, sub-second
SELECT * FROM hourly_revenue ORDER BY hour DESC LIMIT 10;

Incremental maintenance

Materialize doesn't re-run the query on every update. It computes the delta from the new input and applies it to the view. This is what makes it sub-second on TB-scale streams.

Advertisement

Joins on streams

Kafka Streams and Flink struggle with arbitrary joins (you must think about windows, late events). Materialize handles them with SQL semantics — any join, any aggregation, no manual windowing. Trade: state cost grows with input cardinality.

vs Flink for stream processing

Flink: lower-level, more control, ridiculous scale (PB/day). Materialize: higher abstraction, SQL-native, faster to prototype. Use Materialize when your team knows SQL but not Flink; use Flink when scale or fine control matters.

Operational model

Single binary or managed cloud. Disk-based storage of state. Replication for HA. Cost grows with state size, not throughput — keep your views' state bounded with retention windows.

Materialize = streaming SQL with sub-second freshness. Right when SQL is the team's vocabulary and Kafka is the source.