The Heatmap Query
Every second, the App Server polls StarRocks with a spatial aggregation query against the Iceberg table. It takes every raw canvas event, buckets them into a 160×120 grid, and sums the pressure per cell:
SELECT
FLOOR(x / 4.0) AS grid_col,
FLOOR(y / 3.0) AS grid_row,
SUM(pressure) AS intensity,
COUNT(*) AS event_count
FROM canvas_events
WHERE sessionId = 'session-123'
GROUP BY grid_col, grid_row
ORDER BY intensity DESC
StarRocks reads the latest Parquet files from Iceberg (committed by Pulsar every ~3 seconds), scans only the columns it needs, filters by session, and computes the aggregation using its vectorized MPP engine. On this demo workload it completes in ~50ms.
Scalability at Each Layer
⚡
Apache Pulsar
Partitioned topics with horizontal broker scaling and tiered storage. Production clusters handle 1M+ messages/sec per topic, 100K+ topics, and petabytes of tiered storage. Adding producers or consumers needs no reconfiguration — just scale brokers.
🧊
Iceberg + S3
S3 scales to exabytes with no provisioning. Iceberg's metadata layer enables partition pruning — with time + session partitions, a query over 10 billion rows touches only the relevant files. Schema evolution and snapshot isolation come free.
⭐
StarRocks
Vectorized MPP engine. A 3-node cluster handles 10K+ queries/sec on structured data. Scales linearly — double the BE nodes, double the throughput. External catalog prunes Iceberg files before scanning, keeping latency under 100ms even at TB scale.
📊
End-to-End Latency
In this demo: ~3s event-to-heatmap (Pulsar commit interval). In production with Pulsar's sub-10ms publish latency, Iceberg 1-second micro-batches, and StarRocks sub-100ms queries, end-to-end latency can be under 2 seconds at scale.