Skip to main content
Version: Next

QueryFlux — Architecture Overview

QueryFlux is a universal SQL query proxy and router. It accepts queries from clients over multiple protocols (Trino HTTP, PostgreSQL wire, MySQL wire, Arrow Flight SQL), routes them to the appropriate backend engine, optionally translates the SQL dialect, and streams results back in the client's native format.

More documentation: the architecture documentation overview indexes deeper topics — motivation-and-goals.md (why the project exists), query-translation.md (sqlglot and dialects), routing-and-clusters.md (routers, groups, load balancing), observability.md (Prometheus, Grafana, Studio, Admin API), adding-engine-support.md (new engines, Studio, and client protocols).


High-Level Flow

Client (Trino CLI / psql / mysql / DBI)
│ native protocol

┌───────────────────┐
│ Frontend Listener │ ← speaks the client's wire protocol
└────────┬──────────┘
│ SQL + SessionContext

┌───────────────────┐
│ Router Chain │ ← selects target cluster group
└────────┬──────────┘
│ ClusterGroupName

┌───────────────────┐
│ ClusterGroupManager│ ← load-balances across clusters; queues if at capacity
└────────┬──────────┘
│ ClusterName

┌───────────────────┐
│ Translation Service│ ← sqlglot via PyO3; skipped when dialects match
└────────┬──────────┘
│ translated SQL

┌───────────────────┐
│ Engine Adapter │ ← speaks the backend engine's native protocol
└────────┬──────────┘
│ QueryExecution (Async | Sync)

┌───────────────────┐
│ Persistence │ ← stores in-flight state for async engines
└───────────────────┘

The frontend never knows which engine it's talking to. The engine adapter never knows which client protocol was used. The dispatch layer in the middle is the only place that bridges them.


Workspace Layout

queryflux/
├── crates/
│ ├── queryflux/ # main binary — wires everything together
│ ├── queryflux-core/ # shared types: ProxyQueryId, SessionContext, QueryPollResult, …
│ ├── queryflux-config/ # ConfigProvider trait + YamlFileConfigProvider
│ ├── queryflux-frontend/ # FrontendListenerTrait + protocol implementations
│ ├── queryflux-engine-adapters/ # EngineAdapterTrait + per-engine implementations
│ ├── queryflux-routing/ # RouterTrait + RouterChain + all router implementations
│ ├── queryflux-cluster-manager/ # ClusterGroupManager: load balancing + queueing
│ ├── queryflux-persistence/ # Persistence + MetricsStore + ClusterConfigStore traits + impls
│ ├── queryflux-metrics/ # PrometheusMetrics, BufferedMetricsStore, MultiMetricsStore
│ ├── queryflux-translation/ # TranslatorTrait + SqlglotTranslator (PyO3)
│ ├── queryflux-auth/ # Authentication providers, authorization, identity resolution
│ ├── queryflux-bench/ # Proxy overhead benchmarks (mock backends)
│ └── queryflux-e2e-tests/ # Integration tests
├── ui/queryflux-studio/ # Next.js management UI (cluster monitoring, query history)
├── prometheus/ # Prometheus scrape config
├── grafana/ # Grafana provisioning + dashboards
├── docker/ # Docker Compose files
│ ├── docker-compose.yml # Local dev: Trino + Postgres + Prometheus + Grafana
│ └── docker-compose.test.yml # E2E test stack (isolated ports)
├── config.local.yaml # Example config for local development
└── Makefile # build / run / test shortcuts

Core Abstractions

SessionContext (queryflux-core)

Carries protocol-specific metadata that travels with a query from frontend through routing and into the engine adapter. Each variant holds what that protocol actually provides.

pub enum SessionContext {
TrinoHttp { headers: HashMap<String, String> },
PostgresWire { user: Option<String>, database: Option<String>, session_params: HashMap<String, String> },
MySqlWire { user: Option<String>, schema: Option<String>, session_vars: HashMap<String, String> },
ClickHouseHttp { headers: HashMap<String, String>, query_params: HashMap<String, String> },
FlightSql { headers: HashMap<String, String> },
}

QueryExecution (queryflux-core)

Engines fall into two models. The adapter declares which model it uses; dispatch handles both uniformly.

QueryExecution::Async { backend_query_id, next_uri, initial_body }
→ dispatcher stores handle in Persistence
→ client polls proxy until complete

QueryExecution::Sync { result: QueryPollResult }
→ dispatcher returns result immediately
→ no Persistence needed
EngineModelNotes
TrinoAsyncSubmit → poll nextUri until done
DuckDBSyncRuns on spawn_blocking, result available immediately
StarRocksSyncMySQL protocol, single round-trip
ClickHousePlanned

EngineAdapterTrait (queryflux-engine-adapters)

pub trait EngineAdapterTrait: Send + Sync {
async fn submit_query(&self, sql: &str, session: &SessionContext) -> Result<QueryExecution>;
async fn poll_query(&self, backend_id: &BackendQueryId, next_uri: Option<&str>) -> Result<QueryPollResult>;
async fn cancel_query(&self, backend_id: &BackendQueryId) -> Result<()>;
async fn health_check(&self) -> bool;
fn engine_type(&self) -> EngineType;

// Catalog discovery — feeds schema context for translation
async fn list_catalogs(&self) -> Result<Vec<String>>;
async fn list_databases(&self, catalog: &str) -> Result<Vec<String>>;
async fn list_tables(&self, catalog: &str, db: &str) -> Result<Vec<String>>;
async fn describe_table(&self, catalog: &str, db: &str, table: &str) -> Result<Option<TableSchema>>;
}

RouterTrait (queryflux-routing)

pub trait RouterTrait: Send + Sync {
fn type_name(&self) -> &'static str;
async fn route(
&self,
sql: &str,
session: &SessionContext,
frontend_protocol: &FrontendProtocol,
) -> Result<Option<ClusterGroupName>>;
}

RouterChain evaluates routers in config order. First Ok(Some(group)) wins. Falls back to routingFallback if every router returns Ok(None). route_with_trace builds a RoutingTrace for debugging and observability.


Implemented Components

Frontends

ProtocolStatusPort
Trino HTTPDone8080
PostgreSQL wireDone5432
MySQL wireDone3306
Arrow Flight SQLDone (query execution)
Admin / Prometheus metricsDone9000
ClickHouse HTTPPlanned8123

Trino HTTP routes:

MethodPathDescription
POST/v1/statementSubmit a new query
GET/v1/statement/qf/queued/{id}/{seq}Poll a queued query (with backoff)
GET/v1/statement/qf/executing/{id}Poll an executing query
DELETE/v1/statement/qf/executing/{id}Cancel a running query

Engine Adapters

EngineStatusExecution model
TrinoDoneAsync HTTP — transparent nextUri proxying
DuckDBDoneSync embedded — spawn_blocking + Arrow result set
StarRocksDoneMySQL protocol — sync Arrow path via execute_as_arrow
AthenaDoneAsync AWS SDK — StartQueryExecution → poll → GetQueryResults
ClickHousePlanned

Routers

RouterMatching criteria
protocolBasedWhich frontend protocol the client used
headerHTTP header value (Trino HTTP only)
queryRegexRegex patterns against SQL text
clientTagsTrino X-Trino-Client-Tags header
pythonScriptCustom Python function (`def route(query, ctx) -> str
compoundMultiple conditions combined with all (AND) or any (OR) logic

Persistence

StoreStatusUse case
In-memory (DashMap)DoneSingle-instance dev
PostgreSQL (JSONB)DoneProduction / HA
RedisPlannedDistributed

Metrics

StoreStatusPurpose
PrometheusMetricsDoneReal-time operational metrics at /metrics
NoopMetricsStoreDoneDefault — zero overhead
PostgresStore (MetricsStore)DoneHistorical query records for the management UI
BufferedMetricsStoreDoneAsync write buffer wrapping any MetricsStore

Prometheus metrics exposed:

MetricTypeLabels
queryflux_queries_totalCounterengine_type, cluster_group, status, protocol
queryflux_query_duration_secondsHistogramengine_type, cluster_group
queryflux_translated_queries_totalCountersrc_dialect, tgt_dialect
queryflux_running_queriesGaugecluster_group, cluster_name
queryflux_queued_queriesGaugecluster_group

SQL Translation

Translation is handled by sqlglot (Python, 31+ dialects) called via PyO3.

When translation runs: only when the incoming client dialect differs from the target engine's dialect. Trino client → Trino cluster = zero overhead passthrough.

Two translation modes (both implemented in queryflux-translation; see query-translation.md):

  1. Dialect-only (empty SchemaContext): sqlglot.transpile(sql, read=src, write=tgt) — this is what the main dispatch path uses today (SchemaContext::default()).
  2. Schema-aware (non-empty SchemaContext): parse → sqlglot.optimizer.optimize with MappingSchema → emit in target dialect, with fallback to dialect-only if optimization fails.

Source dialect is inferred from the frontend protocol (TrinoHttp → Trino, PostgresWire → Postgres, etc.). Target dialect comes from the selected cluster’s engine type (via the adapter).

Translation gracefully degrades: if sqlglot is unavailable at startup, the service disables itself and SQL passes through untranslated.


Configuration

queryflux:
externalAddress: http://localhost:8080
frontends:
trinoHttp: { enabled: true, port: 8080 }
postgresWire: { enabled: false, port: 5432 }
mysqlWire: { enabled: false, port: 3306 }
persistence:
inMemory: {} # or: postgres: { databaseUrl: "postgres://..." }
adminApi:
port: 9000

clusters:
trino-1:
engine: trino
endpoint: http://trino:8080
enabled: true
duckdb-1:
engine: duckDb
enabled: true
databasePath: /data/analytics.duckdb # omit for in-memory

clusterGroups:
trino-default:
enabled: true
maxRunningQueries: 100
members: [trino-1]

duckdb-local:
enabled: true
maxRunningQueries: 4
members: [duckdb-1]

translation:
errorOnUnsupported: false

routers:
- type: protocolBased
trinoHttp: trino-default

- type: header
headerName: X-Target-Engine
headerValueToGroup:
duckdb: duckdb-local

- type: pythonScript
script: |
def route(query, ctx):
if "big_table" in query:
return "trino-default"
return None

routingFallback: duckdb-local

Local Development

Prerequisites

  • Rust (stable)
  • Docker + Docker Compose
  • Python 3.10+

Setup

# Install Python dependencies (sqlglot)
make setup

# Export Python path for PyO3
export PYO3_PYTHON=$(pwd)/.venv/bin/python3

# Start Trino + Postgres + Prometheus + Grafana, then run the proxy
make dev

Services

ServiceURLCredentials
QueryFlux (Trino HTTP)http://localhost:8080
Prometheus metricshttp://localhost:9000/metrics
Trino (direct)http://localhost:8081
Prometheushttp://localhost:9090
Grafanahttp://localhost:3000admin / admin
PostgreSQLlocalhost:5433queryflux / queryflux

Send a query

# Via Trino CLI
trino --server http://localhost:8080 --execute "SELECT 42"

# Via curl
curl -s -X POST http://localhost:8080/v1/statement \
-H "X-Trino-User: dev" \
-d "SELECT current_date"

Roadmap

PhaseFeatureStatus
P1Trino HTTP frontend + DuckDB/Trino backendsDone
P1sqlglot translation (dialect-only)Done
P1Prometheus metricsDone
P1Postgres persistence + query historyDone
P1PostgreSQL wire frontendDone
P1MySQL wire frontend + StarRocks backendDone
P1Arrow Flight SQL frontendDone
P1QueryFlux Studio — management UIDone
P1Athena backendDone
P1Authentication / authorization (queryflux-auth)Done
P2Wire SchemaContext from catalog into dispatchPlanned
P3ClickHouse HTTP backend + frontendPlanned