Roadmap

What shipped as OpenQueue's worlds architecture in 1.0, and the transport and control-plane work still ahead.

This page began as the design pitch for a broker-neutral runtime boundary. Most of it shipped in 1.0 — under the name worlds — and what's left is the real forward roadmap below.

Shipped in 1.0: the worlds architecture

The "runtime adapter" proposal landed as worlds: a delivery backend is a transport × store composition, selected with a world: field (not runtime:).

Transport × store composition. worldBullmq({ url, storage }) pairs a Redis/BullMQ transport with any durable store; @openqueue/world-postgres is a self-migrating SELECT … FOR UPDATE SKIP LOCKED transport with a Drizzle store and zero Redis. A world owns its own durable state.
The redis: sugar is exactly what this page asked for: redis: { url } resolves to worldBullmq({ url }) — no new imports for the common path.
Capabilities are part of the contract. Each transport declares what it supports, and an unsupported operation (e.g. flows on a queue-less world) returns a typed 501 unsupported_capability instead of failing silently.
A frozen contract. QueueTransport, QueueStorage, and OpenQueueWorld are frozen at WORLD_SPEC_VERSION = 1, so a third-party world has a stable target.
The /openqueue/v1 control plane the old "Workbench changes" section wished for now ships — mounted by the worker, consumed by @openqueue/client, and the basis for the two-plane control/execution split.

See Worlds, Two-plane, and Upgrading to 1.0. Everything below is what remains.

1.0.x follow-ups

Near-term work that doesn't break the frozen surface:

Unpin h3 to stable v2. The workbench and worker pin h3 to its release-candidate line; the unpin lands when a stable h3 v2 ships. It's a patch bump, not a breaking change.
NDJSON run streaming. A streaming variant of the run-events endpoint so the dashboard and @openqueue/client can tail a run's lifecycle without polling.
withOpenqueue helper. A small wrapper to mount the worker's control API and Workbench into an existing app with one call, alongside the current h3 and Next adapters.

New worlds: Kafka and RabbitMQ

BullMQ/Redis and Postgres are the two shipped transports. The next ones target teams whose operational center of gravity is a broker they already run:

Transport	Why teams want it
`worldBullmq` (Redis)	Simple local dev, delayed jobs, retries, flows, good all-in-one semantics. Shipped.
`worldPostgres`	One database for jobs and history, no Redis. Shipped.
Kafka	Existing event backbone, high throughput, durable ordered logs, replay, large worker fleets.
RabbitMQ	Mature work queues, routing keys, exchanges, acknowledgements, prefetch, familiar operations.
SQS / Pub/Sub / NATS	Managed cloud queues, simpler operations, or lower-latency messaging in specific environments.

The goal is not to pretend every broker has the same semantics. It's to expose the same task API where possible, make transport differences explicit through capabilities, and keep operational state portable through the store.

A new broker ships as a @openqueue/world-<broker> package exporting a world<Broker>(...) factory, consistent with world-bullmq/world-postgres:

export default defineConfig({
  namespace: 'my-app',
  dirs: ['./worker'],
  world: worldKafka({
    brokers: process.env.KAFKA_BROKERS!.split(','),
    groupId: 'my-app-workers',
    storage: postgresAdapter({ db, schema: queueSchema }),
  }),
});

The transport implements the frozen QueueTransport contract and declares its capabilities; core owns task discovery, validation, lifecycle events, OpenTelemetry, store writes, and the task API. A new transport does not re-specify the contract — see Worlds for QueueTransport, QueueStorage, and OpenQueueWorld.

Brokers that lack BullMQ-style inspection lean harder on the store, so the store grows a few transport-neutral tables as these worlds land: an outbox for transactional enqueue, flow_nodes for a portable flow graph, dead_letters for terminal failures with enough broker metadata to diagnose poison messages, and optional offsets for transports that need explicit cursor bookkeeping.

Kafka

Kafka is a strong fit for high-throughput background work when a team already runs Kafka as its event backbone. It is not a drop-in for BullMQ because Kafka is a partitioned log, not a job queue.

Recommended mapping:

OpenQueue concept	Kafka mapping
namespace	topic prefix or message header
queue	Kafka topic
run id	message key or header
task id	message header and payload field
payload	message value
metadata	message headers plus value fields
worker pool	consumer group
concurrency	partitions, consumers, and per-consumer execution limits
retry	store-backed retry topic or scheduler
dead letter	dead-letter topic
progress/logs	store events
schedule	store scheduler publishing to Kafka

Design details:

Topics should usually be per queue, such as my-app.default and my-app.email.
Message keys should be chosen deliberately: a run id spreads work evenly; an entity id preserves per-entity ordering.
Commit offsets only after the task succeeds or after the retry/failure state has been durably recorded.
Long-running jobs require careful max-poll-interval and heartbeat handling.
Retrying by not committing offsets can block a partition behind one bad message, so prefer recording retry state, committing the original offset, and republishing later.
Kafka has no native per-message delay like BullMQ delayed jobs. Use retry topics for coarse delays or store-backed scheduling for exact delays.
Workbench should show consumer-group lag and partition assignment when the client exposes it.
Use idempotent producers and deterministic run ids for safer publish retries.

Operational caveats:

Partition count is the hard ceiling for parallelism inside one consumer group.
Ordering and concurrency trade off against each other.
Poison messages need dead-letter handling or they can stall progress.
Replays are powerful but must not accidentally re-run side-effecting tasks without an explicit replay mode.
Large payloads should be stored externally, with the message carrying a pointer.

Kafka Share Groups are an emerging model that behaves more like a queue: multiple consumers can share records without strict partition ownership. This is closer to OpenQueue's worker-pool semantics than classic consumer groups.

Potential benefits: better work sharing for queue-like tasks, less partition-bound parallelism, more natural acknowledgement semantics, and easier scaling for heterogeneous worker fleets.

Stance:

Support classic consumer groups first.
Keep the transport boundary flexible enough for Share Groups later.
Treat Share Groups as an optional Kafka mode, not the baseline.
Document client and provider maturity before recommending it in production.

RabbitMQ

RabbitMQ is a natural fit for background jobs because it already models queues, acknowledgements, routing, and prefetch. It is closer to BullMQ than Kafka in worker semantics, but still lacks BullMQ's full job-state model.

Recommended mapping:

OpenQueue concept	RabbitMQ mapping
namespace	exchange prefix or vhost
queue	RabbitMQ queue
task id	routing key or message header
run id	message id or header
payload	message body
metadata	message headers
worker concurrency	consumer prefetch and local execution limit
retry	delayed exchange, DLX, or store scheduler
dead letter	DLX and dead-letter queue
progress/logs	store events
schedule	store scheduler publishing to exchange

Design details:

Use durable exchanges and durable queues in production, and persistent messages for jobs that must survive a broker restart.
Use publisher confirms for enqueue success, and manual acknowledgements (never auto-ack).
Set prefetch to match worker concurrency, and ack only after task success or after retry/failure state is durable.
Reject or dead-letter poison messages intentionally; do not requeue forever.
Use quorum queues for stronger durability where appropriate.
Keep long delays in the store unless the deployment intentionally uses a delayed-message strategy.

Routing options: a direct exchange for simple queue.task keys; a topic exchange to route by queue, task id, tenant, or priority class; a separate queue per OpenQueue queue for the easiest Workbench and scaling model; priority queues as an explicit opt-in, since they affect broker behavior.

Retry options:

Strategy	Use case	Tradeoff
Store-backed retry scheduler	Default portable behavior.	Requires a store-backed world.
Message expiry plus dead-letter exchange	Simple broker-native retry lanes.	Coarse delays and more queue topology.
Delayed-message exchange	Convenient per-message delay.	Plugin availability and operational support vary.
Immediate requeue	Very short transient failures.	Can hot-loop and starve other work.

Retry and delay across transports

Attempts and exponential backoff are already transport-neutral. What varies is delivery: a broker-native mechanism is fine for short delays, but long or important delays are safer store-backed:

The worker records a retry event and the next attempt time.
The worker acks or rejects the broker message per the transport's semantics.
A scheduler scans due retries from the store.
The scheduler republishes a new message for the same run.
The worker receives the retry and continues with incremented attempt state.

The store-backed path keeps retry behavior consistent even when transports differ, and it is the default for transports without native delay.

Transactional outbox

Transports should support an outbox so application state and enqueueing don't drift:

await db.transaction(async (tx) => {
  await tx.insert(invoices).values(invoice);

  await queue.outbox(tx).enqueue(sendInvoice, {
    invoiceId: invoice.id,
  });
});

The dispatcher then publishes to the transport and marks the outbox row delivered — important for Kafka and RabbitMQ, where a database commit and a broker publish are separate systems. Outbox rows need a deterministic id and optional dedupeKey; dispatch is idempotent; publish confirmation is recorded; failed publishes retry with backoff; and Workbench surfaces stuck rows as operational errors. Plain enqueue still exists for simple cases, documented as at-least-once without transactional coupling.

Portable flows

enqueueFlow() ships today on the BullMQ world's native flow support. Kafka and RabbitMQ have no equivalent parent/child job graph, so a portable flow engine moves the graph into the store, gated by the transport's flows capability.

Flow state tracks the flow id, node id, task id, queue, payload, parent node, dependency node ids, node status, output or error, and timestamps. Execution:

Insert all flow nodes in the store.
Publish only nodes whose dependencies are satisfied.
On node completion, mark it and check dependents.
Publish newly unblocked children through the transport.
On a terminal node failure, mark dependents blocked or failed per the flow policy.

BullMQ can keep using native flows internally; the long-term goal is the same Workbench graph and the same flow state regardless of transport.

Workbench runs from the store

The /openqueue/v1 control plane shipped, but the dashboard's Runs and Errors pages still read BullMQ directly — so on a non-BullMQ world they render empty (see Workbench). The roadmap item is to read runs, logs, progress, and the event timeline from the store through the control plane, so every world shows the same run history in the dashboard, not just BullMQ.

Replay tooling

Store-backed run history makes replay tractable: re-enqueue a run or a filtered set from the dashboard or the CLI. Because tasks have side effects, replay must be an explicit mode with a loud warning — never an accidental re-run — and should record that a run originated from a replay.

Conformance

Every world passes the same conformance suite before it ships: enqueue and execute a task, validate the payload before publish where possible, preserve trace context, record lifecycle events, complete and fail with a serialized error, retry with exponential backoff, stop on NonRetryableError, honor the run timeout, run at the configured concurrency, shut down gracefully, and expose queue and run state to the control plane. Schedules and flows are exercised according to each transport's declared capabilities.

New transports add a broker-specific suite on top:

Kafka: offset commits, partition assignment, consumer-group rebalance, idempotent publish, retry topics or store retries, dead-letter topics, lag reporting, and replay behavior.
RabbitMQ: publisher confirms, manual ack, nack/reject behavior, prefetch, durable-queue recovery, dead-letter routing, quorum queues, and delay strategy.

Open questions

1.0 answered several of the original ones — capability degradation is decided (typed errors plus 501 unsupported_capability), and per-world storage is decided (a world owns its own store). Still open:

Should task priority be portable across transports, or declared transport-specific?
Should the outbox API live on the queue client, the store, or both?
Should large-payload externalization be a core feature before Kafka support?
Should replay be a Workbench feature, a CLI feature, or both?
For Share Groups vs classic consumer groups, which becomes the recommended Kafka default once client and provider support matures?

Where this goes next

The worlds architecture is in place. Kafka is the first new transport to target if the goal is high-throughput event infrastructure; RabbitMQ is first if the goal is queue semantics familiar to teams already running AMQP. In both cases, a store-backed world is what keeps OpenQueue feeling like OpenQueue instead of exposing every broker's raw edges to task authors.

On this page