A user signs up and you need to send a welcome email. The email provider takes 800 milliseconds to respond, and the user is watching a loading spinner. An image gets uploaded and needs to be resized into six formats. A payment webhook arrives and you need to update inventory, generate an invoice, and notify a warehouse. The work is real, but none of it belongs in the HTTP response path.
Every major language ecosystem has solved this problem. Ruby has Sidekiq. Elixir has Oban. Node has BullMQ. Python has Celery. These are good tools built by experienced teams and tested in production at scale. When we started designing ntnt's concurrency story, the question was not whether we needed background jobs, but whether jobs should be a library or a language feature, and whether that distinction would make a meaningful difference in practice.
We think it does. This post explains what we built, how we built it, and where the design differs from existing systems.
What Jobs Are For
Before getting into architecture, it's worth being concrete about what background jobs actually do, because the use cases shape the design.
Deferring slow work is the most common case. Sending emails, calling third-party APIs, generating PDFs, anything that takes more than a few hundred milliseconds. The web request enqueues a job and returns immediately. A worker picks it up whenever capacity is available.
Scheduled tasks add a time component. Send a reminder 24 hours after signup. Generate a report every morning at 9am. Check for expired trials hourly. The job exists, but it doesn't execute until a specified future time.
Reliable processing is fundamentally different from fire-and-forget. A payment webhook arrives from Stripe. If your server crashes mid-processing, the job needs to survive and retry. Job systems provide durability (the job is persisted to a store) and retry logic (failed jobs are re-attempted with configurable backoff). The job must eventually succeed or be explicitly marked as dead.
Rate-limited work meters requests to stay within API constraints. You have 10,000 items to process and the provider allows 100 requests per minute. Without throttling, most of those requests fail with 429s, retry, and create a thundering herd that makes things worse.
Resource-constrained processing limits how many instances of a heavy job type run simultaneously. A video transcoding job that uses 4GB of RAM shouldn't have 20 instances running at once, regardless of how many workers exist.
Every production web application ends up needing some combination of these. The question is how much ceremony is involved in getting there.
How Jobs Work in Other Ecosystems
In most languages, background jobs are a library concern. Here's a Sidekiq job in Ruby:
class SendWelcomeEmail
include Sidekiq::Job
sidekiq_options queue: 'emails', retry: 3
def perform(user_id)
user = User.find(user_id)
UserMailer.welcome(user).deliver_now
end
end
SendWelcomeEmail.perform_async(user.id)
This is clean. Sidekiq is well-designed and Ruby's class system makes it feel natural. In Oban (Elixir), the shape is similar:
defmodule MyApp.Workers.SendWelcomeEmail do
use Oban.Worker, queue: :emails, max_attempts: 3
@impl Oban.Worker
def perform(%Oban.Job{args: %{"user_id" => user_id}}) do
user = Repo.get!(User, user_id)
Mailer.deliver(WelcomeEmail.new(user))
end
end
%{user_id: user.id}
|> MyApp.Workers.SendWelcomeEmail.new()
|> Oban.insert()
And in BullMQ (Node):
const queue = new Queue('emails');
const worker = new Worker('emails', async job => {
const user = await User.findById(job.data.userId);
await sendWelcomeEmail(user);
}, { concurrency: 5 });
await queue.add('send-welcome', { userId: user.id });
These all work and they're all production-proven. But they have something in common: the job system is foreign to the language. The framework provides abstractions that the developer wires together. The parser, type checker, and CLI tools have no idea that background jobs exist. Job definition, registration, and invocation are three separate concerns that happen to be colocated in the same file.
What Changes When the Parser Knows About Jobs
In ntnt, a job is a language construct:
job SendWelcomeEmail on emails (retry: 3) {
perform(user_id: String) {
let user = find_user(user_id)
send_email(user["email"], "Welcome!", welcome_body(user))
}
}
The parser understands job as a keyword. Registration happens at parse time, so there's no separate step, no mixin, no service container wiring. The queue name, retry policy, and perform body are all part of the syntax. Enqueuing is a function call:
enqueue("SendWelcomeEmail", map { "user_id": user_id })
The syntactic difference is small. The consequences are not.
Because the parser knows about jobs, the rest of the toolchain does too. ntnt lint can verify that arguments passed to enqueue match the perform signature. The documentation system can enumerate every job in a project. The test framework can reason about job behavior. When you run ntnt jobs status, the CLI knows which job types exist because the parser told it. None of these things require a plugin, an adapter, or a separate index. The information is already in the parse tree.
The worker model benefits in a different way. When an ntnt worker starts, it evaluates the entire application source file, including all imports, functions, constants, and job definitions. A perform block has access to everything the rest of the application has access to. There is no separate "job environment" to configure or "worker context" to set up.
import { fetch } from "std/http"
let API_BASE = "https://api.example.com"
fn build_headers(token) {
return map { "Authorization": "Bearer " }
}
fn notify_slack(message) {
fetch("/slack/post", map {
"method": "POST",
"headers": build_headers(env("SLACK_TOKEN")),
"body": to_json(map { "text": message })
})
}
job ProcessOrder on orders (retry: 3, timeout: 120) {
perform(order_id: String) {
let order = fetch("/orders/")
// ... process order ...
notify_slack("Order processed")
}
}
Everything defined in that file, fetch, API_BASE, build_headers, notify_slack, is available inside the perform block without re-importing or re-declaring anything. The worker loaded the full application, so the full application is available.
Choosing the Backing Store
Most job systems are tied to a specific database. Sidekiq requires Redis. Oban requires PostgreSQL. BullMQ requires Redis. This is a reasonable design choice when the library needs to guarantee specific transactional semantics, but it means you sometimes choose your job system based on which database you already run rather than which system best fits your needs.
ntnt's job system is built on std/kv, the language's key-value module. std/kv supports Redis, Valkey, DragonflyDB, and SQLite through the same API:
// Redis in production
configure_queue(map { "store": "redis://localhost:6379" })
// SQLite for development or single-node deploys
configure_queue(map { "store": "sqlite://./jobs.db" })
// In-memory with no configuration at all
// Jobs work out of the box with zero setup
The same application code runs against any backend. A developer can prototype with the in-memory store (no setup, no external dependencies), test against SQLite (durable but local), and deploy with Redis (distributed). Switching backends is changing one configuration string.
This design means the job system doesn't impose infrastructure requirements. If an application already runs Redis for caching, the job system can share that instance. If it's a small application on a single server where running Redis feels like overkill, SQLite works. The job system adapts to what's available rather than requiring something new.
How Concurrency Works Under the Hood
ntnt's interpreter is single-threaded by design. It uses Rc<RefCell> internally, which means shared mutable state between threads is architecturally impossible. Not prevented by discipline or convention, but by the type system of the implementation language (Rust). Two tasks physically cannot share memory.
Concurrency uses a thread-per-task model with serialized value passing:
let task = spawn(fn() { heavy_computation() })
let result = await_task(task)
Each spawned task gets a fresh interpreter instance with captured bindings injected. Cross-task communication goes through channels backed by crossbeam-channel:
let [tx, rx] = channel()
spawn(fn() { send(tx, compute_something()) })
let result = recv(rx)
This gives you true parallelism on OS threads, zero shared mutable state by construction, and panic isolation via catch_unwind per task. There is no async/await and no function coloring. A function that does I/O has the same signature as a function that doesn't.
Job workers build directly on these primitives. Each worker is a thread running its own interpreter instance. When a worker claims a job, it evaluates the perform block in an isolated child scope. If the job panics, the worker catches it, marks the job as failed, and continues to the next one. A single bad job cannot take down the worker process.
Priorities and Worker Bands
A password reset email should process before a weekly analytics digest. ntnt handles this with named priorities:
job ResetPassword on auth (priority: "critical") {
perform(user_id: String) { ... }
}
job WeeklyDigest on notifications (priority: "low") {
perform(user_id: String) { ... }
}
Four named levels are available: critical, high, normal (the default), and low. These map to numeric values in a 0 to 99 range, and raw numbers are available when the named tiers don't fit.
Each priority band gets its own thread pool. Critical jobs don't compete with low-priority jobs for worker threads. If the low-priority queue backs up with thousands of items, critical jobs continue processing at full speed. Workers can be scaled per band at runtime without restarting the application:
$ ntnt workers scale critical 4
$ ntnt workers status
Band Workers Pending Active
critical 4 0 2
high 2 15 2
normal 2 847 2
low 1 3201 1
The control socket (.ntnt.sock) makes this possible without a redeploy. The running application adjusts in place.
Rate Limiting and Concurrency Limits
Both are job-level options, declared in the same syntax as retry and timeout:
job SendEmail on emails (rate: "100/minute", retry: 3) {
perform(to: String, body: String) {
email_provider.send(to, body)
}
}
job TranscodeVideo on media (concurrency: 3, timeout: 300) {
perform(video_id: String) {
transcode(video_id)
}
}
Rate limiting uses a sliding window counter backed by the same KV store as the job queue. When a job exceeds the rate limit, it gets re-enqueued (not dropped) and the worker sleeps until the next window opens. Concurrency limiting uses an atomic counter semaphore, where each running instance holds a slot that expires automatically if the worker crashes.
The two compose naturally. A job with rate: "100/minute", concurrency: 5 will run at most 5 instances simultaneously and at most 100 per minute. Both checks happen independently before execution begins.
Deduplication and Expiration
Duplicate jobs are a real problem in production. A user clicks "send" twice. A webhook fires three times. A retry loop enqueues the same work repeatedly. ntnt handles deduplication at the job definition level:
job ProcessPayment on payments (unique: 3600, retry: 3) {
perform(payment_id: String) {
charge(payment_id)
}
}
unique: 3600 means that for the next 3600 seconds, enqueuing a job with the same arguments is silently deduplicated. The dedup key is a SHA-256 hash of the job type and arguments, stored atomically via SET NX (Redis) or INSERT OR IGNORE (SQLite). There is no race window. Two concurrent enqueue calls for the same arguments produce exactly one job.
Expiration handles the opposite problem: jobs that were valid when enqueued but are no longer relevant.
job NotifyFlashSale on notifications (expires: 3600) {
perform(user_id: String, sale_id: String) {
notify(user_id, "Flash sale ending soon!")
}
}
If the job sits in the queue longer than an hour, the worker skips it and marks it expired. No wasted compute, no stale notifications.
Composition
Higher-level concurrency patterns build on the primitives. parallel runs N functions concurrently and returns all results in input order. If any task fails, the rest are cancelled:
import { parallel, race } from "std/concurrent"
let [users, orders, inventory] = parallel([
fn() { fetch(users_url) },
fn() { fetch(orders_url) },
fn() { fetch(inventory_url) }
])
race returns the first successful result and cancels the others. Tasks that fail or return errors are skipped:
let data = race([
fn() { fetch(primary_api) },
fn() { fetch(fallback_api) }
])
Both are useful inside perform blocks for fan-out work like fetching data from multiple services or trying a primary provider with a fallback.
Chaining Without a Workflow DSL
Many job systems provide a chain or workflow abstraction for multi-step processes. ntnt doesn't. Job chaining is just code:
job ProcessOrder on orders {
perform(order_id: String) {
let order = validate_order(order_id)
enqueue("ChargePayment", map {
"order_id": order_id,
"amount": order["amount"]
})
}
}
job ChargePayment on payments {
perform(order_id: String, amount: Float) {
let charge = process_charge(amount)
enqueue("SendConfirmation", map {
"order_id": order_id,
"charge_id": charge["id"]
})
}
}
When a job completes, it enqueues the next one. The dependency is explicit, visible in the code, and debuggable with standard tools. If you need to know what happens after ProcessOrder, you read the perform block. There is no hidden dependency graph.
Workflow DSLs help when you have genuinely complex dependency graphs with fan-out and fan-in. For the common case of a linear sequence of steps, a function call is clearer than a framework concept.
Observability
ntnt ships a CLI for job management:
$ ntnt jobs status
Queue Pending Active Completed Failed Dead
emails 42 3 1,847 12 2
orders 0 1 523 0 0
media 156 3 89 3 1
$ ntnt jobs inspect job-abc-123
ID: job-abc-123
Type: SendEmail
Queue: emails
Status: retrying
Attempts: 2/3
Error: Connection timeout
$ ntnt jobs retry job-abc-123
✓ Job job-abc-123 re-queued (attempts reset)
$ ntnt jobs list --status=dead --queue=emails
ID Type Failed At Error
job-def-456 SendEmail 2026-03-29T07:30:00 SMTP connection refused
job-ghi-789 SendEmail 2026-03-29T07:31:00 Rate limit exceeded
Jobs emit structured JSON to stderr as they move through their lifecycle: enqueued, started, completed, failed, dead, rate-limited. These events are machine-readable and work with any log aggregator.
Stuck job recovery is automatic. Each active job has a heartbeat key with a TTL. If a worker crashes, the heartbeat expires and the job is re-queued. No manual intervention, no admin dashboard, no cron job sweeping for orphans.
Queues can be paused and resumed at runtime, which is useful when a downstream service is having an outage. Pausing is durable (persisted to KV, survives restarts), and workers skip paused queues entirely rather than polling them.
Testing
ntnt provides a testing mode that captures enqueued jobs instead of executing them:
configure_queue(map { "mode": "testing" })
handle_signup(map { "email": "[email protected]" })
assert_enqueued("SendWelcomeEmail", map { "email": "[email protected]" })
assert_not_enqueued("SendAdminAlert")
// Or execute them synchronously for integration testing
drain_jobs()
clear_jobs()
Testing mode is the default when running ntnt intent check, ntnt's verification tool. Jobs can be tested the same way as HTTP routes, with the same tool and the same workflow.
A Few Design Choices Worth Explaining
Free functions, not methods. You write enqueue("SendEmail", args), not SendEmail.enqueue(args). This is how the rest of ntnt works: len(s) not s.len(), trim(s) not s.trim(). The job system follows the same convention.
String-based enqueue. The job name is a string rather than a type reference. This allows enqueuing from contexts where the job definition isn't in scope (a migration script, a REPL session, a different service) and avoids circular import issues. The registry validates the name at enqueue time.
No async/await. ntnt is synchronous. A function that queries a database looks exactly like a function that adds two numbers. Concurrency comes from spawn and channels rather than from an async runtime that permeates the call stack.
Cooperative cancellation. Cancelled jobs aren't killed mid-execution. A cancellation flag is set, and the job checks it at yield points: I/O boundaries, loop iterations, sleep calls. This is simpler and safer than preemptive cancellation because the job can clean up resources before exiting.
Thread-per-task. Each worker is an OS thread with its own interpreter instance. This gives true parallelism, straightforward debugging (each thread has a clear identity), and panic isolation. It won't scale to tens of thousands of concurrent lightweight tasks the way a green thread runtime would, but background jobs typically have dozens to hundreds of workers, not thousands. For that range, OS threads are the simpler and more predictable choice.
What's Coming Next
The v0.4.6 job system covers the core use cases: enqueue, retry, schedule, prioritize, rate-limit, deduplicate, and observe. Several features are in active development.
Batches will let you group N jobs and receive a callback when all complete, when all reach a terminal state, or when the first one dies. This is the most common missing piece for bulk import workflows and fan-out processing.
Event dispatch is a pub/sub layer built on top of the job system. subscribe("user.signed_up", "SendWelcomeEmail") decouples the event emitter from its consumers. Publishing an event enqueues all subscribed jobs, with durability, retry, and observability inherited from the job infrastructure underneath.
Job contracts extend ntnt's existing design-by-contract system. requires validates arguments before execution and ensures validates results after. Contract violations fail the job with a clear error rather than allowing it to produce silently wrong output.
Simulation mode enables dry-run execution where side-effect blocks (email sends, API calls, database writes) are skipped but everything else runs normally. This is useful for testing job logic against production-shaped data without triggering real side effects.
Where This Fits
ntnt's job system is designed for web applications that need reliable background processing without provisioning separate infrastructure. If your application sends emails, processes uploads, calls external APIs, or does any work that doesn't belong in the request cycle, the job system handles it with minimal configuration.
It's well-suited for situations where simplicity matters: small teams, single-server deployments, projects where adding a Redis instance just for job processing feels disproportionate. The in-memory and SQLite backends mean you can have durable, retryable background jobs with no external dependencies.
For high-scale systems processing millions of jobs per hour across distributed clusters, dedicated infrastructure like Sidekiq with Redis Sentinel or Oban with PostgreSQL has a longer track record and more operational tooling. ntnt is a young language, and the job system is new. The architecture is sound and we're running it in production, but the honest answer is that we haven't been through a decade of edge cases yet.
The source is on GitHub. The Learn page covers getting started and the stdlib reference documents the full jobs API. If you build something with it, or find rough edges, we'd like to hear about it.