SQLite with Litestream Replication for Production Applications

Home › Blog › SQLite with Litestream Replication for Production Applications

SQLite with Litestream for Production Applications

SQLite Litestream replication has transformed how developers think about production databases. For years, the advice was clear: use PostgreSQL or MySQL for production, SQLite for development. But with Litestream providing continuous streaming replication to S3, and modern single-server deployments handling millions of requests, SQLite has become a legitimate production database choice for a growing category of applications. The shift is less about SQLite suddenly becoming faster and more about the operational gap closing: the one feature client-server databases offered that SQLite could not match — point-in-time durable backups off the box — is exactly what Litestream supplies.

This guide covers the practical aspects of running SQLite in production: setting up Litestream for continuous backup and disaster recovery, configuring WAL mode for concurrent reads, handling write contention, and understanding when SQLite is the right choice versus when you should stick with a client-server database. Throughout, the emphasis is on the operational details that turn a working prototype into something you can page on at 3am.

Why SQLite for Production

SQLite eliminates an entire category of infrastructure complexity. There is no database server to manage, no connection pooling to configure, no network latency between your application and database, and no separate backup system to maintain. Your database is a single file that sits next to your application, and reads are served in microseconds instead of milliseconds. Because every query is an in-process function call rather than a network round trip, the tail latency that plagues chatty ORMs largely disappears — there is no socket to saturate and no pool to exhaust under load.

Modern hardware makes this approach viable for far more use cases than you might expect. A single server with NVMe storage can handle 100,000+ read queries per second and 10,000+ write queries per second with SQLite in WAL mode. That is sufficient for the vast majority of web applications. Equally important, the absence of a network hop removes a whole class of failure modes — connection storms, DNS hiccups, and pool starvation simply cannot occur when the database lives inside your process.

SQLite Litestream replication architecture — Continuous streaming replication from SQLite to S3 via Litestream

How WAL Mode Changes the Concurrency Model

Understanding why SQLite works in production starts with the write-ahead log. In the older rollback-journal mode, a writer takes an exclusive lock over the whole database, so readers and the writer block each other. WAL mode flips this: writes append to a separate -wal file, and readers continue reading the last committed version of the database concurrently. As a result, a long-running read no longer stalls writes, and a write no longer freezes every reader. The single remaining constraint is that there is still only one writer at a time — readers scale freely, writers serialize.

The WAL is periodically “checkpointed” back into the main database file. By default SQLite checkpoints automatically once the WAL grows past about 1000 pages, but under heavy read load a checkpoint can be deferred because it needs a moment when no reader is pinning an old snapshot. This is worth knowing because a WAL file that grows without bound is almost always a symptom of a long-lived read transaction holding back the checkpoint. Litestream interacts with this mechanism deliberately: it manages its own checkpointing so it can capture every WAL frame before it is overwritten, which is why you should let Litestream, not your application, own checkpoint timing in production.

Setting Up Litestream

Litestream runs as a sidecar process that monitors your SQLite database and continuously streams WAL changes to an S3-compatible storage backend. Setup is straightforward:

# Install Litestream
curl -fsSL https://github.com/benbjohnson/litestream/releases/download/v0.3.13/litestream-v0.3.13-linux-amd64.tar.gz | \
  tar xz -C /usr/local/bin/

# Verify installation
litestream version

Create a Litestream configuration file:

# /etc/litestream.yml
dbs:
  - path: /data/app.db
    replicas:
      - type: s3
        bucket: myapp-db-backups
        path: production/app.db
        region: us-east-1
        retention: 168h          # Keep 7 days of WAL segments
        sync-interval: 1s        # Replicate every second
        snapshot-interval: 4h    # Full snapshot every 4 hours

      # Optional: second replica for geographic redundancy
      - type: s3
        bucket: myapp-db-backups-eu
        path: production/app.db
        region: eu-west-1
        retention: 72h
        sync-interval: 10s

# Start Litestream replication
litestream replicate -config /etc/litestream.yml

# Or run your application with Litestream wrapping it
litestream replicate -config /etc/litestream.yml -exec "node server.js"

The -exec flag is the recommended approach. Litestream starts replication, then launches your application. If your app crashes, Litestream ensures all WAL changes are flushed to S3 before exiting. The two tuning knobs that matter most are sync-interval and snapshot-interval. The sync interval is effectively your recovery point objective: a one-second interval means you can lose at most about a second of writes if the machine vanishes. A more frequent snapshot interval shortens restore time but increases storage and S3 request costs, so most teams settle on a snapshot every few hours with sub-second WAL syncs in between.

Configuring SQLite for Production

SQLite’s default settings are conservative. For production workloads, you need WAL mode, appropriate cache sizes, and busy timeout configuration. Apply these PRAGMAs at connection startup:

# Python with sqlite3
import sqlite3

def get_connection():
    conn = sqlite3.connect('/data/app.db')

    # WAL mode: allows concurrent reads during writes
    conn.execute('PRAGMA journal_mode=WAL')

    # Synchronous NORMAL: good balance of safety and speed
    # Litestream provides the durability guarantee
    conn.execute('PRAGMA synchronous=NORMAL')

    # 64MB cache size (default is only 2MB)
    conn.execute('PRAGMA cache_size=-65536')

    # Enable foreign keys (off by default in SQLite!)
    conn.execute('PRAGMA foreign_keys=ON')

    # Busy timeout: wait up to 5 seconds for write lock
    conn.execute('PRAGMA busy_timeout=5000')

    # Memory-mapped I/O for faster reads (256MB)
    conn.execute('PRAGMA mmap_size=268435456')

    return conn

// Node.js with better-sqlite3 (synchronous, fastest option)
const Database = require('better-sqlite3');

const db = new Database('/data/app.db', {
  // WAL mode for concurrent reads
  fileMustExist: false,
});

db.pragma('journal_mode = WAL');
db.pragma('synchronous = NORMAL');
db.pragma('cache_size = -65536');
db.pragma('foreign_keys = ON');
db.pragma('busy_timeout = 5000');
db.pragma('mmap_size = 268435456');

// Prepared statements for performance
const getUser = db.prepare('SELECT * FROM users WHERE id = ?');
const createUser = db.prepare(
  'INSERT INTO users (name, email) VALUES (@name, @email)'
);

// Transactions for batch writes
const insertMany = db.transaction((users) => {
  for (const user of users) {
    createUser.run(user);
  }
});

The interaction between synchronous=NORMAL and Litestream deserves a note. In WAL mode, NORMAL still fsyncs the WAL at each checkpoint but not on every commit, which is what makes writes fast. The theoretical risk is that an OS-level crash could lose the last few committed transactions from the local file — but because Litestream has already streamed those frames to S3, your durability guarantee comes from the replica rather than from forcing an fsync on the hot path. This is the key trade that lets SQLite stay fast without sacrificing recoverability.

Disaster Recovery

Restoring from Litestream replicas is fast — typically under 30 seconds even for multi-gigabyte databases. The restore process downloads the latest snapshot and replays WAL segments to bring the database to the most recent state.

# Restore from S3 replica
litestream restore -config /etc/litestream.yml /data/app.db

# Restore to a specific point in time
litestream restore -config /etc/litestream.yml \
  -timestamp "2026-03-26T10:30:00Z" \
  /data/app.db

# Restore from a specific replica
litestream restore -replica s3 -config /etc/litestream.yml /data/app.db

# Verify restored database
sqlite3 /data/app.db "PRAGMA integrity_check;"

Database monitoring and analytics dashboard — Monitoring SQLite performance metrics and replication lag

A backup you have never restored is a hypothesis, not a backup. The point-in-time restore shown above is also your best defense against logical corruption: if a bad migration or an accidental DELETE ships at 10:35, you can recover the database to 10:30 and lose only five minutes rather than everything. Because restores are cheap and fast, a healthy practice in production teams is to periodically restore into a scratch path and run PRAGMA integrity_check against it, turning recovery from an untested assumption into a routine that is known to work.

Handling Write Contention

SQLite allows only one writer at a time. In WAL mode, readers are not blocked by writers, but concurrent writes must be serialized. For most web applications, this is not a bottleneck — a single write transaction completes in microseconds. However, if you have high write throughput, use a write queue:

// Write serialization with a queue
class WriteQueue {
  constructor(db) {
    this.db = db;
    this.queue = [];
    this.processing = false;
  }

  async write(fn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ fn, resolve, reject });
      this.processNext();
    });
  }

  async processNext() {
    if (this.processing || this.queue.length === 0) return;
    this.processing = true;

    const { fn, resolve, reject } = this.queue.shift();
    try {
      const result = fn(this.db);
      resolve(result);
    } catch (err) {
      reject(err);
    } finally {
      this.processing = false;
      this.processNext();
    }
  }
}

// Usage
const writeQueue = new WriteQueue(db);
await writeQueue.write((db) => {
  return db.prepare('INSERT INTO orders (user_id, total) VALUES (?, ?)')
    .run(userId, total);
});

Beyond a queue, the single highest-leverage technique for write contention is batching. Wrapping many inserts in one transaction turns dozens of separate fsync-bounded commits into a single one, which is why the insertMany transaction shown earlier can be orders of magnitude faster than looping individual run calls. The most common production symptom of contention is a SQLITE_BUSY error, and the usual cause is not raw throughput but a reader that opened a transaction and forgot to finish it. Set a generous busy_timeout so transient lock waits retry automatically, and audit your code for transactions that stay open across slow operations such as network calls.

Docker Deployment with Litestream

FROM node:20-slim

# Install Litestream
ADD https://github.com/benbjohnson/litestream/releases/download/v0.3.13/litestream-v0.3.13-linux-amd64.tar.gz /tmp/
RUN tar -xzf /tmp/litestream-v0.3.13-linux-amd64.tar.gz -C /usr/local/bin/

WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .

COPY litestream.yml /etc/litestream.yml

# Litestream restores DB on start, then runs the app
CMD ["litestream", "replicate", "-config", "/etc/litestream.yml", "-exec", "node server.js"]

One subtlety with containers is the ephemeral filesystem. If /data/app.db lives only inside the container and you redeploy, the local file vanishes — which is fine, provided your entrypoint runs litestream restore first when the file is missing. A robust startup script restores from S3 if no local database exists, then begins replication; combined with a mounted volume for the live file, this gives you both fast local reads and a durable off-box copy. This restore-then-replicate pattern is what makes SQLite genuinely portable across the immutable-infrastructure deployments common in 2026.

When NOT to Use SQLite in Production

SQLite is fundamentally a single-node database. If your application requires horizontal write scaling across multiple servers, you need PostgreSQL, CockroachDB, or a similar distributed database. Additionally, if multiple services need to access the same database concurrently over the network, SQLite is not appropriate — it requires file-system access.

High write-volume applications (more than 10,000 writes per second sustained) will hit SQLite’s single-writer bottleneck. Real-time analytics workloads that need concurrent heavy reads and writes also perform better with PostgreSQL or ClickHouse. Finally, if your compliance requirements mandate a database with built-in audit logging, row-level security, or role-based access control, SQLite lacks these enterprise features. There is also a more practical constraint worth naming: because the database is bound to one machine’s disk, the SQLite model fits naturally with a single primary process. The moment your architecture demands several application servers writing to shared state, you have outgrown what a single-file embedded database can offer, and forcing it leads to the very network-database complexity you were trying to avoid.

Production database infrastructure — SQLite deployment architecture with Litestream continuous replication

Key Takeaways

SQLite Litestream replication provides continuous streaming backup to S3, making SQLite viable for production with sub-second RPO
WAL mode enables concurrent reads during writes — most web apps never hit SQLite’s write throughput ceiling
Configure PRAGMAs at startup: WAL mode, NORMAL synchronous, large cache, busy timeout, and mmap for optimal performance
Batch writes in transactions and use a generous busy_timeout to tame write contention before reaching for a queue
Disaster recovery with Litestream restore takes under 30 seconds, including point-in-time recovery — but test it regularly
SQLite eliminates database infrastructure entirely — no server, no connection pooling, no network latency

External Resources

In conclusion, Sqlite Litestream Replication is an essential topic for modern software development. By applying the patterns and practices covered in this guide, you can build more robust, scalable, and maintainable systems. Start with the fundamentals, iterate on your implementation, and continuously measure results to ensure you are getting the most value from these approaches.