Skip to content

Redis: From Single Instance to Distributed Scale

What Is Redis?

Redis (REmote DIctionary Server) is an in-memory data structure store. At its core, it holds all data in RAM and processes every command on a single thread — which makes it fundamentally different from what most people think of as a "database."

Redis vs Traditional Databases

A traditional database (MySQL, PostgreSQL, MongoDB) is designed around durability first: data goes to disk, indexes are built for query flexibility, and the system is optimized for complex reads across large datasets. The cost is latency — every operation involves disk I/O, query planning, locking, and transaction coordination.

Redis inverts this priority. Speed first, durability second.

Traditional Database Redis
Where data lives Disk (with memory caching) Memory (with optional disk persistence)
Access time Milliseconds (disk seek + query plan) Microseconds (direct memory access)
Data model Tables/documents with rich query languages (SQL, MQL) Key-value with typed data structures (strings, lists, sets, hashes, sorted sets, streams)
Concurrency model Multi-threaded with locks, MVCC, or optimistic concurrency Single-threaded — commands execute sequentially, no locking needed
Durability Writes are durable by default (WAL, fsync) Writes are in-memory by default; persistence is opt-in and lossy
Query capability Joins, aggregations, subqueries, indexes on any column Get by key, operate on data structure (no joins, no ad-hoc queries)
Dataset size Limited by disk (terabytes+) Limited by RAM (typically gigabytes)
Failure cost Data survives crashes (designed for it) Data can be lost on crash (depends on persistence config)

Why Not Just Use a Database for Everything?

Because some operations don't need what a database provides — and pay a heavy price for it.

Consider a session store. You write a session on login, read it on every request, and delete it on logout. You don't need joins, transactions, or complex queries. You don't need the data to survive a server reboot (the user just logs in again). What you need is speed — sub-millisecond reads on every single HTTP request.

A database handles this at ~1-5ms per read (network + query plan + disk/cache). Redis handles it at ~0.1ms. At 10,000 requests per second, that's the difference between 10-50 seconds of cumulative latency and 1 second.

The same logic applies to:

  • Caching: store computed results to avoid re-querying the database
  • Rate limiting: count requests per user per time window
  • Leaderboards: sorted sets give you rank operations in O(log n)
  • Queues: lists with blocking pop give you a job queue with no polling
  • Pub/Sub: real-time message fanout without a message broker

When Redis Is the Wrong Choice

Redis is not a replacement for your database. It's a complement:

Redis Is Not a Database Replacement

  • Your source of truth should be a database. Redis can lose data on crash (even with persistence, the default appendfsync everysec has a ~1 second data loss window).
  • If your data doesn't fit in RAM, Redis is not the right tool. A 500GB dataset needs 500GB of RAM — expensive and impractical for most use cases.
  • If you need complex queries, Redis can't help. There is no SELECT * FROM users WHERE age > 30 AND city = 'Tokyo'. Redis retrieves by key, not by query.
  • If you need strong durability guarantees (financial transactions, legal records), Redis's persistence model is too weak. Use a database with WAL and synchronous replication.

The Mental Model

Think of Redis as your application's working memory — fast, limited in size, and optimized for data you need right now. Your database is long-term storage — slower, vast, and designed to never lose anything. The two work together:

graph LR
    App["Application"] -->|"fast path<br/><i>~0.1ms</i>"| Redis["Redis<br/><i>working memory</i>"]
    App -->|"slow path<br/><i>~1-5ms</i>"| DB["Database<br/><i>long-term storage</i>"]
    Redis -.->|"cache miss"| DB
    DB -.->|"populate cache"| Redis

The rest of this guide dives into how Redis achieves this speed, what happens when you outgrow a single instance, and the hard distributed systems problems that emerge at scale.

Why Not Just Use In-Memory Data Structures?

Every language has hashmaps, dictionaries, and in-process caches. A Python dict or a Node.js Map lives in the same process, requires zero network calls, and is faster than Redis for raw access time (~50ns vs ~100us). So why add an external system?

Because in-process memory has fundamental limitations that appear the moment your application grows beyond a single process.

graph TD
    subgraph "In-Process Cache"
        P1["Process 1<br/><i>cache: {user:1 → Alice}</i>"]
        P2["Process 2<br/><i>cache: {user:1 → ???}</i>"]
        P3["Process 3<br/><i>cache: {user:1 → ???}</i>"]
    end

    subgraph "Redis (Shared)"
        R["Redis<br/><i>{user:1 → Alice}</i>"]
        P4["Process 1"] --> R
        P5["Process 2"] --> R
        P6["Process 3"] --> R
    end
Concern In-Process (Map, dict) Redis
Shared across processes No — each process has its own copy. Update in one, others are stale. Yes — single source of truth for all processes, servers, and services.
Survives process restart No — process dies, cache is gone. Cold start on every deploy. Yes — data persists across restarts (with AOF/RDB). Deploys don't flush your cache.
Shared across servers No — Server A's cache is invisible to Server B. Yes — any server can read/write the same keys.
Memory limit Bounded by the process's heap. Competes with your application for RAM. Large caches cause GC pressure (Java, Node.js, Go). Dedicated memory. Doesn't affect application GC. Can be on a separate machine with more RAM.
Eviction policies You build your own (or use a library). Built-in LRU, LFU, TTL-based eviction with tunable sampling.
Atomic operations Thread-safety is your problem (mutexes, locks, concurrent maps). Single-threaded — every command is atomic. No race conditions by design.
Data structures Basic (maps, lists, sets). Sorted sets, HyperLogLog, streams — you'd build these yourself. Native sorted sets, streams, bitmaps, HyperLogLog, pub/sub, geospatial indexes.
Expiration Manual — you track timestamps and purge stale entries. Built-in TTL per key, with lazy + periodic expiration.

When In-Process Cache Is the Right Choice

In-process caching is not wrong — it's a different tool for a different problem:

  • Single-process application that will never scale horizontally — a Map is simpler and faster
  • Immutable reference data (country codes, config, feature flags) that rarely changes — load once, use forever
  • Hot-path microsecond optimization where even Redis's ~100us RTT is too slow — use an in-process L1 cache in front of Redis

BullMQ: Why a HashMap Can't Replace Redis

Consider implementing a job queue with in-process data structures. You'd need:

  • A list for the queue (easy)
  • Atomic pop-and-push across processes (impossible without shared state)
  • Blocking wait for new jobs without polling (requires OS-level primitives)
  • Job locking across multiple workers on different machines (requires distributed state)
  • Persistence so jobs survive restarts (requires your own serialization/storage)

BullMQ uses Redis because every one of these requirements demands shared, persistent, atomic state that no in-process data structure can provide. The BRPOPLPUSH command alone — atomically pop from one list and push to another, blocking until data is available — has no equivalent in a Map.

The Hybrid Approach: L1 + Redis

In high-throughput systems, the best architecture often combines both:

graph LR
    App["Application"] -->|"~50ns"| L1["L1: In-Process Cache<br/><i>small, hot data</i>"]
    L1 -->|"miss → ~100μs"| Redis["L2: Redis<br/><i>shared, larger</i>"]
    Redis -->|"miss → ~1-5ms"| DB["L3: Database<br/><i>source of truth</i>"]

The in-process L1 cache holds a small set of frequently accessed keys (hundreds, not millions). Redis serves as the shared L2. The database is the durable source of truth. Each tier is 10-100x slower than the one above, but holds progressively more data.

The challenge with L1 is invalidation: when a key changes in Redis, all processes with a stale L1 copy need to know. Redis 7.0+ provides server-assisted client-side caching with invalidation messages — Redis tracks which clients cached which keys and pushes invalidation notices when those keys change.


The Journey

This guide follows the lifecycle of a growing application. At each stage, we teach the Redis internals that matter, the failure modes that appear, and — critically — why you shouldn't jump ahead. Over-engineering your Redis setup is just as dangerous as under-engineering it.

graph LR
    A["<b>Single Instance</b><br/>Event loop, data structures,<br/>persistence, memory model"] --> B["<b>Growing Pains</b><br/>Replication, background jobs,<br/>Sentinel, cache stampede"]
    B --> C["<b>Large Scale</b><br/>Cluster, CAP trade-offs,<br/>race conditions, Redlock"]

    style A fill:#4CAF50,color:#fff,stroke:#388E3C
    style B fill:#FF9800,color:#fff,stroke:#F57C00
    style C fill:#F44336,color:#fff,stroke:#D32F2F

How to Read This Guide

Each stage explains three things:

  1. What's appropriate at this scale
  2. What's overkill (and why adding it hurts more than it helps)
  3. The internals behind why things work — or break

BullMQ is used as a running example throughout, showing which Redis primitives power real-world job queue infrastructure.

  • Single Instance


    A single Redis instance is surprisingly powerful. Learn the event loop, internal data structures, memory model, and persistence trade-offs that make it work.

    Read more

  • Growing Pains


    Your app is scaling. Replication, background jobs with BullMQ, Sentinel for HA — and the failure modes that come with each.

    Read more

  • Large Scale


    Distributed Redis. Cluster architecture, the CAP theorem, race conditions, and why hardware clocks break distributed locks.

    Read more