Stage 1: Single Instance¶

A single Redis instance can handle 100,000+ operations per second on commodity hardware. For most small-to-medium applications — caching, session storage, simple queues, rate limiting — this is more than enough. Before adding replicas, Sentinel, or Cluster, understand why one instance is so fast and what it guarantees.

The Anti-Over-Engineering Rule

If your app has one or two servers, your Redis fits in memory, and you can tolerate losing up to one second of data on crash — stay here. Everything in the next two stages adds operational complexity that you don't need yet.

The Event Loop¶

Redis processes every command on a single thread. This sounds like a bottleneck, but it's actually a performance advantage.

How It Works¶

Redis implements its own event library called ae, which wraps the fastest available OS-level I/O multiplexer: epoll on Linux, kqueue on macOS/BSD, or select as a fallback. The core loop is minimal:

Register all client sockets with the multiplexer
Wait for any socket to have data ready (non-blocking)
Process ready connections one at a time, to completion
Repeat

graph TD
    A[Wait for ready sockets<br/><i>epoll/kqueue</i>] --> B{Any socket<br/>has data?}
    B -->|Yes| C[Read request<br/><i>network I/O</i>]
    C --> D[Parse command<br/><i>CPU</i>]
    D --> E[Execute against<br/>data structures<br/><i>CPU + memory</i>]
    E --> F[Write response<br/><i>network I/O</i>]
    F --> A
    B -->|No| G[Process timer events<br/><i>expiry, background tasks</i>]
    G --> A

Each command runs to completion before the next one starts. There is no interleaving, no preemption, no concurrent access to data structures.

Why Single-Threaded Is Fast¶

Factor	Multi-threaded cost	Redis avoids it
Context switching	Thousands of CPU cycles per switch (kernel transition, register save/restore, cache invalidation)	Zero — one thread, no switching
Cache locality	Cores fight over cache lines ("ping-ponging"), adding hundreds of cycles per operation	Hot data stays in L1/L2 cache (~1ns access vs ~100ns for main memory)
Synchronization	Mutexes, spin locks, CAS loops — all add latency and complexity	No locks anywhere. Impossible to deadlock.
Correctness	Race conditions, data corruption, hard-to-reproduce bugs	Every operation is inherently atomic

The key insight: since Redis operations are microsecond-scale (data is in memory, not on disk), the bottleneck is network I/O, not CPU. A single core can process commands far faster than the network can deliver them.

Redis 6.0+: Threaded I/O¶

Redis 6.0 introduced multi-threaded I/O — but with a critical distinction:

What	Threaded?
Reading requests from sockets	Yes (I/O threads)
Writing responses to sockets	Yes (I/O threads)
Parsing commands	No (main thread)
Executing commands	No (main thread)
Data structure manipulation	No (main thread)

The I/O threads do not run concurrently with command execution. The main thread dispatches read/write work to I/O threads, waits for them to finish, then processes commands. This preserves the single-threaded execution guarantee while parallelizing the socket reads/writes that were the actual bottleneck.

Impact

Threaded I/O is disabled by default. When enabled with 8 I/O threads, benchmarks show 37–112% throughput improvement depending on workload. The gains come from parallelizing network serialization/deserialization, not from executing commands faster.

Data Structures Under the Hood¶

When you use a Redis Hash, List, or Sorted Set, you're not directly using a hashtable, linked list, or tree. Redis chooses an internal encoding based on the data's size and shape, silently switching encodings as data grows.

SDS (Simple Dynamic Strings)¶

Redis never uses raw C strings. Every string value is an SDS (Simple Dynamic String):

[Header: len | alloc | flags] [string data...] [\0]

Property	C string	SDS
Get length	O(n) — scan for `\0`	O(1) — read `len` field
Binary safe	No — `\0` terminates	Yes — length-tracked, can contain `\0`
Buffer overflow protection	None	Checked against `alloc`
Resize cost	Always full realloc	Pre-allocated space reduces reallocs

SDS uses multiple header types (sdshdr8, sdshdr16, sdshdr32, sdshdr64) — the smallest header that can represent the string's length. A 200-byte string uses a 3-byte header; a 1GB string uses a 9-byte header.

Growth strategy: strings under 1MB double their allocation; strings over 1MB grow by 1MB at a time. This amortizes reallocation cost.

Listpack (Replacing Ziplist)¶

For small hashes, lists, and sorted sets, Redis uses a listpack — a single contiguous block of memory with all entries stored sequentially.

graph LR
    subgraph "Listpack (one contiguous allocation)"
        A["Total<br/>bytes"] --> B["Entry 1<br/><i>encoding + data + backlen</i>"]
        B --> C["Entry 2<br/><i>encoding + data + backlen</i>"]
        C --> D["Entry 3<br/><i>encoding + data + backlen</i>"]
        D --> E["EOF<br/>marker"]
    end

Why contiguous memory matters: a hashtable with 10 entries requires ~10 separate allocations (node structs, key/value pointers). A listpack stores all 10 entries in one allocation. This means:

~2 bytes overhead per entry vs ~21 bytes for a hashtable node
Sequential memory access (cache-friendly) vs pointer chasing (cache-hostile)
Zero pointer overhead

The ziplist problem: Redis 7 replaced ziplists with listpacks because of a design flaw. Ziplists stored each entry's previous entry's length. If you inserted a large entry, the next entry's "previous length" field might need to grow from 1 byte to 5 bytes, which could cascade through the entire structure — an O(n) chain reaction from a single insert. Listpacks store each entry's own length instead, eliminating cascading updates entirely.

Skiplist¶

Used for sorted sets when they outgrow the listpack threshold. A skiplist is a probabilistic data structure — a linked list with multiple "express lanes":

graph LR
    subgraph "Level 3"
        L3_1["1"] --> L3_9["9"]
    end
    subgraph "Level 2"
        L2_1["1"] --> L2_5["5"] --> L2_9["9"]
    end
    subgraph "Level 1"
        L1_1["1"] --> L1_3["3"] --> L1_5["5"] --> L1_7["7"] --> L1_9["9"]
    end

Property	Skiplist	Balanced tree (e.g., AVL, Red-Black)
Search	O(log n)	O(log n)
Insert/Delete	O(log n), simpler	O(log n), complex rotations
Range queries	Very efficient (follow forward pointers)	Requires in-order traversal
Memory per node	~1.33 pointers average (p=1/4)	2 pointers (left/right)
Implementation	~100 lines	~300+ lines

Redis chose skiplists over balanced trees because they're simpler to implement, naturally efficient for range queries (ZRANGEBYSCORE), and have comparable performance with slightly less memory.

Dict (Hashtable) with Incremental Rehashing¶

The core key-value store uses a dict — a hash table with a critical design choice: incremental rehashing.

A naive hash table doubles in size when the load factor gets too high. This means allocating a new table and moving all entries — a sudden, blocking operation that causes a latency spike proportional to the table size.

Redis avoids this by maintaining two hash tables simultaneously:

graph TD
    subgraph "During Rehashing"
        HT0["ht[0] — old table<br/><i>gradually emptying</i>"]
        HT1["ht[1] — new table (2x size)<br/><i>gradually filling</i>"]
    end

    OP["Every dict operation<br/>(lookup, insert, delete)"] --> M["Migrate 1 bucket<br/>from ht[0] → ht[1]"]
    M --> HT0
    M --> HT1

When a resize triggers, ht[1] is allocated at double the size
A rehashidx counter starts at 0
On every normal operation (GET, SET, DEL), Redis migrates one bucket from ht[0] to ht[1] (visiting up to 10 empty buckets to bound work per step)
Lookups check both tables; inserts go only to ht[1]
When migration completes, ht[1] becomes ht[0]

The result: rehashing is invisible to clients. There is no sudden latency spike, just a tiny constant overhead spread across millions of operations. The trade-off is temporarily using ~2x the hash table memory during the migration window.

Encoding Thresholds¶

Redis automatically switches encodings when data outgrows compact representations:

Type	Compact encoding	Condition to stay compact	Full encoding
Hash	Listpack	Entries ≤ 128 AND all values ≤ 64 bytes	Hashtable
Sorted Set	Listpack	Entries ≤ 128 AND all values ≤ 64 bytes	Skiplist + Hashtable
Set	Intset	All integers AND count ≤ 512	Hashtable
List	Quicklist (linked list of listpacks)	Always uses quicklist	—

These thresholds are configurable (hash-max-listpack-entries, hash-max-listpack-value, etc.). The defaults are tuned for a balance between memory efficiency and CPU cost of linear scans in compact encodings.

BullMQ: How Data Structures Map

BullMQ stores each job as a Redis Hash (bull:<queue>:<jobId>) with fields like data, opts, progress, returnvalue. For a queue with thousands of small jobs, each job hash likely uses the compact listpack encoding (few fields, small values).

The queue itself is a List (bull:<queue>:wait) — FIFO order via LPUSH/BRPOPLPUSH. Delayed jobs sit in a Sorted Set (bull:<queue>:delayed) scored by their target timestamp, leveraging the skiplist's efficient ZRANGEBYSCORE to find jobs ready to execute.

Memory Model¶

Jemalloc and Fragmentation¶

Redis uses jemalloc as its memory allocator. Jemalloc requests large chunks from the OS and subdivides them into size classes for smaller allocations — which is efficient, but creates a fundamental tension.

When Redis creates and deletes keys of varying sizes, the freed slots in jemalloc's arenas can't always be reused for differently-sized allocations. An arena can only be returned to the OS when all allocations within it are freed. If even one small allocation remains, the entire arena stays resident.

Fragmentation ratio:

\[ \text{fragmentation\_ratio} = \frac{\text{used\_memory\_rss}}{\text{used\_memory}} \]

Ratio	Meaning
< 1.0	Redis is swapping to disk — very bad, immediate action needed
1.0 – 1.5	Normal, healthy
1.5 – 2.0	Moderate fragmentation — monitor
> 2.0	High fragmentation — consider active defrag

Active defragmentation (Redis 4.0+): Redis can proactively move allocations to consolidate fragmented memory. It scans key-value pairs and reallocates them to contiguous regions. This trades CPU for reduced memory waste — configurable via activedefrag yes with tunable thresholds for when to start and how much CPU to spend.

Eviction Policies¶

When Redis hits maxmemory, it must evict keys to make room. The choice of eviction policy has significant implications:

Policy	Scope	Algorithm
`noeviction`	—	Return errors on writes (safe but disruptive)
`allkeys-lru`	All keys	Approximated LRU
`allkeys-lfu`	All keys	Approximated LFU
`allkeys-random`	All keys	Random eviction
`volatile-lru`	Keys with TTL only	Approximated LRU
`volatile-lfu`	Keys with TTL only	Approximated LFU
`volatile-ttl`	Keys with TTL only	Shortest TTL first
`volatile-random`	Keys with TTL only	Random eviction

Redis LRU Is Approximated, Not Exact

Redis does not maintain a true LRU linked list — that would require per-key pointer overhead (16 bytes per key on 64-bit systems). Instead:

Each key stores a 24-bit LRU clock (last access timestamp)
On eviction, Redis samples maxmemory-samples (default 5) random keys
Sampled keys populate a 16-entry eviction pool sorted by idle time
The key with the longest idle time in the pool is evicted

With 5 samples, the approximation is surprisingly close to true LRU. With 10 samples, it's nearly indistinguishable. This is a deliberate engineering trade-off: exact LRU costs O(n) memory, approximated LRU costs O(1) with tunable accuracy.

LFU (Least Frequently Used) — added in Redis 4.0, reuses the same 24 bits differently:

16 bits: decay timestamp (when the counter was last decremented)
8 bits: a Morris counter — a logarithmic probabilistic counter

The Morris counter's increment probability decreases as the count grows: p = 1 / (counter * lfu_log_factor + 1). This means frequently accessed keys quickly reach a high count and stay there, while the decay mechanism periodically reduces counts so old popular keys don't persist forever.

Persistence: What Can You Lose?¶

Redis is an in-memory store. Without persistence, a crash means total data loss. Redis offers two persistence mechanisms, each with distinct trade-offs.

RDB (Snapshots)¶

Redis periodically saves the entire dataset to a binary file (.rdb):

Redis calls fork() to create a child process
Parent continues serving clients
Child writes the entire dataset to a temp file
Child atomically replaces the old .rdb file

Copy-on-Write (CoW): after fork(), parent and child share the same memory pages. Only pages modified by the parent (from new writes during the snapshot) are physically copied. The memory overhead is proportional to write volume during the snapshot, not dataset size.

fork() Is Redis's Achilles' Heel

fork() must duplicate the page table — a kernel data structure that maps virtual addresses to physical memory. With a 25GB dataset, this can take 100ms+, during which Redis is completely blocked.

Transparent Huge Pages (THP) make this dramatically worse. THP increases the CoW granularity from 4KB to 2MB pages. A single byte write during a snapshot now copies 2MB instead of 4KB. Redis's official documentation explicitly recommends disabling THP:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

AOF (Append-Only File)¶

Every write command is appended to a log file. Three fsync policies control the durability-performance trade-off:

`appendfsync`	Data loss window	How it works
`always`	None	fsync after every write — safest, slowest
`everysec` (default)	~1 second	fsync every second in a background thread
`no`	~30 seconds (OS-dependent)	Let the OS flush when it wants

AOF Rewrite: the AOF grows indefinitely because it logs every write. Periodically, Redis rewrites it to the minimal set of commands that recreate the current dataset. Redis 7.0+ uses Multi-Part AOF — a base file (in RDB format) plus incremental files plus a manifest — which eliminates the old double-write penalty and memory spike during rewrites.

AOF rewrite also uses fork()

The same CoW memory concerns from RDB apply here. The rewrite happens less frequently, but the fork() latency hit is the same.

What You Can Lose¶

Persistence mode	Data loss on crash	Recovery speed	Best for
None	Everything	N/A (no recovery)	Pure cache
RDB only	Minutes (since last snapshot)	Fast (binary load)	Backups, disaster recovery
AOF only (everysec)	~1 second	Slower (replay commands)	Durability-sensitive apps
RDB + AOF (recommended)	~1 second (loads AOF)	Medium	Production — best of both
AOF (always)	Nothing (theoretically)	Slowest	Extremely rare — significant perf cost

Talking to Redis: The Wire Protocol¶

Before understanding why pipelining and connection pooling matter, you need to understand how clients actually communicate with Redis.

RESP (REdis Serialization Protocol)¶

Every Redis client — whether it's redis-cli, ioredis, Jedis, or Lettuce — talks to the server over TCP using RESP, a simple text-based protocol designed to be human-readable and machine-parseable.

A client sends a command as a RESP Array of Bulk Strings:

*3\r\n        ← Array of 3 elements
$3\r\n        ← Bulk String, 3 bytes
SET\r\n       ← "SET"
$4\r\n        ← Bulk String, 4 bytes
name\r\n      ← "name"
$5\r\n        ← Bulk String, 5 bytes
alice\r\n     ← "alice"

The server responds with a type-prefixed reply:

Prefix	Type	Example
`+`	Simple String	`+OK\r\n`
`-`	Error	`-ERR unknown command\r\n`
`:`	Integer	`:1\r\n`
`$`	Bulk String	`$5\r\nalice\r\n`
`*`	Array	`*2\r\n$3\r\nfoo\r\n$3\r\nbar\r\n`

The protocol is synchronous by default: send one command, wait for one reply. This is the bottleneck that pipelining solves.

Why This Matters

Understanding RESP explains why Redis is fast at the protocol level: there is no schema negotiation, no handshake overhead per command, no binary encoding/decoding complexity. It's plain text over TCP. The simplicity is the speed.

Connection Pooling¶

Every Redis command requires a TCP connection. Creating a new TCP connection involves a three-way handshake (~1 RTT), and if TLS is enabled, a TLS handshake on top (~1-2 more RTTs). Doing this per-request is wasteful.

graph LR
    subgraph "Without Pool"
        A1["Request 1"] -->|"new TCP conn"| R1["Redis"]
        A2["Request 2"] -->|"new TCP conn"| R1
        A3["Request 3"] -->|"new TCP conn"| R1
    end

graph LR
    subgraph "With Pool"
        A1["Request 1"] --> P["Connection Pool<br/><i>N persistent connections</i>"]
        A2["Request 2"] --> P
        A3["Request 3"] --> P
        P -->|"reuse conn A"| R1["Redis"]
        P -->|"reuse conn B"| R1
    end

How it works:

On startup, the client library creates N persistent TCP connections to Redis
When a thread needs to send a command, it borrows a connection from the pool
After the command completes, the connection is returned to the pool
Connections are kept alive and reused across thousands of requests

BullMQ: Connection Pooling via ioredis

BullMQ uses ioredis as its Redis client, which supports connection pooling natively. A BullMQ Worker maintains its own connection (separate from the Queue producer connection) because the worker uses BRPOPLPUSH/BLMOVE — a blocking command that monopolizes the connection until a job arrives. If the worker shared a connection with non-blocking operations, they'd all stall.

This is why BullMQ requires separate Redis connections for the Queue (producer), Worker (consumer), and QueueEvents (event listener) — each has a fundamentally different connection usage pattern.

Pool Sizing

Too few connections: threads wait for a free connection, adding latency. Too many connections: each connection consumes memory on both client and server (Redis tracks each connection's output buffer, query buffer, and state). A good starting point is number of application threads that concurrently access Redis, plus a small buffer.

Check INFO clients → connected_clients to see how many connections your application is actually using.

Pipelining¶

In the default request-response model, each command pays a full network round-trip (RTT). If RTT is 0.5ms and you need to run 100 commands, that's 50ms of pure network waiting — even though Redis can process those 100 commands in under 1ms.

sequenceDiagram
    participant Client
    participant Redis

    rect rgb(255, 230, 230)
        Note over Client,Redis: Without pipelining (3 × RTT)
        Client->>Redis: SET key1 "a"
        Redis->>Client: +OK
        Client->>Redis: SET key2 "b"
        Redis->>Client: +OK
        Client->>Redis: SET key3 "c"
        Redis->>Client: +OK
    end

    rect rgb(230, 255, 230)
        Note over Client,Redis: With pipelining (1 × RTT)
        Client->>Redis: SET key1 "a"<br/>SET key2 "b"<br/>SET key3 "c"
        Redis->>Client: +OK<br/>+OK<br/>+OK
    end

How it works: The client sends multiple commands without waiting for replies, then reads all replies in one batch. This is possible because RESP is a simple stream protocol — Redis processes commands in order and writes replies in order. The client can match replies to commands by position.

Impact: Pipelining can improve throughput by 5-10x for bulk operations. The improvement is proportional to RTT — the higher the latency between client and server, the bigger the gain.

Scenario	RTT	100 commands without pipeline	100 commands with pipeline
Same machine	~0.05ms	~5ms	~0.1ms
Same datacenter	~0.5ms	~50ms	~1ms
Cross-region	~50ms	~5,000ms	~50ms

Pipelining Is Not Atomic

Unlike MULTI/EXEC, pipelined commands are not transactional. Other clients' commands can interleave between your pipelined commands. Pipelining is a throughput optimization, not a consistency mechanism. If you need atomicity, use MULTI/EXEC or Lua scripts.

BullMQ: Move and Fetch as Implicit Pipelining

BullMQ's moveToCompleted Lua script is a form of server-side pipelining: it completes the current job and fetches the next job in a single round-trip. Without this optimization, every job completion would cost 2 RTTs (one to complete, one to fetch). Under high throughput, this nearly doubles the effective processing rate.

Production Tuning¶

A default Redis installation works well for development. Production workloads need deliberate tuning of both Redis configuration and the operating system.

Redis Configuration¶

Directive	Default	Production guidance
`maxmemory`	0 (no limit)	Always set this. Without it, Redis will consume all available RAM and the OS will OOM-kill it. Leave ~25% overhead for fork() CoW, replication buffers, and fragmentation.
`maxclients`	10,000	Increase if you have many application instances or microservices. After this limit, Redis responds with errors to new connections.
`tcp-backlog`	511	The queue size for incoming connections waiting to be accepted. Under burst traffic, increase this alongside the OS `somaxconn` and `tcp_max_syn_backlog`.
`timeout`	0 (disabled)	Seconds before idle connections are closed. Setting this (e.g., 300) prevents connection leaks from crashed clients.
`maxmemory-policy`	`noeviction`	Choose an eviction policy (see Eviction Policies above). `noeviction` returns errors on writes when memory is full — safe but disruptive.

Operating System Tuning¶

Transparent Huge Pages (THP)

THP increases CoW page size from 4KB to 2MB, causing massive memory amplification during fork(). Redis will log a warning at startup if THP is enabled:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

Make this persistent across reboots by adding it to /etc/rc.local or a systemd unit.

File descriptor limits: each Redis client connection uses one file descriptor. The default limit (often 1024) is too low for production. Redis will log errors like "Redis can't set maximum open files".

# /etc/security/limits.conf (or systemd override)
redis soft nofile 65536
redis hard nofile 65536

Network stack tuning for high-connection environments:

Kernel parameter	What it does	Suggested value
`net.core.somaxconn`	Max queued connections	65535
`net.ipv4.tcp_max_syn_backlog`	SYN queue size	65535
`vm.overcommit_memory`	Allow fork() to succeed even if memory looks tight	1

vm.overcommit_memory = 1

Redis's fork() for persistence creates a child process that theoretically could use as much memory as the parent (due to CoW). With the default overcommit setting (0), Linux may refuse the fork if it thinks there isn't enough memory — even though CoW means the child will only use a fraction. Setting overcommit_memory = 1 tells Linux to always allow the fork. Redis logs a warning if this is not set.

CPU affinity and RPS: on multi-core machines, prevent Redis from competing with network interrupt handlers for the same CPU cores:

Enable RPS (Receive Packet Steering) on network interfaces, pinned to cores 0-1
Set Redis's CPU affinity to cores 2+

This ensures the network stack and Redis's event loop don't contend for the same L1/L2 cache.

Observability¶

You can't fix what you can't measure. A single Redis instance is simple to monitor — and you should monitor it from day one, not after the first incident.

The INFO Command¶

INFO is the single most important diagnostic command. It returns metrics across multiple sections:

redis-cli INFO [section]

Section	Key metrics	What to watch
server	`redis_version`, `uptime_in_seconds`	Unexpected restarts (`uptime < 300` means something crashed it)
clients	`connected_clients`, `blocked_clients`	Connection leaks (growing `connected_clients`), workers stuck on blocking commands (`blocked_clients`)
memory	`used_memory`, `used_memory_rss`, `mem_fragmentation_ratio`	RSS growing faster than `used_memory` = fragmentation; ratio < 1.0 = swapping
stats	`keyspace_hits`, `keyspace_misses`, `evicted_keys`, `instantaneous_ops_per_sec`	Hit ratio (`hits / (hits + misses)`), eviction pressure, throughput
persistence	`rdb_last_save_time`, `aof_rewrite_in_progress`	Stale snapshots, concurrent rewrite during high load
replication	`connected_slaves`, `master_link_down_since`	Replication broken if `master_link_down_since > 0`
keyspace	`keys`, `expires`, `avg_ttl` per database	Key count trends, TTL distribution

Cache hit ratio is the single most important application-level metric:

\[ \text{hit\_ratio} = \frac{\text{keyspace\_hits}}{\text{keyspace\_hits} + \text{keyspace\_misses}} \]

A healthy cache typically has a hit ratio above 0.90 (90%). If it drops, either your TTLs are too short, your cache is too small, or your access patterns have shifted.

Latency Diagnostics¶

Quick latency check — measures round-trip time to Redis from the client machine:

redis-cli --latency              # continuous, 1-second intervals
redis-cli --latency-history      # rolling 15-second windows
redis-cli --latency-dist         # visual latency distribution

Latency monitoring framework — tracks latency spikes inside Redis (command processing time, not network):

CONFIG SET latency-monitor-threshold 5    # log events > 5ms
LATENCY LATEST                            # most recent spikes
LATENCY HISTORY command                   # history for a specific event
LATENCY DOCTOR                            # automated diagnosis

Event type	What it measures
`command`	Regular command execution time
`fast-command`	O(1) and O(log n) commands
`fork`	Time spent in `fork()` for persistence
`aof-write`	AOF write latency
`expire-cycle`	Key expiration processing
`eviction-cycle`	Key eviction processing

Slow Log¶

Captures commands whose execution time (excluding I/O) exceeds a threshold:

# redis.conf
slowlog-log-slower-than 10000    # microseconds (10ms)
slowlog-max-len 256              # keep last 256 entries

SLOWLOG LEN                      # number of entries
SLOWLOG GET 10                   # last 10 slow commands
SLOWLOG RESET                    # clear the log

Set the threshold low initially

Start with slowlog-log-slower-than 10000 (10ms). In a healthy Redis instance, almost nothing should take 10ms. If the slow log fills up, you're either running O(n) commands on large keys (KEYS *, SMEMBERS on a huge set) or experiencing fork() latency from persistence.

Finding Problematic Keys¶

redis-cli --bigkeys              # scan for largest keys by type
redis-cli --memkeys              # scan with memory usage per key
redis-cli --hotkeys              # most frequently accessed keys (requires LFU policy)

These use SCAN internally — safe to run in production (non-blocking, cursor-based).

MONITOR: Use With Extreme Caution

redis-cli MONITOR streams every command processed by Redis in real time. It is invaluable for debugging but can reduce throughput by up to 50%. Never leave it running in production. Use it briefly to diagnose issues, then disconnect.

Alert Thresholds to Start With¶

Metric	Condition	Severity
`uptime_in_seconds`	< 300	Critical — Redis restarted unexpectedly
`connected_clients`	< expected minimum	Warning — application connection issue
`mem_fragmentation_ratio`	> 2.0 or < 1.0	Warning — high fragmentation or swapping
`evicted_keys` (rate)	> 0 when unexpected	Warning — memory pressure causing data loss
`instantaneous_ops_per_sec`	> 80% of benchmark baseline	Warning — approaching throughput limit
`rdb_last_save_time`	> max acceptable interval	Warning — persistence falling behind
`keyspace_misses / (hits + misses)`	> 0.20	Warning — cache effectiveness degrading

When to Stay Here¶

You don't need replication, Sentinel, or Cluster if:

[x] Your application runs on a single server (or a small number of servers all in one region)
[x] Redis fits comfortably in memory on one machine
[x] You can tolerate ~1 second of data loss (AOF everysec)
[x] Your throughput is well under 100K ops/sec
[x] You don't need high availability (brief downtime on crash/restart is acceptable)
[x] You're not doing multi-key atomic operations that span multiple services

The Over-Engineering Trap

Adding replication "just in case" means managing replication lag, monitoring replica health, handling failover (manual or Sentinel), and debugging split-brain scenarios. If your app doesn't need HA yet, that complexity will create more incidents than it prevents.