Stage 1: Single Instance¶
A single Redis instance can handle 100,000+ operations per second on commodity hardware. For most small-to-medium applications — caching, session storage, simple queues, rate limiting — this is more than enough. Before adding replicas, Sentinel, or Cluster, understand why one instance is so fast and what it guarantees.
The Anti-Over-Engineering Rule
If your app has one or two servers, your Redis fits in memory, and you can tolerate losing up to one second of data on crash — stay here. Everything in the next two stages adds operational complexity that you don't need yet.
The Event Loop¶
Redis processes every command on a single thread. This sounds like a bottleneck, but it's actually a performance advantage.
How It Works¶
Redis implements its own event library called ae, which wraps the fastest available OS-level I/O multiplexer: epoll on Linux, kqueue on macOS/BSD, or select as a fallback. The core loop is minimal:
- Register all client sockets with the multiplexer
- Wait for any socket to have data ready (non-blocking)
- Process ready connections one at a time, to completion
- Repeat
graph TD
A[Wait for ready sockets<br/><i>epoll/kqueue</i>] --> B{Any socket<br/>has data?}
B -->|Yes| C[Read request<br/><i>network I/O</i>]
C --> D[Parse command<br/><i>CPU</i>]
D --> E[Execute against<br/>data structures<br/><i>CPU + memory</i>]
E --> F[Write response<br/><i>network I/O</i>]
F --> A
B -->|No| G[Process timer events<br/><i>expiry, background tasks</i>]
G --> A
Each command runs to completion before the next one starts. There is no interleaving, no preemption, no concurrent access to data structures.
Why Single-Threaded Is Fast¶
| Factor | Multi-threaded cost | Redis avoids it |
|---|---|---|
| Context switching | Thousands of CPU cycles per switch (kernel transition, register save/restore, cache invalidation) | Zero — one thread, no switching |
| Cache locality | Cores fight over cache lines ("ping-ponging"), adding hundreds of cycles per operation | Hot data stays in L1/L2 cache (~1ns access vs ~100ns for main memory) |
| Synchronization | Mutexes, spin locks, CAS loops — all add latency and complexity | No locks anywhere. Impossible to deadlock. |
| Correctness | Race conditions, data corruption, hard-to-reproduce bugs | Every operation is inherently atomic |
The key insight: since Redis operations are microsecond-scale (data is in memory, not on disk), the bottleneck is network I/O, not CPU. A single core can process commands far faster than the network can deliver them.
Redis 6.0+: Threaded I/O¶
Redis 6.0 introduced multi-threaded I/O — but with a critical distinction:
| What | Threaded? |
|---|---|
| Reading requests from sockets | Yes (I/O threads) |
| Writing responses to sockets | Yes (I/O threads) |
| Parsing commands | No (main thread) |
| Executing commands | No (main thread) |
| Data structure manipulation | No (main thread) |
The I/O threads do not run concurrently with command execution. The main thread dispatches read/write work to I/O threads, waits for them to finish, then processes commands. This preserves the single-threaded execution guarantee while parallelizing the socket reads/writes that were the actual bottleneck.
Impact
Threaded I/O is disabled by default. When enabled with 8 I/O threads, benchmarks show 37–112% throughput improvement depending on workload. The gains come from parallelizing network serialization/deserialization, not from executing commands faster.
Data Structures Under the Hood¶
When you use a Redis Hash, List, or Sorted Set, you're not directly using a hashtable, linked list, or tree. Redis chooses an internal encoding based on the data's size and shape, silently switching encodings as data grows.
SDS (Simple Dynamic Strings)¶
Redis never uses raw C strings. Every string value is an SDS (Simple Dynamic String):
| Property | C string | SDS |
|---|---|---|
| Get length | O(n) — scan for \0 |
O(1) — read len field |
| Binary safe | No — \0 terminates |
Yes — length-tracked, can contain \0 |
| Buffer overflow protection | None | Checked against alloc |
| Resize cost | Always full realloc | Pre-allocated space reduces reallocs |
SDS uses multiple header types (sdshdr8, sdshdr16, sdshdr32, sdshdr64) — the smallest header that can represent the string's length. A 200-byte string uses a 3-byte header; a 1GB string uses a 9-byte header.
Growth strategy: strings under 1MB double their allocation; strings over 1MB grow by 1MB at a time. This amortizes reallocation cost.
Listpack (Replacing Ziplist)¶
For small hashes, lists, and sorted sets, Redis uses a listpack — a single contiguous block of memory with all entries stored sequentially.
graph LR
subgraph "Listpack (one contiguous allocation)"
A["Total<br/>bytes"] --> B["Entry 1<br/><i>encoding + data + backlen</i>"]
B --> C["Entry 2<br/><i>encoding + data + backlen</i>"]
C --> D["Entry 3<br/><i>encoding + data + backlen</i>"]
D --> E["EOF<br/>marker"]
end
Why contiguous memory matters: a hashtable with 10 entries requires ~10 separate allocations (node structs, key/value pointers). A listpack stores all 10 entries in one allocation. This means:
- ~2 bytes overhead per entry vs ~21 bytes for a hashtable node
- Sequential memory access (cache-friendly) vs pointer chasing (cache-hostile)
- Zero pointer overhead
The ziplist problem: Redis 7 replaced ziplists with listpacks because of a design flaw. Ziplists stored each entry's previous entry's length. If you inserted a large entry, the next entry's "previous length" field might need to grow from 1 byte to 5 bytes, which could cascade through the entire structure — an O(n) chain reaction from a single insert. Listpacks store each entry's own length instead, eliminating cascading updates entirely.
Skiplist¶
Used for sorted sets when they outgrow the listpack threshold. A skiplist is a probabilistic data structure — a linked list with multiple "express lanes":
graph LR
subgraph "Level 3"
L3_1["1"] --> L3_9["9"]
end
subgraph "Level 2"
L2_1["1"] --> L2_5["5"] --> L2_9["9"]
end
subgraph "Level 1"
L1_1["1"] --> L1_3["3"] --> L1_5["5"] --> L1_7["7"] --> L1_9["9"]
end
| Property | Skiplist | Balanced tree (e.g., AVL, Red-Black) |
|---|---|---|
| Search | O(log n) | O(log n) |
| Insert/Delete | O(log n), simpler | O(log n), complex rotations |
| Range queries | Very efficient (follow forward pointers) | Requires in-order traversal |
| Memory per node | ~1.33 pointers average (p=1/4) | 2 pointers (left/right) |
| Implementation | ~100 lines | ~300+ lines |
Redis chose skiplists over balanced trees because they're simpler to implement, naturally efficient for range queries (ZRANGEBYSCORE), and have comparable performance with slightly less memory.
Dict (Hashtable) with Incremental Rehashing¶
The core key-value store uses a dict — a hash table with a critical design choice: incremental rehashing.
A naive hash table doubles in size when the load factor gets too high. This means allocating a new table and moving all entries — a sudden, blocking operation that causes a latency spike proportional to the table size.
Redis avoids this by maintaining two hash tables simultaneously:
graph TD
subgraph "During Rehashing"
HT0["ht[0] — old table<br/><i>gradually emptying</i>"]
HT1["ht[1] — new table (2x size)<br/><i>gradually filling</i>"]
end
OP["Every dict operation<br/>(lookup, insert, delete)"] --> M["Migrate 1 bucket<br/>from ht[0] → ht[1]"]
M --> HT0
M --> HT1
- When a resize triggers,
ht[1]is allocated at double the size - A
rehashidxcounter starts at 0 - On every normal operation (GET, SET, DEL), Redis migrates one bucket from
ht[0]toht[1](visiting up to 10 empty buckets to bound work per step) - Lookups check both tables; inserts go only to
ht[1] - When migration completes,
ht[1]becomesht[0]
The result: rehashing is invisible to clients. There is no sudden latency spike, just a tiny constant overhead spread across millions of operations. The trade-off is temporarily using ~2x the hash table memory during the migration window.
Encoding Thresholds¶
Redis automatically switches encodings when data outgrows compact representations:
| Type | Compact encoding | Condition to stay compact | Full encoding |
|---|---|---|---|
| Hash | Listpack | Entries ≤ 128 AND all values ≤ 64 bytes | Hashtable |
| Sorted Set | Listpack | Entries ≤ 128 AND all values ≤ 64 bytes | Skiplist + Hashtable |
| Set | Intset | All integers AND count ≤ 512 | Hashtable |
| List | Quicklist (linked list of listpacks) | Always uses quicklist | — |
These thresholds are configurable (hash-max-listpack-entries, hash-max-listpack-value, etc.). The defaults are tuned for a balance between memory efficiency and CPU cost of linear scans in compact encodings.
BullMQ: How Data Structures Map
BullMQ stores each job as a Redis Hash (bull:<queue>:<jobId>) with fields like data, opts, progress, returnvalue. For a queue with thousands of small jobs, each job hash likely uses the compact listpack encoding (few fields, small values).
The queue itself is a List (bull:<queue>:wait) — FIFO order via LPUSH/BRPOPLPUSH. Delayed jobs sit in a Sorted Set (bull:<queue>:delayed) scored by their target timestamp, leveraging the skiplist's efficient ZRANGEBYSCORE to find jobs ready to execute.
Memory Model¶
Jemalloc and Fragmentation¶
Redis uses jemalloc as its memory allocator. Jemalloc requests large chunks from the OS and subdivides them into size classes for smaller allocations — which is efficient, but creates a fundamental tension.
When Redis creates and deletes keys of varying sizes, the freed slots in jemalloc's arenas can't always be reused for differently-sized allocations. An arena can only be returned to the OS when all allocations within it are freed. If even one small allocation remains, the entire arena stays resident.
Fragmentation ratio:
| Ratio | Meaning |
|---|---|
| < 1.0 | Redis is swapping to disk — very bad, immediate action needed |
| 1.0 – 1.5 | Normal, healthy |
| 1.5 – 2.0 | Moderate fragmentation — monitor |
| > 2.0 | High fragmentation — consider active defrag |
Active defragmentation (Redis 4.0+): Redis can proactively move allocations to consolidate fragmented memory. It scans key-value pairs and reallocates them to contiguous regions. This trades CPU for reduced memory waste — configurable via activedefrag yes with tunable thresholds for when to start and how much CPU to spend.
Eviction Policies¶
When Redis hits maxmemory, it must evict keys to make room. The choice of eviction policy has significant implications:
| Policy | Scope | Algorithm |
|---|---|---|
noeviction |
— | Return errors on writes (safe but disruptive) |
allkeys-lru |
All keys | Approximated LRU |
allkeys-lfu |
All keys | Approximated LFU |
allkeys-random |
All keys | Random eviction |
volatile-lru |
Keys with TTL only | Approximated LRU |
volatile-lfu |
Keys with TTL only | Approximated LFU |
volatile-ttl |
Keys with TTL only | Shortest TTL first |
volatile-random |
Keys with TTL only | Random eviction |
Redis LRU Is Approximated, Not Exact
Redis does not maintain a true LRU linked list — that would require per-key pointer overhead (16 bytes per key on 64-bit systems). Instead:
- Each key stores a 24-bit LRU clock (last access timestamp)
- On eviction, Redis samples
maxmemory-samples(default 5) random keys - Sampled keys populate a 16-entry eviction pool sorted by idle time
- The key with the longest idle time in the pool is evicted
With 5 samples, the approximation is surprisingly close to true LRU. With 10 samples, it's nearly indistinguishable. This is a deliberate engineering trade-off: exact LRU costs O(n) memory, approximated LRU costs O(1) with tunable accuracy.
LFU (Least Frequently Used) — added in Redis 4.0, reuses the same 24 bits differently:
- 16 bits: decay timestamp (when the counter was last decremented)
- 8 bits: a Morris counter — a logarithmic probabilistic counter
The Morris counter's increment probability decreases as the count grows: p = 1 / (counter * lfu_log_factor + 1). This means frequently accessed keys quickly reach a high count and stay there, while the decay mechanism periodically reduces counts so old popular keys don't persist forever.
Persistence: What Can You Lose?¶
Redis is an in-memory store. Without persistence, a crash means total data loss. Redis offers two persistence mechanisms, each with distinct trade-offs.
RDB (Snapshots)¶
Redis periodically saves the entire dataset to a binary file (.rdb):
- Redis calls
fork()to create a child process - Parent continues serving clients
- Child writes the entire dataset to a temp file
- Child atomically replaces the old
.rdbfile
Copy-on-Write (CoW): after fork(), parent and child share the same memory pages. Only pages modified by the parent (from new writes during the snapshot) are physically copied. The memory overhead is proportional to write volume during the snapshot, not dataset size.
fork() Is Redis's Achilles' Heel
fork() must duplicate the page table — a kernel data structure that maps virtual addresses to physical memory. With a 25GB dataset, this can take 100ms+, during which Redis is completely blocked.
Transparent Huge Pages (THP) make this dramatically worse. THP increases the CoW granularity from 4KB to 2MB pages. A single byte write during a snapshot now copies 2MB instead of 4KB. Redis's official documentation explicitly recommends disabling THP:
AOF (Append-Only File)¶
Every write command is appended to a log file. Three fsync policies control the durability-performance trade-off:
appendfsync |
Data loss window | How it works |
|---|---|---|
always |
None | fsync after every write — safest, slowest |
everysec (default) |
~1 second | fsync every second in a background thread |
no |
~30 seconds (OS-dependent) | Let the OS flush when it wants |
AOF Rewrite: the AOF grows indefinitely because it logs every write. Periodically, Redis rewrites it to the minimal set of commands that recreate the current dataset. Redis 7.0+ uses Multi-Part AOF — a base file (in RDB format) plus incremental files plus a manifest — which eliminates the old double-write penalty and memory spike during rewrites.
AOF rewrite also uses fork()
The same CoW memory concerns from RDB apply here. The rewrite happens less frequently, but the fork() latency hit is the same.
What You Can Lose¶
| Persistence mode | Data loss on crash | Recovery speed | Best for |
|---|---|---|---|
| None | Everything | N/A (no recovery) | Pure cache |
| RDB only | Minutes (since last snapshot) | Fast (binary load) | Backups, disaster recovery |
| AOF only (everysec) | ~1 second | Slower (replay commands) | Durability-sensitive apps |
| RDB + AOF (recommended) | ~1 second (loads AOF) | Medium | Production — best of both |
| AOF (always) | Nothing (theoretically) | Slowest | Extremely rare — significant perf cost |
Talking to Redis: The Wire Protocol¶
Before understanding why pipelining and connection pooling matter, you need to understand how clients actually communicate with Redis.
RESP (REdis Serialization Protocol)¶
Every Redis client — whether it's redis-cli, ioredis, Jedis, or Lettuce — talks to the server over TCP using RESP, a simple text-based protocol designed to be human-readable and machine-parseable.
A client sends a command as a RESP Array of Bulk Strings:
*3\r\n ← Array of 3 elements
$3\r\n ← Bulk String, 3 bytes
SET\r\n ← "SET"
$4\r\n ← Bulk String, 4 bytes
name\r\n ← "name"
$5\r\n ← Bulk String, 5 bytes
alice\r\n ← "alice"
The server responds with a type-prefixed reply:
| Prefix | Type | Example |
|---|---|---|
+ |
Simple String | +OK\r\n |
- |
Error | -ERR unknown command\r\n |
: |
Integer | :1\r\n |
$ |
Bulk String | $5\r\nalice\r\n |
* |
Array | *2\r\n$3\r\nfoo\r\n$3\r\nbar\r\n |
The protocol is synchronous by default: send one command, wait for one reply. This is the bottleneck that pipelining solves.
Why This Matters
Understanding RESP explains why Redis is fast at the protocol level: there is no schema negotiation, no handshake overhead per command, no binary encoding/decoding complexity. It's plain text over TCP. The simplicity is the speed.
Connection Pooling¶
Every Redis command requires a TCP connection. Creating a new TCP connection involves a three-way handshake (~1 RTT), and if TLS is enabled, a TLS handshake on top (~1-2 more RTTs). Doing this per-request is wasteful.
graph LR
subgraph "Without Pool"
A1["Request 1"] -->|"new TCP conn"| R1["Redis"]
A2["Request 2"] -->|"new TCP conn"| R1
A3["Request 3"] -->|"new TCP conn"| R1
end
graph LR
subgraph "With Pool"
A1["Request 1"] --> P["Connection Pool<br/><i>N persistent connections</i>"]
A2["Request 2"] --> P
A3["Request 3"] --> P
P -->|"reuse conn A"| R1["Redis"]
P -->|"reuse conn B"| R1
end
How it works:
- On startup, the client library creates N persistent TCP connections to Redis
- When a thread needs to send a command, it borrows a connection from the pool
- After the command completes, the connection is returned to the pool
- Connections are kept alive and reused across thousands of requests
BullMQ: Connection Pooling via ioredis
BullMQ uses ioredis as its Redis client, which supports connection pooling natively. A BullMQ Worker maintains its own connection (separate from the Queue producer connection) because the worker uses BRPOPLPUSH/BLMOVE — a blocking command that monopolizes the connection until a job arrives. If the worker shared a connection with non-blocking operations, they'd all stall.
This is why BullMQ requires separate Redis connections for the Queue (producer), Worker (consumer), and QueueEvents (event listener) — each has a fundamentally different connection usage pattern.
Pool Sizing
Too few connections: threads wait for a free connection, adding latency. Too many connections: each connection consumes memory on both client and server (Redis tracks each connection's output buffer, query buffer, and state). A good starting point is number of application threads that concurrently access Redis, plus a small buffer.
Check INFO clients → connected_clients to see how many connections your application is actually using.
Pipelining¶
In the default request-response model, each command pays a full network round-trip (RTT). If RTT is 0.5ms and you need to run 100 commands, that's 50ms of pure network waiting — even though Redis can process those 100 commands in under 1ms.
sequenceDiagram
participant Client
participant Redis
rect rgb(255, 230, 230)
Note over Client,Redis: Without pipelining (3 × RTT)
Client->>Redis: SET key1 "a"
Redis->>Client: +OK
Client->>Redis: SET key2 "b"
Redis->>Client: +OK
Client->>Redis: SET key3 "c"
Redis->>Client: +OK
end
rect rgb(230, 255, 230)
Note over Client,Redis: With pipelining (1 × RTT)
Client->>Redis: SET key1 "a"<br/>SET key2 "b"<br/>SET key3 "c"
Redis->>Client: +OK<br/>+OK<br/>+OK
end
How it works: The client sends multiple commands without waiting for replies, then reads all replies in one batch. This is possible because RESP is a simple stream protocol — Redis processes commands in order and writes replies in order. The client can match replies to commands by position.
Impact: Pipelining can improve throughput by 5-10x for bulk operations. The improvement is proportional to RTT — the higher the latency between client and server, the bigger the gain.
| Scenario | RTT | 100 commands without pipeline | 100 commands with pipeline |
|---|---|---|---|
| Same machine | ~0.05ms | ~5ms | ~0.1ms |
| Same datacenter | ~0.5ms | ~50ms | ~1ms |
| Cross-region | ~50ms | ~5,000ms | ~50ms |
Pipelining Is Not Atomic
Unlike MULTI/EXEC, pipelined commands are not transactional. Other clients' commands can interleave between your pipelined commands. Pipelining is a throughput optimization, not a consistency mechanism. If you need atomicity, use MULTI/EXEC or Lua scripts.
BullMQ: Move and Fetch as Implicit Pipelining
BullMQ's moveToCompleted Lua script is a form of server-side pipelining: it completes the current job and fetches the next job in a single round-trip. Without this optimization, every job completion would cost 2 RTTs (one to complete, one to fetch). Under high throughput, this nearly doubles the effective processing rate.
Production Tuning¶
A default Redis installation works well for development. Production workloads need deliberate tuning of both Redis configuration and the operating system.
Redis Configuration¶
| Directive | Default | Production guidance |
|---|---|---|
maxmemory |
0 (no limit) | Always set this. Without it, Redis will consume all available RAM and the OS will OOM-kill it. Leave ~25% overhead for fork() CoW, replication buffers, and fragmentation. |
maxclients |
10,000 | Increase if you have many application instances or microservices. After this limit, Redis responds with errors to new connections. |
tcp-backlog |
511 | The queue size for incoming connections waiting to be accepted. Under burst traffic, increase this alongside the OS somaxconn and tcp_max_syn_backlog. |
timeout |
0 (disabled) | Seconds before idle connections are closed. Setting this (e.g., 300) prevents connection leaks from crashed clients. |
maxmemory-policy |
noeviction |
Choose an eviction policy (see Eviction Policies above). noeviction returns errors on writes when memory is full — safe but disruptive. |
Operating System Tuning¶
Transparent Huge Pages (THP)
THP increases CoW page size from 4KB to 2MB, causing massive memory amplification during fork(). Redis will log a warning at startup if THP is enabled:
Make this persistent across reboots by adding it to /etc/rc.local or a systemd unit.
File descriptor limits: each Redis client connection uses one file descriptor. The default limit (often 1024) is too low for production. Redis will log errors like "Redis can't set maximum open files".
Network stack tuning for high-connection environments:
| Kernel parameter | What it does | Suggested value |
|---|---|---|
net.core.somaxconn |
Max queued connections | 65535 |
net.ipv4.tcp_max_syn_backlog |
SYN queue size | 65535 |
vm.overcommit_memory |
Allow fork() to succeed even if memory looks tight | 1 |
vm.overcommit_memory = 1
Redis's fork() for persistence creates a child process that theoretically could use as much memory as the parent (due to CoW). With the default overcommit setting (0), Linux may refuse the fork if it thinks there isn't enough memory — even though CoW means the child will only use a fraction. Setting overcommit_memory = 1 tells Linux to always allow the fork. Redis logs a warning if this is not set.
CPU affinity and RPS: on multi-core machines, prevent Redis from competing with network interrupt handlers for the same CPU cores:
- Enable RPS (Receive Packet Steering) on network interfaces, pinned to cores 0-1
- Set Redis's CPU affinity to cores 2+
This ensures the network stack and Redis's event loop don't contend for the same L1/L2 cache.
Observability¶
You can't fix what you can't measure. A single Redis instance is simple to monitor — and you should monitor it from day one, not after the first incident.
The INFO Command¶
INFO is the single most important diagnostic command. It returns metrics across multiple sections:
| Section | Key metrics | What to watch |
|---|---|---|
| server | redis_version, uptime_in_seconds |
Unexpected restarts (uptime < 300 means something crashed it) |
| clients | connected_clients, blocked_clients |
Connection leaks (growing connected_clients), workers stuck on blocking commands (blocked_clients) |
| memory | used_memory, used_memory_rss, mem_fragmentation_ratio |
RSS growing faster than used_memory = fragmentation; ratio < 1.0 = swapping |
| stats | keyspace_hits, keyspace_misses, evicted_keys, instantaneous_ops_per_sec |
Hit ratio (hits / (hits + misses)), eviction pressure, throughput |
| persistence | rdb_last_save_time, aof_rewrite_in_progress |
Stale snapshots, concurrent rewrite during high load |
| replication | connected_slaves, master_link_down_since |
Replication broken if master_link_down_since > 0 |
| keyspace | keys, expires, avg_ttl per database |
Key count trends, TTL distribution |
Cache hit ratio is the single most important application-level metric:
A healthy cache typically has a hit ratio above 0.90 (90%). If it drops, either your TTLs are too short, your cache is too small, or your access patterns have shifted.
Latency Diagnostics¶
Quick latency check — measures round-trip time to Redis from the client machine:
redis-cli --latency # continuous, 1-second intervals
redis-cli --latency-history # rolling 15-second windows
redis-cli --latency-dist # visual latency distribution
Latency monitoring framework — tracks latency spikes inside Redis (command processing time, not network):
CONFIG SET latency-monitor-threshold 5 # log events > 5ms
LATENCY LATEST # most recent spikes
LATENCY HISTORY command # history for a specific event
LATENCY DOCTOR # automated diagnosis
| Event type | What it measures |
|---|---|
command |
Regular command execution time |
fast-command |
O(1) and O(log n) commands |
fork |
Time spent in fork() for persistence |
aof-write |
AOF write latency |
expire-cycle |
Key expiration processing |
eviction-cycle |
Key eviction processing |
Slow Log¶
Captures commands whose execution time (excluding I/O) exceeds a threshold:
# redis.conf
slowlog-log-slower-than 10000 # microseconds (10ms)
slowlog-max-len 256 # keep last 256 entries
SLOWLOG LEN # number of entries
SLOWLOG GET 10 # last 10 slow commands
SLOWLOG RESET # clear the log
Set the threshold low initially
Start with slowlog-log-slower-than 10000 (10ms). In a healthy Redis instance, almost nothing should take 10ms. If the slow log fills up, you're either running O(n) commands on large keys (KEYS *, SMEMBERS on a huge set) or experiencing fork() latency from persistence.
Finding Problematic Keys¶
redis-cli --bigkeys # scan for largest keys by type
redis-cli --memkeys # scan with memory usage per key
redis-cli --hotkeys # most frequently accessed keys (requires LFU policy)
These use SCAN internally — safe to run in production (non-blocking, cursor-based).
MONITOR: Use With Extreme Caution
redis-cli MONITOR streams every command processed by Redis in real time. It is invaluable for debugging but can reduce throughput by up to 50%. Never leave it running in production. Use it briefly to diagnose issues, then disconnect.
Alert Thresholds to Start With¶
| Metric | Condition | Severity |
|---|---|---|
uptime_in_seconds |
< 300 | Critical — Redis restarted unexpectedly |
connected_clients |
< expected minimum | Warning — application connection issue |
mem_fragmentation_ratio |
> 2.0 or < 1.0 | Warning — high fragmentation or swapping |
evicted_keys (rate) |
> 0 when unexpected | Warning — memory pressure causing data loss |
instantaneous_ops_per_sec |
> 80% of benchmark baseline | Warning — approaching throughput limit |
rdb_last_save_time |
> max acceptable interval | Warning — persistence falling behind |
keyspace_misses / (hits + misses) |
> 0.20 | Warning — cache effectiveness degrading |
When to Stay Here¶
You don't need replication, Sentinel, or Cluster if:
- [x] Your application runs on a single server (or a small number of servers all in one region)
- [x] Redis fits comfortably in memory on one machine
- [x] You can tolerate ~1 second of data loss (AOF everysec)
- [x] Your throughput is well under 100K ops/sec
- [x] You don't need high availability (brief downtime on crash/restart is acceptable)
- [x] You're not doing multi-key atomic operations that span multiple services
The Over-Engineering Trap
Adding replication "just in case" means managing replication lag, monitoring replica health, handling failover (manual or Sentinel), and debugging split-brain scenarios. If your app doesn't need HA yet, that complexity will create more incidents than it prevents.