2.1 Persistence options in Redis

If a Redis server that only stores data in RAM is restarted, all data is lost. To prevent such data loss, there needs to be some mechanism for persisting the data to disk; Redis provides two of them: snapshotting and anappend-only file, or AOF. You can configure your Redis instances to use either of the two, or a combination of both.

When a snapshot is created, the entire point-in-time view of the dataset is written to persistent storage in a compact .rdb file. You can set up recurring backups, for example every 1, 12, or 24 hours and use these backups to easily restore different versions of the data set in case of disasters. You can also use these snapshots to create a clone of the server, or simply leave them in place for a future restart.

Creating a .rdb file requires a lot of disk I/O. If performed in the main Redis process, this would reduce the server’s performance. That’s why this work is done by a forked child process. But even forking can be time-consuming if the dataset is large. This may result in decreased performance or in Redis failing to serve clients for a few milliseconds or even up to a second for very large datasets. Understanding this should help you decide whether this solution makes sense for your requirements.

You can configure the name and location of the .rdb file with the dbfilename and dir configuration directives, either through the redis.conf file, or through the redis-cli as explained in Section 1 Unit 2. And of course you can configure how often you want to create a snapshot. Here’s an excerpt from the redis.conf file showing the default values.

As an example, this configuration will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed in that period. While snapshotting is a great strategy for the use cases explained above, it leaves a huge possibility for data loss. You can configure snapshots to run every few minutes, or after X writes against the database, but if the server crashes you lose all the writes since the last snapshot was taken. In many use cases, that kind of data loss can be acceptable, but in many others it is absolutely not. For all of those other use cases Redis offers the AOF persistence option.

AOF, or append-only file works by logging every incoming write command to disk as it happens. These commands can then be replayed at server startup, to reconstruct the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. The AOF approach provides greater durability than snapshotting, and allows you to configure how often file syncs happen.

Depending on your durability requirements (or how much data you can afford to lose), you can choose which fsync policy is the best for your use case:

fsync every write: The safest policy: The write is acknowledged to the client only after it has been written to the AOF file and flushed to disk. Since in this approach we are writing to disk synchronously, we can expect a much higher latency than usual.
fsync every second: The default policy. Fsync is performed asynchronously, in a background thread, so write performance is still high. Choose this option if you need high performance and can afford to lose up to one second worth of writes.
no fsync: In this case Redis will log the command to the file descriptor, but will not force the OS to flush the data to disk. If the OS crashes we can lose a few seconds of data (Normally Linux will flush data every 30 seconds with this configuration, but it's up to the kernel’s exact tuning.).

The relevant configuration directives for AOF are shown on the screen. AOF contains a log of all the operations that modified the database in a format that’s easy to understand and parse. When the file gets too big, Redis can automatically rewrite it in the background, compacting it in a way that only the latest state of the data is preserved.