Save to My DOJO
Most of you know me as a blog writer for the Altaro Hyper-V blog, but I began my relationship with Altaro as a customer. One of the features that impressed me right from the start was Reverse Delta. “Delta” in backup jargon just means “difference”. The reason that we don’t just say “difference” is because that would cause confusion with the “differential” backup method. Reverse delta technology allows you to reduce the size of your backups and perform restores more quickly.
What is Reverse Delta?
To understand Reverse Delta, you first need to understand delta. When a file is backed up, it requires an equivalent amount of space on backup media as it does on live media. Keeping a unique backup of a file each time it changes can consume a great deal of media space, especially for frequently-changing files. Compression algorithms help reduce the space utilization, but they have never lived up to their hype. Several alternative techniques have been introduced over the years, but one of the most effective is “delta”. When a file changes, rather than back up the entire file again, only the changed bits are kept. If a restore is ever necessary, the original file is recovered and then all of the changes are applied to it in order, usually, with some calculations to skip right to the final version of each bit, until the file is restored to the condition it was in at the time the desired backup was taken. Due to the overhead of working with individual bits, deltas are typically handled in block chunks.
This is a visual of one possible delta implementation:
The big thing to notice is how much less space is consumed by the delta backups than by the full backup. Delta is a tried-and-true solution that does a good job of preserving space on backup media. The full backup file is still necessary, of course, and the intermediate backups might be required as well, depending upon the delta technique in use.
The folks at Altaro looked at delta, and they looked at how things go in typical restores, and realized something: most restores are trying to get to the most recent version of a file. The older a backup gets, the less likely it will be used in a restore. With the traditional delta method, that means it’s likely that you’ll need to use multiple backup media to restore any given file. So, they came up with Reverse Delta as an answer. Reverse Delta works by keeping the full copy close to the recent backup, rather than at the opposite end of it.
Before we proceed, I need to make it clear that I do not work directly for Altaro software. I am on a completely different continent than the brilliant people that develop this software. I do not drive into the office and talk shop with them. Just like you, I only know as much about Reverse Delta as Altaro has made public. Fortunately for us, they aren’t hiding anything. If you want to read what they have to say on the matter, this is the official page, which includes a link to a PDF that diagrams the entire process. Since they’ve already done such a good job documenting it there, I’m only going to do the short form here for the sake of continuity. You already understand deltas, so you don’t really need a further in-depth explanation.
The Altaro Reverse Delta Process
This is how Reverse Delta operates:
- The first backup is taken. This is a full backup, so the entire VHDX is captured and saved to backup media.
- The second backup knows about the first backup. So, it captures only the parts that have changed. In older versions of Altaro VM Backup, that meant a manual scan of the VHDX. In newer versions, it performs changed block tracking (CBT), so the scan time is significantly reduced. The effect is the same, though. Only changed blocks are captured and sent to the Altaro VM Backup server.
- As the second backup is written to disk, the bits are combined into the data saved from the first backup and that is saved as backup number two. The unaltered bits are the only things that are kept in backup 1.
This process then continues each day so that the latest backup is always a full backup being built by combining the previous full backup with that cycle’s changed blocks.
Benefits of Reverse Delta
Like standard delta, Reverse Delta is intended to save space on backup media. As you can see from the above, it uses storage in the same fashion, with only the difference being where the full backup is stored. Just like manufacturers that implement standard delta practices, Altaro recommends periodically taking full backups to reduce restore processing times.
What Reverse Delta does is shift when data combination processing occurs. Let’s say that you’re using a traditional delta backup application. You come to work on a Friday morning and are immediately greeted with an emergency e-mail from the accountant: “I accidentally deleted our payroll spreadsheet this morning and I need to upload my figures to the payroll processor today!” That’s your paycheck on the line! So, you fire up your application and start to restore the spreadsheet from last night’s backup. The first thing that your application needs to do is dig back to the first full copy of the file. Let’s say you take full backups every Sunday. So, it will need to go back five days (depending on your cycle) to retrieve that backup. Then, it will need to scan Monday’s backup, Tuesday’s backup, Wednesday’s backup, and Thursday’s backup for changes to that file. It will integrate all of these changes into the latest version, and then restore your file. Crisis averted!
The question is, how long does that take? That will depend on several factors, of course. What’s certain, though, is that you needed more than one day’s backup to retrieve data that was recorded less than a day ago, and the system is doing all of its calculating and file crunching while you’re sweating over whether or not you’re going to be able to pay your mortgage on time this month. With Reverse Delta, Altaro VM Backup only needs to perform a direct read from last night’s backup. All of the difficult file processing was done while you slept.
How to Set Reverse Delta
By default, Reverse Delta runs for a maximum of thirty days. After that, the next backup is not replaced by deltas. Instead, it is kept as a full backup. It will then be used as the reference point if you wish to restore any data from backups prior to it. Reverse Delta is configured per virtual machine. To access its settings:
- Open Altaro VM Backup and connect to the backup system.
- Under the Setup tree, click Advanced Settings.
- In the main pane, locate the VM that you wish to modify.
- Click the number that appears in the Reverse Delta column. It will become editable; change it to your desired number.
Set Reverse Delta - Change other virtual machines as desired. Alternatively, you can use the Modify All link to change the global policy or Modify for host to change all virtual machines on a specific host.
- Click Save Changes in the lower right when you are satisfied.
This is a “going forward” modification. Old backups are not changed.
The number that you set specifies the maximum number of Reverse Delta backups that can be taken before a full backup will be saved in its entirety and not replaced on the next cycle by deltas. The smaller the number that you use, the more frequently full backups will be kept.
How to Set Retention Policy
This article is not specifically about retention policy, but it is highly related to the Reverse Delta setting. To modify retention policy, just go to the Retention Policy page under the Setup heading.
Drag virtual machines into the slots that you wish to apply to them. If you don’t see a policy that you like, use the Add New buttons to build policies that suit you.
Reverse Delta Strategies
Now that you know what Reverse Delta is, you can start thinking about how to apply it optimally within your environment. A quick overview:
- There isn’t a perfect one-size-fits-all approach, but no one says that you must be perfect.
- Think on the VHD/X scale, not the individual file scale.
- The two factors that most influence how you design your Reverse Delta scheme are data churn rates and available backup media space.
- Consider your retention plan when adjusting Reverse Delta settings.
- Watch your usage meter.
Altaro VM Backup Looks at VHD/X Files
Remember not to treat Altaro VM Backup the same way that you would a traditional in-operating system backup program. When we talk about changes and data churn, we’re talking about the blocks of a VHD/X file. If there are a few dozen files that are changing all of the time and they cumulatively consume 100 megabytes on a 100 GB VHDX, that’s not something that you need to worry about. If there are 60GB worth of files changing on that same 100GB VHDX, that’s worth taking the time to architect a Reverse Delta strategy around.
The Effect of Data Churn on Reverse Delta
Deltas/Reverse Deltas are only captured at the moment that the backup is taken. If a large portion of a file is being changed often, then its deltas can easily wind up being at or near the same size as the original. That means that delta/Reverse Delta won’t save you very much space. Worse, restoring to a particular point in time requires all of those deltas to be processed in order from the nearest full backup. For Altaro VM Backup, that won’t be too bad as long as the restore target is fairly recent. For standard delta backup applications, it won’t be too bad if the target restore date is fairly close to a full backup. This leads to our first recommendation:
For data that changes frequently, use full backups more often.
Frequent full backups are especially recommended for database servers that see meaningful amounts of writes. Due to the way that SQL works, there will be data changing in the database’s file and in logs for every modification action (CREATE, INSERT, UPDATE, and DELETE statements). Since virtual machine backup software is examining the .VHD/X file, even transient files like .TRNs will cause deltas to be generated even though they may not be of much use. I would caution you not to automatically lump all SQL databases into any “high churn” category, though. Many SQL databases do very little work. Some are very read-heavy. SQL statistics is a large topic and one that I am not especially well-versed in, but this might get you started (Microsoft SQL Server):
SELECT * FROM sys.dm_io_virtual_file_stats(NULL,NULL);
The Effect of Available Media Space on Reverse Delta
Before you can worry much about your media’s space, make sure you spend some time on the previous section regarding data churn. That will determine how Reverse Delta is going to make use of your space. For VHD/X changes that generate only a few deltas, your space utilization will be dominated by how frequently full backups are taken. If your system has a great deal of data churn, then your deltas may not be significantly smaller than your full backups. There are two recommendations:
- Because large deltas are still smaller than full backups; consider using more delta for high-churn VHD/Xs when backup media space is a premium. This will result in lower overall space utilization but with higher-than-average times required for restore operations.
- Reducing your retention policy length will be the best way for you to conserve space used by high-churn VHD/X files.
- Use compression. It may or may not save a lot of space, but it will certainly work better if it’s on.
With both Reverse Delta and compression, you are trading computational power for consumed space. If compression is off and you don’t use Reverse Delta, there is no space savings but no calculation time. If you use compression and very long Reverse Delta settings, you will use the least amount of space, but you will spend more CPU cycles calculating compression and deltas.
The Effect of Retention Policy on Reverse Delta
Retention policies determine how long data is kept. In conjunction with your Reverse Delta policy, they determine the total number of full backups and delta captures. If I set a Retention Policy of six months and take full backups every 30 days, then, depending on how those line up in any given time frame, I’ll have as many as six full backups and somewhere around 175 separate Reverse Delta backups. If I were to reduce the Reverse Delta policy to every fifteen days, I’d have twelve full backups and around 170 Reverse Delta backups. A longer Reverse Delta policy results in reduced space consumption at the expense of longer restore times. The related recommendations are:
To conserve maximum space, use a longer Reverse Delta policy with a shorter retention policy.
To balance space usage and restore speed, use a longer Reverse Delta policy with a longer retention policy.
To ensure that restores occur quickly, use a shorter Reverse Delta policy.
Balancing a smart retention policy against a smart Reverse Delta policy is the best way to control your data usage.
Watch Your Usage Meters
Nobody is perfect. You may not build the best policy your first time out. Don’t worry about that. Hindsight is always easier to work with, and it helps when you have nice charts to look at.
You’ve probably already seen the charts on the Altaro VM Backup dashboard. If you haven’t already, spend some time going through them to see what they can offer.
The best way to determine how a virtual machine should be dealt with is by seeing how it is currently using your data. In the dashboard, on the top right graph, click the Data Backed Up / Day button. It’s the middle button at the left of that particular chart. On the right, choose a cluster or host, then choose a virtual machine to look at.
The tooltips are extremely helpful as they show the compressed and uncompressed statistics for any given day.
I can then switch over to the Total Backup Size / Day graph for the same virtual machine. What this graph is showing me is the total amount of space consumed by this virtual machine’s backup on any given day.
What I see on this virtual machine is that my deltas are working very well. My high-water mark for compressed size is 520MB. This virtual machine only has a single full backup. That plus all of the deltas is only about 7 GB. If I were to set it with another full, it would jump to around 12 GB of consumed space. If backup media space is a concern for me, I would not want to shorten the Reverse Delta length. However, I also need to understand that if I wish to restore to the July 07 date, I will need to step through every single data point on the graph, which would certainly take quite a bit of time.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!
1 thoughts on "Hyper-V Backup Strategies: Full vs. Reverse Delta"
Information seems to be obsolete for 7.6. Force full Backup seems to be no longer possible (unless all Backups are removed with Free Up Disk Space) to start from scratch