I have been frequently asked this question in many meetup by Azure developers who have created hundreds and thousands of containers inside the Azure storage account and they wanted to know how they can take the backup of the complete Azure storage account.
I think this is a common question which has been asked by many people.
To answer this question I should say that practically it’s not possible to take the backup of Azure storage account what we need to do is to take the snapshot of the blob container and download it for a point in time backup.
Fig: Azure Blob Hierarchy
What is a blob snapshot?
As per Microsoft the snapshot is a read-only version of a blob that’s taken at a point in time. Snapshots are useful for backing up blobs. After you create a snapshot, you can read, copy, or delete it, but you cannot modify it.
A snapshot of a blob is identical to its base blob, except that the blob URI has a DateTime value appended to the blob URI to indicate the time at which the snapshot was taken.
Most common use case of Snapshot.
The most common use case of blob snapshot is the snapshot of the VHD file. A VHD store the current information of a VM disk. If you have taken the snapshot of a VHD file you can later create a VM from that snapshot. In this article I am not going to show you how to do that because that is already shown in many videos and blogs. In my present post I will try to explain the underline mathematics of the Azure Blob Snapshots so that you can understand the billing of Azure Blob Snapshot.
We can also backup of disks using the snapshot for Azure VM’s, this is a common practice and Azure administrator generally schedule backup in a regular interval of time.
Why it’s a case of worry
I have seen many Azure admins lately surprised with many billing related issues related to Azure storage account which has been created and in a period of time multiple snapshots has incurred huge billing cost as there is a math behind every snapshot and it is also important to delete the snapshot in time to time to save the cost so it is very important we should understand how the snapshot billing done in Azure.
Understanding Snapshot Billing
To understand this in a better way let’s take a very simple example of multiple identical twins who are studying in the same school. Now in my example I have considered multiple identical twins in a class in a school and there is a special rule in this school that school will charge only single fee for the identical twins.
In this below figure let’s consider there are three students in a class (Left side) and they have three identical twins (Right Side) in the class and as per the rule of the school ,the school will charge the fees only for three students who are in left side. In this below figure left side students represent three blobs and right side students represents the snapshot of each blob. So if the fees for each student is considered has USD 1000 the total fees needed to be paid is USD 3000 in this case.
Base Blob Snapshot
In technical words in the above figure you have three blocks in the left hand side blob and in the right side you have three snapshots of those blocks taken in any point of time and after that there is no change done in the base blob so the charges incurred only for the three unique blocks in the left hand side.
In this scenario let’s say the 3rd student has changed the color of his uniform to green in this case the school will charge the fees from four students instead of three since the school will consider the student who has changed the color of his uniform as another unique student.
Base Blob Snapshot
In technical words if base blob is being updated and the 3rd block in the left hand side has been changed however no new snapshot has not been taken. Since there is a change in the 3rd block Azure will charge for three previous snapshot and one for the third base block which has been updated.
In this scenario the 3rd student in the left hand side completely replaced by a new student but there is no change in the left hand side identical twins. In this case the school will charge the fees from four students instead of three since the school will consider the student who has been replaced as another unique student.
In technical words in this case the base blob has been updated, but the snapshot has not. Block 3 in left hand side was replaced with a new block in the base blob, but the snapshot still reflects block 3. As a result, the account is charged for four blocks.
In this scenario all the students in the left hand side has been replaced by new students and there is no change in the identical students in the right side so the school will consider all the six students as unique and take fees from all of them.
In technical words the base blob has been completely updated with new set of blocks and all the original blocks has been replaced also in the left hand side there is no change in the snapshot blocks so Azure will charge for all the six blocks present here.
Can we copy the snapshot to a different storage account?
Yes we can copy a snapshot created in a storage account to a different storage account as a blob. When a snapshot copied from one storage account to another account it will maintain the same size of the base blob and will incur same cost of storage.
What is Incremental Snapshot and why it is considered as the best practice at present?
Incremental snapshot is similar to incremental backup of any database, here in case of a blob when a snapshot is created from the base blob, with the help of an API called GetPageRange API only changes which happened just after the last snapshot taken. When we copy one complete snapshot from one storage account to another storage account that can be very slow and can consume much storage space which will increase the storage cost. With the incremental snapshot backup successive copies of the data contain only that portion that has changed since the preceding snapshot copy was made.
This way, the time to copy and the space to store backups is reduced.
If you are following a customized backup solution for the Azure blobs the snapshot is the best possible solution at this moment. Incremental snapshot can reduce the cost and helps you to manage the storage cost effectively.