8th May 2017 days
Backup and Restore are critical concepts to understand when working with databases in SQL.
Whether you are new to SQL or an experienced database analyst you never know when a disaster might occur. It could be a hardware failure, a data issue or something else which causes your system/application to crash. If it does so you need to be confident that you can recover your data. This is especially true if you are still learning SQL and so could make errors.
As a database administrator, you should routinely be taking a backup of your database.
That may be every month, every week, daily or hourly depending on the importance of your data and amount of work being carried out. This then allows you to restore a backup if disaster strikes.
For example, if you are managing a finance related database, this is high-value data and so a backup should be taken at least every 15 minutes and probably more frequently than that.
The backup process itself does not need to be onerous as it can be automated by scheduling them in a SQL agent job. We discuss choosing the optimal backup strategy at the end of this article as well as looking at a couple of useful tools to help you automate this process. SQL backups are simple to automate but, as ever, it is important to understand a process before you automate it.
SQL backups are simple to automate but, as ever, it is important to understand a process before you automate it. It is also vital to test your backups so that you know that your system works.
There are a number of different types of backups available in SQL server.:
Full, Differential and Transaction Log backups all perform critical roles in a disaster recovery scenario. Let’s look at each in turn.
A full database backup backs up the whole database as it says! This includes part of the transaction log so that the full database can be recovered when it is restored.
A transaction log is a sequential record of all changes made to the database while the actual data is contained in a separate file.
Each database has at least one physical transaction log file (.LDF) and one data file (.MDF). A full database backup backs up both files: transaction log and data file. In other words, full database backups represent the complete database at the time the backup finished.
With the SIMPLE recovery model, the transaction log will be automatically truncated or deleted when it runs a checkpoint runs or is execute. With a SIMPLE recovery
With the SIMPLE recovery model, the transaction log will be automatically truncated or deleted when it runs a checkpoint runs or is execute. With a SIMPLE recovery model a full transaction log is not maintained. This allows you to restore your data to its most recent state but because the transaction log is lost you can not restore to an earlier point in time.
FULL or BULK LOGGED:
With the FULL or BULK LOGGED recovery model, a transaction log will be maintained and log backups will be taken along with a full database backup. Though we can manually truncate transaction logs using the database shrink option if we have a space issue, although this is not recommended.
Following steps are performed by MS SQL server while taking a full database backup. It
A full database backup can be performed in two different ways i.e using the GUI or through a T-SQL script.
You can run the below script to take a full backup of the “AdventureWorks” database and sent the backup file to the specified path (‘C:\Backups\AdventureWorks.BAK’).
Using the GUI you use the following steps to take a full backup of “AdventureWorks” database and send the backup file to the specified path (‘C:\Backups\AdventureWorks.BAK’).
A differential backup is based on the most recent, previous full database backup. A differential backup captures only the data that has changed since the last full database backup.
This means that if a previous full database backup was taken two days before, then you will find only the data that has changed in a differential backup. A previous full database backup is needed if you want to restore a differential backup.
The primary purposes for creating a differential backup is to reduce the number of log backups. This means that differential backups take up less space than full backups. For this reason it is standard practice to take more differential backups than full backups.
Again a differential backup can also be performed with any of the three standard recovery models: SIMPLE, FULL or BULK LOGGED. They can be managed through a GUI or by using a T-SQL script.
The file extension of a differential backup file is .DIF
The below script will take a differential backup of the “AdventureWorks” database.
The below steps in a GUI will carry out a differential backup of the “AdventureWorks” database.
A transaction log is a sequential record of all changes made to the database while the actual data is contained in a separate file. Every database has at least one physical transaction log file (.LDF) and one data file (.MDF).
A transaction log backup backs up ONLY the transaction log file (.LDF). Like a differential backup, a previous full database backup is needed for restoring transaction log backups. Assuming that restoration is possible it is possible to do ‘point in time recovery’ with a transaction log backup.
From this, it is logical that a transaction log backup can not be carried out if the database is in SIMPLE recovery mode. It only works with FULL or BULK LOGGED recovery model.
Whenever a change is we made to a database, SQL Server logs that entry into transaction log. Each entry has unique number called a LSN (Log Sequence Number). Using this LSN, we can rewind the changes made to a database so that it is exactly as it was at that point of time. It is critical that the LSN sequence isn’t broken. f it breaks, we will need to take a full backup before going back to transaction log backups, otherwise recovery will be impossible. LSN numbers start from 1 and keep increase with each change.
A transaction log backup occupies less space than a full or differential backup and so they can be taken more frequently. Transaction log backups are often carried out at very short intervals (Eg every 15 minutes).
The file extension of a transaction log backup file is .TRN
As mentioned above you can not carry out transaction log backups on a database in SIMPLE recovery mode. If you need to change the recovery mode of your database, you can run below script to change a recovery model from SIMPLE to FULL.
You can check the recovery model of a database from database properties.
Now you can perform a transaction log backup.
You can run below script to take a transaction log backup of “AdventureWorks” database.
You can perform following steps to take a transactional backup of “AdventureWorks” database.
A filegroup backup allows you to take a backup of all data files within a SQL server filegroup individually. The key difference between a full database backup and a filegroup backup is that you can take a backup of individual data files using a filegroup backup. With a full database backup you can only take a backup of a whole database, there is no option to take a backup of individual data files. See here for more on filegroup backups.
There are two primary reasons for creating multiple filegroups for data files: performance and recovery. Performance improves when you spread files across multiple disks because you have multiple heads reading and writing your data, rather than one doing all the work and often acting as a bottleneck.
Filegroups can be backed up and restored separately as well. This enables faster object recovery in the case of a disaster. It can also help the administration of large databases.
A filegroup is a logical structure to group objects in a database. For example, If you have a database with multiple tables, you can put these tables under separate filegroups (based on data within that tables) like all employee related tables under one filegroup, order related tables under another filegroup and so on.
By default PRIMARY filegroup is created when you create a new database and all data files are kept under that filegroup. You need to create additional filegroups if you want to split data files/tables. You can run below script to add a new filegroup (Employees_FileGroup) to an existing database.
You can find a newly added filegroup under Database properties.
Right click on a database name and select “Database Properties” to open the window below.
Now the next step is to create a new data file for the Employees_FileGroup Filegroup and assign the newly created data file to Employees_FileGroup. The script below will add a new data file (EmployeeData.mdf) to Employees_FileGroup
You can find a newly added data file under Database Properties > Files.
Now the next step is to create a new employee table and move it into the Employees_FileGroup. Note that the table must have a cluster index on it.
Now that we have created a filegroup let’s look at how we can take a backup of an individual filegroup as well as all the filegroups in a database. Again we’ll look at doing this using the GUI or T-SQL directly.
You can take a backup of Employees_FileGroup using below script.
Note: You can take a backup of all filegroups by not specifying FILEGROUP in the above query
You can perform following steps to take a backup of Employees_FileGroup. Note you can select multiple filegroups or all filegroups from the list (shown in image below).
6. Select “Disk” as the destination
7. Click on “Add…” to add a backup file and path
8.Click “OK” again to create the backup and repeat for other filegroups
Partial backups are similar to file and filegroup backups. They allow us to back up only a subset of the data files that comprise a database.
However, whereas file and filegroup backups allow us to backup specific, individual files and filegroups, partial backups will make a backup copy of the primary filegroup and all read/write filegroups, omitting by default any filegroups designated as READONLY.
This means that partial backups are only relevant for databases that contain read-only filegroups; otherwise a partial backup will capture exactly the same data and objects as an equivalent full database backup.
Partial backups were designed to reduce the backup footprint of large databases that contain a high proportion of read-only data. There is no need to back up the same data each night if we know that it can not have been changed since it is read-only.
Partial backups can be either FULL ( with file extension .BAK) or DIFFERENTIAL (with file extension .DIF) as discussed above. They can not be used for a transaction log backup for obvious reasons.
The partial backup can only be carried out using T-SQL. There is no GUI option available for partial backup. To see more on partial backups see this article.
Choosing an optimal backup strategy is crucial for disaster recovery. The three key questions when defining a database backup strategy are:
Once we have these answers it is easy to define a strategy.
Full Backup Only
Full + Differential Backup
Full + Differential + Transaction Log Backup
Always remember that you must follow the following backup sequence.
Full Backup → Differential Backup → Transaction Log Backup
For more detailed coverage of testing your backup strategy see this article.