Introduction
Any data that's of value needs some sort of disaster recovery plan. The trick is to anticipate the type of failure, determine how much "loss of data" (if any) you are willing to accept, and then develop your recovery plan accordingly.
A philosophical note: There is no value in performing backups (Yes, that's what I said!), the value comes from performing a restore. That might seem a bit corny, but it's a good thing to keep in mind.
One more philosophical note: A disaster recovery plan that has never been practiced is almost worthless... practice makes perfect.
Background
What types of disaster are you planning for?
There are a lot of things that can go wrong... and your disaster recovery plan needs to address several basic types of disasters
- Physical loss: The data center burned down, somebody stole the servers, etc.
- Physical failure: The server or disks have died
- Malicious: A virus or malicious user has erased your database content
- Software failure: A patch has failed and the server won't boot
- User errors: A user accidently clobbered a gallizion rows in the table
The reason for pointing all this out is that the recovery plan is different for each type of disaster. For example, you plan may have to include installing SQL Server onto a new hardware platform.
How much data loss are you willing to accept?
You manage the amount of risk for data loss by the SQL Server's "Recovery Model" and the frequency and type of backup.
- None: Requires a 2nd fail over cluster and Full recovery model
- 5-10 minutes: Use full recovery model and transaction log backups
- 4 hours: Use bulk logged recovery model and transaction log backups
- 1 day: Use simple recovery model and database backups
Note: Fail-over clustering and other "hot standby" techniques are not covered by this article
Database Recovery Models
Each database operates in one of three "Recovery Models"... which are designed to allow you some flexibility on configuring the amount of disk space required to keep the database safe from disaster.
- Simple - doesn't use a log file for transactions
- Bulk-logged - records most transactions individually, but others in "bulk" fashion
- Full (the default) - records every transaction in a log file
When to use Simple Recovery model:
Let's say you have a database where you are only concerned about the "current" version of the data. This scenario is similar to a file-system (NTBACKUP) plan... where your major concern is to quickly get the database back to a usable state... and you're not too concerned about losing the data that has occurred since the last backup.In this scenario, you'd rarely (if ever) be asked to restore the database to a particular "point in time"... so the full backups are sufficient, and you probably won't need transaction logs at all. This scenario is perfect for a developmental "play" database.
When to use Full/Bulk-logged Recovery model:
The use of the Simple Recovery model in a production database is extremely rare. The more likely scenario is that you'd be asked to restore the database to a particular point in time.For example, the Server and the Database itself are running normally, but somebody accidently deleted a table from inside a database application. That's different from a "failure recovery"... because nothing has failed! This Full recovery model requires transaction logs, so that you could restore the database to the exact instant before the "user accident" occurred. The Bulk-logged Recovery model is similar but only supports a restore to the end of the log backup (not a "point in time" between log backups)
Recovery Model summary:
Recovery Model | Requires a large transaction log file? | Restores to |
---|---|---|
Simple | No | Last database backup |
Bulk-logged | Yes* | Last log backup |
Full | Yes | Point in time |
Backup Scenarios
Let's start with an example backup plan that has the following components:
- Full database backups daily at midnight, retain for 4 weeks
- Differential database backups every 6 hours, retain for 2 weeks
- Log backups every 15 minutes, retain for 1 week
- External backup (NTBACKUP, etc.) of SQL backup files
This plan allows for an archive of the data for 4 weeks (for security and auditing purposes) and has the ability to perform a "point in time" restore for practically any time in the last week. The use of an external backup means the SQL backup files (the *.bak and *.trn files) are themselves backed up and placed off site.
A note about retention times
Some folks will look at that example and say "Geez, those retention times are very long, I don't think I've got enough disk space for that".
The problem is that some disasters go unnoticed for a very long time. For example, if a malicious user damaged your database in a subtle way, it might take you several days (perhaps several weeks) to discover it. You need to retain your backups to cover the time it takes to discover and audit those kinds of problems.
Note : Log files are required for a forensic audit of database activity.
Recovery Operations
As I mentioned as a philosophical note, it's the recovery operation that really matter. So let's look at how you would perform a recovery using the sample backup plan above.
Consider this scenario: Today is Friday, and you must restore the database exactly what it looked like at 7:30 PM on Thursday. The following are the steps required:
- Restore the full backup from Thursday at midnight
- Restore the 6 AM, 12 Noon, and 6 PM differential backups
- Restore the 6:15 PM to 7:30 PM log files
So, that's 1 full, 3 differentials, and 5 log files.... a total of 9 operations
You should always consider the number of operations that are required for a restore. Remember, during a disaster, there will be chaos... the phones will be ringing, people will be complaining, and that's not the time to have to stop and remember "Oh wait, did I restore log file #22 or was it #23?" or "I forgot to use NORECOVERY and I have to start all over again". Trust me... simpler is better.
Point in Time Restore
An the name implies, a "Point in Time" restoration allows you to recovery a database to an exact instance in time... by either by specifying the time of day or a "named transaction". This type of recovery requires that the database use the Full recovery model.
SQL Server Maintenance Plans
You'd typically perform this step from inside a Maintenance Plan: