I recently bumped onto a situation the other day, all of a sudden my differential backup started to fail unexpectedly. This was a SQL 2005 cluster. This sounded quite unusual, and reading the actual error message made me even more curious. We do use SQL litespeed for all SQL instances before SQL 2008 because of its compression feature and the disk space savings that it offers. I started to see errors similar to the below.
Msg 61700, Level 16, State 1, Line 0
VDI open failed due to requested abort.
BACKUP DATABASE is terminating abnormally.
Cannot perform a differential backup for database “XXX”, because a current database backup does not exist. Perform a full database backup by reissuing BACKUP DATABASE, omitting the WITH DIFFERENTIAL option.
First of all I thought it was really quite a strange one especially as I knew that the full backup had actually just completed, what’s also ironic is that once you run a full backup and then kick off a differential afterwards it always worked, but whenever we had it scheduled it always failed. Hmmmmn, how weird….
Upon examining the system, it became clear that something was clearly wrong somewhere, I took the unusual step of actually re-installing the SQL litespeed binaries. This didn’t seem to help, and for some reason I decided to query the msdb database to check if any other backups were being done, I checked the SQL agent job list to ensure that there weren’t any other jobs that could be running, I didn’t see any jobs so I then decided to query the msdb database.
What we can see from the above is that some of the backups are being written to the file system, however there are others which are being written to what looks like a tape device. It appeared rather strange, because our backup policy is to write to the file system then the tape infrastructure picks up the files.
I got in touch with the backup administrators to make enquiries as I thought they might have setup a direct tape backup to the database which was then causing the litespeed full backup no to be recognised when the differential gets kicked off.
Then I also noticed the following within the SQL error log file as well.
I/O was resumed on database DBA. No user action is required.
Hmmn, this hit me like a tonne of rocks, this message was one that I last saw a few years ago at a different company, in this particular case we had the ability to take SAN snapshots and that uses VSS technology on the database disk volumes. This effectively freezes the IO on the databases, and then takes snapshots of the disk volumes. Armed with this information, I then informed the tape backup administrators to check the tape backup config again to ensure that VSS wasn’t enabled, and waaaala that was the problem. I got the administrator to disable the VSS option and then asked him to backup the drive again, he did and this time I didn’t see any record logged on the msdb database and also any information on the SQL error log file also.
I stumbled upon this KB article as well from MS which is related: http://support.microsoft.com/kb/903643