T-SQL += started in 2008?!

A few days ago, The Impatient DBA wrote that in Microsoft SQL Server 2008, MS added an old C operator: +=.  Simply put, add a value to a variable.  You can run the following code that he provided:

DECLARE @shamwow int = 0;
SET @shamwow += 1;
SET @shamwow += 1;
SELECT @shamwow AS ShamWow;

And @shamwow is now 2.  Without using the += operator, we’d be doing @shamwow = @shamwow +1.  Is += better?  It is certainly shorter, but I think that’s mainly a matter of personal choice.  If you code in other languages that support it, I think it would probably be a good thing.

But there’s more than just adding: there’s also operators for the other basic math functions: -=, *=, /=.  Let’s continue playing with @shamwow with the following code:

DECLARE @shamwow FLOAT = 3;

SET @shamwow *= 3;

SELECT @shamwow;

SET @shamwow /= 2;

SELECT @shamwow;

SET @shamwow -= 3;

SELECT @shamwow;

SSMS will return 9, 4.5, and 1.5.

Pretty cool, eh?

But there’s two problems.  First, this is supported in T-SQL, but not in Access 2013 VBA.  The other is that there’s another pair of C operators, ++ and — (that’s minus minus, not an em dash).  Add or subtract one from the preceding variable, so you’d do set @shamwow++ if it were supported.  And these latter two are not supported in either T-SQL or VBA.  I’m sure they’re alive and well in C# and probably several other .Net languages.

I’m frankly amazed that this happened EIGHT YEARS AGO and I haven’t seen it in common usage.  It’s been in C for probably as long as there has been C, and now we have it in T-SQL.  Something about new tricks and old dogs comes to mind.

Unused(?) Indexes

Another post on SQL Server Central has the user asking about shrinking a database that apparently has 100 gig free space and a large transaction log file.  There are several issues to consider, as Jeff Moden pointed out, including the size of the largest index in the database.

Yesterday a person replied that you might look at dropping unused indexes.  But that brings up the question: what determines whether an index is unused.

The problem is that any statistics that you pull from sys.dm_db_index_usage_stats only show stats since SQL Server was last restarted, not since the dawn of time.  Brent Ozar explains it well from today’s newsletter:

Unused indexes are tricky. When you are analyzing this data, you have to keep in mind that this data is only available since the last restart. If you rebooted the server yesterday and are viewing the data today, you might see a lot of unused indexes in the list. But are they really unused? Or have the indexes not just been used YET? This is a very important point when deciding to disable or drop an index based on this list. If you reboot your servers monthly due to Microsoft security patches, consider reviewing the list the day prior to the reboot. I once dropped an index 3 weeks after the server was rebooted, thinking that the entire application workload must have been run by now. A few days later, I got a call on the weekend that the database server was pegged at 100% CPU utilization. I reviewed which queries were using the most CPU and found that the top query’s WHERE clause matched the index I had dropped. That query only ran once a month, which is why it hadn’t recorded any reads yet. We later moved that monthly process to another server that was refreshed regularly with production data.

Here’s some code that can show you what indexes are unused or empty.  An empty index just means that there’s no data in that table right now, it may always be populated later, so I would not drop an empty index.  Besides, how much space would an empty index take?

For my personal preferences, I order the output by table then index name, also I put a u.* at the end of the select statement so the more interesting usage stat columns can be seen.

--Jonathan Fahey, 23 Sept 2011
-- http://www.sqlservercentral.com/Forums/Topic1068035-2908-2.aspx#bm1164093
SELECT	LEFT(OBJECT_NAME (i.id), 50)	AS TableName,
	LEFT(ISNULL(i.name,'HEAP'), 50)	AS IndexName,
	CASE WHEN u.object_id IS NULL THEN 'No Data' ELSE 'No Usage' END AS Determination,
	i.dpages	AS PagesUsed,
	CONVERT(DECIMAL(8,2), i.dpages / 128.0) AS MBUsed
FROM		sys.objects o
	JOIN	sys.sysindexes i
	ON	i.id = o.object_id
	LEFT JOIN sys.dm_db_index_usage_stats u
	ON		i.indid = u.index_id
		AND	u.object_id = i.id
WHERE		o.type = 'U'			-- Exclude system tables.
	AND	i.name IS NOT NULL		-- Exclude heaps
	AND	i.name NOT LIKE '_WA_Sys%'	-- Exclude statistics
	AND	(	u.object_id IS NULL	-- Either no usage stats, or...
		OR	(	u.user_seeks = 0	-- all usage stats are zero.
			AND	u.user_scans = 0
			AND	u.user_lookups = 0
			)
		)
ORDER BY i.dpages DESC

How often do you do transaction log backups?

A post appeared on SQL Server Central two days ago wherein the person asked why the transaction log backups were suddenly taking a long time, having gone from under 12 minutes to over two hours.

Subsequent questioning revealed that they do one transaction log backup daily.

Most experienced DBAs would be kind of aghast at the concept of doing transaction log backups once a day.  If the database is the least bit transactionally active, this would represent a potentially devastating data loss if the database fell in to the bit bucket.

Obviously any database will have different recovery objectives for time to recover and acceptable data loss depending on business needs and the size of the database.  Jeff Moden talks about his system where old data, which must be kept online, is kept in its own filegroups so that only active data is backed up, greatly reducing size and time required for data backups and making for much faster database recovery in the event of an emergency restore (restore the active data, they’re back in business and they can take time recovering the archived data).  The only scenario that I can (off-hand) envision where a once daily t-log backup would be acceptable is if the database does an ETL process and is then quiet transactionally thereafter.  But that’s my observation within my realms of experience.

Brent Ozar had a post two years ago advocating doing transaction log backups ONCE A MINUTE.  The reasoning is quite interesting, and can be summed up with the concept that a theoretical database produces 60 gig of transaction log changes every hour.  Would you rather back up 60 gig in one big chunk, every hour, or 1 gig every minute?  Which would be faster?  How big of a log would you need for one versus the other?  Hopefully you’d have your log pre-sized to handle 60 gig rather than having issues with it expanding as needed.

Good things to keep in mind when scheduling log backups.  Me, personally, my databases are typically fairly small.  At the end of the business day I would do a final log append backup, which would then be caught by the backup system.  First thing in the morning I’d do a log initialization backup and we’re back and running.

I know of a city government that installed an ERP system.  Pretty big install: a couple hundred or more users on at any given time doing all sorts of different jobs.  There might be utility billing jobs going on, lots of activity.  This would be a good installation for a t-log backup every minute.  Let’s say you do log backups every 15 minutes and your database crashes at 14 minutes since the last backup.  The users have to remember what they did in those 14 minutes and double-check their work to make sure everything is current.  At one minute intervals, there’s an excellent chance that almost no work was lost.

The backup interval is something that has to be negotiated with management as to how much loss of time and data is acceptable and how much they’re willing to spend to minimize those.  But once a day log backups?  I don’t think so.

Excellent post from Brent Ozar on what to do when a database gets corrupted

You are running DBCCs on your systems at proper intervals, right?  If not, you are keeping your resume up to date, right?  Having a database get corrupted and not being able to recover it can definitely be an RGE (resume generating event), i.e. your job just ended.

Brent goes over an excellent plan for how to address what happens if a DBCC reports an error, stressing the importance of backups and regularly running DBCCs.  If there’s one weakness in his plan, it’s that he does not say to never use the REPAIR_ALLOW_DATA_LOSS option.  This should always be the absolute last option as your data is going to fall in to the bit bucket and will never be seen again.

An excellent defense against database corruption is to use the WITH_CHECKSUM option on your backups.  My job for full backups looks like this:

BACKUP DATABASE [BlahBlahBlahBlah] TO [BlahBlahBlahBlah_data_bkup] WITH NOFORMAT, INIT, NAME = N'BlahBlahBlahBlah-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION, STATS = 10, CHECKSUM, CONTINUE_AFTER_ERROR;
GO

DECLARE @backupSetId AS INT;

SELECT @backupSetId = POSITION FROM msdb..backupset WHERE database_name=N'BlahBlahBlahBlah' AND backup_set_id=(SELECT MAX(backup_set_id) FROM msdb..backupset WHERE database_name=N'BlahBlahBlahBlah' );

IF @backupSetId IS NULL BEGIN RAISERROR(N'Verify failed. Backup information for database ''BlahBlahBlahBlah'' not found.', 16, 1); END;

RESTORE VERIFYONLY FROM [BlahBlahBlahBlah_data_bkup] WITH FILE = @backupSetId, NOUNLOAD, NOREWIND;
GO

You see the CHECKSUM on the backup along with the RESTORE VERIFYONLY.  The code was generated by right clicking on the database, selecting Tasks, then Backup, plug in the parameters, and select Script.  I put it in a new query window as I may back up several databases in the same job.  Sometimes I’ll just do a find/replace for the other databases since my backup.  The Restore Verifyonly gives you some confidence that your backup is recoverable: NEVER assume that just because your backup ran that the database is restorable!  The ONLY way to know is to actually restore it to another file!  You don’t want to accidentally clobber your production that probably has newer data in it.

(I love backing up with compression!  If your CPU is not under undue strain, it will speed up your backups because your CPU is faster than disk I/O so the CPU is able to compress faster than your disks can take the writes, resulting in shorter backup times and smaller backup sets!)

Of course, the best backup methodology would be to do a full restore to another database after the backup, and then DBCC CHECKDB on that copy.

There is absolutely nothing wrong with belts and suspenders.  It may look silly, but if it preserves your job then it’s a good thing.

Another good thing to do is to look at the contents of msdb.dbo.suspect_pages table, hopefully it will always have zero records.  With older versions of SQL Server I’ve seen records appear in this before the DBCC noticed major problems.  And yet another is to make sure that you receive emails when alerts 823, 824, and 825.  Those can give you an early indication that something bad is looming over the horizon.

DBCCs are most commonly caused by a bug in SQL Server, thus it is a good idea to keep your server fully patched with Service Packs and Cumulative Updates, also caused by errors in your storage system.  If you’re suddenly seeing corruption across multiple databases, then your storage has a serious problem and you’re in for a long day.

Create backup devices for all databases via T-SQL

Dating back to the formative days of SQL Server, backups were referred to as dumps.  Thus the stored procedure is sp_addumpdevice, even though when you back up a database you use the Backup command.  Whatever.

I have always enjoyed looking at the information stored in the metadata in SQL Server, and one such source is sys.databases, a repository of all databases defined on the server instance.   By appending strings around the name, and filtering for database_id > 4, it’s easy to build dump devices for each database.   I use one dump device to back up the system databases: if you want to have an individual dump device for each system DB, then remove the WHERE clause.

--CreateDumpDevicesFromSysdatabases.sql
/*
 Creates backup devices for all databases.
 The code from Tim Ford reads the registry to see where backups are stored, then used to create dump devices.
 Run Part 1 to create the function, then run Part 2 to create the dump devices.
 Change to text output (Ctrl-T), then run script. Copy the output to a new query window and run it.
 Found a link to Tim Ford's code at https://www.mssqltips.com/sqlservertip/1966/function-to-return-default-sql-server-backup-folder/,
 then found the actual code at http://thesqlagentman.com/2010/04/turn-management-studio-into-a-web-browser/.

 WW, 17 Oct 07
 WW, 16 May 15
 WW, 12 May 16: Added Tim Ford's registry reader code
*/

--****************************************************--
--Code for fn_SQLServerBackupDir():
--Author: Timothy Ford (sqlagentman@yahoo.com)
-- returns nvarchar(4000) = dbo.fn_SQLServerBackupDir()
--****************************************************--
--PART 1:
USE master;

IF OBJECT_ID('dbo.fn_SQLServerBackupDir') IS NOT NULL
 DROP FUNCTION dbo.fn_SQLServerBackupDir
GO

CREATE FUNCTION dbo.fn_SQLServerBackupDir()
RETURNS NVARCHAR(4000)
AS
BEGIN
   DECLARE @path NVARCHAR(4000)

   EXEC master.dbo.xp_instance_regread
       N'HKEY_LOCAL_MACHINE',
       N'Software\Microsoft\MSSQLServer\MSSQLServer',N'BackupDirectory',
       @path OUTPUT,
       'no_output'
   RETURN @path
END;

--SELECT fn_SQLServerBackupDir = dbo.fn_SQLServerBackupDir();


--PART 2: SET OUPUT TO TEXT! CTRL-T!
DECLARE @BackupLocation NVARCHAR(4000);
SELECT @BackupLocation = master.dbo.fn_SQLServerBackupDir() + '\';
SET NOCOUNT ON;

SELECT 'USE master;';

--Each DB gets two dump devices: one for data, one for log.
SELECT 'EXEC sp_addumpdevice ''disk'', ''' + name + '_data_bkup'', ''' 
    + @BackupLocation + name + '_data_bkup.bak''' + ';'
    + CHAR(13) + CHAR(10)
    + 'EXEC sp_addumpdevice ''disk'', ''' + name + '_log_bkup'', ''' 
    + @BackupLocation + name + '_log_bkup.bak''' + ';'
FROM master.sys.databases
WHERE database_id > 4
UNION ALL
--I back up all system databases to the same file via append.
SELECT 'EXEC sp_addumpdevice ''disk'', ''SystemDB_data_bkup'', ''' 
    + @BackupLocation + 'SystemDB_data_bkup.bak''' + ';'
    + CHAR(13) + CHAR(10)
    + 'EXEC sp_addumpdevice ''disk'', ''SystemDB_log_bkup'', ''' 
    + @BackupLocation + 'System_log_bkup.bak''' + ';';

SET NOCOUNT OFF;

A couple of interesting posts from other bloggers

First, from the illustrious Phil Factor at Simple Talk, a post on using T-SQL to generate UML code to create database relational diagrams.  The diagram generator is PlantUML and does a fair job, but the diagram can become very wide and complex.

I did have one problem with Phil’s script that I haven’t been able to correct yet: some rows are returned as Null and have to be deleted from the output that you feed PlantUML lest it crash.  Easy enough to clean up.

https://www.simple-talk.com/sql/sql-tools/automatically-creating-uml-database-diagrams-for-sql-server/

 

The second is from Kendra Little, formerly a business associate of Brent Ozar, also illustrious, with a script for generating the code to recreate all of the indexes in a database.  This is cool for a couple of reasons.  First, it includes a note showing whether or not a table is a heap.  Second, it also shows you the usage statistics which tells you whether an index is getting hit.  If it’s not getting hits, it might not be worth the overhead to keep the index.

Personally I’m probably going to make it part of my Model database.  You can turn it into a view if you remove the Option(Recompile) line at the end of the code.  You’re probably better off not looking at the execution plan of the code, just be satisfied that this is system management and not production code.

How to Script Out Indexes from SQL Server