Create backup devices for all databases via T-SQL

Dating back to the formative days of SQL Server, backups were referred to as dumps.  Thus the stored procedure is sp_addumpdevice, even though when you back up a database you use the Backup command.  Whatever.

I have always enjoyed looking at the information stored in the metadata in SQL Server, and one such source is sys.databases, a repository of all databases defined on the server instance.   By appending strings around the name, and filtering for database_id > 4, it’s easy to build dump devices for each database.   I use one dump device to back up the system databases: if you want to have an individual dump device for each system DB, then remove the WHERE clause.

--CreateDumpDevicesFromSysdatabases.sql
/*
 Creates backup devices for all databases.
 The code from Tim Ford reads the registry to see where backups are stored, then used to create dump devices.
 Run Part 1 to create the function, then run Part 2 to create the dump devices.
 Change to text output (Ctrl-T), then run script. Copy the output to a new query window and run it.
 Found a link to Tim Ford's code at https://www.mssqltips.com/sqlservertip/1966/function-to-return-default-sql-server-backup-folder/,
 then found the actual code at http://thesqlagentman.com/2010/04/turn-management-studio-into-a-web-browser/.

 WW, 17 Oct 07
 WW, 16 May 15
 WW, 12 May 16: Added Tim Ford's registry reader code
*/

--****************************************************--
--Code for fn_SQLServerBackupDir():
--Author: Timothy Ford (sqlagentman@yahoo.com)
-- returns nvarchar(4000) = dbo.fn_SQLServerBackupDir()
--****************************************************--
--PART 1:
USE master;

IF OBJECT_ID('dbo.fn_SQLServerBackupDir') IS NOT NULL
 DROP FUNCTION dbo.fn_SQLServerBackupDir
GO

CREATE FUNCTION dbo.fn_SQLServerBackupDir()
RETURNS NVARCHAR(4000)
AS
BEGIN
   DECLARE @path NVARCHAR(4000)

   EXEC master.dbo.xp_instance_regread
       N'HKEY_LOCAL_MACHINE',
       N'Software\Microsoft\MSSQLServer\MSSQLServer',N'BackupDirectory',
       @path OUTPUT,
       'no_output'
   RETURN @path
END;

--SELECT fn_SQLServerBackupDir = dbo.fn_SQLServerBackupDir();


--PART 2: SET OUPUT TO TEXT! CTRL-T!
DECLARE @BackupLocation NVARCHAR(4000);
SELECT @BackupLocation = master.dbo.fn_SQLServerBackupDir() + '\';
SET NOCOUNT ON;

SELECT 'USE master;';

--Each DB gets two dump devices: one for data, one for log.
SELECT 'EXEC sp_addumpdevice ''disk'', ''' + name + '_data_bkup'', ''' 
    + @BackupLocation + name + '_data_bkup.bak''' + ';'
    + CHAR(13) + CHAR(10)
    + 'EXEC sp_addumpdevice ''disk'', ''' + name + '_log_bkup'', ''' 
    + @BackupLocation + name + '_log_bkup.bak''' + ';'
FROM master.sys.databases
WHERE database_id > 4
UNION ALL
--I back up all system databases to the same file via append.
SELECT 'EXEC sp_addumpdevice ''disk'', ''SystemDB_data_bkup'', ''' 
    + @BackupLocation + 'SystemDB_data_bkup.bak''' + ';'
    + CHAR(13) + CHAR(10)
    + 'EXEC sp_addumpdevice ''disk'', ''SystemDB_log_bkup'', ''' 
    + @BackupLocation + 'System_log_bkup.bak''' + ';';

SET NOCOUNT OFF;
Advertisements

A couple of interesting posts from other bloggers

First, from the illustrious Phil Factor at Simple Talk, a post on using T-SQL to generate UML code to create database relational diagrams.  The diagram generator is PlantUML and does a fair job, but the diagram can become very wide and complex.

I did have one problem with Phil’s script that I haven’t been able to correct yet: some rows are returned as Null and have to be deleted from the output that you feed PlantUML lest it crash.  Easy enough to clean up.

https://www.simple-talk.com/sql/sql-tools/automatically-creating-uml-database-diagrams-for-sql-server/

 

The second is from Kendra Little, formerly a business associate of Brent Ozar, also illustrious, with a script for generating the code to recreate all of the indexes in a database.  This is cool for a couple of reasons.  First, it includes a note showing whether or not a table is a heap.  Second, it also shows you the usage statistics which tells you whether an index is getting hit.  If it’s not getting hits, it might not be worth the overhead to keep the index.

Personally I’m probably going to make it part of my Model database.  You can turn it into a view if you remove the Option(Recompile) line at the end of the code.  You’re probably better off not looking at the execution plan of the code, just be satisfied that this is system management and not production code.

How to Script Out Indexes from SQL Server

Concatenating text fields from multiple records in to one string

We have a Caseload system where a teacher can assign certain specific kids to their caseload, so with one set of menu buttons they can access any kid in the district, with another they access only a filtered list.  These are the kids that they are supposed to pay special attention to.  To complicate matters, because they can never be simple, it was decided that it was OK that one kid could be on multiple caseloads.  Okay, not a difficult change, except that I displayed the caseload teacher’s name on the bottom of the student edit screen.  Now I had multiple teachers and needed to build a string so they could all be shown together.

This is something that I’ve done before, but I couldn’t find the code that I needed.  I remembered the basics and eventually found a sample that let me build out the code that worked.

This is the basic code:

DECLARE @s AS VARCHAR(40); --MUST be varchar!  Doesn’t work if it’s char.
SET @s = '';
SELECT @s = @s + t.TeacherName + ', '
FROM Caseload cl
JOIN Teachers t
       ON cl.TeacherID = t.TeacherID
WHERE cl.StudentID = 1
ORDER BY t.TeacherName;
SELECT SUBSTRING(@s, 1, LEN( @s) - 1);
SELECT @s;

It builds up the @s variable, holding all the teacher names, for StudentID 1.  Then the substring removes the last comma and space.

First, our test rig.  Two tables, Teachers and Caseload.  We’ll keep it basic with a half dozen teachers and we’ll just use three students.  No student table because we’re just concerned with an integer student ID number.

/****** Object:  Table [dbo].[Teachers] ******/
CREATE TABLE [dbo].[Teachers](
       [TeacherID] [int] NOT NULL,
       [TeacherName] [varchar](10) NOT NULL,
 CONSTRAINT [PK_Teachers] PRIMARY KEY CLUSTERED
(      [TeacherID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
/****** Object:  Table [dbo].[Caseload] ******/
CREATE TABLE [dbo].[Caseload](
       [TeacherID] [int] NOT NULL,
       [StudentID] [int] NOT NULL,
 CONSTRAINT [PK_Caseload] PRIMARY KEY CLUSTERED
(      [TeacherID] ASC, [StudentID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
--Populate the two tables:
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (1, N'Bob');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (2, N'Carol');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (3, N'Ted');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (4, N'Alice');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (5, N'Woody');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (6, N'Buzz');
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (1, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (1, 2);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (2, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (3, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (4, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (5, 2);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (6, 3);
--Show the joined caseloads so we can more easily see what the UDF will produce:
SELECT cl.StudentID, cl.TeacherID, t.TeacherName
FROM Caseload cl
JOIN Teachers t
       ON cl.TeacherID = t.TeacherID
ORDER BY cl.StudentID, cl.TeacherID;
--same code as above demonstrating the concatenation
DECLARE @s AS VARCHAR(40);
SET @s = '';
SELECT @s = @s + t.TeacherName + ', '
FROM Caseload cl
JOIN Teachers t
       ON cl.TeacherID = t.TeacherID
WHERE cl.StudentID = 1
ORDER BY t.TeacherName;
SELECT SUBSTRING(@s, 1, LEN( @s) - 1);
SELECT @s;
*/
--The UDF that does everything needed and returns the populated string
ALTER FUNCTION dbo.udf_GetCaseloads (@SeekerID INT)
RETURNS VARCHAR(80)
AS
BEGIN
       DECLARE @s AS VARCHAR(80);
       SET @s = '';
       SELECT @s = @s + t.TeacherName + ', '
       FROM Caseload cl
       JOIN Teachers t
              ON cl.TeacherID = t.TeacherID
       WHERE cl.StudentID = @SeekerID
       ORDER BY t.TeacherName;
       --make sure there's data in the var before trying substrings!
       IF LEN(RTRIM(@s)) > 0
              SET @s = (SUBSTRING(@s, 1, LEN(@s) - 1));
       RETURN @s;
END;
--test harness for the above:
SELECT dbo.udf_GetCaseloads(1) AS CaseloadList; --test UDF
SELECT dbo.udf_GetCaseloads(2) AS CaseloadList; --test UDF
SELECT dbo.udf_GetCaseloads(3) AS CaseloadList; --test UDF
SELECT dbo.udf_GetCaseloads(4) AS CaseloadList; --test UDF

Unfortunately I wasn’t able to get this code working in Access VBA.  My final solution was to wrap it in to a stored procedure which updated a record that’s always available.  I’ll show that tomorrow.

Invalid character value for cast specification (#0)

I ran in to this rather odious SQL Server Native Client 11.0 error last week, getting a message box saying “ODBC–update on a linked table failed.”  Actually it was an insert, but that’s OK, I never expect software to have particularly good error messages.  (but why, oh why, can’t Microsoft make their error messages so that we can copy the error message out with our mouse to make it easier to look up?)  I’ve since been banging my head against the wall trying to solve it.  It didn’t seem like anything complicated going on: I had a table that I was accessing directly through a subform and adding a record.  The insert was consistently blowing up.

Well, I found my problem.  Your mileage may vary.

First, the SQL Server table:

CREATE TABLE [dbo].[NotesStudents](
    [NoteID] [int] IDENTITY(1,1) NOT NULL,
    [StudentID] [int] NOT NULL,
    [NoteDate] [date] NOT NULL CONSTRAINT [DF_NotesStudents_NoteDate] DEFAULT (getdate()),
    [NoteTime] [time](3) NOT NULL CONSTRAINT [DF_NotesStudents_NoteTime] DEFAULT (getdate()),
    [UserID] [int] NOT NULL,
    [NoteText] [varchar](1000) NOT NULL,
    [RowVer] [timestamp] NOT NULL);

I have the date and time as two separate fields as I want my form to display the records sorted by date descending, time ascending.  I needed to populate the date, time, student ID, and user ID, and the following code on the BeforeInsert did that:

Private Sub Form_BeforeInsert(Cancel As Integer)
    Me.txtNoteLen.Visible = True
    Forms!frmStudentsSPEDs!fsubStudentsNotes.Form!SeekerID = Forms!frmStudentsSPEDs.StudentID
    Forms!frmStudentsSPEDs!fsubStudentsNotes.Form!UserID = Forms!frmMenuSPEDs.UserID
    Forms!frmStudentsSPEDs!fsubStudentsNotes.Form!NoteDate = Date
    Forms!frmStudentsSPEDs!fsubStudentsNotes.Form!NoteTime = CStr(Time)
End Sub

(the txtNotLen field counts how many characters are in the note since they’re limited to 1,000)

So the above code produced the #0 error message.  I could add records through SSMS, so I knew the table was functional.  Today I get back from lunch and decide to manually add a record to the table via Access, using Ctrl-; to add the date and Ctrl-: to add the time.

And it blew up.

Turns out that when you use Ctrl-: to enter the time, you get 12:57:41 PM.  The PM was the problem.  SQL Server was expecting just a time value, so I changed the NoteTime line to say Format(Time, “h:m:s”): the Format “h:m:s” converts the current time to 24 hour format, so 4:49pm becomes 16:49.

So if you’re getting this error and you’re dealing the VBA and a Time(3) field on SQL Server, this might be the cause.

Working with Execution Plans

Interesting stuff.  With the student information system that I’m developing, there’s one query that’s central to all users, it’s called vwStudentsFiltered.  Since I need to enforce row filtering based on what district(s) a user has access to, this became the core.  The original view looked something like this:

SELECT st.*
FROM Students AS st
JOIN SeekerUsers AS u
   ON u.SeekerUserDBLogin = SYSTEM_USER
   AND st.StatusCode in ('0', '1', '9')
   AND (u.SeekerUserRole = 1 --superuser sees all, otherwise filter
      OR st.DistrictNum IN (SELECT td.SeekerUserDistrict
         FROM SeekerUsersDistricts AS td
         WHERE u.SeekerUserid = td.SeekerUserid)
);

Obviously I wasn’t doing a Select *, but it’s a long field list and irrelevant to the discussion.

Student Status Codes of 0, 1, and 9 indicated an active student.  District users were only allowed to see students that were not assigned to a district, or students assigned to their district.  And SeekerUserRole 1 is a superuser who is allowed to see everyone, regardless of status code.  The problem is that if you looked at it through SSMS Tools Pack by Mladen Prajdić, it would report that the query had an estimated cost of 200%. (SSMS Tools Pack is pretty neat, I recommend checking it out)  The SeekerUsersDistricts table consists of two fields: SeekerUserID and a district number.  For district users, they’d have one record in this table, teachers could have more than one district number.

The problem is that the u.SeekerUserRole = 1 OR st.DistrictNum IN confuse the optimizer: it could execute either path, and you’ll only know at runtime.  Kind of hard to optimize for something like that.  Turning on client statistics gives us the following:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0 (all other reads are zero)
Table 'Students'. Scan count 1, logical reads 23, physical reads 0 ...
Table 'SeekerUsers'. Scan count 1, logical reads 2, physical reads 0 ...

The optimizer added a worktable.  That can’t be good for performance.

There are three types of users: superusers, district users, and teachers.  The front-end app gives them different menus to control what they can do and what data they can see.  In this post I’ll be looking at how to improve the view but only for the superuser.

Here’s the code that I came up with:

--Variation 1
SELECT st.*
FROM Students AS st
WHERE 1 = (SELECT 1
   FROM SeekerUsers AS u
   WHERE u.SeekerUserDBLogin = SYSTEM_USER
     AND u.SeekerUserRole = 1);
--Variation 2
SELECT st.*
FROM Students AS st
JOIN SeekerUsers u
   ON u.SeekerUserDBLogin = SYSTEM_USER
   AND u.SeekerUserRole = 1;
--Variation 3
SELECT st.*
FROM Students AS st
WHERE EXISTS (SELECT 1
   FROM SeekerUsers AS u
   WHERE u.SeekerUserDBLogin = SYSTEM_USER
      AND u.SeekerUserRole = 1);

Running all three in the same query window shows each result set having the same number of rows returned, 939 students.  So far, so good.

Variation 3 made sense to me.  There are no common fields between the Student table and the SeekerUsers table, so a join, on the surface, doesn’t make sense.  So why not a Where Exists?  The devil, as they say, is in the details.  The most glaring thing to show in the graphic execution plan was an operator that I had never seen before: a Row Count Spool (Lazy Spool) with a cost of 78%.  With all three queries running together, Variation 3 had a cost relative to the batch of 70%, the other two each being 15%.  The Row Count Spool was doing rebinds and rewinds, which are not good.  The estimated number of rows and the estimated row size on that operator were farcical: it estimated one row and returned 939 rows (no, updating statistics and reindexing did nothing to improve that mismatch).  On top of that, the entire estimated subtree cost was 0.12, the other two variations were each 0.026.

So variation 3 is out.  What’s the difference between 1 and 2?  The graphic plans are identical, as are the numbers on every operator, but there’s a subtle difference: the cache plan size for #2 is 96 kb, #1 is 88 kb.  Kind of trivial, but this plan is going to be called regularly, so it’ll live in the cache pretty much forever.  At this point there’s no way to forecast how much cache memory pressure there will be, but 8k is 8k, so since every statistic between the two variations is identical, I’ll go with #1.  Aesthetically, I personally prefer #2, but I can live with #1.

Here’s the execution plans:

queryplan

Yes, I’m getting scans on my Students base table.  Right now the queries are returning 99% of students, so a seek doesn’t improve things since a covering index would require every field in the table.  As the number of students grow with diverging status codes, then indexing might come in to play.