Reasons Why I Don’t Like “The Cloud”

Part X of an ongoing series….

First off, the word itself.  The Cloud.  What is The Cloud?  It’s a server that you don’t own.  You can’t touch it, it’s in someone else’s data center.  It may or may not be virtual.  Amazon’s Cloud or Microsoft’s or Google’s are several data centers with racks and racks of servers.  They are physical, just not at your location.  And they’re accessed across the Internet.  This is something that we’ve been doing for 30 years, it’s called a Wide-Area Network, just scaled up bigger.  We had bi-coastal WANs before the World Wide Web came along.

So you’re paying for a server that you have no physical relation to.  Now, on the one hand, you’re also not responsible if something breaks.  You don’t pay the electricity bill for power and cooling or to keep the lights on, or off, as is more common in a lot of data centers.  The concept of backups becomes much more worrying for me because I have to trust them utterly that my machine is being backed up, and in the case of the server that I’m working on now, I don’t know when that VM backup happens.  I perform my due diligence with my SQL Server backups, but if I don’t know WHEN that VM backup takes place, then I don’t know what my recovery window is.  My full backups go off, along with the rest of my maintenance, at about 23:50.  If they back up the VM at 21:00 and I have to restore from that previous backup, I have to know that the previous DBCCs and index maintenance and whatever didn’t run.  Ultimately I’m going to set up a form of log shipping where backups will be compressed, encrypted, and emailed to a repository, but I don’t have that in place right now.

But my big gripe is downtime.  We’ve had Microsoft working on our server trying to resolve a problem.  Yesterday my boss comes in and tells me to sign off that box so they could reboot it, so I do.  Then shortly after he says he can’t connect to it, so he’s wondering if they might have accidentally done a shutdown rather than a restart, so I open up the control panel for the VM.  And it’s unresponsive.

The entire data center was offline.  Our one server plus who knows how many others, all gone.  It was down for a good half an hour.

Let’s relate that to local hardware.  If my server crashes, I reboot it, put the database in to administrator mode, run DBCCs to make sure all is well, then I open it up for the users.  If there’s a larger outage like a disk failure or the entire box goes up in flame, then we’ve got something that’s going to take longer to address.  If a backhoe eats our internet connection, then external users can’t access my system but internal users are fine, and the line will be fixed in a couple of days.

In ALL of these cases, I KNOW WHAT’S GOING ON and I can tell users and management what’s up and give them a SWAG (semi-wild assed guess) as to when normal operations will resume.  When a Cloud goes down?  No idea.  You get an email like this:

We are currently investigating reports of an incident affecting the Network  in the WKRP data center.  This incident will impact your ability to manage current assets, will impact your ability to generate new assets, and does impact availability of current assets.  We are in the process of engaging the appropriate teams to quickly mitigate and resolve this incident and will provide additional information as soon as it is available. …

Isn’t that nice and reassuring?

No, I’m not a fan.  EVERY CLOUD PROVIDER HAS OUTAGES.  Microsoft has had them, Amazon has had them, Google has had them.  Some have had serious security breeches (looking at you, AWS) where it was pretty easy to commandeer someone else’s virtual hosts.  Not good.

It’s hard to do security right.  We’d like to think that people like AWS has ‘Top People’ doing it, but they make mistakes just like us mere mortals.  There’s no easy answers: if you have local servers, you’re going to have problems and outages.  If you throw everything in to The Cloud, you’re still going to have problems and outages, and there won’t be a blessed thing that you can do about it.

So which is better?  Flip a coin, I don’t know.  But for my $0.002, I’d prefer a server that I can touch.

Microsoft now recommends that ALL SQL Server Cumulative Updates be installed

Formerly the language around the CUs were fairly clear: CUs addressed specific issues, and if you weren’t experiencing that issue, then you shouldn’t install that CU.

Well, that was then and this is now.

Microsoft is now saying that you should install all CUs when they become available, and says that they undergo the same testing as complete Service Packs (SPs).  So now instead of waiting for an SP, even if you weren’t having problems, you should also keep current on CUs.

https://blogs.msdn.microsoft.com/sqlreleaseservices/announcing-updates-to-the-sql-server-incremental-servicing-model-ism/

One good way of keeping up with the release of both CUs and SPs is Brent Ozar’s web site, SQL Server Updates., which tells me that SQL Server 2014 is up to CU5 for SP1.  On this site you can sign up for notification emails when new updates are available.  It lists versions going back to 2005, I expect that might fall off the list in three years when 2008 begins falling out of support.  (Microsoft says that when CU6 for 2014 SP1 comes out that CU5 will no longer be available for download)

So if you’ll excuse me, I have a couple of servers to patch.

Concatenating text fields from multiple records in to one string

We have a Caseload system where a teacher can assign certain specific kids to their caseload, so with one set of menu buttons they can access any kid in the district, with another they access only a filtered list.  These are the kids that they are supposed to pay special attention to.  To complicate matters, because they can never be simple, it was decided that it was OK that one kid could be on multiple caseloads.  Okay, not a difficult change, except that I displayed the caseload teacher’s name on the bottom of the student edit screen.  Now I had multiple teachers and needed to build a string so they could all be shown together.

This is something that I’ve done before, but I couldn’t find the code that I needed.  I remembered the basics and eventually found a sample that let me build out the code that worked.

This is the basic code:

DECLARE @s AS VARCHAR(40); --MUST be varchar!  Doesn’t work if it’s char.
SET @s = '';
SELECT @s = @s + t.TeacherName + ', '
FROM Caseload cl
JOIN Teachers t
       ON cl.TeacherID = t.TeacherID
WHERE cl.StudentID = 1
ORDER BY t.TeacherName;
SELECT SUBSTRING(@s, 1, LEN( @s) - 1);
SELECT @s;

It builds up the @s variable, holding all the teacher names, for StudentID 1.  Then the substring removes the last comma and space.

First, our test rig.  Two tables, Teachers and Caseload.  We’ll keep it basic with a half dozen teachers and we’ll just use three students.  No student table because we’re just concerned with an integer student ID number.

/****** Object:  Table [dbo].[Teachers] ******/
CREATE TABLE [dbo].[Teachers](
       [TeacherID] [int] NOT NULL,
       [TeacherName] [varchar](10) NOT NULL,
 CONSTRAINT [PK_Teachers] PRIMARY KEY CLUSTERED
(      [TeacherID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
/****** Object:  Table [dbo].[Caseload] ******/
CREATE TABLE [dbo].[Caseload](
       [TeacherID] [int] NOT NULL,
       [StudentID] [int] NOT NULL,
 CONSTRAINT [PK_Caseload] PRIMARY KEY CLUSTERED
(      [TeacherID] ASC, [StudentID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
--Populate the two tables:
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (1, N'Bob');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (2, N'Carol');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (3, N'Ted');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (4, N'Alice');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (5, N'Woody');
INSERT [dbo].[Teachers] ([TeacherID], [TeacherName]) VALUES (6, N'Buzz');
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (1, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (1, 2);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (2, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (3, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (4, 1);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (5, 2);
INSERT [dbo].[Caseload] ([TeacherID], [StudentID]) VALUES (6, 3);
--Show the joined caseloads so we can more easily see what the UDF will produce:
SELECT cl.StudentID, cl.TeacherID, t.TeacherName
FROM Caseload cl
JOIN Teachers t
       ON cl.TeacherID = t.TeacherID
ORDER BY cl.StudentID, cl.TeacherID;
--same code as above demonstrating the concatenation
DECLARE @s AS VARCHAR(40);
SET @s = '';
SELECT @s = @s + t.TeacherName + ', '
FROM Caseload cl
JOIN Teachers t
       ON cl.TeacherID = t.TeacherID
WHERE cl.StudentID = 1
ORDER BY t.TeacherName;
SELECT SUBSTRING(@s, 1, LEN( @s) - 1);
SELECT @s;
*/
--The UDF that does everything needed and returns the populated string
ALTER FUNCTION dbo.udf_GetCaseloads (@SeekerID INT)
RETURNS VARCHAR(80)
AS
BEGIN
       DECLARE @s AS VARCHAR(80);
       SET @s = '';
       SELECT @s = @s + t.TeacherName + ', '
       FROM Caseload cl
       JOIN Teachers t
              ON cl.TeacherID = t.TeacherID
       WHERE cl.StudentID = @SeekerID
       ORDER BY t.TeacherName;
       --make sure there's data in the var before trying substrings!
       IF LEN(RTRIM(@s)) > 0
              SET @s = (SUBSTRING(@s, 1, LEN(@s) - 1));
       RETURN @s;
END;
--test harness for the above:
SELECT dbo.udf_GetCaseloads(1) AS CaseloadList; --test UDF
SELECT dbo.udf_GetCaseloads(2) AS CaseloadList; --test UDF
SELECT dbo.udf_GetCaseloads(3) AS CaseloadList; --test UDF
SELECT dbo.udf_GetCaseloads(4) AS CaseloadList; --test UDF

Unfortunately I wasn’t able to get this code working in Access VBA.  My final solution was to wrap it in to a stored procedure which updated a record that’s always available.  I’ll show that tomorrow.

Looking for some OLD SQL Server sample databases?

Back in the day, I’m talking up to SQL Server 2000, SQL Server installed two databases by default: Pubs and Northwind.  Then they went away.  Well, you can still download them from Microsoft’s Technet website.  You get a tiny little 1.5 meg MSI package, when you run it a directory called C:\SQL Server 2000 Sample Databases will be created that contains two MDF and LDF files for the two databases, an HTML ReadMe, and two .SQL files that are complete build scripts including populating the tables with data: the scripts have a copyright date of 1994-2000!

And they run under SQL Server 2014: I know because I just ran them and created the two databases.  I thought it would be more interesting to run the installer scripts rather than trying to attach old databases, though I may try it tomorrow to see if SQL 2014 can attach older databases like these.

It’s kind of amusing seeing things like the use of quoted identifiers and views with spaces in the name, I’d be include to take a ClueBy4 to the person who created such a monstrosity.  But they are valuable for study just to see how other people design databases.  Pubs shows creating user data types before creating tables so they can use the UDT in the table definition, which is interesting even if it isn’t really a data dictionary.  It also shows some bad coding techniques, such as having a varchar(30) field for Country rather than normalizing it in to a lookup table: that was a normalization violation even back in the mid ’90s.

The design of Northwind does not escape criticism.  Again, Country is stored directly in the Suppliers table rather than a lookup, but here’s the interesting bit: Country and the Phone and Fax numbers are stored in nvarchar rather than varchar.  As far as I’ve seen, country names and phone/fax numbers don’t contain non-Latin character set characters.  My guess is that they went with nvarchar over varchar because the rest of the table uses nvarchar.  As far as I can see, the only accented character is in Québec, which you can store in regular varchar.  I don’t know if it is an international postal regulation that country/city names be in Latinate character sets, so it’s just a little oddity.

Their views are interesting.  Overall, they tend to be fairly basic selects, but two things pop out.  First, the occasional use of SELECT *.  That’s a definite no-no.  Also, they don’t use table aliases in views!  I couldn’t believe the following example from the Northwind script:

create view Invoices AS
 SELECT Orders.ShipName, Orders.ShipAddress, Orders.ShipCity, Orders.ShipRegion, Orders.ShipPostalCode,
 Orders.ShipCountry, Orders.CustomerID, Customers.CompanyName AS CustomerName, Customers.Address, Customers.City,
 Customers.Region, Customers.PostalCode, Customers.Country,
 (FirstName + ' ' + LastName) AS Salesperson,
 Orders.OrderID, Orders.OrderDate, Orders.RequiredDate, Orders.ShippedDate, Shippers.CompanyName As ShipperName,
 "Order Details".ProductID, Products.ProductName, "Order Details".UnitPrice, "Order Details".Quantity,
 "Order Details".Discount,
 (CONVERT(money,("Order Details".UnitPrice*Quantity*(1-Discount)/100))*100) AS ExtendedPrice, Orders.Freight
 FROM Shippers INNER JOIN
 (Products INNER JOIN
 (
 (Employees INNER JOIN
 (Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID)
 ON Employees.EmployeeID = Orders.EmployeeID)
 INNER JOIN "Order Details" ON Orders.OrderID = "Order Details".OrderID)
 ON Products.ProductID = "Order Details".ProductID)
 ON Shippers.ShipperID = Orders.ShipVia
 GO

It was indented, that was lost and I’m not inclined to work through that mess.  So you have both quoted identifiers (and double-quotes, rather than single-quotes), and spaces in object names — both tables and views!  The lack of aliasing greatly increases the reading difficulty.

My brain hurts just looking at that!  I mean, I’ve seen some pretty horrible stuff perpetrated in Access in my days, but oddly I haven’t had to deal with a lot of T-SQL code written by other people.  But this?  Wow.

The Pubs script had this gem of a view:

CREATE VIEW titleview
AS
select title, au_ord, au_lname, price, ytd_sales, pub_id
from authors, titles, titleauthor
where authors.au_id = titleauthor.au_id
 AND titles.title_id = titleauthor.title_id
GO

So not only do we have no table aliasing, we have an example of deprecated join syntax.  In the first days of T-SQL, we used *= and =* for LEFT and RIGHT JOINs.  At least the Northwind example was using JOIN syntax, even if it wasn’t aliasing.  To be fair, the only sin was not using table aliasing, but I’m not going to give them a pass on that because the book A Guide To Sybase and SQL Server by D. McGoveran with C.J. Date, as in Chris Date, one of the foundational minds of relational database, was published in 1992 and demonstrates alias syntax!  There’s no excuse for them not using table aliases, this book was fundamental and one of the very first good books on learning SQL Server when it first appeared under the Microsoft label.

The big question is why — why would I load sample databases that are so old?  It’s easy enough to load Adventureworks, and I have.  The reason is simple: I’m using Access 2010 to do development, and the Help VBA samples are all using Pubs.  Loading that old database will make it easier to use the sample code to try and figure out some things that I’m weak on.

Truncating a table with an Identity column and DBCC CHECKIDENT

I did not know this behavior and came across it recently while resetting audit tables in my development database for a test deployment.

Given:

A table that contains an identity field that you run the following command against:

DBCC checkident ([Audit_Logins]);
The result is:
Checking identity information: current identity value '86', current column value '86'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

Assuming the table has no relationships to prevent it from being truncated, you run the following command:

TRUNCATE TABLE [dbo].[Audit_Logins];

What is the result when you run DBCC CHECKIDENT against it again?

  • 87
  • 86
  • 0
  • Null

The correct answer is Null.

The result of the DBCC CHECKIDENT is

Checking identity information: current identity value 'NULL', current column value 'NULL'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

From BOL 2014:

If the table contains an identity column, the counter for that column is reset to the seed value defined for the column. If no seed was defined, the default value 1 is used. To retain the identity counter, use DELETE instead.

https://msdn.microsoft.com/en-us/library/ms177570.aspx

 

I sent it in to SQL Server Central and it was accepted as last Thursday’s Question of the Day!  I was happy.  I should have provided a table and data load script, more importantly I should have included a statement that the table was created with an IDENTITY(1,1).  Something to remember if I ever submit another question.  Still, there was good discussion.  63% (as of this post) got the answer wrong.

Based on discussions, I think the reason behind the null is that after a truncate, there are null records in the table, so it reports the identity value as null rather than Seed – 1.

I thought it was interesting.  Then again, I’m easily amused.

SQL Server for Linux? Coming 2017!

Very interesting.  Ars Technica reported on it yesterday, I expect more information after the SQL Server 2016 launch event which is apparently on March 10 at 10am EST.

While I don’t have much interest in SQL 2016 (which has some excellent features, but we just started running SQL 2014 and it’ll be fine for quite a while), the Linux edition intrigues me.  Formerly you only had DB2 and Oracle for major database software on non-Windows platforms, so this will be quite interesting to see it play out.  Yes, Linux already had MySQL, Maria, PostgreSQL, etc., and they are perfectly fine databases.  I’m just glad to see something that I know pretty well get on to *nix.

The actual Microsoft announcement also has a link to sign up for the SQL Linux newsletter.  What I want to know is what distros it will run on: I would expect Red Hat and Ubuntu, we shall see.  Microsoft has ODBC drivers for those two distros, plus SUSE, so those three seem the most likely candidates.  Plus there are drivers for node.js, Ruby, and Python in addition to Java ODBC libraries, so lots of approaches can be taken.  According to the newsletter signup, it’s currently available as Ubuntu and Docker images, I would expect Red Hat at launch as it’s perhaps the biggest server *nix going.