Aaron Bertrand

Deprecated features to take out of your toolbox – Part 3

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Paul White is an independent SQL Server consultant specializing in performance tuning, execution plans, and the query optimizer.

Paul’s Posts

I've recently discussed a few features that Microsoft advises against using, and that I think you should forget exist, too. There was the case where a colleague constantly promoted the deprecated backward compatibility view sys.sysprocesses instead of newer dynamic management views (DMVs), and another case where a different colleague took down a production server using SQL Server Profiler.

My latest run-in with things best forgotten is a new stored procedure with an ntext parameter. I checked and, sure enough, the data type matches the schema for the underlying table. My mind started racing about these older data types to explain why we really shouldn't be using them anymore:

  • image
  • ntext
  • text

These types are on the deprecated list for many reasons, and have held a permanent spot on that list since being replaced by the max types way back in SQL Server 2005. Some of these pain points include:

  • you can't use many string functions, like LEFT(), RTRIM(), UPPER(), and most comparison operators;
  • you need to use functions like TEXTPTR, WRITETEXT, and UPDATETEXT for modifications;
  • you can't use the types as local variables;
  • you can't reference the columns in DISTINCT, GROUP BY, ORDER BY, or as an included column (not that you should want to do any of these);
  • smaller values that could fit in-row can only do so with the text in row option.

That is not an exhaustive list; there are other differences that you might consider more or less important. The most pressing reason for me is that you can't rebuild the clustered index online if the table contains one of these data types.

Let's create a simple database with a few tables:

CREATE DATABASE BadIdeas;
GO

USE BadIdeas;
GO

CREATE TABLE dbo.t1(id bigint IDENTITY PRIMARY KEY, msg nvarchar(max));
CREATE TABLE dbo.t2(id bigint IDENTITY PRIMARY KEY, msg ntext);

Now, let's try to perform online operations on the tables:

ALTER TABLE dbo.t1 REBUILD WITH (ONLINE = ON);
GO
ALTER TABLE dbo.t2 REBUILD WITH (ONLINE = ON);

While the first statement succeeds, the second yields an error message like this:

Msg 2725, Level 16, State 2
An online operation cannot be performed for index 'PK__t2__3213E83FEEA1E0AD' because the index contains column 'msg' of data type text, ntext, image or FILESTREAM. For a non-clustered index, the column could be an include column of the index. For a clustered index, the column could be any column of the table. If DROP_EXISTING is used, the column could be part of a new or old index. The operation must be performed offline.

In a one-off scenario, the error message is easy enough to deal with: you either skip the table or, if the table absolutely must be rebuilt, you work with your teams to schedule an outage. When you're automating an index maintenance solution or deploying new compression settings across your environment, it is a bit more of a pain in the behind to make your solution handle present or future state.

What to do?

You can start replacing these columns with their more modern counterparts. Here is a query to help you track them down using the sys.columns catalog view, but you're on your own for any explicit references that might exist in your application code:

SELECT [Schema]    = s.name, 
       [Object]    = o.name,
       [Column]    = c.name,
       [Data Type] = TYPE_NAME(c.user_type_id) + CASE 
         WHEN c.system_type_id <> c.user_type_id 
         THEN N' (' + TYPE_NAME(c.system_type_id) + N')' 
         ELSE N'' END
  FROM sys.columns AS c
  INNER JOIN sys.objects AS o
    ON c.[object_id] = o.[object_id]
  INNER JOIN sys.schemas AS s
    ON o.[schema_id] = s.[schema_id]
  WHERE c.system_type_id IN (34, 35, 99)
  ORDER BY [Schema], [Object], [Column];

Output:

Output showing our bad columns

It may be tempting to go into SSMS and manually change the data types of these columns, but there may be other implications as well. For example, the columns might have default constraints associated with them. And you may have stored procedures with parameters that should be updated in tandem:

CREATE PROCEDURE dbo.sp1 @p1 ntext AS PRINT 1;
GO

To find all of these cases, you can adapt the above query to search against the sys.parameters catalog view instead:

SELECT [Schema]  = s.name, 
       [Object]   = o.name, 
       [Parameter] = p.name, 
       [Data Type] = TYPE_NAME(p.user_type_id) + CASE 
         WHEN p.system_type_id <> p.user_type_id 
         THEN N' (' + TYPE_NAME(p.system_type_id) + N')' 
         ELSE N'' END
  FROM sys.objects AS o
  INNER JOIN sys.schemas AS s
    ON o.[schema_id] = s.[schema_id]
  INNER JOIN sys.parameters AS p
    ON p.[object_id] = o.[object_id]
  WHERE p.system_type_id IN (34, 35, 99)
  ORDER BY [Schema], [Object], [Parameter];

Output:

Output showing our bad parameters

If you need to return this data this across all databases, you can grab sp_ineachdb, a procedure I wrote (and documented here and here) to overcome several of the limitations in the buggy, undocumented, and unsupported sp_MSforeachdb. Then you can do this:

EXEC master.dbo.sp_ineachdb @command = N'SELECT [Database]  = DB_NAME(), 
       [Schema]    = s.name, 
       [Object]    = o.name,
       [Column]    = c.name,
       [Data Type] = TYPE_NAME(c.user_type_id) + CASE 
         WHEN c.system_type_id <> c.user_type_id 
         THEN N'' ('' + TYPE_NAME(c.system_type_id) + N'')'' 
         ELSE N'''' END
  FROM sys.columns AS c
  INNER JOIN sys.objects AS o
    ON c.[object_id] = o.[object_id]
  INNER JOIN sys.schemas AS s
    ON o.[schema_id] = s.[schema_id]
  WHERE c.system_type_id IN (34, 35, 99)
  ORDER BY [Schema], [Object], [Column];

SELECT [Database]  = DB_NAME(),
       [Schema]    = s.name, 
       [Object]    = o.name, 
       [Parameter] = p.name, 
       [Data Type] = TYPE_NAME(p.user_type_id) + CASE 
         WHEN p.system_type_id <> p.user_type_id 
         THEN N'' ('' + TYPE_NAME(p.system_type_id) + N'')''
         ELSE N'''' END
  FROM sys.objects AS o
  INNER JOIN sys.schemas AS s
    ON o.[schema_id] = s.[schema_id]
  INNER JOIN sys.parameters AS p
    ON p.[object_id] = o.[object_id]
  WHERE p.system_type_id IN (34, 35, 99)
  ORDER BY [Schema], [Object], [Parameter];';

An interesting side note here: if you do run that against all databases, you will discover that, even in SQL Server 2019, Microsoft is still using some of these old types.

You could further automate that by running it from PowerShell or whatever automation tool you use to manage multiple instances of SQL Server.

Of course, that is just the beginning – it only produces a list. You could further extend it to generate a draft version of the ALTER TABLE commands you would need to update all the tables, but those commands would need to be reviewed before you execute them, and you'd still need to modify the procedures yourself (generating ALTER PROCEDURE commands that only have those parameter type names replaced correctly is not an easy exercise by any means). Here is an example that generates ALTER TABLE commands, taking into account nullability but no other complications like default constraints:

SELECT N'ALTER TABLE ' + QUOTENAME(s.name)
  + N'.' + QUOTENAME(o.name)
  + N' ALTER COLUMN ' + QUOTENAME(c.name) + N' '
  + CASE c.system_type_id
      WHEN 34 THEN N'varbinary'
      WHEN 35 THEN N'varchar'
      WHEN 99 THEN N'nvarchar'
    END + N'(max)' 
  + CASE c.is_nullable 
      WHEN 0 THEN N' NOT' 
      ELSE N'' END + N' NULL;'
FROM sys.columns AS c
INNER JOIN sys.objects AS o
  ON c.[object_id] = o.[object_id]
INNER JOIN sys.schemas AS s
  ON o.[schema_id] = s.[schema_id]
  WHERE c.system_type_id IN (34, 35, 99);

Output:

ALTER TABLE [dbo].[t2] ALTER COLUMN [msg] nvarchar(max) NULL;

And in case you're wondering, no, you can't do this one-time operation with an explicit WITH (ONLINE = ON) option:

Msg 11427, Level 16, State 1
The online ALTER COLUMN operation cannot be performed for table 't2' because column 'msg' currently has or is getting altered into an unsupported datatype: text, ntext, image, CLR type or FILESTREAM. The operation must be performed offline.

Conclusion

Hopefully this provides some good background on why you want to eliminate these deprecated types, and a starting point to actually make the changes. Microsoft learned the hard way that there isn't much functionality they can just rip out of the product, so I'm not concerned that these will ever actually cease to exist in my lifetime. But fear of removal shouldn't be your only motivator.