I've recently discussed a few features that Microsoft advises against using, and that I think you should forget exist, too. There was the case where a colleague constantly promoted the deprecated backward compatibility view sys.sysprocesses
instead of newer dynamic management views (DMVs), and another case where a different colleague took down a production server using SQL Server Profiler.
My latest run-in with things best forgotten is a new stored procedure with an ntext
parameter. I checked and, sure enough, the data type matches the schema for the underlying table. My mind started racing about these older data types to explain why we really shouldn't be using them anymore:
- image
- ntext
- text
These types are on the deprecated list for many reasons, and have held a permanent spot on that list since being replaced by the max
types way back in SQL Server 2005. Some of these pain points include:
- you can't use many string functions, like
LEFT()
,RTRIM()
,UPPER()
, and most comparison operators; - you need to use functions like
TEXTPTR
,WRITETEXT
, andUPDATETEXT
for modifications; - you can't use the types as local variables;
- you can't reference the columns in
DISTINCT
,GROUP BY
,ORDER BY
, or as an included column (not that you should want to do any of these); - smaller values that could fit in-row can only do so with the
text in row
option.
That is not an exhaustive list; there are other differences that you might consider more or less important. The most pressing reason for me is that you can't rebuild the clustered index online if the table contains one of these data types.
Let's create a simple database with a few tables:
CREATE DATABASE BadIdeas;
GO
USE BadIdeas;
GO
CREATE TABLE dbo.t1(id bigint IDENTITY PRIMARY KEY, msg nvarchar(max));
CREATE TABLE dbo.t2(id bigint IDENTITY PRIMARY KEY, msg ntext);
Now, let's try to perform online operations on the tables:
ALTER TABLE dbo.t1 REBUILD WITH (ONLINE = ON);
GO
ALTER TABLE dbo.t2 REBUILD WITH (ONLINE = ON);
While the first statement succeeds, the second yields an error message like this:
An online operation cannot be performed for index 'PK__t2__3213E83FEEA1E0AD' because the index contains column 'msg' of data type text, ntext, image or FILESTREAM. For a non-clustered index, the column could be an include column of the index. For a clustered index, the column could be any column of the table. If DROP_EXISTING is used, the column could be part of a new or old index. The operation must be performed offline.
In a one-off scenario, the error message is easy enough to deal with: you either skip the table or, if the table absolutely must be rebuilt, you work with your teams to schedule an outage. When you're automating an index maintenance solution or deploying new compression settings across your environment, it is a bit more of a pain in the behind to make your solution handle present or future state.
What to do?
You can start replacing these columns with their more modern counterparts. Here is a query to help you track them down using the sys.columns
catalog view, but you're on your own for any explicit references that might exist in your application code:
SELECT [Schema] = s.name,
[Object] = o.name,
[Column] = c.name,
[Data Type] = TYPE_NAME(c.user_type_id) + CASE
WHEN c.system_type_id <> c.user_type_id
THEN N' (' + TYPE_NAME(c.system_type_id) + N')'
ELSE N'' END
FROM sys.columns AS c
INNER JOIN sys.objects AS o
ON c.[object_id] = o.[object_id]
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
WHERE c.system_type_id IN (34, 35, 99)
ORDER BY [Schema], [Object], [Column];
Output:
It may be tempting to go into SSMS and manually change the data types of these columns, but there may be other implications as well. For example, the columns might have default constraints associated with them. And you may have stored procedures with parameters that should be updated in tandem:
CREATE PROCEDURE dbo.sp1 @p1 ntext AS PRINT 1;
GO
To find all of these cases, you can adapt the above query to search against the sys.parameters
catalog view instead:
SELECT [Schema] = s.name,
[Object] = o.name,
[Parameter] = p.name,
[Data Type] = TYPE_NAME(p.user_type_id) + CASE
WHEN p.system_type_id <> p.user_type_id
THEN N' (' + TYPE_NAME(p.system_type_id) + N')'
ELSE N'' END
FROM sys.objects AS o
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
INNER JOIN sys.parameters AS p
ON p.[object_id] = o.[object_id]
WHERE p.system_type_id IN (34, 35, 99)
ORDER BY [Schema], [Object], [Parameter];
Output:
If you need to return this data this across all databases, you can grab sp_ineachdb
, a procedure I wrote (and documented here and here) to overcome several of the limitations in the buggy, undocumented, and unsupported sp_MSforeachdb
. Then you can do this:
EXEC master.dbo.sp_ineachdb @command = N'SELECT [Database] = DB_NAME(),
[Schema] = s.name,
[Object] = o.name,
[Column] = c.name,
[Data Type] = TYPE_NAME(c.user_type_id) + CASE
WHEN c.system_type_id <> c.user_type_id
THEN N'' ('' + TYPE_NAME(c.system_type_id) + N'')''
ELSE N'''' END
FROM sys.columns AS c
INNER JOIN sys.objects AS o
ON c.[object_id] = o.[object_id]
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
WHERE c.system_type_id IN (34, 35, 99)
ORDER BY [Schema], [Object], [Column];
SELECT [Database] = DB_NAME(),
[Schema] = s.name,
[Object] = o.name,
[Parameter] = p.name,
[Data Type] = TYPE_NAME(p.user_type_id) + CASE
WHEN p.system_type_id <> p.user_type_id
THEN N'' ('' + TYPE_NAME(p.system_type_id) + N'')''
ELSE N'''' END
FROM sys.objects AS o
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
INNER JOIN sys.parameters AS p
ON p.[object_id] = o.[object_id]
WHERE p.system_type_id IN (34, 35, 99)
ORDER BY [Schema], [Object], [Parameter];';
An interesting side note here: if you do run that against all databases, you will discover that, even in SQL Server 2019, Microsoft is still using some of these old types.
You could further automate that by running it from PowerShell or whatever automation tool you use to manage multiple instances of SQL Server.
Of course, that is just the beginning – it only produces a list. You could further extend it to generate a draft version of the ALTER TABLE
commands you would need to update all the tables, but those commands would need to be reviewed before you execute them, and you'd still need to modify the procedures yourself (generating ALTER PROCEDURE
commands that only have those parameter type names replaced correctly is not an easy exercise by any means). Here is an example that generates ALTER TABLE
commands, taking into account nullability but no other complications like default constraints:
SELECT N'ALTER TABLE ' + QUOTENAME(s.name)
+ N'.' + QUOTENAME(o.name)
+ N' ALTER COLUMN ' + QUOTENAME(c.name) + N' '
+ CASE c.system_type_id
WHEN 34 THEN N'varbinary'
WHEN 35 THEN N'varchar'
WHEN 99 THEN N'nvarchar'
END + N'(max)'
+ CASE c.is_nullable
WHEN 0 THEN N' NOT'
ELSE N'' END + N' NULL;'
FROM sys.columns AS c
INNER JOIN sys.objects AS o
ON c.[object_id] = o.[object_id]
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
WHERE c.system_type_id IN (34, 35, 99);
Output:
And in case you're wondering, no, you can't do this one-time operation with an explicit WITH (ONLINE = ON)
option:
The online ALTER COLUMN operation cannot be performed for table 't2' because column 'msg' currently has or is getting altered into an unsupported datatype: text, ntext, image, CLR type or FILESTREAM. The operation must be performed offline.
Conclusion
Hopefully this provides some good background on why you want to eliminate these deprecated types, and a starting point to actually make the changes. Microsoft learned the hard way that there isn't much functionality they can just rip out of the product, so I'm not concerned that these will ever actually cease to exist in my lifetime. But fear of removal shouldn't be your only motivator.