Itzik Ben-Gan

Fundamentals of table expressions, Part 10 – Views, SELECT *, and DDL changes

July 14, 2021 by in T-SQL Queries | No Comments
SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Erin Stellato is a Principal Consultant with SQLskills and a Microsoft Data Platform MVP.

Erin’s Posts

As part of the series on table expressions, last month I started the coverage of views. Specifically, I started the coverage of logical aspects of views, and compared their design with that of derived tables and CTEs. This month I’m going to continue the coverage of logical aspects of views, focusing my attention on SELECT * and DDL changes.

The code that I’ll be using in this article can be executed in any database, but in my demos, I’ll be using TSQLV5—the same sample database that I used in previous articles. You can find the script that creates and populates TSQLV5 here, and its ER diagram here.

Using SELECT * in the view’s inner query is a bad idea

In the conclusion section of last month’s article I posed a question as food for thought. I explained that earlier in the series I made a case in favor of using SELECT * in the inner table expressions used with derived tables and CTEs. See Part 3 in the series for details if you need to refresh your memory. I then asked you to think whether the same recommendation would still be valid for the inner table expression used to define view. Perhaps the title of this section was already a spoiler, but I’ll say right of the bat that with views it’s actually a very bad idea.

I’ll start with views that are not defined with the SCHEMABINDING attribute, which prevents relevant DDL changes to dependent objects, and then explain how things change when you do use this attribute.

I’ll jump straight to an example since this will be the easiest way to present my argument.

Use the following code to create a table called dbo.T1, and a view called dbo.V1 based on a query with SELECT * against the table:

USE TSQLV5;

DROP VIEW IF EXISTS dbo.V1;
DROP TABLE IF EXISTS dbo.T1;
GO

CREATE TABLE dbo.T1
(
  keycol INT NOT NULL IDENTITY
    CONSTRAINT PK_T1 PRIMARY KEY,
  intcol INT NOT NULL,
  charcol VARCHAR(10) NOT NULL
);

INSERT INTO dbo.T1(intcol, charcol) VALUES
  (10, 'A'),
  (20, 'B');
GO

CREATE OR ALTER VIEW dbo.V1
AS
  SELECT *
  FROM dbo.T1;
GO

Observe that the table currently has the columns keycol, intcol and charcol.

Use the following code to query the view:

SELECT * FROM dbo.V1;

You get the following output:

keycol      intcol      charcol
----------- ----------- ----------
1           10          A
2           20          B

Nothing too special here.

When you create a view, SQL Server records metadata information in a number of catalog objects. It records some general information, which you can query via sys.views, the view definition which you can query via sys.sql_modules, column information which you can query via sys.columns, and more information is available through other objects. What’s also relevant to our discussion is that SQL Server lets you control access permissions against views. What I want to warn you about when using SELECT * in the view’s inner table expression, is what can happen when DDL changes are applied to underlying dependent objects.

Use the following code to create a user called user1 and grant the user with permissions to select the columns keycol and intcol from the view, but not charcol:

DROP USER IF EXISTS user1;

CREATE USER user1 WITHOUT LOGIN; 

GRANT SELECT ON dbo.V1(keycol, intcol) TO user1;

At this point, let’s inspect some of the recorded metadata related to our view. Use the following code to return the entry representing the view from sys.views:

SELECT SCHEMA_NAME(schema_id) AS schemaname, name, object_id, type_desc
FROM sys.views
WHERE object_id = OBJECT_ID(N'dbo.V1');

This code generates the following output:

schemaname  name  object_id   type_desc
----------- ----- ----------- ----------
dbo         V1    130099504   VIEW

Use the following code to get the view definition from sys.modules:

SELECT definition 
FROM sys.sql_modules
WHERE object_id = OBJECT_ID(N'dbo.V1');

Another option is to use the OBJECT_DEFINITION function like so:

SELECT OBJECT_DEFINITION(OBJECT_ID(N'dbo.V1'));

You get the following output:

CREATE   VIEW dbo.V1
AS
  SELECT *
  FROM dbo.T1;

Use the following code to query the view’s column definitions from sys.columns:

SELECT name AS column_name, column_id, TYPE_NAME(system_type_id) AS data_type
FROM sys.columns
WHERE object_id = OBJECT_ID(N'dbo.V1');

As expected, you get information on the view’s three columns keycol, intcol and charcol:

column_name  column_id   data_type
------------ ----------- ----------
keycol       1           int
intcol       2           int
charcol      3           varchar

Observe the column IDs (ordinal positions) that are associated with the columns.

You can get similar information by querying the standard information schema view INFORMATION_SCHEMA.COLUMNS, like so:

SELECT COLUMN_NAME, ORDINAL_POSITION, DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = N'dbo'
  AND TABLE_NAME = N'V1';

To get the view’s dependency information (objects it refers to), you can query sys.dm_sql_referenced_entities, like so:

SELECT
  OBJECT_NAME(referenced_id) AS referenced_object,
  referenced_minor_id,
  COL_NAME(referenced_id, referenced_minor_id) AS column_name
FROM sys.dm_sql_referenced_entities(N'dbo.V1', N'OBJECT');

You will find the dependency on the table T1 and on its three columns:

referenced_object  referenced_minor_id column_name
------------------ ------------------- -----------
T1                 0                   NULL
T1                 1                   keycol
T1                 2                   intcol
T1                 3                   charcol

As you could probably guess, the reference_minor_id value for columns is the column ID you saw earlier.

If you want to get the permissions of user1 against V1, you can query sys.database_permissions, like so:

SELECT 
  OBJECT_NAME(major_id) AS referenced_object,
  minor_id,
  COL_NAME(major_id, minor_id) AS column_name, 
  permission_name
FROM sys.database_permissions
WHERE major_id = OBJECT_ID(N'dbo.V1')
  AND grantee_principal_id = USER_ID(N'user1');

This code generates the following output, affirming that indeed user1 has select permissions only against keycol and intcol, but not against charcol:

referenced_object  minor_id    column_name  permission_name
------------------ ----------- ------------ ----------------
V1                 1           keycol       SELECT
V1                 2           intcol       SELECT

Again, the minor_id value is the column ID that you saw earlier. Our user, user1, has permissions to the columns whose IDs are 1 and 2.

Next, run the following code to impersonate user1 and to try and query all of V1’s columns:

EXECUTE AS USER = N'user1';

SELECT * FROM dbo.V1;

As you would expect, you get a permission error due to the lack of permission to query charcol:

Msg 230, Level 14, State 1, Line 141
The SELECT permission was denied on the column 'charcol' of the object 'V1', database 'TSQLV5', schema 'dbo'.

Try querying only keycol and intcol:

SELECT keycol, intcol FROM dbo.V1;

This time the query runs successfully, generating the following output:

keycol      intcol
----------- -----------
1           10
2           20

No surprises so far.

Run the following code to revert to your original user:

REVERT;

Now let’s apply a few structural changes to the underlying table dbo.T1. Run the following code to first add two columns called datecol and binarycol, and then to drop the column intcol:

ALTER TABLE dbo.T1
  ADD datecol DATE NOT NULL DEFAULT('99991231'),
      binarycol VARBINARY(3) NOT NULL DEFAULT(0x112233);
      
ALTER TABLE dbo.T1
  DROP COLUMN intcol;

SQL Server did not reject the structural changes to columns that are referenced by the view since the view was not created with the SCHEMABINDING attribute. Now, for the catch. At this point, SQL Server did not yet refresh the view’s metadata information in the different catalog objects.

Use the following code to query the view, still with your original user (not user1 yet):

SELECT * FROM dbo.V1;

You get the following output:

keycol      intcol     charcol
----------- ---------- ----------
1           A          9999-12-31
2           B          9999-12-31

Notice that intcol actually returns charcol's content and charcol returns datecol's content. Remember, there's no intcol in the table anymore but there is datecol. Also, you don't get back the new column binarycol.

To try and figure out what’s going on, use the following code to query the view’s column metadata:

SELECT name AS column_name, column_id, TYPE_NAME(system_type_id) AS data_type
FROM sys.columns
WHERE object_id = OBJECT_ID(N'dbo.V1');

This code generates the following output:

column_name  column_id   data_type
------------ ----------- ----------
keycol       1           int
intcol       2           int
charcol      3           varchar

As you can see, the view’s metadata is still not refreshed. You can see intcol as column ID 2 and charcol as column ID 3. In practice, intcol doesn't exist anymore, charcol is supposed to be column 2, and datecol is supposed to be column 3.

Let’s check if there’s any change with permission information:

SELECT 
  OBJECT_NAME(major_id) AS referenced_object,
  minor_id,
  COL_NAME(major_id, minor_id) AS column_name, 
  permission_name
FROM sys.database_permissions
WHERE major_id = OBJECT_ID(N'dbo.V1')
  AND grantee_principal_id = USER_ID(N'user1');

You get the following output:

referenced_object  minor_id    column_name  permission_name
------------------ ----------- ------------ ----------------
V1                 1           keycol       SELECT
V1                 2           intcol       SELECT

Permission info shows that user1 has permissions to columns 1 and 2 in the view. However, even though metadata thinks that column 2 is called intcol, it's actually mapped to charcol in T1 in practice. That’s dangerous since user1 is not supposed to have access to charcol. What if in real life this column holds sensitive information like passwords.

Let’s impersonate user1 again, and query all view columns:

EXECUTE AS USER = 'user1';

SELECT * FROM dbo.V1;

You do get a permission error saying you don’t have access to charcol:

Msg 230, Level 14, State 1, Line 211
The SELECT permission was denied on the column 'charcol' of the object 'V1', database 'TSQLV5', schema 'dbo'.

However, see what happens when you explicitly ask for keycol and intcol:

SELECT keycol, intcol FROM dbo.V1;

You get the following output:

keycol      intcol
----------- ----------
1           A
2           B

This query succeeds, only it returns the content of charcol under intcol. Our user, user1, is not supposed to have access to this information. Oops!

At this point, revert back to the original user by running the following code:

REVERT;

Refresh SQL module

You can clearly see that using SELECT * in the view’s inner table expression is a bad idea. But it’s not only that. Generally, it’s a good idea to refresh the view’s metadata after every DDL change to referenced objects and columns. You can do so using sp_refreshview or the more general sp_refreshmodule, like so:

EXEC sys.sp_refreshsqlmodule N'dbo.V1';

Query the view again, now that its metadata has been refreshed:

SELECT * FROM dbo.V1;

This time you do get the expected output:

keycol      charcol    datecol    binarycol
----------- ---------- ---------- ---------
1           A          9999-12-31 0x112233
2           B          9999-12-31 0x112233

The column charcol is named correctly and shows the correct data; you don’t see intcol, and you do see the new columns datecol and binarycol.

Query the view’s column metadata:

SELECT name AS column_name, column_id, TYPE_NAME(system_type_id) AS data_type
FROM sys.columns
WHERE object_id = OBJECT_ID(N'dbo.V1');

The output now shows correct column metadata information:

column_name  column_id   data_type
------------ ----------- ----------
keycol       1           int
charcol      2           varchar
datecol      3           date
binarycol    4           varbinary

Query user1’s permissions against the view:

SELECT 
  OBJECT_NAME(major_id) AS referenced_object,
  minor_id,
  COL_NAME(major_id, minor_id) AS column_name, 
  permission_name
FROM sys.database_permissions
WHERE major_id = OBJECT_ID(N'dbo.V1')
  AND grantee_principal_id = USER_ID(N'user1');

You get the following output:

referenced_object  minor_id    column_name  permission_name
------------------ ----------- ------------ ----------------
V1                 1           keycol       SELECT

Permission information is now correct. Our user, user1, has permissions only to select keycol, and permission information for intcol has been removed.

To be sure all is good, let’s test this by impersonating user1 and querying the view:

EXECUTE AS USER = 'user1';

SELECT * FROM dbo.V1;

You get two permission errors due to the lack of permissions against datecol and binarycol:

Msg 230, Level 14, State 1, Line 281
The SELECT permission was denied on the column 'datecol' of the object 'V1', database 'TSQLV5', schema 'dbo'.

Msg 230, Level 14, State 1, Line 281
The SELECT permission was denied on the column 'binarycol' of the object 'V1', database 'TSQLV5', schema 'dbo'.

Try to query keycol and intcol:

SELECT keycol, intcol FROM dbo.V1;

This time the error correctly says that there’s no column called intcol:

Msg 207, Level 16, State 1, Line 279

Invalid column name 'intcol'.

Query only intcol:

SELECT keycol FROM dbo.V1;

This query runs successfully, generating the following output:

keycol
-----------
1
2

At this point revert back to your original user by running the following code:

REVERT;

Is it enough to avoid SELECT * and use explicit column names?

If you follow a practice that says no SELECT * in the view’s inner table expression, would this be enough to keep you out of trouble? Well, let’s see…

Use the following code to recreate the table and view, only this time list the columns explicitly in the view’s inner query:

DROP VIEW IF EXISTS dbo.V1;
DROP TABLE IF EXISTS dbo.T1;
GO

CREATE TABLE dbo.T1
(
  keycol INT NOT NULL IDENTITY
    CONSTRAINT PK_T1 PRIMARY KEY,
  intcol INT NOT NULL,
  charcol VARCHAR(10) NOT NULL
);

INSERT INTO dbo.T1(intcol, charcol) VALUES
  (10, 'A'),
  (20, 'B');
GO

CREATE OR ALTER VIEW dbo.V1
AS
  SELECT keycol, intcol, charcol
  FROM dbo.T1;
GO

Query the view:

SELECT * FROM dbo.V1;

You get the following output:

keycol      intcol      charcol
----------- ----------- ----------
1           10          A
2           20          B

Again, grant user1 permissions to select keycol and intcol:

GRANT SELECT ON dbo.V1(keycol, intcol) TO user1;

Next, apply the same structural changes like you did before:

ALTER TABLE dbo.T1
  ADD datecol DATE NOT NULL DEFAULT('99991231'),
      binarycol VARBINARY(3) NOT NULL DEFAULT(0x112233);

ALTER TABLE dbo.T1
  DROP COLUMN intcol;

Observe that SQL Server accepted these changes, even though the view has an explicit reference to intcol. Again, that’s because the view was created without the SCHEMABINDING option.

Query the view:

SELECT * FROM dbo.V1;

At this point SQL Server generates the following error:

Msg 207, Level 16, State 1, Procedure V1, Line 5 [Batch Start Line 344]
Invalid column name 'intcol'.

Msg 4413, Level 16, State 1, Line 345
Could not use view or function 'dbo.V1' because of binding errors.

SQL Server attempted to resolve the intcol reference in the view, and of course was unsuccessful.

But what if your original plan was to drop intcol and later add it back? Use the following code to add it back, and then query the view:

ALTER TABLE dbo.T1
  ADD intcol INT NOT NULL DEFAULT(0);

SELECT * FROM dbo.V1;

This code generates the following output:

keycol      intcol      charcol
----------- ----------- ----------
1           0           A
2           0           B

The result seems correct.

How about querying the view as user1? Let’s try it:

EXECUTE AS USER = 'user1';

SELECT * FROM dbo.V1;

When querying all columns, you get the expected error due to the lack of permissions against charcol:

Msg 230, Level 14, State 1, Line 367
The SELECT permission was denied on the column 'charcol' of the object 'V1', database 'TSQLV5', schema 'dbo'.

Query keycol and intcol explicitly:

SELECT keycol, intcol FROM dbo.V1;

You get the following output:

keycol      intcol
----------- -----------
1           0
2           0

It seems like everything is in order thanks to the fact that you didn’t use SELECT * in the view’s inner query, even though you did not refresh the view’s metadata. Still, it could be a good practice to refresh the view’s metadata after DDL changes to referenced objects and columns to be on the safe side.

At this point, revert back to your original user by running the following code:

REVERT;

SCHEMABINDING

Using the SCHEMABINDING view attribute you can save yourself a lot of the aforementioned trouble. One of the keys to avoiding the trouble you saw earlier is to not use SELECT * in the view’s inner query. But there’s also the issue of structural changes against dependent objects, like dropping referenced columns, that could still result in errors when querying the view. Using the SCHEMABINDING view attribute, you won’t be allowed to use SELECT * in the inner query. Moreover, SQL Server will reject attempts to apply relevant DDL changes to dependent objects and columns. In relevant, I mean changes like dropping a referenced table or column. Adding a column to a referenced table is obviously not an issue, so SCHEMABINDING does not prevent such a change.

To demonstrate this, use the following code to recreate the table and view, with SCHEMABINDING in the view definition:

DROP VIEW IF EXISTS dbo.V1;
DROP TABLE IF EXISTS dbo.T1;
GO

CREATE TABLE dbo.T1
(
  keycol INT NOT NULL IDENTITY
    CONSTRAINT PK_T1 PRIMARY KEY,
  intcol INT NOT NULL,
  charcol VARCHAR(10) NOT NULL
);

INSERT INTO dbo.T1(intcol, charcol) VALUES
  (10, 'A'),
  (20, 'B');
GO

CREATE OR ALTER VIEW dbo.V1
  WITH SCHEMABINDING
AS
  SELECT *
  FROM dbo.T1;
GO

You get an error:

Msg 1054, Level 15, State 6, Procedure V1, Line 5 [Batch Start Line 387]
Syntax '*' is not allowed in schema-bound objects.

When using SCHEMABINDING, you’re not allowed to use SELECT * in the view’s inner table expression.

Try to create the view again, only this time with an explicit column list:

CREATE OR ALTER VIEW dbo.V1
  WITH SCHEMABINDING
AS
  SELECT keycol, intcol, charcol
  FROM dbo.T1;
GO

This time the view is created successfully.

Grant user1 permissions on keycol and intcol:

GRANT SELECT ON dbo.V1(keycol, intcol) TO user1;

Next, attempt to apply structural changes to the table. First, add a couple of columns:

ALTER TABLE dbo.T1
  ADD datecol DATE NOT NULL DEFAULT('99991231'),
      binarycol VARBINARY(3) NOT NULL DEFAULT(0x112233);

Adding columns is not a problem because they can't be part of existing schema-bound views, so this code completes successfully.

Attempt to drop the column intcol:

ALTER TABLE dbo.T1
  DROP COLUMN intcol;

You get the following error:

Msg 5074, Level 16, State 1, Line 418
The object 'V1' is dependent on column 'intcol'.

Msg 4922, Level 16, State 9, Line 418
ALTER TABLE DROP COLUMN intcol failed because one or more objects access this column.

Dropping or altering referenced columns is disallowed when schema-bound objects exist.

If you still need to drop intcol, you will have to drop the schema-bound referencing view first, apply the change, and then recreate the view and reassign permissions, like so:

DROP VIEW IF EXISTS dbo.V1;
GO

ALTER TABLE dbo.T1 DROP COLUMN intcol;
GO

CREATE OR ALTER VIEW dbo.V1
  WITH SCHEMABINDING
AS
  SELECT keycol, charcol, datecol, binarycol
  FROM dbo.T1;
GO

GRANT SELECT ON dbo.V1(keycol, datecol, binarycol) TO user1;
GO

Of course at this point there’s no need to refresh the view definition, because you created it anew.

Now that you’re done testing, run the following code for cleanup:

DROP VIEW IF EXISTS dbo.V1;
DROP TABLE IF EXISTS dbo.T1;
DROP USER IF EXISTS user1;

Summary

Using SELECT * in the view’s inner table expression is a very bad idea. After structural changes are applied to referenced objects, you could get incorrect column names, and even allow users to access data that they’re not supposed to have access to. It’s an important practice to explicitly list the referenced column names.

Using SCHEMABINDING in the view definition you’re forced to explicitly list column names, and relevant structural changes to dependent objects are rejected by SQL Server. Therefore, it might seem like creating views with SCHEMBINDING is always a good idea. However, there is a caveat with this option. As you saw, applying structural changes to referenced objects when SCHEMBINDING is used becomes a longer, more elaborate process. It can especially be an issue in systems that have to have very high availability. Imagine that you need to change a column defined as VARCHAR(50) to VARCHAR(60). That’s not an allowed change if there’s a view defined with SCHEMABINDING referencing this column. The implications of dropping a bunch of referencing views, which could be referenced by other views, and so on, could be problematic for the system. In short, it’s not always that trivial for companies to just adopt a policy that says that SCHEMABINDING should be used in all objects that support it. However, adopting a policy to not use SELECT * in views’ inner queries should be more straightforward.

There’s lots more to explore regarding views. To be continued next month…