Aaron Bertrand

Parse parameter default values using PowerShell – Part 2

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Paul White is an independent SQL Server consultant specializing in performance tuning, execution plans, and the query optimizer.

Paul’s Posts

[ Part 1 | Part 2 | Part 3 ]

In my last post, I showed how to use TSqlParser and TSqlFragmentVisitor to extract important information from a T-SQL script containing stored procedure definitions. With that script, I left out a few things, such as how to parse out the OUTPUT and READONLY keywords for parameters, and how to parse multiple objects together. Today, I wanted to provide a script that handles those things, mention a few other future enhancements, and share a GitHub repository I created for this work.

Previously, I used a simple example like this:

CREATE PROCEDURE dbo.procedure1
  @param1 AS int = /* comment */ -64
AS PRINT 1;
GO

And with the Visitor code I provided, the output to the console was:

==========================
ProcedureReference
==========================

dbo.procedure1

 
==========================
ProcedureParameter
==========================

Param name: @param1
Param type: int
Default: -64

Now, what if the script passed in looked more like this? It combines the intentionally terrible procedure definition from before with a couple of other elements that you might expect to cause problems, like user-defined type names, two different forms of the OUT/OUTPUT keyword, Unicode in parameter values (and in parameter names!), keywords as constants, and ODBC escape literals.

/* AS BEGIN , @a int = 7, comments can appear anywhere */
CREATE PROCEDURE dbo.some_procedure 
  -- AS BEGIN, @a int = 7 'blat' AS =
  /* AS BEGIN, @a int = 7 'blat' AS = -- */
  @a AS /* comment here because -- chaos */ int = 5,
  @b AS varchar(64) = 'AS = /* BEGIN @a, int = 7 */ ''blat''',
  @c AS int = -- 12 
              6
AS
    -- @d int = 72,
    DECLARE @e int = 5;
    SET @e = 6;
GO

CREATE PROCEDURE [dbo].another_procedure
(
  @p1 AS [int] = /* 1 */ 1,
  @p2 datetime = getdate OUTPUT,-- comment,
  @p3 date = {ts '2020-02-01 13:12:49'},
  @p4 dbo.tabletype READONLY,
  @p5 geography OUT, 
  @p6 sysname = N'学中'
)
AS SELECT 5

The previous script doesn’t quite handle multiple objects correctly, and we need to add a few logical elements to account for OUTPUT and READONLY. Specifically, Output and ReadOnly are not token types, but rather they are recognized as an Identifier. So we need some extra logic to find identifiers with those explicit names within any ProcedureParameter fragment. You might spot a few other minor changes:

    Add-Type -Path "Microsoft.SqlServer.TransactSql.ScriptDom.dll";

    $parser = [Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser]($true)::New(); 

    $errors = [System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]]::New();

    $procedure = @"
    /* AS BEGIN , @a int = 7, comments can appear anywhere */
    CREATE PROCEDURE dbo.some_procedure 
      -- AS BEGIN, @a int = 7 'blat' AS =
      /* AS BEGIN, @a int = 7 'blat' AS = -- */
      @a AS /* comment here because -- chaos */ int = 5,
      @b AS varchar(64) = 'AS = /* BEGIN @a, int = 7 */ ''blat''',
      @c AS int = -- 12 
                  6
    AS
        -- @d int = 72,
        DECLARE @e int = 5;
        SET @e = 6;
    GO

    CREATE PROCEDURE [dbo].another_procedure
    (
      @p1 AS [int] = /* 1 */ 1,
      @p2 datetime = getdate OUTPUT,-- comment,
      @p3 date = {ts '2020-02-01 13:12:49'},
      @p4 dbo.tabletype READONLY,
      @p5 geography OUT, 
      @p6 sysname = N'学中'
    )
    AS SELECT 5
"@

    $fragment = $parser.Parse([System.IO.StringReader]::New($procedure), [ref]$errors);

    $visitor = [Visitor]::New();

    $fragment.Accept($visitor);

    class Visitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragmentVisitor 
    {
      [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragment] $fragment)
      {
        $fragmentType = $fragment.GetType().Name;
        if ($fragmentType -in ("ProcedureParameter", "ProcedureReference"))
        {
          if ($fragmentType -eq "ProcedureReference")
          {
            Write-Host "`n==========================";
            Write-Host "  $($fragmentType)";
            Write-Host "==========================";
          }
          $output     = "";
          $param      = ""; 
          $type       = "";
          $default    = "";
          $extra      = "";
          $isReadOnly = $false;
          $isOutput   = $false;
          $seenEquals = $false;

          for ($i = $fragment.FirstTokenIndex; $i -le $fragment.LastTokenIndex; $i++)
          {
            $token = $fragment.ScriptTokenStream[$i];
            if ($token.TokenType -notin ("MultiLineComment", "SingleLineComment", "As"))
            {
              if ($fragmentType -eq "ProcedureParameter")
              {
                if ($token.TokenType -eq "Identifier" -and 
                    ($token.Text.ToUpper -in ("OUT", "OUTPUT", "READONLY"))
                {
                  $extra = $token.Text.ToUpper();
                  if ($extra -eq "READONLY")
                  {
                    $isReadOnly = $true;
                  }
                  else 
                  {
                    $isOutput = $true;
                  }
                }

                if (!$seenEquals)
                {
                  if ($token.TokenType -eq "EqualsSign") 
                  { 
                    $seenEquals = $true; 
                  }
                  else 
                  { 
                    if ($token.TokenType -eq "Variable") 
                    {
                      $param += $token.Text; 
                    }
                    else
                    {
                      if (!$isOutput -and !$isReadOnly)
                      {
                        $type += $token.Text; 
                      }
                    }
                  }
                }
                else
                { 
                  if ($token.TokenType -ne "EqualsSign" -and !$isOutput -and !$isReadOnly)
                  {
                    $default += $token.Text;
                  }
                }
              }
              else 
              {
                $output += $token.Text.Trim(); 
              }
            }
          }

          if ($param.Length   -gt 0) { $output  = "`nParam name: " + $param.Trim(); }
          if ($type.Length    -gt 0) { $type    = "`nParam type: " + $type.Trim(); }
          if ($default.Length -gt 0) { $default = "`nDefault:    " + $default.TrimStart(); }
          if ($isReadOnly) { $extra = "`nRead Only:  yes"; }
          if ($isOutput)   { $extra = "`nOutput:     yes"; }

          Write-Host $output $type $default $extra;
        }
      }
    }

This code is for demonstration purposes only, and there is zero chance it is the most current. Please see details below about downloading a more recent version.

The output in this case:

==========================
ProcedureReference
==========================
dbo.some_procedure

 
Param name: @a
Param type: int
Default: 5

 
Param name: @b
Param type: varchar(64)
Default: 'AS = /* BEGIN @a, int = 7 */ "blat"'

 
Param name: @c
Param type: int
Default: 6

 

==========================
ProcedureReference
==========================
[dbo].another_procedure

 
Param name: @p1
Param type: [int]
Default: 1

 
Param name: @p2
Param type: datetime
Default: getdate
Output: yes

 
Param name: @p3
Param type: date
Default: {ts '2020-02-01 13:12:49'}

 
Param name: @p4
Param type: dbo.tabletype
Read Only: yes

 
Param name: @p5
Param type: geography
Output: yes

 
Param name: @p6
Param type: sysname
Default: N'学中'

That’s some pretty powerful parsing, even though there are some tedious edge cases and a lot of conditional logic. I’d love to see TSqlFragmentVisitor expanded so some of its token types have additional properties (like SchemaObjectName.IsFirstAppearance and ProcedureParameter.DefaultValue), and see new token types added (like FunctionReference). But even now, this is light years beyond a brute force parser you might write in any language, never mind T-SQL.

There are still a couple of limitations I haven’t addressed yet, though:

  • This only addresses stored procedures. The code to handle all three types of user-defined functions is similar, but there is no handy FunctionReference fragment type, so instead you need to identify the first SchemaObjectName fragment (or the first set of Identifier and Dot tokens) and ignore any subsequent instances. Currently the code in this post will return all the information about the parameters to a function, but it will not return the function’s name. Feel free to use it for singletons or batches containing only stored procedures, but you might find the output confusing for multiple mixed object types. The latest version in the repository below does handle functions perfectly fine.
  • This code doesn’t save state. Outputting to the console within each Visit is easy, but collecting the data from multiple visits, to then pipeline elsewhere, is a little more complex, primarily because of the way the Visitor pattern works.
  • The code above can’t accept input directly. To simplify demonstration here, it’s just a raw script where you paste your T-SQL block as a constant. The eventual goal is to support input from a file, an array of files, a folder, an array of folders, or pulling module definitions from a database. And the output can be anywhere: to the console, to a file, to a database… so the sky’s the limit there. Some of that work has happened in the meantime, but none of that has been written in the simple version you see above.
  • There is no error handling. Again, for brevity and ease of consumption, the code here does not worry about handling inevitable exceptions, though the most destructive thing that can happen in its present form is that a batch won’t appear in the output if it can’t be properly parsed (like CREATE STUPID PROCEDURE dbo.whatever). When we start using databases and/or the file system, proper error handling will become that much more important.

ParamParser on GitHub
https://github.com/AaronBertrand/ParamParserYou might wonder, where am I going to maintain ongoing work on this and fix all of these things? Well, I have put it on GitHub, have tentatively called the project ParamParser, and already have contributors helping with improvements. The current version of the code already looks quite different from the sample above, and by the time you read this some of the limitations mentioned here might already be addressed. I only want to maintain the code in one place; this tip is more about showing a minimal sample of how it can work, and highlighting that there is a project out there dedicated to simplifying this task.

In the next segment, I'll talk more about how my friend and colleague, Will White, helped me get from the standalone script you see above to the much more powerful module you'll find on GitHub.

If you have the need to parse default values from parameters in the meantime, feel free to download the code and try it out. And as I suggested before, experiment on your own, because there are lots of other powerful things you can do with these classes and the Visitor pattern.

[ Part 1 | Part 2 | Part 3 ]