An SQL text by Erland Sommarskog, SQL Server MVP. Latest
revision: 2023-07-04.
Copyright applies to this text. See here for font conventions used in this article.
This is an article that is intended to get you started with passing table-valued parameters (TVPs) to SQL Server from .NET. I begin with presenting how you use table-valued parameters in SQL Server itself whereupon I give a quick overview of the mechanisms to pass TVPs from ADO .NET to SQL Server.
The main meat of this article are two real-world examples where I use TVPs. The first example is the classical problem of passing a comma-separated list of values to SQL Server, this time through a table-valued parameter. You will be amazed of how simple it is. In the second example I show two ways to load a file with master-detail data into tables in SQL Server. In addition to the examples, there is also some discussion on how you can improve performance when loading large amounts of data.
Despite the appearance of .NET in the title of this article, there is a final chapter that explores the possibilities in other APIs, of which some support TVPs and some do not. In this chapter I briefly discuss workarounds when TVPs are not available to you.
Table of Contents
Table-Valued Parameters in T‑SQL
Passing Table-Valued Parameters from ADO .NET
Sending a Comma-Separated List to SQL Server
Looking Closer at the SqlMetaData Constructors
Loading Data through Table-Valued Parameters
Take One: Reading the File Into a List
Using Table-Valued Parameters from Other APIs
Some of the sample code in this article refers to the Northwind database. Use this script to create it. This is a very small database, less than 10 MB in size.
Note: This article was originally published as Arrays and Lists in SQL Server 2008 as an accompanying article to my other Arrays and Lists article. I have since renamed it, so that the title better reflects the contents.
Let's first look at how to use TVPs in T‑SQL without involving a client. To be able to declare a TVP, you first need to create a table type like this:
CREATE TYPE integer_list_tbltype AS TABLE (n int NOT NULL PRIMARY KEY)
That is, after CREATE TYPE you specify the type name followed by AS TABLE and then comes the table definition, using the same syntax as CREATE TABLE. You cannot use everything you can use with CREATE TABLE, but you can define PRIMARY KEY, UNIQUE and CHECK constraints, you can use IDENTITY and DEFAULT definitions, and you can define computed columns.
Starting with SQL 2014, you can also include index definitions in the type declaration. For instance:
CREATE TYPE tabletype AS TABLE (a int NOT NULL PRIMARY KEY, b int NOT NULL, c int NOT NULL, INDEX ix (b,c) WHERE (b > 2) )
Although, this particular example only runs on SQL 2016 and later; on SQL 2014 you cannot have filtered indexes in a table type.
While it is nothing we will explore in this article, it is worth noting that you can use table types with In-Memory OLTP, in which case you add the MEMORY_OPTIMIZED clause to the type definition:
CREATE TYPE hekatontype AS TABLE(a int NOT NULL PRIMARY KEY NONCLUSTERED) WITH (MEMORY_OPTIMIZED = ON)
This requires SQL 2014 or later.
Once you have this table type, you can use it to declare table variables:
DECLARE @mylist integer_list_tbltype
However, you cannot use the type with CREATE TABLE (it could have been nutty with temp tables!), nor can you use it for the declaration of the return table in a multi-statement table function. The raison d'être for table types is to make it possible to declare table-valued parameters for stored procedures or user-defined functions. Here is one example:
CREATE PROCEDURE get_product_names @prodids integer_list_tbltype READONLY AS SELECT p.ProductID, p.ProductName FROM Northwind.dbo.Products p WHERE p.ProductID IN (SELECT n FROM @prodids)
The body of the procedure brings no surprises. The code looks just as it would have if @prodids had been a local table variable. The parameter declaration on the other hand includes a keyword hitherto not seen in this context: READONLY. This keyword means what it says: you cannot modify the contents of the table parameter in any way in the procedure. As an aside, this restriction makes TVPs far less useful than they could have been; often you want to pass data between stored procedures as I discuss in my article How to Share Data Between Stored Procedures. However, for the task at hand, passing data from a client, the READONLY restriction is no major obstacle.
Calling this procedure is straightforward:
DECLARE @mylist integer_list_tbltype INSERT @mylist(n) VALUES(9),(12),(27),(37) EXEC get_product_names @mylist
You can also use TVPs with sp_executesql:
DECLARE @mylist integer_list_tbltype, @sql nvarchar(MAX) SELECT @sql = N'SELECT p.ProductID, p.ProductName FROM Northwind..Products p WHERE p.ProductID IN (SELECT n FROM @prodids)' INSERT @mylist VALUES(9),(12),(27),(37) EXEC sp_executesql @sql, N'@prodids integer_list_tbltype READONLY', @mylist
There are a few peculiarities, though. This does not work:
EXEC get_product_names NULL
but results in this error message:
Msg 206, Level 16, State 2, Procedure get_product_names, Line 0
Operand type clash: void type is incompatible with integer_list_tbltype
It is quite logical when you think of it: NULL is a scalar value, not a table value. But what do you think about this:
EXEC get_product_names
You may expect this to result in an error due to the missing parameter, but instead this runs and produces an empty result set! The same happens with:
EXEC get_product_names DEFAULT
The scoop is that a table-valued parameter always has the implicit default value of an empty table. Whether this is good or bad can be disputed, but if there were to be explicit default values, Microsoft would have to invent a lot of syntax for it. And in most cases, the default value you want is probably the empty table, so it is not entirely unreasonable.
One thing with table types which is not apparent is that you need permission to use a table type. This can be demonstrated in this script:
CREATE USER testuser WITHOUT LOGIN go EXECUTE AS USER = 'testuser' go DECLARE @p integer_list_tbltype go REVERT go DROP USER testuser
(What we do here is to create a loginless user, and then impersonate that user. This is a quick way to test permissions. For more details on impersonation, see my article Giving Permissions through Stored Procedures.)
The output from this script is puzzling:
Msg 229, Level 14, State 5, Line 1
The EXECUTE permission was denied on the object 'integer_list_tbltype', database 'tempdb', schema 'dbo'.
But the message is to be taken by the letter. To be able to use a table type, you need to have EXECUTE permission on the type. (This does not apply to normal scalar types, but it does apply to user-defined CLR types.) To grant permission on a type, the syntax is:
GRANT EXECUTE ON TYPE::integer_list_tbltype TO testuser
The TYPE::
prefix is needed to specify the object class.
It is maybe not that surprising that you cannot use table-valued parameters across linked servers, given that there are many restrictions with linked servers. But it does not stop there: you cannot even use table-valued parameters across databases. If you try something like:
USE tempdb go CREATE TYPE tobbe AS TABLE (a int NOT NULL PRIMARY KEY) go CREATE PROCEDURE tobbe_sp @t tobbe READONLY AS SELECT a FROM @t go USE otherdb go CREATE TYPE tobbe AS TABLE (a int NOT NULL PRIMARY KEY) go DECLARE @t tobbe EXEC tempdb..tobbe_sp @t
The error message is:
Msg 206, Level 16, State 2, Procedure tobbe_sp, Line 0
Operand type clash: tobbe is incompatible with tobbe
And if you try
CREATE PROCEDURE tobbe_sp @t otherdb.dbo.tobbe READONLY AS SELECT a FROM @t
You get the message:
Msg 117, Level 15, State 2, Procedure tobbe_sp, Line 1
The type name 'otherdb.dbo.tobbe' contains more than the maximum number of prefixes. The maximum is 1.
This somewhat awkward message informs us that the data type for a parameter cannot be a three-part name with a database component.
You cannot create stored procedures or user-defined functions in the CLR that take a table-valued parameter. But the other way works: you can call a T‑SQL procedure with a TVP from a CLR procedure, using the same mechanisms you use from a client and which is what we will look at next.
Passing values to TVPs from ADO .NET is very straightforward, and requires very little extra code compared to passing data to regular parameters. You need .NET Framework 3.5 SP1 or higher to have support for TVPs. You can only use TVPs with SqlClient; you cannot use TVPs with the classes in the System.Data.OleDb or System.Data.Odbc namespaces.
The specifics can be summarised as:
Exactly what is suitable then? There is an MSDN topic that suggests that the three choices are List<SqlDataRecord>, a DataTable or a DbDataReader. It turns out that this is not the full story. I have not been able – and nor have I really tried – to find the exact requirements, but it seems that you can pass anything that implements IEnumerable and IDataRecord, and then DataTable is a special case that goes beyond that. Exactly what you can use and not use is not particularly interesting. I would suggest that in practice you will use one of these four:
Of these, you would use the first two for general-purpose programming. The only reason to pass a DataTable is that you already have the data in such an object. If you have the data somewhere else – in a file, on a wire etc – there is no reason to fill a DataTable when you can use a List which is more lightweight. On the same token, the only reason you would use a DbDataReader, is because you have a DbDataReader anyway. That is, if the data for your TVP comes from an Oracle database, you can pass an OracleDataReader directly – no need to populate a List or a DataTable.
For this reason, in this article I focus on the first two alternatives, and all examples use either a custom-written class or a List<SqlDataRecord>. But just in case you are sitting with your DataTable and wonder how to use it, here is some sample code, assuming that the name of your DataTable is dt:
cmd.Parameters.Add("@tbl", SqlDbType.Structured).Value = dt; cmd.Parameters["@tbl"].TypeName = "dbo.MyTableType";
It is that simple. Well, almost. There are two caveats:
Note that I have specified the table type in two-part notation. That is, I have included the schema. This is not likely to matter if you call a stored procedure, but it matters if you submit a batch of SQL text. When you send SQL statements from the client, you should specify all objects that can belong to a schema – tables, views, stored procedures, user-defined function and also user-defined types – in two-part notation, or else there can be cache littering if users have different default schema. Best practice is to always specify the schema, no matter what you are calling.
Before we move on, I like to give a quick introduction to the demo files that accompany this article. They are compiled in two zip files, one with the demos in C# and one with the same demos in Visual Basic .NET. Use the language that is the most convenient to you. In the text itself, I sometimes show the code in C# and sometimes in VB .NET. If you are more comfortable with the language I'm not using for the moment, please refer to the corresponding file in the other language. For instance, if I refer to TVPDemo.DemoHelper.cs, you can rely on that there is also a TVPDemo.DemoHelper.vb file in the zip file with the VB code.
Beside the source code in C# and VB .NET, the zip files also include an SQL script and a file with sample data for one of the demos. There is also a file compile.bat you can use to compile the files. There are however no project/solution files for Visual Studio, as that goes a little above my head.
I will cover the code in the zip files as we arrive to the examples where they belong. There is however one class I like to highlight here and now, and that is the TVPDemo.DemoHelper class. This class includes some utility routines that are of little interest for the article as such. But you need to pay attention to the connection string. Here is how it looks as shipped:
private const string connstr = "Application Name=TVPdemo;Integrated Security=SSPI;" + "Data Source=.;Initial Catalog=tempdb";
You may have to change it to fit your environment. Particularly, if you only have Express Edition installed, you it's likely that you should use .\SQLEXPRESS for Data Source instead of the single dot.
In the article, the code mainly appears without comments, since I explain the code in the text. However, in the source files, the code is thoroughly commented.
Disclaimer: My expertise is in SQL Server, and I only write .NET code left-handedly. While I have done my best to adhere what I think is best practice, you may see things which makes you think "I would never do something like that". It is not unlikely that you will be right. Please let me know in such case!
A common problem is that you have a comma-separated list that you have gotten from somewhere, for instance from a multi-selection control in .NET. It's common to see questions in the forums on how to handle these comma-separated lists in SQL Server, but the correct solution in my opinion is to parse them client-side and feed them to table-valued parameter. SQL Server should spend its resources on reading and writing tabular data, not string processing. Not only are the resources better spent this way, the solution is also much simpler and cleaner with the help of the class CSV_splitter that I will introduce.
Using the CSV_splitter class is extremely simple. All your application code sees is a call to the constructor and that's it. Here is an example where we call a stored procedure with a TVP. We use the table type and the stored procedure I used in the T‑SQL section above.:
CREATE TYPE integer_list_tbltype AS TABLE (n int NOT NULL PRIMARY KEY) go CREATE PROCEDURE get_product_names @prodids integer_list_tbltype READONLY AS SELECT p.ProductID, p.ProductName FROM Northwind.dbo.Products p WHERE p.ProductID IN (SELECT n FROM @prodids)
In the set of demo files, you find CSVDemo.vb which includes the procedure CSV_to_SP that calls get_product_names, and it is no more complicated than this.
Private Sub CSV_to_SP()
Using cn As SqlConnection = TVPDemo.DemoHelper.SetupConnection(), _
cmd As SqlCommand = cn.CreateCommand()
cmd.CommandType = CommandType.StoredProcedure
cmd.CommandText = "dbo.get_product_names"
cmd.Parameters.Add("@prodids", SqlDbType.Structured)
cmd.Parameters("@prodids").Direction = ParameterDirection.Input
cmd.Parameters("@prodids").TypeName = "dbo.integer_list_tbltype"
cmd.Parameters("@prodids").Value = _
new TVPDemo.CSV_splitter("9,12,27,37")
Using da As new SqlDataAdapter(cmd), _
ds As new DataSet()
da.Fill(ds)
TVPDemo.DemoHelper.PrintDataSet(ds)
End Using
End Using
End Sub
To a large extent, very typical code to call a stored procedure. We first set up a connection and create a command object. We move on to state which stored procedure to call. In the highlighted part we define the single parameter that get_product_names accepts. Finally, we invoke the procedure, and in this example I have chosen to use DataAdapter.Fill together with a method in my DemoHelper class that prints the result set. In a real-world scenario you may prefer to use ExecuteReader or whatever fits you.
(I assume that most readers are acquainted with Using
, but in case you are not: this statement permits you to declare a variable
that is accessible in the enclosed block, and when the block exits, any Dispose method of the class will be invoked. This is
highly recommendable for SqlConnection and SqlCommand objects. If you just leave it to garbage collection to take care of
them, you may spew SQL Server with a lot of extra connections. Using
is available in C# as well, but spelt using
.)
The interesting part is the four statements that set up the parameter. The first adds the parameter and defines the type:
cmd.Parameters.Add("@prodids", SqlDbType.Structured)
For a table-valued parameter, you always specify SqlDbType.Structured here.
cmd.Parameters("@prodids").Direction = ParameterDirection.Input
Specifying the direction of the parameter is somewhat superfluous; since TVPs are read-only, Input is the only choice. Nevertheless, I have included it for clarity. Next we introduce the name of the table type in SQL Server, by setting the special parameter property TypeName:
cmd.Parameters("@prodids").TypeName = "dbo.integer_list_tbltype"
Strictly speaking, this is not necessary when calling a stored procedure, since SQL Server knows the type anyway. However, it is definitely best practice to always specify the type. For one thing, this will give you a clearer error message when there is a mismatch between the structure you pass from the client and the table type in SQL Server.
And now – drum roll! – it's time pass an actual value to the TVP:
cmd.Parameters("@prodids").Value = new TVPDemo.CSV_splitter("9,12,27,37")
You create a new CSV_splitter object which you pass as the parameter value. And as the parameter to the constructor you pass your list of comma-separated integers. Since this is a sample, the list is a constant; in practical code you would of course have a variable here.
All you need to do to get this to work is to put the CSV_splitter class in place. Which is very simple, since the code is included in the download files. You only need to change the namespace to fit your local conventions. The joy is that this class is perfectly reusable, and while I will cover the internals of the class in a second, all you really need to know if you are in a hurry is this:
You might ask: what if I don't use stored procedures? Can I still use TVPs and the CSV_splitter class? Sure enough. The file CSVDemo.vb also includes this routine:
Private Sub CSV_to_SQL() Using cn As SqlConnection = TVPDemo.DemoHelper.SetupConnection(), _ cmd As SqlCommand = cn.CreateCommand() cmd.CommandType = CommandType.Text cmd.CommandText = " SELECT p.ProductID, p.ProductName " & _ " FROM Northwind.dbo.Products p " & _ " WHERE p.ProductID IN (SELECT n FROM @prodids)" cmd.Parameters.Add("@prodids", SqlDbType.Structured) cmd.Parameters("@prodids").Direction = ParameterDirection.Input cmd.Parameters("@prodids").TypeName = "dbo.integer_list_tbltype" cmd.Parameters("@prodids").Value = _ New TVPDemo.CSV_splitter("1, 11, 76, 34") Using da As new SqlDataAdapter(cmd), _ ds As new DataSet() da.Fill(ds) TVPDemo.DemoHelper.PrintDataSet(ds) End Using End Using End Sub
It is very similar to CSV_to_SP. The one thing to observe is this line:
cmd.Parameters("@prodids").TypeName = "dbo.integer_list_tbltype"
When you use CommandType.Text, it is compulsory to specify the name of the table type. For stored procedures you can leave it out, but as I noted above, best practice is to always include the type name.
As I discussed above, the object you pass as the value for a TVP must implement IEnumerable<SqlDataRecord> and IEnumerator<SqlDataRecord>. I guess most .NET programmers understand what this means. In case you don't: an interface consists of a number of members with well-defined signatures, but without any code. To implement an interface, you write a class that includes the members of the interface with exactly those signatures – now with code added. To this you can add other members as you like. You don't have to worry too much about inadvertently leaving something out – the compiler will inform you of any small detail you forget. Some interfaces feature over 20 members, but these two interfaces are quite slender with in total four methods and one property.
While the main purpose is to feed a table-valued parameter, it is worth noting that since CSV_splitter implements IEnumerable, you can use the class in this way.
foreach (SqlDataRecord rec in new TVPDemo.CSV_splitter("1,2,3,4")) { Console.WriteLine (rec.GetInt64(0).ToString()); }
Not that this is particularly useful, but it gives an understanding what this is all about.
I will now walk you through the inside of the CSV_splitter class, which you find in TVPDemo.CSV_splitter.cs. For reference, here is the using
section:
using System; using System.Collections.Generic; using Microsoft.SqlServer.Server;
Nothing startling here. (Most people would probably add a few more namespaces, but I refer to some classes in full for clarity.) The class declaration looks like this:
public class CSV_splitter : IEnumerable<SqlDataRecord>, IEnumerator<SqlDataRecord>
That is, this class implements both IEnumerable and IEnumerator. This is possibly disputable; some people may prefer to have one class per interface, but I could not really see the point in this. (I did say that I'm normally not a .NET programmer, didn't I?)
The class has a few private member variables:
string input; // The input string. char delim; // The delimiter. int start_ix; // Start position for current list element. int end_ix; // Position for the next list delimiter. SqlDataRecord outrec; // The record we use to return data.
input is the comma-separated list itself and delim is the delimiter. start_ix and end_ix keep track of where in the string we are. The most interesting member is outrec. As the snippet above shows, each iteration produces an instance of SqlDataRecord and as we shall see, it comes from this outrec variable.
Here is how the general constructor looks in C#:
public CSV_splitter (string str, char delimiter) { this.input = str; this.delim = delimiter; this.outrec = new SqlDataRecord( new SqlMetaData("nnnn", System.Data.SqlDbType.BigInt)); this.Reset(); }
In the VB version, delimiter is an optional parameter. Since older versions of C# do not permit optional parameters, I have instead instead opted to supply the possibility to instantiate the splitter with a default delimiter by adding a second constructor that is a so-called chained constructor that calls the construtor above:
public CSV_splitter (string str) : this(str, ',') {}
First the constructor saves the input parameters into the private members. Next comes the key part: the constructor creates an instance of SqlDataRecord that matches the table type for the table-valued parameter. The constructor for SqlDataRecord accepts an array of SqlMetaData objects. (Both these classes are in the Microsoft.SqlServer.Server namespace.) These classes are closely related: the raison d'être for SqlMetaData is exactly to describe a single column in an SqlDataRecord and ultimately a column in SQL Server.
SqlMetaData has a whole slew of constructors to accommodate the various data types in SQL Server and I cover some of the variations as we encounter them. For the CSV splitter we use the simplest constructor of them all and pass only the column name and the data type. You may note that the column we define in SqlMetaData differs from the column in integer_list_tbltype on two accounts:
The last line in the constructor is a call to Reset which is one of the methods required by the IEnumerator interface. Its task is to initiate start_ix and end_ix, and we set them to values that indicate that we have not starting scanning the string yet:
public void Reset() { this.start_ix = -1; this.end_ix = -1; }
Next comes the part of the class that implements IEnumerable. This interface requires the implementation of a single method: GetEnumerator, which should return an object that implements IEnumerator. Since the class implements both interfaces, it returns itself:
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return this; } public System.Collections.Generic.IEnumerator<SqlDataRecord> GetEnumerator() { return this; }
What is a little tricky is that the interface IEnumerable<T> requires that you also implement the non-generic version (and for some reason, the latter cannot be public
).
The interface IEnumerator requires you to implement three methods MoveNext, Reset and Dispose and one read-only property, Current. We have already looked at Reset, and Dispose is only there to permit you to explicitly close files or SQL connections without waiting for garbage collection. That leaves MoveNext and Current as the two interesting members.
The purpose of MoveNext is to permit the caller to move to the next value in the iteration which the caller can retrieve with Current. MoveNext is a Boolean function and should return true
as long as there is a new item to retrieve with Current. If the caller moves past the last item, the method should return false
.
Here is how CSV_Splitter.MoveNext looks like:
public bool MoveNext() { this.start_ix = this.end_ix + 1; while (this.start_ix < this.input.Length && this.input[this.start_ix] == this.delim) { this.start_ix++; } if (this.start_ix >= this.input.Length) { return false; } this.end_ix = this.input.IndexOf(this.delim, this.start_ix); if (this.end_ix == -1) { this.end_ix = this.input.Length; } return true; }
The first action is to set start_ix to be one step ahead of end_ix. This is followed by a while
loop of which the purpose is to skip adjacent delimiters. (Imagine that you have a string like "1,2,,4".) If we at this point find that start_ix is equal to the length of the string, we are past the last character in the string, and we return false
to the caller to indicate that the iteration is over.
Else start_ix is now at the first character in the next value, and we set end_ix to be at the delimiter following this value. If there is no delimiter after the last value, we pretend that there is one any way. Since there is at least one more value in this case, we return true
.
That is, all we do here is to position start_ix and end_ix. In the Current property we make use of these values. This property should return the same type as IEnumerable<T> was instantiated with, that is, SqlDataRecord. Here is how our Current property looks like:
public SqlDataRecord Current { get { string str = this.input.Substring(this.start_ix, this.end_ix - this.start_ix); this.outrec.SetInt64(0, Convert.ToInt64(str)); return this.outrec; } }
We first extract the
public SqlDataRecord Current { get { string str = this.input.Substring(this.start_ix, this.end_ix - this.start_ix); SqlDataRecord outrec = new SqlDataRecord( new SqlMetaData("nnnn", System.Data.SqlDbType.BigInt)); outrec.SetInt64(0, Convert.ToInt64(str)); return outrec; } }
And there would have been no need to have outrec as a variable on class level, but it seemed to me slightly more efficient to create the record once and reuse it.
When you implement IEnumerable<T> you must also implement a non-generic version, and we just let it invoke the generic version.
Object System.Collections.IEnumerator.Current { get { return this.Current; } }
What you have seen in MoveNext and Current is fairly normal string-parsing code. There is certainly room for all sorts of improvements: multi-character delimiters, alternate delimiters, trim blanks. (A string like "1,2, ,3" will cause a run-time error in Convert.ToInt64.) If you want to handle comma-separated lists of strings, you can easily clone the class – or make the type a parameter or make the class generic. I leave all these ideas as exercises to the reader.
To conclude, you can see that implementing a custom-iterator to feed a TVP is by no means any advanced matter, and we will leverage on this later in this article.
Before we move on to the next demo, we will learn some more theory. More precisely we will learn to how deal with IDENTITY columns and how we can specify that the input data is sorted.
The SqlMetaData class has no less than 15 constructors. They control in total 17 read-only properties – i.e., once set you can't change them. To a great extent which constructor to use depends on the data type. For a string or a binary column you use a constructor that includes the maxLength parameter, for a decimal column you need one that exposes scale and precision etc. I am not covering all constructors and properties here, but I refer you to the .NET documentation. However, I like to remind you of Dan Guzman's findings that I briefely discussed earlier: for string and binary columns, always set the max length to something > -1, unless you actually have a MAX column.
Here I will discuss four parameters to control special properties for table-valued parameters. They appear in several constructors, and a constructor either has all four or none of these parameters. Here is the C# declaration for the simplest of these constructors:
public SqlMetaData( string name, SqlDbType dbType, bool useServerDefault, bool isUniqueKey, SortOrder columnSortOrder, int sortOrdinal)
The first of these parameters, useServerDefault, serves a different purpose than the other three. You may guess from the name what it is all about, but your guess may not be exactly right. When you specify this parameter as true
, SQL Server will ignore any value you set for the column but always set the column to its default value. Sounds corny? Here is the scoop: the SqlDataRecord must have exactly as many columns as your table type has. But what if your table type includes an IDENTITY column or a computed column that you cannot assign values to? It is for that sort of columns you specify useServerDefault as true
. It's also useful for columns with a default of newid() or NEXT VALUE FOR. (The latter is for sequences, a feature added in SQL 2012.)
The other three parameters, isUniqueKey, columnSortOrder and sortOrdinal are related and they exist in order to permit a performance enhancement. But before we can discuss what purpose they serve and how they work, we need to take one step back and look at the declaration for the table type we used with get_product_names.
CREATE TYPE integer_list_tbltype AS TABLE (n int NOT NULL PRIMARY KEY)
The table type has a primary key, and thus it assumes that the values in the TVP are unique. Is this a good thing? To start with, when you design your tables, you should always look for a natural primary key, and this includes table variables and temp tables. One reason is that if you write your code under the assumption that a certain column or a set of columns is unique, you should also state this in the table declaration as an assertion. If your assumption is incorrect, your code will die early and not produce incorrect results.
But there is also a performance aspect. Let's look at the code for get_product_names again:
CREATE PROCEDURE get_product_names @prodids integer_list_tbltype READONLY AS SELECT p.ProductID, p.ProductName FROM Northwind.dbo.Products p WHERE p.ProductID IN (SELECT n FROM @prodids)
For SQL Server, the query is equivalent to:
SELECT p.ProductID, p.ProductName FROM Northwind.dbo.Products p JOIN @prodids ps ON ps.n = p.ProductID
If the table type would not have a primary key, this would not be true. Instead the equivalent query would be:
SELECT DISTINCT p.ProductID, p.ProductName FROM Northwind.dbo.Products p JOIN @prodids ps ON ps.n = p.ProductID
That is, SQL Server would have to add an operator somewhere to remove duplicate values. This comes with an extra cost. Of course, for four values in the TVP this is entirely negligible, but assume that there are 50 000 values. Now the difference is starting to be measurable.
However, if you test performance as measured from the client-side, you will not find any noticeable difference with or without a primary key, from what we have done so far. SQL Server must validate that the incoming data adheres to the primary key constraints, so guess what it does? Yup, it sorts the data. That is, the Sort operator moves from the query to the call. Only if you use the TVP multiple times in the procedure there is an actual gain.
And if you have no knowledge about the data in the list, this is as good as you can do. But if you know that the data is already sorted according to the index, you use the three parameters isUniqueKey, columnSortOrder and sortOrdinal to specify that the data is sorted and how. isUniqueKey should be true
in this case. columnSortOrder can take any of the values SortOrder.Unspecified, SortOrder.Ascending and SortOrder.Descending. (The SortOrder enum is in the System.Data.SqlClient namespace.) For sorted data you would use any of the latter two; Unspecified is what you pass when you use the constructor to be able to specify true
for useServerDefault. Finally, sortOrdinal specifies where in the unique key the column appears. Use 0 for the first column in the key, 1 for the second etc. Use ‑1 for SortOrder.Unspecified. (If you want to see an example on this, stay tuned. They will be coming.)
You need to use these parameters with care. It goes without saying that you need to ensure that the data you have really is sorted. If you sort the data or create the sort keys yourself, you have control, but it may be precarious to rely on data coming from an outside source to be sorted. If you are mistaken, SQL Server will not let you get away with it, but produce an error message like this one.
Msg 4819, Severity 16, State: 1, Procedure , Line no: 0
Cannot bulk load. The bulk data stream was incorrectly specified as sorted or the data violates a uniqueness constraint imposed by the target table. Sort order incorrect for the following two rows: primary key of first row: (gamma), primary key of second row: (delta).
There is another thing to watch out for, and in this case SQL Server will stay silent. Inspired by what we have read, we may get the idea to change the constructor for the CSV_splitter class, so that outrec is created in this way:
this.outrec = new SqlDataRecord( new SqlMetaData("nnnn", SqlDbType.BigInt, false, true, SortOrder.Ascending, 0));
If you make this change and then run the CSVdemo program, you will find that it runs just fine. But wait! In the procedure CSV_to_SQL there is this line:
cmd.Parameters("@prodids").Value = New TVPDemo.CSV_splitter("1, 11, 76, 34")
Data is out of order, so an error message is to be expected. Still we did not get any. Why? Recall that CSV_splitter uses BigInt to be as reusable as possible, while the table type has an integer column. Because of the data-type mismatch, SQL Server decides to ignore the information that the data is sorted and sorts it anyway. If you change the type to SqlDbType.Int and try again, you will get the error message above.
Thus, to be sure that SQL Server does not decide to sort behind your back, you should make sure that you create the SqlDataRecord object so that it matches your table type exactly. To be precise, you can have a mismatch as long as SQL Server feels that it can trust the conversion to not affect the sort order. If you want to be sure, the simplest way to test is to send data out of order. If you get error message 4819, the plot worked, else it did not. You can also use Profiler, and include the event Performance:Show Statistics XML Profile and run the application. If you also add SP:StmtCompleted you will see the insertion into the TVP as encrypted text. This helps you to locate the query plan, and it should not include a Sort or Hash operator.
Character data is particularly difficult in this context. The first thing to note is that the length must match. That is, if you define the column in .NET as
new SqlMetaData("charcol", SqlDbType.NVarChar, 120, false, true, SortOrder.Ascending, 0);
but the target column is nvarchar(20), SQL Server will ignore your sorting parameters and sort the data. Another complication is that character data can be sorted in many ways, that is, according to different collations. If you look through the constructors for SqlMetaData you will find two parameters localeID and compareOptions which seem like they could be used to specify the collation. I tested this, but I found that they had no effect. From what I can tell, SQL Server assumes that character data is always sorted according to the database collation. If the data you send with the TVP is sorted according to a different collation, you will get an error once there is a deviation. You can of course specify an explicit collation for the column in the table type, and it may save you from error messages about data being non-unique. However, my testing indicates that if a key column has a different collation from the database collation, SQL Server will ignore the sorting parameters and always sort the incoming data stream.
We will look at one more example. This time we will see how we can use table-valued parameters to easily load lots of data to SQL Server. There are several other options for this task: BCP, BULK INSERT, SQL Server Integration Services and the SqlBulkCopy class. But none of these options permit you to send data directly to a stored procedure. We will learn two ways to do this. The plain way where we read the file into memory and a more efficient way where we stream the file to the TVP. I've taken the opportunity to cover some ground beyond the topic of TVPs, so you may learn some other tricks in this chapter as well.
For this example we will look at loading data into these two tables:
CREATE TABLE Albums (AlbumID int IDENTITY, Artist nvarchar(200) NOT NULL, Title nvarchar(200) NOT NULL, ReleaseDate date NULL, Length time(0) NULL, CONSTRAINT pk_Albums PRIMARY KEY (AlbumID) ) CREATE TABLE Tracks (AlbumID int NOT NULL, TrackNo tinyint NOT NULL, Title nvarchar(200) NOT NULL, Length time(0) NULL, CONSTRAINT pk_Tracks PRIMARY KEY (AlbumID, TrackNo), CONSTRAINT fk_Tracks_Albums FOREIGN KEY(AlbumID) REFERENCES Albums(AlbumID) )
We have a music collection, and Albums includes information about an album, and Tracks details the tracks for the albums. All and all, a fairly typical master-detail scenario. These table definitions, as well as other SQL code in this chapter, are included in the file fileloaddemo.sql which you find among the demo files.
Our task is to load new albums with their tracks into the database, from the file Albums.csv which is also included in the demo files. Here are some sample lines from this file:
A,Adrian Belew,Desire Caught By the Tail,,33:25 T,1,Tango Zebra,458553, T,2,Laughing Man,332460, T,3,The Gypsy Zurna,187141, T,4,Portrait of Margaret,240718, T,5,Beach Creatures Dancing Like Cranes,197564, T,6,At the Seaside Cafe,113319, T,7,Guernica,127216, T,8,"""Z""",338416, A,"Al di Meola, John McLaughlin, Paco de Lucia",Friday Night in San Francisco,8/10/1981,42:09 T,1,A. Mediterranean Sundance-B. Rio Ancho,708780, T,2,Short Tales of the Black Forest,535484, T,3,Frevo Rasgado,486608, T,4,Fantasia Suite,541492, T,5,Guardian Angel (McLaughin),247066, A,David Bowie,"""Heroes""",14/10/1977,40:56 T,1,Beauty and the Beast,217182,
The first field defines whether the line contains an album (A) or a track (T). On an Album line, the fields are Artist, Album title, Release date and Length in minutes and seconds. On a Track line, the fields are Track number, Track title and Length in milliseconds. That is, the fields have the same as in the tables, except for one thing: there is no AlbumID. It is part of our loading task to assign this id.
As for the format, you can note that some fields are quoted in double quotes, but this happens only when the field includes a comma or a double quote. Some fields include a plethora of double quotes; this occurs when the double quotes are part of the value. (You may recall that the name of David Bowie's classic album from 1977 really is "Heroes" with quotes and all.)
An aside: loading this file with BCP or BULK INSERT would be a bit complicated. To start with, you need to have SQL 2017 or later so that you use the CSV FORMAT option with BULK INSERT (this option is not available with BCP). Furthermore, they cannot cope with master-detail formats, but you would have to load the data into a staging table to be able to separate albums and tracks. And this is not as simple as it seems, since it is questionable if there is a guaranteed way to mirror the line number from the flat file in SQL. And would master and detail rows have a different number of fields, it would be more or less game over. (As it happens, Excel – which I used to create this file – was kind to add an extra comma at the end of the Tracks lines, so this is not an issue here.).
We need two table types and a stored procedure. The table types mirror the file with one addition:
CREATE TYPE Albums_tbltype AS TABLE (TempID int NOT NULL, Artist nvarchar(200) NOT NULL, Title nvarchar(200) NOT NULL, ReleaseDate date NULL, Length time(0) NULL, PRIMARY KEY (TempID) ) CREATE TYPE Tracks_tbltype AS TABLE (TempID int NOT NULL, TrackNo tinyint NOT NULL, Title nvarchar(200) NOT NULL, Length time(0) NULL, PRIMARY KEY (TempID, TrackNo) ) go
Since there is no album ID in the file, the loading process must assign new ids. As long as we have the file, we know which tracks that go with which albums, since the file is ordered. But when we load the data into different tables that order is lost, since tables are unordered objects by definition. For this reason, both table types include a column TempID, which is a temporary ID that uniquely identifies an album during the loading process.
The stored procedure is worth dwelling on for an extra second:
CREATE PROCEDURE LoadAlbums @Albums Albums_tbltype READONLY, @Tracks Tracks_tbltype READONLY AS DECLARE @idmap TABLE (TempID int NOT NULL PRIMARY KEY, AlbumID int NOT NULL UNIQUE) SET XACT_ABORT ON BEGIN TRANSACTION MERGE Albums A USING @Albums T ON 1 = 0 WHEN NOT MATCHED BY TARGET THEN INSERT(Artist, Title, ReleaseDate, Length) VALUES(T.Artist, T.Title, T.ReleaseDate, T.Length) OUTPUT T.TempID, inserted.AlbumID INTO @idmap(TempID, AlbumID) ; INSERT Tracks(AlbumID, TrackNo, Title, Length) SELECT i.AlbumID, T.TrackNo, T.Title, T.Length FROM @Tracks T JOIN @idmap i ON i.TempID = T.TempID COMMIT TRANSACTION go
To start with, the procedure sets up a user-defined transaction so that we don't end up loading only the albums. While I am a strong advocate of error handling, I don't use TRY-CATCH here. In the interest of brevity, I let it suffice with SET XACT_ABORT ON to make sure that any error aborts and rolls back the transaction. (If you want directions for error handling, please see my articles about Error and Transaction Handling in SQL Server.)
What may surprise the reader is the MERGE statement. This is a pure insert operation (for the sake of the example, I am completely ignoring that the album may already be in the database), so why use MERGE? And with that weird condition 1 = 0? By using this condition we make sure that no rows in the source match the target. That is, all rows in @Albums will match the condition WHEN NOT MATCHED BY TARGET, and thus all rows in @Albums will be inserted into Albums. Or in another words, this is a complicated way of saying:
INSERT Albums(Artist, Title, ReleaseDate, Length) SELECT Artist, Title, ReleaseDate, Length FROM @Albums
Why all this? The answer lies in the OUTPUT clause. We need to map the TempID in @Albums to the IDENTITY values generated for AlbumID in Albums, so that we can insert the correct AlbumID values into Tracks, and this is the purpose of the table variable @idmap. If you try to make the mapping with INSERT, you will find that this does not work, because in the OUTPUT clause for INSERT you only have access to the columns in the target table. This is different with MERGE; with MERGE you have access to both target and source columns in the OUTPUT clause.
When inserting into Tracks there is no need for extra fireworks, and we can use plain INSERT where we pick up the album IDs from the @idmap table.
There are two example programs to load the file, and we will first look at fileloaddemo1.cs which reads the file into two List<SqlDataRecord>, one for albums and one for tracks. This program starts of with a number using
clauses, of which one may be surprising:
using System; using System.Data; using System.Data.SqlClient; using System.Collections.Generic; using Microsoft.SqlServer.Server; using Microsoft.VisualBasic.FileIO;
System is needed as always of course, and System.Data includes SqlDBType and more. SqlClient is what this text is all about. We need System.Collections.Generic for the class List<T>, and as noted previously we get SqlDataRecord and SqlMetaData from Microsoft.SqlServer.Server. But staunch fans of C# may be appalled by the appearance of Visual Basic here. As I pointed out above, the format of this file is somewhat complex. While I could have written code to parse the lines on my own, I said to myself this file has been generated by Excel; there must be code out there that performs this task. So I did a search on Google, and I was quickly pointed to the class TextFieldParser that exists in the namespace Microsoft.VisualBasic.FileIO.
If you want to use this class from C#, you need to add a reference to Microsoft.VisualBasic.dll. VB programmers get this DLL automatically.
Fileloaddemo1 includes two routines of interest, read_file and load_albums. The latter first calls read_file and then calls the stored procedure LoadAlbums. We will look at read_file first. Here is the declaration:
private static void read_file (string filename, out List<SqlDataRecord> albums, out List<SqlDataRecord> tracks) {
It accepts a file name and return album and track data in the two List parameters. The first few lines in read_file are pretty dull:
System.Globalization.DateTimeFormatInfo no_culture = new System.Globalization.DateTimeFormatInfo(); System.Globalization.DateTimeStyles no_datetime_style = System.Globalization.DateTimeStyles.None;
int album_no = 0;
The two items from System.Globalization are some jazz needed when we parse the date and time fields, I'll return to them later. The variable album_no is more interesting: this variable will feed the TempID columns in the table parameters.
The next two statements are significantly hotter, because this is where we set up the SqlMetaData definitions that map to our table types:
SqlMetaData[] albums_tbltype = { new SqlMetaData("id", SqlDbType.Int, false, true, SortOrder.Ascending, 0), new SqlMetaData("artist", SqlDbType.NVarChar, 200), new SqlMetaData("album", SqlDbType.NVarChar, 200), new SqlMetaData("released", SqlDbType.Date), new SqlMetaData("length", SqlDbType.Time, 0, 0)}; SqlMetaData[] tracks_tbltype = { new SqlMetaData("id", SqlDbType.Int, false, true, SortOrder.Ascending, 0), new SqlMetaData("trackno", SqlDbType.TinyInt, false, true, SortOrder.Ascending, 1), new SqlMetaData("title", SqlDbType.NVarChar, 200), new SqlMetaData("length", SqlDbType.Time, 0, 0)};
In contrast to the CSV_splitter class, we don't create any SqlDataRecord at this point; since we are adding to a List, we will need a new SqlDataRecord object each time. Whence, we only create the SqlMetaData arrays in advance.
Here we see some more examples of using the special parameters for the SqlMetaData constructor to specify that the data is sorted. For albums_tbltype there is a single column in the sort key, while for tracks_tbltype there is a composite key and as you see we specify that both columns are unique. We set the parameter sortOrdinal to 0 and 1 respectively. Admittedly, to some extent this contradicts what I said previously about ensuring that the data is sorted. The id columns are no problem; we are generating the id values in our code and we have full control over them. But the track numbers comes from the file and in a real-world scenario, we may not be able to rely on that the track numbers come in numeric order.
Here are also examples of SqlMetaData constructors where we specify the length for the string columns. For the time columns we need to use a constructor that exposes scale and precision, even if time only has one of them.
In the C# version we create the lists at this point:
albums = new List<SqlDataRecord>(); tracks = new List<SqlDataRecord>();
(In the VB code this happens in load_albums since VB did not seem to like it when I passed uninitialised variables.)
Next we open the file by creating a TextFieldParser object:
TextFieldParser fp = new TextFieldParser(filename, System.Text.Encoding.Default);
In total, this class offers eight different constructors. For this demo, we use one where we pass the name of the file (there are also constructors that accept a Stream object instead) and the encoding. The default for the TextFieldParser is UTF‑8, but CSV files from Excel appears to always be ANSI files. (And since one of the tracks from "Heroes" is called Neuköln it matters for the sample file.)
We need to configure our newly created object:
fp.TextFieldType = FieldType.Delimited; fp.Delimiters = new String[] {","}; fp.HasFieldsEnclosedInQuotes = true;
The TextFieldParser can handle both delimited files and fixed-length formats. Here we set up the file to be comma-delimited. We also specify that there are fields enclosed in quotes. Yet an option, that we don't make use of, is to specify comment tokens.
Once this is done, we have completed the preparations and can read the file.
while (! fp.EndOfData) { String[] fields = fp.ReadFields();
The ReadFields method consumes the next set of fields and returns them in a string array. (If the file does not comply with the expected format, the method will throw an exception, but I have not included any error handling to keep the example down in length.) Depending on fields[0] we take different paths:
if (fields[0] == "A") { SqlDataRecord album_rec = new SqlDataRecord(albums_tbltype);
album_rec.SetInt32(0, ++album_no); album_rec.SetString(1, fields[1]); album_rec.SetString(2, fields[2]);
If we have an A in the first field, we create an SqlDataRecord that aligns with the table type for albums, and then we go on and populate the fields, using various Set methods of the SqlDataRecord class. Above, we save the temporary id (which we first increment), the artist name and the album title. As you see, we refer to the columns by number, starting on 0. If you prefer to access the columns by name, you need to use the GetOrdinal method:
album_rec.SetInt32(album_rec.GetOrdinal("id"), ++album_no); album_rec.SetString(album_rec.GetOrdinal("artist"), fields[1]); album_rec.SetString(album_rec.GetOrdinal("album"), fields[2]);
Note here that you need to use the name you specified in the SqlMetaData constructor; you cannot use the names in the table type.
The ReleaseDate column is a little more complex for two reasons: it is permitted to be NULL, and date formats are always problematic. Here is the code:
DateTime releasedate; bool date_ok = DateTime.TryParseExact ( fields[3], "d/MM/yyyy", no_culture, no_datetime_style, out releasedate); if (date_ok) { album_rec.SetDateTime(3, releasedate); } else { album_rec.SetDBNull(3); }
We use TryParseExact to see if there is a legit date in the field. It just so happens that the dates in the file are on the format DD/MM/YYYY, because I generated the CSV file with my regional settings set to English (Australia). (Had I used my regular Swedish settings, the CSV file would have had semicolon as delimiter, which would have been less interesting with regards to the double quotes.) When reading dates from text input – be that a file or text box – you should never assume that all dates are well-formed. There may be a mix of different date formats, and there may be completely bogus dates like 1992-02-30. If the parsing succeeds, we set the date column in album_rec, else we set it to NULL.
When I composed this demo program, the release date proved to be the most difficult to get right. It turned out that it is not sufficient to specify an exact date format. When I tested the program, I had switched back to Swedish settings where the date format is YYYY-MM-DD. Eventually I found that I could not leave the third parameter null
, but I had to use an explicit value to state that I wanted to ignore regional settings, whence this no_culture. no_datetime_style is the value for an enum parameter which is mandatory with TryParseExact.
The last album field is the length, which is handled similarly:
DateTime length; bool length_ok = DateTime.TryParseExact( fields[4], new string [] {"h:m:ss", "m:ss"}, no_culture, no_datetime_style, out length); if (length_ok) { album_rec.SetTimeSpan(4, length.TimeOfDay); } else { album_rec.SetDBNull(4); }
The only thing that is different is that I permit for time formats both with and without hours, since an album may exceed one hour in length.
When all this is done, we add the record to the albums list:
albums.Add(album_rec);
The code for dealing with the tracks data is similar with a sanity check added.
else if (fields[0] == "T") { if (album_no == 0) { throw new Exception("Bad file format: track rows before the first album row!"); } SqlDataRecord track_rec = new SqlDataRecord(tracks_tbltype); track_rec.SetInt32(0, album_no); track_rec.SetByte(1, Convert.ToByte(fields[1])); track_rec.SetString(2, fields[2]); if (fields[3] != "") { TimeSpan length = TimeSpan.FromMilliseconds(Convert.ToInt32(fields[3])); track_rec.SetTimeSpan(3, length); } else { track_rec.SetDBNull(3); } tracks.Add(track_rec); }
Here is the code for load_albums:
private static void load_albums() { List<SqlDataRecord> albums; List<SqlDataRecord> tracks; read_file("albums.csv", out albums, out tracks); using (SqlConnection cn = TVPDemo.DemoHelper.setup_connection()) using (SqlCommand cmd = cn.CreateCommand()) { cmd.CommandType = CommandType.StoredProcedure; cmd.CommandText = "dbo.LoadAlbums"; cmd.Parameters.Add("@Albums", SqlDbType.Structured); cmd.Parameters["@Albums"].Direction = ParameterDirection.Input; cmd.Parameters["@Albums"].TypeName = "dbo.Albums_tbltype"; cmd.Parameters["@Albums"].Value = albums; cmd.Parameters.Add("@Tracks", SqlDbType.Structured); cmd.Parameters["@Tracks"].Direction = ParameterDirection.Input; cmd.Parameters["@Tracks"].TypeName = "dbo.Tracks_tbltype"; cmd.Parameters["@Tracks"].Value = tracks; cmd.ExecuteNonQuery(); } }
It first calls read_file to fill albums and tracks, and then it calls the stored procedure LoadAlbums. The only difference to the code we saw for comma-separated list is this line:
cmd.Parameters["@Albums"].Value = albums;
That is, we pass a List object and not an custom-written iterator.
While this code may seem trivial, it is worth emphasising the flexibility. Here we pass a List object, but if we would change our mind and want to pass something else, we can do that very easily. In fact, this is exactly what we will do in a second.
The sample file that comes with this article is short; there are only five albums. But imagine that you have a very large file, tens of megabytes in size. With the solution above, you would have to read the entire file into memory before you can start sending the data to SQL Server. Is that really necessary? No, and you might already have guessed how we can approach this. If we can write a custom-iterator for a comma-separated list, we should be able to write an iterator that reads the file, so that SqlClient can send a row to SQL Server as soon as we have read it.
Now, the fact that this is a master-detail file makes this a little more complicated. If we have two TVPs, we would need two iterator classes, one for albums and one for tracks. And these classes would both have to read the file, which thus would be read twice. To avoid this, I decided to use a single table type that can accommodate both row types. Depending on how different headers and details are from each other, this can be quite messy. Thankfully, our albums-and-tracks example is quite forgiving in this sense. (At this point I can sense objection from some readers who think that it is possible to have two table types and still only read the file once. Permit me to come back to this idea after I have gone through the streaming example.)
Here is the table type (which you also find in fileloaddemo.sql):
CREATE TYPE AlbumTracks_tbltype AS TABLE (TempID int NOT NULL, TrackNo tinyint NOT NULL, Artist nvarchar(200) NULL, Title nvarchar(200) NOT NULL, ReleaseDate date NULL, Length time(0) NULL, PRIMARY KEY (TempID, TrackNo), CHECK (TrackNo = 0 AND Artist IS NOT NULL OR TrackNo > 0 AND Artist IS NULL AND ReleaseDate IS NULL) )
I did not add the A/T field to the table type; instead I use TrackNo as the distinguishing column; TrackNo = 0 indicates that this is a header row. I've also added constraints to state rules that are unique for album rows (Artist must be present) and track rows (must not have Artist and ReleaseDate). Such CHECK constraints help to detect errors in the client program.
To use this table type, there is a second stored procedure, similar to the one we looked at previously:
CREATE PROCEDURE LoadAlbums_2 @AlbumTracks AlbumTracks_tbltype READONLY AS DECLARE @idmap TABLE (TempID int NOT NULL PRIMARY KEY, AlbumID int NOT NULL UNIQUE) SET XACT_ABORT ON BEGIN TRANSACTION MERGE Albums A USING (SELECT TempID, Artist, Title, ReleaseDate, Length FROM @AlbumTracks WHERE TrackNo = 0) AT ON 1 = 0 WHEN NOT MATCHED BY TARGET THEN INSERT(Artist, Title, ReleaseDate, Length) VALUES(AT.Artist, AT.Title, AT.ReleaseDate, AT.Length) OUTPUT AT.TempID, inserted.AlbumID INTO @idmap(TempID, AlbumID) ; INSERT Tracks(AlbumID, TrackNo, Title, Length) SELECT i.AlbumID, AT.TrackNo, AT.Title, AT.Length FROM @AlbumTracks AT JOIN @idmap i ON i.TempID = AT.TempID WHERE AT.TrackNo > 0 COMMIT TRANSACTION
In demo files, there is a sample program fileloaddemo2.vb that calls this procedure. Here is the code that calls LoadAlbums_2:
Private Sub LoadAlbums Using cn As SqlConnection = TVPDemo.DemoHelper.SetupConnection(), _ cmd As SqlCommand = cn.CreateCommand() cmd.CommandType = CommandType.StoredProcedure cmd.CommandText = "dbo.LoadAlbums_2" cmd.Parameters.Add("@AlbumTracks", SqlDbType.Structured) cmd.Parameters("@AlbumTracks").Direction = ParameterDirection.Input cmd.Parameters("@AlbumTracks").TypeName = "dbo.AlbumTracks_tbltype" cmd.Parameters("@AlbumTracks").Value = _ new TVPDemo.AlbumReader("Albums.csv") cmd.ExecuteNonQuery() End Using End Sub
You have seen this pattern a couple of times now. What is different from fileloaddemo1 is that there is no call to read_file, but instead there is an instantiation of the class TVPDemo.AlbumReader. This is analogous to when we worked with comma-separated strings of integers, and when we look inside TVPDemo.AlbumReader.vb there is a mix of what we saw in CSV_splitter.cs and fileloaddemo1.cs. The most startling difference may be that this time I show the code is in Visual Basic... So I will rash through the code fairly quickly. The important takeaway is that writing a class that streams a file to a TVP is by no means complicated.
Here is the Imports
section:
Imports System Imports System.Data Imports System.Collections.Generic Imports Microsoft.SqlServer.Server Imports Microsoft.VisualBasic.FileIO
Again Microsoft.VisualBasic.FileIO is featured, but I like to remind you that the choice of using the TextFieldParser class is due to the specific file format. While it is likely to be useful for CSV files in general, you may have a file format for which it is less suitable. Particularly, it is not that you need to use this class only because you are streaming to a TVP.
The class declaration:
Public Class AlbumReader Implements IEnumerable(Of SqlDataRecord), _ IEnumerator(Of SqlDataRecord)
is no different from the CSV_splitter. I implement both interfaces in the same class. There are some global members:
Dim AlbumNo As Integer ' Current album. Dim fp As TextFieldParser ' Our file-reading class. Dim Outrec As SqlDataRecord ' The record we use to return data. Dim NoCulture As New System.Globalization.DateTimeFormatInfo Dim NoDateTimeStyle As System.Globalization.DateTimeStyles = _ System.Globalization.DateTimeStyles.None
AlbumNo is the TempID for the current album and fp is the object for the file we are reading. Outrec is a single output record that I reuse just like in the CSV_splitter class. Then follows the System.Globalization jazz.
The constructor:
Public Sub New (FileName As String) Me.AlbumNo = 0 Me.Outrec = new SqlDataRecord( _ New SqlMetaData("id", SqlDbType.Int, false, _ true, System.Data.SqlClient.SortOrder.Ascending, 0), _ New SqlMetaData("trackno", SqlDbType.TinyInt, false, _ true, System.Data.SqlClient.SortOrder.Ascending, 1), _ New SqlMetaData("artist", SqlDbType.NVarChar, 200), _ New SqlMetaData("title", SqlDbType.NVarChar, 200), _ New SqlMetaData("released", SqlDbType.Date), _ New SqlMetaData("length", SqlDbType.Time, 0, 0)) Me.fp = New TextFieldParser(FileName, System.Text.Encoding.Default) fp.TextFieldType = FieldType.Delimited fp.Delimiters = New String() {","} fp.HasFieldsEnclosedInQuotes = True End Sub
Again, we take the occasion to state that our TVP is sorted to save SQL Server from sorting when the data arrives. The last few lines set up the TextFieldParser class for reading a CSV file.
Next comes GetEnumerator and in Visual Basic, the two functions must have different names:
Function GetEnumerator_nongeneric As System.Collections.IEnumerator _ Implements System.Collections.IEnumerable.GetEnumerator Return Me End Function
Public Function GetEnumerator_generic As IEnumerator (Of SqlDataRecord) _ Implements IEnumerable (Of SqlDataRecord).GetEnumerator Return Me End Function
Observe here that this is identical to CSV_splitter.vb, and trust me: the implementation in AlbumReader.cs looks exactly to what I showed you for the CSV_splitter class. That is, as long as you follow the pattern with implementing IEnumerable and IEnumerator in the same class, GetEnumerator will always look the same.
The Reset method is somewhat brutal:
Public Sub Reset Implements IEnumerator(Of SqlDataRecord).Reset Throw New NotImplementedException("AlbumReader.Reset") End Sub
I could not think of anything to put here. Well, I guess a proper Reset method could restart the file, but I'm not sure I want that to happen. I chanced to see a blog post that used this pattern, which I decided to copy. Since there is no reason why SqlClient would have to call Reset when you pass a TVP, you could always implement Reset this way. (But try to remember to change the class name in the argument to the exception constructor.)
Next we look at the Current property, which is very straightforward here, even if there is some level of noise due to the requirement to have both a generic and a non-generic implementation:
ReadOnly Public Property Current_generic As SqlDataRecord _ Implements IEnumerator (Of SqlDataRecord).Current Get Return Me.Outrec End Get End Property ReadOnly Property Current_nongeneric As Object _ Implements System.Collections.IEnumerator.Current Get Return Me.Outrec End Get End Property
In the CSV_splitter class, I put the final extraction in Current, but in this class I have put all work to fill Outrec in MoveNext, and that is probably what you will do most of the time.
Now, if you think of what we have seen so far, there are really only two things you have craft from scratch when you implement a new custom-iterator for a TVP: the constructor and MoveNext. As for GetEnumerator, Reset and Current, you simply clone from your previous effort. Oh, I forgot: you need to implement Dispose as well:
Public Sub Dispose Implements IDisposable.Dispose Me.fp.Close() Me.fp.Dispose() End Sub
This time, there is something real to dispose of.
Left to show is the implementation of MoveNext, which is very much a rehash of the loop in read_file above. Therefore, I only include an outline to highlight the one thing that is different: since this is MoveNext we should return False
if we are at end of file, else True
.
Public Function MoveNext As Boolean _ Implements IEnumerator (Of SqlDataRecord).MoveNext If Me.fp.EndOfData Then _ Return False Dim Fields() As String = fp.ReadFields() If Fields(0) = "A" Then ' ... Else If Fields(0) = "T" Then ' ... Else Throw New Exception("Illegal record type '" & fields(0) & "'.") End If Return True End Function
We have now looked at two classes that both feed a TVP. While they have lot of common when we look at the code, there is nevertheless one important distinction. The CSV_splitter class is intended to be a general class that you can reuse in many places. AlbumReader, on the other hand, is specific to a certain problem. You would have to write a new class for every new file or data source you read. And as you have seen, this is no big deal at all. Just remember that if there is a DbDataReader class for your data source, you should pass a data-reader object to your TVP directly; no need to write your own class in this case.
Before you start to stream files all over town, I like to add some words of caution. While the pattern I have shown here is very practical and neat, it is not the most optimal. It will serve you well for large files – but not for very large files. I wanted to prove that a streamed file is not buffered in the client, so I wrote a very stupid file reader which just chopped up the file into chunks of 1024 bytes and passed it to a TVP with a binary(1024) column. I was able to load an 80 MB file this way, although it took some time. (But the memory consumption in the client stayed flat, proving that data was indeed streamed.) When I tried a 500 MB file, my reward was a timeout message and a TDS error. I never investigated very closely what the underlying reason was, but I assume that I hit a resource limit. Maybe I triggered an auto-grow of the log file which took too long.
An advantage with TVPs is that they make it simple to implement a polished well-packaged solution using stored procedures. But keep in mind that the table parameter is an intermediate storage. This intermediate storage may be in memory or on disk, depending on how SQL Server decides to handle it, but it is intermediate storage. For this reason, it will always be more efficient if you can load the data directly into the target table through BCP, BULK INSERT or the SqlBulkCopy class. As I noted previously, BCP and BULK INSERT are not able to handle files with formats that require stateful parsing. Since SqlBulkCopy is an API, you have more control and you could use a class like the TextFieldParser to feed an SqlBulkCopy session to load data into the target table directly.
(In case you are thinking that XML or delimited strings could be an alternative here, permit me to point out that they, too, represent intermediate storage. If you pass a 50 MB XML document, it is very likely that SQL Server will spill it to disk.)
When you insert or update large amounts of data, there is always reason to consider chopping up the operation in batches. This applies no matter you are loading data from an outside source like a file, or if you copy data from one table to another. If for no other reason, it helps to keep the transaction-log size in check. In the context of loading a file through a TVP, this means that you need to call your procedure for every batch. There are two challenges here:
I will not go into details here, but let if suffice with a brief discussion. The first point is not too difficult. You could pass the custom-iterator a Stream object and a batch size, and the custom-iterator would read that many of number of lines from the file. Or you could keep it simple: use a moderate batch size and fill a List with one batch at a time, and don't stream at all.
Making the process restartable may be more difficult. For a simple MERGE scenario (that is, if-not-exists-insert-else-update) you may accept to run part of the file twice. But there are scenarios where re-running part of a file would alter the outcome, for instance when columns are updated incrementally. Or INSERT-only scenarios like the one we have looked at in this article, where a re-run would result in primary-key violations or even worse: load of duplicate data. You can add WHERE NOT EXISTS in the stored procedure as a simple way out, but it may prove to have an undesirable performance impact for all loads, not only restarted ones. The best solution is likely to depend on the exact situation.
Finally, let's discuss the specific problem with master-detail files a little more closely. I said previously that with two table types and two iterators, both iterators would have to read the file from start to end. You may object to this statement and suggest that there could be a single class that reads the file and which puts the rows into two queues, one for albums and one for tracks. The custom-iterators would read from these queues. But, no, this will not fly. Well, it would fly in the sense that you would be able to load the file. However, you not would achieve the aim of preserving memory in the client process. Why?
Keep in mind that SqlClient sends the data to SQL Server over single a communication line where it has to respect the TDS protocol. And if you look in the TDS specification, you will find that the data for one TVP has to be sent in a single sequence. That is, SqlClient cannot interleave data for the two TVPs, but it will have to read all data for one TVP first. Which means that the data for the other TVP will be buffered into in this queue and take up memory which was exactly what we wanted to avoid. There is simply a law of nature working against us here: the data in the file comes in a different order than we want to process it, and there is no way around it. The best you can do is to stream the detail rows and buffer the header rows (which are likely to be fewer). But this appears to be messy to implement – it may be simpler use a batchwise implementation with a List<SqlDataRecord>.
It is worth noticing that neither my solution with a single table type overcomes problem with having to reorder the data. As long as the procedure has not started executing, no reordering has occurred, but all data has been buffered in SQL Server – in memory or in tempdb. However, when the procedure runs, it scans the table variable twice. Depending on the situation and hardware configuration this may be a better – or worse – solution than having the client to read the file twice. It goes without saying that if you are facing this scenario, and performance is critical for you, you should benchmark several solutions.
This article has focused on using table-valued parameters with ADO .NET and SqlClient for two reasons. 1) It's a very common environment. 2) It's very simple to use TVPs from SqlClient. Before I conclude this article, I will give a brief exposé over other APIs and whether they support table-valued parameters. I also discuss what options you have if your API does not support TVPs.
You can use table-valued parameters with ODBC. You need to specify SQL Server Native Client 10.0 or later as your ODBC driver. That is, you need to use the ODBC driver that comes with SQL 2008 or later (and which is freely distributable). You cannot use the old ODBC driver that comes with Windows.
As I have not worked with ODBC myself, I cannot assess how smooth or difficult it is to use TVPs with ODBC. I believe that as with ADO .NET there are two ways to pass a TVP through ODBC: streaming and non-streaming. Just like ADO .NET, ODBC exposes properties to specify that your data is sorted, to avoid sorting in SQL Server when the TVP has a primary key.
Books Online have two examples on using table-valued parameter in the section Table-Valued Parameters (ODBC). There is also a sample on Codeplex.
You can use table-valued parameters with OLE DB, if you use the any of the SQLNCLI10 or SQLNCLI11 provider or later. OLE DB offers two models for passing TVPs. One is the push model, where you create a rowset with the metadata, fill the rowset with your data, and in the regular parameter area, you pass the rowset pointer. This is the same basic idea as passing a List with ADO .NET, but you need to write more code. (As always with OLE DB, I'm tempted to say.)
The alternative is the pull model, which essentially is a role reversal where the consumer needs to implement IRowset on its own whereupon the provider will read from the rowset as a consumer. This model is intended for streaming scenarios, where you get data from an external source, and you don't want any intermediate storage in the client.
I can't find anything in Books Online that discusses how to specify that your data source is already sorted, so I don't know if this is possible. I have a suspicion, though, that they rely on general OLE DB functionality. To define a table parameter, you need to use the interface ITableDefinitionWithConstraints, and this interface has a method AddConstraint that permits you to specify a primary key. It is a little embarrassing that I don't know, since I have actually implemented TVPs with OLE DB (see below under Perl).
Whether you can use TVPs if you use the OLE DB Consumer Templates, I don't know. I've only worked with "naked" OLE DB myself, never the consumer templates.
You can find a sample for the pull model on CodePlex. I have not found any sample for the push model, but if you are desperate you can download the source code for my Perl module (see below) where I use it, but you will find it difficult to find the forest among the all the trees there.
Beware that using OLE DB for connecting to SQL Server is deprecated and the recommendation is that you should use ODBC for access SQL Server from native code.
No, you cannot use table-valued parameters with old ADO. Yes, ADO sits on top of OLE DB, and you can use SQLNCLI10 as the OLE DB provider with ADO. But ADO itself has not been updated for the new data types added in SQL 2005 and later cannot work with them.
For a long time, LINQ did not support table-valued parameters. But since some back you can use TVPs with Entity Framework 6, if you install EntityFrameworkExtras. See the blog post Access Stored Procedure With User Defined Data Type Using Entity Framework by Mitesh Gadhiya for details.
I don't know of any solutions for LINQ to SQL (that is, without EF), but then again I have not looked around.
The newer EF Core does not have support for TVP of this writing, but see this GitHub item for possible solutions.
No matter which of the above you use, you can always make a direct call from ADO .NET and bypass that language-integrated thing.
I need to add disclaimer that since I don't like the very idea of LINQ or Entity Framework (or any other so-called ORM which hides the SQL for the programmer), I don't follow the development in this area, there is a risk that this section becomes inaccurate by time.
See also the section Further Reading for some useful links in this area.
You can use table-valued parameters with the Microsoft SQL Server JDBC Driver, provided that you use version 6.0 or later. I cannot give any details myself, as all I know about Java is that is close to Sumatra, but I refer you to Microsoft's documentation.
Microsoft has a PHP driver for SQL Server. The current version of this writing is 5.6. What I can understand, it does not support table-valued parameters.
If you use the standard DBI/DBD modules, I doubt that you will find any support for table-valued parameters. However, the best option for connecting to SQL Server – as long you do not need to support other data sources – is Win32::SqlServer, of which I am the author myself. And, yes, it supports table-valued parameters. However, I found in my performance tests that the performance for passing TVPs is very poor. It took two seconds to pass a TVP with 50 000 values. Compare this with 50-150 ms for ADO .NET. I cannot say whether this is due to OLE DB or my own miserable programming.
As you have realised when you've read this small summary is that if you are using VB6, VBA, Access, PHP – and probably a few more environments which I did not list here – you cannot use TVPs directly in your API. If you need to pass a list of values, you could consider using any of the methods that I discuss in my article Arrays and Lists in SQL Server.
If you need to call a stored procedure that takes a table-valued parameter, you can always do this by writing a wrapper procedure that takes a comma-separated list (or an XML document for multi-column TVPs) as a parameter and inserts the data to a table variable and then calls the inner procedure. Say for instance that you need to call get_product_names from VB6:
CREATE PROCEDURE get_product_names_wrapper @prodids nvarchar(MAX) DECLARE @prodid_table integer_list_tbltype INSERT @prodid_table(n) SELECT number FROM intlist_to_tbl(@prodids) EXEC get_product_names @prodid_table
If you think creating a procedure is too much, you can submit a parameterised command batch:
DECLARE @prodid_table integer_list_tbltype INSERT @prodid_table(n) SELECT number FROM iter_intlist_to_tbl(?) EXEC get_product_names @prodid_table
If you've never seen a parameterised command before, see the section Running Dynamic SQL From Client Code in my article The Curse and Blessings of Dynamic SQL for a brief introduction.
I like to thank my MVP colleagues who helped me by reviewing my demo program and with other research: Bob Beauchemin, Alejandro Mesa, Greg Low, Daniel Joskovski, Lenni Lobel, Adam Machanic, Erik Eljskov Jensen abd Dan Guzman. I would also like to thank John Meyer and Peter Stewart for valuable suggestions.
If you have opinions, additions or just have spotted a language/grammar error, please mail me at esquel@sommarskog.se. If you have questions about using TVPs, I advice you to post your questions to the appropriate public forum. Which forum you should use depends on the exact nature of your question. If you have questions related to C# and VB .NET, you should use a .NET forum. For questions on ADO .NET the SQL Server Data Access forum may be the best place, while T‑SQL questions goes into the T‑SQL forum.
Here are some more blog posts about table-valued parameters:
SQL Server 2008 Table-Valued Parameters and C# Custom Iterators: A Match Made In Heaven! by SQL Server MVP Lenni Lobel, which includes some approaches that I did not include (because they go beyond my C# abilities).
Using Table Valued Parameters in Entity Framework, a blog post from Ritesh Sharma which shows a workaround for using TVPs with EF if you don't want to use EntityFrameworkExtras.
Renamed the article so that it no longer carries the Arrays and Lists moniker. As a consequence of this, I have rewritten the introduction and ripped out material related to that topic or moved it to the long version of Arrays and Lists in SQL Server.
Improved the constructors for the CSV_splitter class after a suggestion from John Meyer.
Updated the section on JDBC to reflect that the Microsoft SQL Server JDBC now supports table-valued parameters, hooray!
Added a link to a post from Ritesh Sharma on how to use TVP from .NET on a suggestion from Peter Stewart.
Updates for new capabilities for table types in SQL 2014/16 and other minor modifications.