SQL Wizardry Part Four - Passing Lists of Data to SQL Server

Christian Graus

Rate me:

4.74/5 (19 votes)

12 Jan 2014CPOL8 min read

32.3K

299

Ways to pass an arbitrary list of values to SQL Server

In this article, we will take a look at how to pass a list of values into a stored proc and use it to select or insert data. We will look at three ways of doing this, and discuss the pros and cons of each.

Download source files - 1.5 KB

Introduction

Welcome to the fourth instalment of my series of articles on SQL. Today, I intend to address an issue that I often see asked about in forums, that is, how to pass a list of values into a stored proc and use it to select or insert data. We will look at three ways of doing this and discuss the pros and cons of each.

Creating the Tables

First, we’ll need some sample data. For this exercise, we’ll imagine a table that stores names and birthdays and has an id for a key. Here is the table SQL.

SQL

create table birthdays
(
  id int identity,
  firstName varchar(50),
  lastName varchar(50),
  birthday date
)

And here is the SQL to insert some values:

SQL

insert into birthdays values ('Fred', 'Smith', '1969/02/17'), _
('Sarah', 'Smith', '1971/09/15'), ('Mary', 'Smith', _
'1996/07/20'),('Billy', 'Smith', '2000/07/01'), _
('Sarah', 'Jones', '1945/12/26')

As you can see, just using SQL allows us to write one line to insert many values. More on that, later.

Now, the first thing we might want to do, is find people with the same surname or birthday, for example:

SQL

select * from birthdays where lastName = 'smith'

This is obviously easy to do with a parameter in a stored procedure. But what if we had a reason to select a group of values by id, which could be any length? Here is the SQL: select * from birthdays where id in (1,3). But if you were to write a proc like this:

SQL

create proc findById1 @ids varchar(20)
as
begin

	select * from birthdays where id in (@ids)
	
end

That would create fine. Now do this:

SQL

findById1 '1,3'

Your error is:

SQL

Msg 245, Level 16, State 1, Procedure findById1, Line 5
Conversion failed when converting the varchar value '1,3' to data type int.

SQL server cannot split your csv for you. It is literally doing this:

SQL

Select * from birthdays where id = ‘1,3’

Which, of course, is not what we want.

Passing in a CSL

The most obvious thing to do, is to pass the value in and parse it.

SQL

CREATE proc findById2 (@list nvarchar(MAX))
AS
BEGIN

   DECLARE @pos        int,
           @nextpos    int,
           @valuelen   int

   SELECT @pos = 0, @nextpos = 1

   WHILE @nextpos > 0
   BEGIN
      SELECT @nextpos = charindex(',', @list, @pos + 1)
      SELECT @valuelen = CASE WHEN @nextpos > 0
                              THEN @nextpos
                              ELSE len(@list) + 1
                         END - @pos - 1

   	  select * from birthdays where id =  convert(int, substring(@list, @pos + 1, @valuelen))	
      SELECT @pos = @nextpos
   END
END

And we can run it like this:

SQL

findById2 '1,3'

There are a few issues here. Not least is that you get a table for each row. But the code itself sure is ugly. Let’s get it all into a temp table and see how it looks:

SQL

CREATE proc findById3 (@list nvarchar(MAX))
AS
BEGIN

declare @list nvarchar(MAX) = '1,3'

	create table #values
	(
	   id int,
	   birthday date,
	   firstname nvarchar(50),
	   lastname nvarchar(50)
	)

   DECLARE @pos        int,
           @nextpos    int,
           @valuelen   int

   SELECT @pos = 0, @nextpos = 1

   WHILE @nextpos > 0
   BEGIN
      SELECT @nextpos = charindex(',', @list, @pos + 1)
      SELECT @valuelen = CASE WHEN @nextpos > 0
                              THEN @nextpos
                              ELSE len(@list) + 1
                         END - @pos - 1
      insert into #values
   	  select id, birthday, firstname, _
      lastname from birthdays where id =  convert(int, substring(@list, @pos + 1, @valuelen))	
      SELECT @pos = @nextpos
   END
   
   select * from #values
   
END

And you run it like this:

SQL

findById3 '1,3'

Try changing the values in the list and see what happens. So, this works, but it’s clearly pretty ugly. We’re creating temp tables, stepping over strings, doing all sorts of nasty stuff. This is not something I recommend you’d ever do. I really included it to show you how NOT to do things, and to show a BAD alternative you might encounter ‘in the wild’. My first port of call in situations like this, is to use XML.

XML Processing

XML became a valid type for data in SQL Server 2005. Before that, a function called OpenXML was used to parse XML. If you see code that uses OpenXML and you’re not in SS2000, I recommend converting it. This is a great read on the pros and cons of different ways to use XML. The short version is, it’s better to store your values as attributes, than as text in nodes. So, feel free to read that article, but, we’ll mostly be working with XML that uses attributes. The great advantage of using XML is, of course, that if your calls are coming from procedural code (and they almost certainly are), odds are you’re using a language that knows what XML is and offers all sorts of tools for building it. Even DataTables in .NET can be serialised to XML, as can any serializable class. So, now we are going to create a proc which takes this as an input:

Here is the proc:

SQL

create proc findByIdWithXml @xml xml
as
begin

;with fromXml as
(
SELECT Y.ID.value('(@val)[1]', 'int') as id
FROM @xml.nodes('/ids/id') as Y(ID)
)

select b.id, firstName, lastName, _
             birthday from birthdays b inner join fromXml x on x.id = b.id;

end

Call it like this:

SQL

findByIdWithXml '<ids><id val="1"><id val="3" /></id></ids>'

and it returns:

SQL

id	firstName	lastName	birthday
1	Fred	Smith	1969-02-17
3	Mary	Smith	1996-07-20

Some things to note: we need to pass in the type of the object we want to pull out, as a string. If we were to select /ids in the nodes statement, we’d only get one node out, even though it has a many child id nodes, we’d only select one. @ before the name, means it’s an attribute. We use [1] because I’ve read that this makes the query faster, although in the real world, it’s superfluous (it means, grab the first value in the group). This is definitely much neater than using a csl. But, it’s also more powerful. Now we can define more than one way to get values out. Let’s try this:

SQL

create proc findWithXml @xml xml
as 
begin

;with fromXML as
(
SELECT 
Y.bday.value('(@id)[1]', 'int') as id,
Y.bday.value('(@fname)[1]', 'varchar(255)') as fname
FROM @xml.nodes('/birthdays/bday') as Y(bday)
)
select b.id, firstName, lastName, _
    birthday from birthdays b inner join fromXml x on x.id = b.id or x.fname = b.firstName;

end

Now we’re allowing two different types of search, id and first name. Obviously, we could expand this as much as we wanted. The CTE returns this:

SQL

id	fname
1	NULL
NULL	Sarah
4	NULL

But this is fine, because NULL does not equal NULL in SQL. So even if there’s nulls in the source table (which I allowed), it would still not match on them.

SQL

findWithXml '<birthdays><bday id="1"><bday fname="Sarah"><bday id="4" /></birthdays>'

returns this:

SQL

id	firstName	lastName	birthday
1	Fred	Smith	1969-02-17
2	Sarah	Smith	1971-09-15
5	Sarah	Jones	1945-12-26
4	Billy	Smith	2000-07-01

Note that my use of the name ‘Sarah’ has confirmed that this will return multiple matches. Note also that I’ve not used ‘ORDER BY’ in my proc, so they are not in id order. Of course, this is easily resolved.

INSERTS with XML

Of course, now that we’ve moved to a strongly defined format, we can start to also use XML to do inserts (of course, we could always do deletes, because we’d delete on only one value, same as we select). The concept is the same across the board, use a CTE to create an in table memory, and then use it with our existing tables.

SQL

create proc insertWithXml @xml xml
as
begin

;with fromXML as
(
SELECT 
Y.bday.value('(@fname)[1]', 'varchar(255)') as fname,
Y.bday.value('(@lname)[1]', 'varchar(255)') as lname,
Y.bday.value('(@date)[1]', 'date') as bday
FROM @xml.nodes('/birthdays/bday') as Y(bday)
)

insert into birthdays (firstname, lastname, birthday)
select fname, lname, bday from fromXml;

end

Note that the column names don’t have to be the same, just in the same order.

SQL

insertWithXml '<birthdays><bday fname="Hannah" lname="Jones" _
date="1972/12/31"><bday fname="Calvin" lname="Kaputnik" date="1914/11/11" /></birthdays>'

Now select all and see the result:

SQL

select * from birthdays

Two values have been inserted. Note that this is not as efficient as using a proper bulk insert process if you have TONS of records, but it’s great if you need to insert a small number at a time and want to call a proc once, instead of X times.

Table Value Parameters

Starting in SS2008, it is possible to pass in a table as a parameter. While our code above was quite compact, there was a separate step to turn XML into a table, and although I’ve never had any issue with doing this, I am told there are quirks to watch out for. So, this seems better right? There are a few gotchas to be aware of. First of all, TVPs inside a proc must be marked as read only (which means you also can’t insert into them or delete from them). That seems reasonable. Secondly, TVPs require you to define them first. This means that if you have a proc that takes a TVP, you need to also install the TVP on your SQL Server instance for it to work, and you can easily end up with a DB full of TVPs, which strikes me as more confusing than just using one type that is natively supported. But the real issue that I can see is that, as far as I can tell, TVPs are not supported by entity framework. This is a real blow, if you could create a List of a class and pass it to a proc, that would be very powerful indeed. As it stands, it means that even if you don’t use EF, using TVPs closes the door on using EF for any other projects you need to write against your DB. I know I’ve often had to write small, single use apps, and having EF create my data layer for free has always been a very powerful reason to use it. With those caveats in mind, let’s press on.

SQL

CREATE TYPE BirthdayTableType AS TABLE 
( 
  id int,
  firstName varchar(50),
  lastName varchar(50),
  birthday date
);
GO

This creates the actual TVP type. As you can see, it’s a table declaration, with some small syntax changes. In fact, I copied and pasted the code for our table to create this, and just removed the identity specification (it turns out this is valid, although it’s clearly largely useless). And here is the proc:

SQL

CREATE PROCEDURE dbo.findWithTVP
    @TVP BirthdayTableType READONLY
    AS 
begin
    
    select * from birthdays b
    inner join @TVP t on t.id = b.id or t.birthday = b.birthday _
                      or t.firstname = b.firstname or t.lastname  = b.lastname
end

This is certainly even neater than the XML solution, and it’s a world away from the comma separated list solution, both in usefulness and in readability. While we’re clearly not typically going to create TVPs in SQL, this bit is actually more painful.

SQL

DECLARE @tvp AS BirthdayTableType;
INSERT INTO @tvp (id, firstname, lastname, birthday) values(1, null, null, null)
INSERT INTO @tvp (id, firstname, lastname, birthday) values(null, null, 'kaputnik', null)
INSERT INTO @tvp (id, firstname, lastname, birthday) values(null, 'sarah', null, null)
INSERT INTO @tvp (id, firstname, lastname, birthday) values(null, null, null, '1996-07-20')

exec findWithTVP @tvp

And this will return the records we would expect. And, again, we can easily write insert code using a TVP, and, I believe that it’s possible in normal (non-EF) .NET code, to just pass a datatable through as a parameter. So, this would make for neat code with a traditional data layer, where the result of one proc needs to be passed to another proc.

Conclusion

So, we have discussed a number of ways to pass an arbitrary number of parameters through to a stored proc, so we can operate on a group of values, instead of just one. When it comes to selecting a range of values, this is really the only decent possibility, although for inserts or deletes, it’s more a matter of efficiency that we can make one SQL call instead of many. Personally, I think XML is a universal standard, well understood, easy to work with and supported right out of the box. If there was EF support for TVPs, I’d be inclined to use them on large projects that use EF, because there would be clear benefits. As it stands, I think it’s like so many things in SQL – there’s more than one way to do most things, and which way is best, depends on your circumstance.

History

4^th January, 2014: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Christian Graus

Software Developer (Senior)

Australia

Programming computers ( self taught ) since about 1984 when I bought my first Apple ][. Was working on a GUI library to interface Win32 to Python, and writing graphics filters in my spare time, and then building n-tiered apps using asp, atl and asp.net in my job at Dytech. After 4 years there, I've started working from home, at first for Code Project and now for a vet telemedicine company. I owned part of a company that sells client education software in the vet market, but we sold that and I worked for the owners for five years before leaving to get away from the travel, and spend more time with my family. I now work for a company here in Hobart, doing all sorts of Microsoft based stuff in C++ and C#, with a lot of T-SQL in the mix.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.