Introduction
Updated!
I've not paid much attention to this since I posted it years ago, but based on a couple requests that came through, and new features available in SQL Server, I thought it deserved a quick refresh.
Particularly, a test for Date values and a DATETIME column have been added. Also a version that uses the XML functions in SQL server to parse the string has been added. The XML approach is more efficient and scales better.
Cheers!
Clayton
One of the challenges that always comes up when dealing with data captured from web sites is parsing multi-value fields. Many web forms have "check all that apply" checkbox forms, and the data from questions of this type is often logged as a single delimited field value.
For example, the index values from a block of check boxes may be logged to a text file or returned to the database as a string of delimited numbers: '1|3|6|11'
The challenge is to take these delimited values and get them into a format that is meaningful to a database user. The single field of delimited values has to be broken apart and stored as individual rows in a child table related to the source row in the parent table.
The best way to accomplish this is up front in the web application. If this is not an option then it has to be handled when importing to the database. This SQL Server function gives you a quick way to turn the data trapped in delimited strings into meaningful data.
One of the more intriguing capabilities of SQL Server User Defined Functions (UDF) is the ability to return a table as a result. This UDF uses this feature to return each element in the delimited string as a row in a result table. The result from the function can be used like any other table. It can be included in a multi-table query, or returned directly by the function call.
Code
CREATE FUNCTION [fn_ParseText2Table]
(@p_SourceText VARCHAR(MAX)
,@p_Delimeter VARCHAR(100)=',' --default to comma delimited.
)
RETURNS @retTable
TABLE([Position] INT IDENTITY(1,1)
,[Int_Value] INT
,[Num_Value] NUMERIC(18,3)
,[Txt_Value] VARCHAR(MAX)
,[Date_value] DATETIME
)
AS
BEGIN
DECLARE @w_xml xml;
SET @w_xml = N'<root><i>' + replace(@p_SourceText, @p_Delimeter,'</i><i>') + '</i></root>';
INSERT INTO @retTable
([Int_Value]
, [Num_Value]
, [Txt_Value]
, [Date_value]
)
SELECT CASE
WHEN ISNUMERIC([i].value('.', 'VARCHAR(MAX)')) = 1
THEN CAST(CAST([i].value('.', 'VARCHAR(MAX)') AS NUMERIC) AS INT)
END AS [Int_Value]
, CASE
WHEN ISNUMERIC([i].value('.', 'VARCHAR(MAX)')) = 1
THEN CAST([i].value('.', 'VARCHAR(MAX)') AS NUMERIC(18, 3))
END AS [Num_Value]
, [i].value('.', 'VARCHAR(MAX)') AS [txt_Value]
, CASE
WHEN ISDATE([i].value('.', 'VARCHAR(MAX)')) = 1
THEN CAST([i].value('.', 'VARCHAR(MAX)') AS DATETIME)
END AS [Num_Value]
FROM @w_xml.nodes('//root/i') AS [Items]([i]);
RETURN;
END;
GO
--Old version, with some updates...
CREATE FUNCTION [fn_ParseText2Table_old]
(
@p_SourceText VARCHAR(MAX)
, @p_Delimeter VARCHAR(100) = ',' --default to comma delimited.
)
RETURNS @retTable TABLE
(
[Position] INT IDENTITY(1, 1)
, [Int_Value] INT
, [Num_Value] NUMERIC(18, 3)
, [Txt_Value] VARCHAR(MAX)
, [Date_value] DATETIME
)
AS
BEGIN
DECLARE @w_Continue INT
, @w_StartPos INT
, @w_Length INT
, @w_Delimeter_pos INT
, @w_tmp_int INT
, @w_tmp_num NUMERIC(18, 3)
, @w_tmp_txt VARCHAR(MAX)
, @w_date DATETIME
, @w_Delimeter_Len TINYINT;
IF LEN(@p_SourceText) = 0
BEGIN
SET @w_Continue = 0; -- force early exit
END;
ELSE
BEGIN
-- parse the original @p_SourceText array into a temp table
SET @w_Continue = 1;
SET @w_StartPos = 1;
SET @p_SourceText = RTRIM(LTRIM(@p_SourceText));
SET @w_Length = DATALENGTH(RTRIM(LTRIM(@p_SourceText)));
SET @w_Delimeter_Len = LEN(@p_Delimeter);
END;
WHILE @w_Continue = 1
BEGIN
SET @w_Delimeter_pos = CHARINDEX(@p_Delimeter, (SUBSTRING(@p_SourceText, @w_StartPos, ((@w_Length-@w_StartPos)+@w_Delimeter_Len))));
IF @w_Delimeter_pos > 0 -- delimeter(s) found, get the value
BEGIN
SET @w_tmp_txt = LTRIM(RTRIM(SUBSTRING(@p_SourceText, @w_StartPos, (@w_Delimeter_pos-1))));
IF ISNUMERIC(@w_tmp_txt) = 1
BEGIN
SET @w_tmp_int = CAST(CAST(@w_tmp_txt AS NUMERIC) AS INT);
SET @w_tmp_num = CAST(@w_tmp_txt AS NUMERIC(18, 3));
END;
ELSE
BEGIN
SET @w_tmp_int = NULL;
SET @w_tmp_num = NULL;
END;
IF ISDATE(@w_tmp_txt) = 1
BEGIN
SET @w_date = CAST(@w_tmp_txt AS DATETIME);
END;
ELSE
BEGIN
SET @w_date = NULL;
END;
SET @w_StartPos = @w_Delimeter_pos + @w_StartPos + (@w_Delimeter_Len - 1);
END;
ELSE -- No more delimeters, get last value
BEGIN
SET @w_tmp_txt = LTRIM(RTRIM(SUBSTRING(@p_SourceText, @w_StartPos, ((@w_Length-@w_StartPos)+@w_Delimeter_Len))));
IF ISNUMERIC(@w_tmp_txt) = 1
BEGIN
SET @w_tmp_int = CAST(CAST(@w_tmp_txt AS NUMERIC) AS INT);
SET @w_tmp_num = CAST(@w_tmp_txt AS NUMERIC(18, 3));
END;
ELSE
BEGIN
SET @w_tmp_int = NULL;
SET @w_tmp_num = NULL;
END;
IF ISDATE(@w_tmp_txt) = 1
BEGIN
SET @w_date = CAST(@w_tmp_txt AS DATETIME);
END;
ELSE
BEGIN
SET @w_date = NULL;
END;
SELECT @w_Continue = 0;
END;
INSERT INTO @retTable
VALUES
(@w_tmp_int
, @w_tmp_num
, @w_tmp_txt
, @w_date
);
END;
RETURN;
END;
GO
Usage Examples:
Single Character Delimiter
select * from dbo.fn_ParseText2Table('100|120|130.56|Yes|Cobalt Blue|2016-04-04|2000-06-17','|')
Multi-Character Delimiter
select * from dbo.fn_ParseText2Table('Red, White, and, Blue',', ')
Big Multi-Character Delimiter
select * from dbo.fn_ParseText2Table('Red<Tagname>White<Tagname>Blue','<Tagname>')
Unfortunately, the only way to use this to process multiple rows is using a cursor.
Here is an example of what the code inside the cursor block would look like to insert parsed values from a string as rows in a child table
As a table in an insert statement:
create table #tmp_Child (parent_id int, ColorSelection varchar(30), SelOrder tinyint)
declare @parent_id int
,@ColorSelections varchar(255)
,@delim varchar(100)
set @parent_id = 122
set @ColorSelections = 'Red, White, and, Blue'
set @delim = ', '
insert #tmp_Child (parent_id, ColorSelection, SelOrder)
select @parent_id
,t.txt_value
,t.position
from dbo.fn_ParseText2Table(@ColorSelections, @delim) as t
select * from #tmp_child
drop table #tmp_child
I am a database design consultant with over 20 years experience in consulting.
My primary focus is designing and bulding data driven, highly automated OLTP and datatwarehouse applications that leverage the Microsoft SQL Server platform. I am skilled in SQL Server Analysis Services (MDS and Tabular), MDX, DAX, and all the related technologies. I am experienced with SharePoint Business Intelligence and dashboard capabilities, including installation, configuration, and Kerberos authentication set up.
I have worked with SQL Server since version 4.12 and have experience with Oracle, Sybase, Teradata and DB/2 as well.
I would term myself as a Data Warehouse Architect and programmer DBA, with a good dose of system architect and troubleshooter thrown in.