Click here to Skip to main content
15,881,812 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I have written T-SQL code to update XML records in a potentially very large table (let's say millions of rows) with each row having an XML column. The code changes the values of all nodes in the XML that match a user-defined XPath expression. So for example, assume the XML is:
XML
<myrec>
    <amt>1</amt>
    <main>
        <amt bonus="11">2</amt>
        <supp>
            <amtort>referrer</amtort>
            <amt>3</amt>
            <rows>
                <row>
                    <amt bonuses="999">4</amt>
                </row>
                <row>
                    <amt>5</amt>
                </row>
                <row>
                    <someotherlevel>
                        <amt bonus="22">6</amt>
                        <keep>optional</keep>
                    </someotherlevel>
                </row>
            </rows>
        </supp>
    </main>
</myrec>

And assume the user-defined XPath is //amt, then the code (as an example) would add 100 to each data element of <amt>. The XML would then become:
XML
<myrec>
   <amt>101</amt>
  <main>
    <amt bonus="11">102</amt>
    <supp>
      <amtort>referrer</amtort>
      <amt>103</amt>
      <rows>
        <row>
          <amt bonuses="999">104</amt>
        </row>
        <row>
          <amt>105</amt>
        </row>
        <row>
          <someotherlevel>
            <amt bonus="22">106</amt>
            <keep>optional</keep>
          </someotherlevel>
        </row>
      </rows>
    </supp>
  </main>
</myrec>

This would get done for all <amt> nodes in the XML as per the XPath //amt.

Note that the code to works for any XML document and any (user entered) XPath. So this is general purpose code.

My code works, but I have a feeling there is a better way to do this since I am not getting the performance I was hoping for (currently takes about 266 seconds to update 1 million rows (single XML node updated per row) on an Azure B2s instance (2 VCPU's, 4GB RAM, SSD).

The code I've written uses iteration and the .modify() function on and XML type variable. I've also resorted to using dynamic SQL so I can build arbitrary XQuery strings (as mentioned, this is general purpose code for any XML / XPath).

Can anyone suggest a better / more performant approach? I'm new to XML in MSSQL, so still feeling my way through it.

My code is below:

What I have tried:

SQL
CREATE PROCEDURE dbo.ModifyXML 
	@var_xpath VARCHAR(max), 
	@var_rec XML,
	@new_xml XML OUTPUT
AS
	SET NOCOUNT ON

	DECLARE @sql_matches NVARCHAR(max);
	DECLARE @sql_modify NVARCHAR(max);
	DECLARE @sql_get_new NVARCHAR(max);
	DECLARE @new_value INTEGER;
	DECLARE @i INTEGER;

	SET @sql_modify = N'';
	SET @new_xml = @var_rec
	SET @i = 1;

	-- Get original data value in the first node
	SET @sql_get_new = N'SET @new_value = dbo.my_transform(@xml.value(N''(' + @var_xpath + '/text())[1]'', ''INTEGER''));';
	EXECUTE sp_executesql @sql_get_new, N'@xml XML, @new_value INTEGER OUTPUT', @xml = @var_rec, @new_value = @new_value OUTPUT;

	WHILE @new_value is not NULL
	BEGIN
		-- Replace the original value with the new value in the XML.
		-- This is a dynamic SQL string and is appended to the previous one, ... to be executed once the loop exits
		SET @sql_modify = @sql_modify + N'SET @new_xml.modify(''replace value of (' + @var_xpath + '/text())[' + CAST(@i AS NVARCHAR(max)) +'] with "' + CAST(@new_value AS NVARCHAR(max)) + '"'');';

		-- Next match
		SET @i = @i + 1;

		-- Create masked value from original
		SET @sql_get_new = N'SET @new_value = dbo.my_transform(@xml.value(N''(' + @var_xpath + '/text())[' + CAST(@i AS NVARCHAR(max)) + ']'', ''INTEGER''));';
		EXECUTE sp_executesql @sql_get_new, N'@xml XML, @new_value INTEGER OUTPUT', @xml = @var_rec, @new_value = @new_value OUTPUT;
	END

	-- Execute the accumulated dynamic SQL modification queries
	IF @sql_modify <> '' EXECUTE sp_executesql @sql_modify, N'@new_xml XML OUTPUT', @new_xml = @new_xml OUTPUT;
GO
Posted
Updated 5-Oct-20 3:51am
v2
Comments
[no name] 3-Oct-20 12:39pm    
226 seconds relative to what? A day? Some "transaction" window? It's "fast enough" or it isn't.
nadimr 3-Oct-20 22:35pm    
Relative to the equivalent operation on a non-XML column.
[no name] 4-Oct-20 9:18am    
And what's that? "More"? How much more? This is something others can only guess at and you want suggestions. Crazy making.
nadimr 4-Oct-20 9:31am    
To be more clear, I am simply asking if there is a more efficient approach to modifying arbitrary XML records. By efficient, I mean runs faster.
[no name] 5-Oct-20 10:01am    
Probably.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900