I'll deal with your question in sections .. .firstly
Quote:
(1), i do not know how what to do about the aggregate function as i have nothing to aggregate
As I said in my comment, if there is nothing to aggregate then I usually use
MAX()
(or
MIN()
). This has the net effect of just returning the value you want. Here is an example:
SELECT [foreign_key], [plant], [clothing], [country], [continent], [vehicle], [season]
FROM
(
SELECT row_id, foreign_key, Field_Name, field_value
FROM my_source_table
) a
PIVOT
(
MAX(field_value)
FOR Field_Name IN ([plant],[clothing],[country],[continent],[vehicle],[season] )
) AS pvt
ORDER BY foreign_key
Note in your attempt you had used
FOR vehicle IN ...
. The column [vehicle] doesn't actually exist until after the PIVOT has been actioned, so you can't refer to it within the body of the PIVOT. Nor do you want to, you are converting the [Field_Name] values to columns so that's what needs to go in that clause.
Unfortunately, the query as it stands produces these unhelpful results...
foreign_key plant clothing country continent vehicle season
1 flower NULL NULL NULL NULL NULL
1 NULL pants NULL NULL NULL NULL
1 NULL NULL usa NULL NULL NULL
1 NULL NULL NULL NULL NULL spring
2 tree NULL NULL NULL NULL NULL
2 NULL NULL NULL europe NULL NULL
2 NULL NULL france NULL NULL NULL
3 flower NULL NULL NULL NULL NULL
3 NULL NULL brazil NULL NULL NULL
3 NULL NULL NULL NULL bike NULL
3 NULL NULL NULL NULL NULL summer
We need to aggregate them in someway … e.g.
SELECT [foreign_key], MAX([plant]) AS [plant], MAX([clothing]) as [clothing], MAX([country]) as [country], MAX([continent]) as [continent], MAX([vehicle]) as [vehicle], MAX([season]) as [season]
FROM
(
SELECT row_id, foreign_key, Field_Name, field_value
FROM my_source_table
) a
PIVOT
(
MAX(field_value)
FOR Field_Name IN ([plant],[clothing],[country],[continent],[vehicle],[season] )
) AS pvt
GROUP BY foreign_key
ORDER BY foreign_key
Note that I'm doing the aggregate on the dataset produced by the PIVOT, "as well as" the aggregate in the pivot. This produces your expected results
foreign_key plant clothing country continent vehicle season
1 flower pants usa NULL NULL spring
2 tree NULL france europe NULL NULL
3 flower NULL brazil NULL bike summer
Quote:
(2), i do not know how to determine the column headers dynamically, based on distinct values in the my_source_table.field_name column
To do this you need to generate some Dynamic SQL. Let's look at the first bit that could change depending on the data...First let's add some extra data
insert into my_source_table (foreign_key, Field_Name, field_value) values (1, 'sense', 'smell')
So we know we're going to have to "construct" something that adds that new value to the bit of the SQL that currently says
SELECT [foreign_key], [plant], [clothing], [country], [continent], [vehicle], [season]
There are a few ways of constructing that list - because I want to add some square brackets this is probably the easiest
DECLARE @liststr NVARCHAR(MAX) = null
;WITH fieldValues AS
(
SELECT DISTINCT Field_Name FROM my_source_table
)
SELECT @liststr = COALESCE(@liststr +'],[','') + Field_Name
FROM fieldValues
SELECT '[' + @liststr + ']'
This will produce the results
[clothing],[continent],[country],[plant],[season],[sense],[vehicle]
Note that I can't control the order that these are returned in unless I ORDER BY the query. That ORDER BY has to be outside the CTE … ie. I can't have
SELECT DISTINCT Field_Name FROM my_source_table ORDER BY Field_Name
I have to use
FROM fieldValues ORDER BY Field_Name
Also note that the variable starts as null and I have to add the missing bracket on at the end. Give it a try the other way to see why.
Now let's address the other bit that needs to change now we've added new data - the original query said
SELECT [foreign_key], MAX([plant]) AS [plant], MAX([clothing]) as [clothing], MAX([country]) as [country], MAX([continent]) as [continent], MAX([vehicle]) as [vehicle], MAX([season]) as [season]
This is a bit more fiddly as I need to refer to the field name twice and include the MAX function. Have a good look at the difference in the way that I do this. Also note that I want to ignore the first two characters when I use the results
DECLARE @liststr2 NVARCHAR(MAX) = null
;WITH MAXfieldValues AS
(
SELECT DISTINCT Field_Name, Field_Name AS FN FROM my_source_table
)
SELECT @liststr2 = COALESCE(@liststr2,'') +'],MAX([' + Field_Name + ']) AS [' + Field_Name
FROM MAXfieldValues ORDER BY Field_Name
SELECT SUBSTRING(@liststr2,3, LEN(@liststr2)) + ']'
This produces the results
MAX([clothing]) AS [clothing],MAX([continent]) AS [continent],MAX([country]) AS [country],MAX([plant]) AS [plant],MAX([season]) AS [season],MAX([sense]) AS [sense],MAX([vehicle]) AS [vehicle]
We already know what the rest of the query needs to look like so lets stitch it all together and run it
DECLARE @sql NVARCHAR(MAX) = N'SELECT [foreign_key], ' + SUBSTRING(@liststr2,3, LEN(@liststr2)) + ']' +
' FROM
(
SELECT row_id, foreign_key, Field_Name, field_value
FROM my_source_table
) a
PIVOT
(
MAX(field_value)
FOR Field_Name IN ([' + @liststr + '] )) AS pvt
GROUP BY foreign_key
ORDER BY foreign_key'
Exec sp_executesql @sql
And the results...
FK clothing continent country plant season sense vehicle
1 pants NULL usa flower spring smell NULL
2 NULL europe france tree NULL NULL NULL
3 NULL NULL brazil flower summer NULL bike
I really encourage you to work through the various sections to make sure you fully understand what is going on. There are links to documentation and other examples in the article link in my comment