Click here to Skip to main content
15,893,668 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I'll preface this question by sharing that I have read the MSDN site related to LINQ (which is geared towards database usage with known column names and not in-memory datasets and datatables) and have spent maybe 20 hours searching various forums for a solution for what I have to solve.

I have an c# Winforms application which will accept a CSV file and load the data into a datatable. The datatable column names are read and populated into another datatable that serves to map the source datatable to the normalized final datatable. The user is presented with a screen where they provide input on which source column matches which destination column.

Problem
How to copy only the columns from the source table needed to the destination table using the mapping table to tell what is supposed to go where. In SQL I would do this via dynamic SQL like:

SQL
DECLARE @sql nvarchar(4000)

-- This creates the dynamic query statement and embeds the concatenated list of source column names (the STUFF function handles the concatenation) mapped to the target columns or puts in an empty string if the column was not mapped to avoid a column count mismatch when inserting
SELECT @sql = 'SELECT ' + STUFF((SELECT ',' + ISNULL(m.src,'') FROM tbl_map m FOR XML PATH('')),1,1,'') + ' FROM tbl_src'

-- This inserts the records from the source table using the dynamic query into the target table
INSERT INTO target_table
EXEC(@sql)


Again, nothing I've read anywhere has dealt with the use of datatable as the source where the column names are not known in advance.

Sample Source:
-----------------------------------------------------------------------
| FName    | LName   | DOB      |  SSN      | Sex |  LicNbr   | State |
| Bob      | Smith   | 19230105 | 123456789 | M   | N4257665  | NJ    |
| Jane     | Doe     | 19670530 | 000000000 | F   | 12457890  | PA    |
-----------------------------------------------------------------------


Sample Mapping:

--------------------------
| Target        | Source |
| first_name    | FName  |
| last_name     | LName  |
| license_state | State  |
| license_id    | LicNbr |
--------------------------


Sample Output Expected:

-------------------------------------------------------
| first_name | last_name | license_state | license_id |
| Bob        | Smith     | NJ            | N4257665   |
| Jane       | Doe       | PA            | 12457890   |
-------------------------------------------------------


Here are the dataset/datatables involved:
C#
//Datatable to hold the mapping between the source datatable and the target datatable
DataTable dt = new DataTable("dt_map");

//Add columns to the mapping table
DataColumn dc = new DataColumn("source", typeof(string));
dc.DefaultValue = string.Empty;
dt.Columns.Add(dc);

dc = new DataColumn("target", typeof(string));
dc.DefaultValue = string.Empty;
dt.Columns.Add(dc);

//Load initial row data populating the target table column names to map to
dt.BeginLoadData();
dt.Rows.Add("first_name", "");
dt.Rows.Add("last_name", "");
dt.Rows.Add("license_state", "");
dt.Rows.Add("license_id", "");
dt.EndLoadData();

//Add the mapping datatable to the dataset
ds_mstr.Tables.Add(dt.Copy());

//Create the final normalized datatable which be used
dt = new DataTable("dt_final");

//Add columns to the final datatable
dc = new DataColumn("first_name", typeof(string));
dc.DefaultValue = string.Empty;
dt.Columns.Add(dc);

dc = new DataColumn("last_name", typeof(string));
dc.DefaultValue = string.Empty;
dt.Columns.Add(dc);

dc = new DataColumn("license_state", typeof(string));
dc.DefaultValue = string.Empty;
dt.Columns.Add(dc);

dc = new DataColumn("license_id", typeof(string));
dc.DefaultValue = string.Empty;
dt.Columns.Add(dc);

//Add the final datatable to the dataset
ds_mstr.Tables.Add(dt.Copy());


Again, the source column names are unknown until imported from the file and will vary in naming and quantity. Assume everything is being treated as a string for the sake of simplicity. Thank you to whoever can provide a solution/guidance and even more thanks if an explanation can come with the answer so I can learn something along the way and be a better developer for it!

What I have tried:

Searching for solutions or anything that might be anywhere close to what I'm looking for (LINQ query using dynamic columns names mapped between two tables using a table that maps the column names between the two tables).
Posted
Updated 26-Aug-16 3:28am
v3
Comments
Maciej Los 26-Aug-16 2:42am    
Well, does "mapping" object is DataTable or Dictionary?
greenhorn_dev 26-Aug-16 9:03am    
It's a datatable.

Try objecttype.GetProperty("variabletype").GetValue(dragRow, null)
you get unknown
object TypedragRow
objecttype is object variable
variabletype is variable the you to get
 
Share this answer
 
Well...

There's few ways to achieve that:

  1. using Dynamic Query Library
    ScottGu's Blog - Dynamic LINQ (Part 1: Using the LINQ Dynamic Query Library)[^]
  2. using Expression Trees
    How to: Use Expression Trees to Build Dynamic Queries (C# and Visual Basic)[^]
    Dynamic LINQ Queries with Expression Trees - Simple Talk[^]
  3. using Func<t,> delegate
    Func(T, TResult) Delegate (System)[^]

    Example:
    C#
    //source datatable
    DataTable src = new DataTable();
    src.Columns.Add(new DataColumn("FName", typeof(string)));
    src.Columns.Add(new DataColumn("LName", typeof(string)));
    src.Columns.Add(new DataColumn("DOB", typeof(string)));
    src.Columns.Add(new DataColumn("SSN", typeof(string)));
    src.Columns.Add(new DataColumn("Sex", typeof(string)));
    src.Columns.Add(new DataColumn("LicNbr", typeof(string)));
    src.Columns.Add(new DataColumn("State", typeof(string)));
    src.Rows.Add(new object[]{"Bob", "Smith", "19230105", "123456789", "M", "N4257665", "NJ"});
    src.Rows.Add(new object[]{"Jane ", "Doe", "19670530", "000000000", "F", "12457890", "PA"});
    //datatable for map fields
    DataTable map = new DataTable();
    map.Columns.Add(new DataColumn("Target", typeof(string)));
    map.Columns.Add(new DataColumn("Source", typeof(string)));
    map.Rows.Add(new object[]{"first_name","FName"});
    map.Rows.Add(new object[]{"last_name","LName"});
    map.Rows.Add(new object[]{"license_id","LicNbr"});
    map.Rows.Add(new object[]{"license_state","State"});
    //Dictionary object to get KeyValuePair<string, string>
    //this object will be used to "translate" one name of column into second
    Dictionary<string, string> col_map = map.AsEnumerable().ToDictionary(x=>x.Field<string>("Source"), x=>x.Field<string>("Target"));
    //delegate to change the names of columns
    Func<DataRow, Dictionary<string, string>, DataRow> mappedRow = delegate(DataRow row, Dictionary<string, string> mapp)
    		{
    			foreach(DataColumn c in row.Table.Columns)
    			{
    				try {
    					c.ColumnName = mapp[c.ColumnName];
    				} catch (KeyNotFoundException ex)
    				{
    					//ignore error and continue for loop
    					continue;
    				}
    			}
    			return row;
    		};
    
    //query
    var result = src.AsEnumerable()
    		.Select(x=> mappedRow(x, col_map));


    Result:

    first_name	last_name	DOB			SSN			Sex	license_id	license_state
    Bob 		Smith 		19230105 	123456789 	M 	N4257665 	NJ 
    Jane  		Doe 		19670530 	000000000 	F 	12457890 	PA 


  4. In case you want to return only those fields which are stored in dictionary, you have to provide a way to modify collection of columns for datatable object.


Other interesting resources:
c# - Referencing DataRows by column name in a LINQ query, then filter via mapping with another dataset - Stack Overflow[^]
sql server - Map column names of sql table with an object property, C#, LINQ - Stack Overflow[^]
 
Share this answer
 
Comments
greenhorn_dev 2-Sep-16 17:29pm    
While this gets me close, it ends up renaming the underlying datatable rather than mapping the src table to the destination table and querying the source to populate the destination.

Let's try a slightly different approach using the dictionary and src table as-is. Is there a LinQ query that will select from the src datatable in the order that the columns are stored in the dictionary as keys and then rename the output result columns to the key value (not the key itself).

src:
-----------------------------------------------------------------------
| FName | LName | DOB | SSN | Sex | LicNbr | State |
| Bob | Smith | 19230105 | 123456789 | M | N4257665 | NJ |
| Jane | Doe | 19670530 | 000000000 | F | 12457890 | PA |
-----------------------------------------------------------------------

Dictionary:
--------------------------
| Key | Value |
| fname | first_name |
| lname | last_name |
| DOB | |
| SSN | |
| Sex | |
| LicNbr | license_id |
| State | license_state |
--------------------------

destination datatable:
-------------------------------------------------------
| first_name | last_name | license_state | license_id |
| Bob | Smith | N4257665 | NJ |
| Jane | Doe | 12457890 | PA |
-------------------------------------------------------

Note that the column ordering has changed as have the column names. I don't want to change the src datatable structure after it has been built. I do want to extract --> transform --> load (ETL) from src --> destination using the dictionary as the blueprint for the transformation, if that makes any sense.
Maciej Los 4-Sep-16 12:39pm    
Well, all you need to do is to change the code to your needs.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900