|
Ok, this isn't so much a "I need help" question as it is a "what do you think of this" question. I'm building a .net desktop app that accesses it's database through a web service. I need to implement a way for my users to upload large data files into the database on demand. The data files are generated by third parties and their format changes quite frequently. When these files are uploaded, only some columns from the files are uploaded into the DB tables. Now, I've looked at several options and here's what I've decided on. I'm going to create a DTS package and job for each import file and I'm going to fire it off using sp_start_job. I'm then going to poll the job's status using sp_help_job. Does this sound reasonable or can anybody think of a simpler way? It's gotta be flexible and easy to update without having to recompile/redistribute code, which is why I'm going the DTS package route.
|
|
|
|
|
Nasty, ugly painful SOB of a thing. I have grown to dislike most of MSs ETL products, Biztalk and SSIS. We have a similar problem which requires the ETL to handle additional columns in a data file that is uploaded daily. SSIS will choke on this and changing a package, redeploying etc AFTER it has choked is not an option.
Our solution.
The package reads the first line of the file where the column headers are and checks against the target table in SQL. All cols are varchar 500. If there is a new column (it always grows) SSIS drops the target table and passes the column names into a stored proc to recreate the table (some column names are duplicated and this is handled by the proc).
The package then bulk copies the data into the target table, it is guaranteed to work because of the previous process. I then have a proc to do the transforms into the production database. I find changing a proc to be easier than a package.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
I agree. For some reason (I believe that the reason is between the chair and the keyboard ) I often find it difficult to implement tasks with "advanced" functionality with SSIS. Usually I use stored procedures and if T-SQL cannot provide enough functionality I typically create an assembly using C# and add that to the database. This way I can get more reusable functionality to the place where it's actually needed (=DB).
I think that the problem with SSIS for me is that logic is easily scattered to different modules and it's more difficult to understand later or maintain. Also SSIS is not so powerful that it could be used as "programming platform".
However, what comes to Analysis Services I find SSIS usable. It's quite easy to load data from OLTP to cubes so I haven't totally discarded it. Also simple data pumping tasks seem to work fine.
What comes to original question, I think your idea is good and if you don't have any difficulties implementing this using SSIS, you're safe.
Mika
|
|
|
|
|
Mika Wendelius wrote: if you don't have any difficulties implementing this using SSIS
You've got to be joking, took me 3 days and more googling than I like to think about to identify the shortcomings and work around them. I still prefer DTS!
We were in the position of needing to move an SSAS database from the default (also want to stripe it across drives) and went to out outsource DBA support (IBM)for help, "sorry we have no experience in SSAS" WTF. SSAS presents it's challenges as well.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
I think I wrote it badly. I was trying to refer to original FyreWyrm's question, not to your case.
|
|
|
|
|
Hi,
I have a column in (say) database 1, and another column in (say) database 2.
How would I go about copying all the values from the database 1 column into the database 2 column?
Cheers,
|
|
|
|
|
Databases do not have columns, they have tables, are you tlaking about copying a table from 1 DB to another?
What database, SQL, Access, MySQL.....
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
Mycroft Holmes wrote: Databases do not have columns, they have tables, are you tlaking about copying a table from 1 DB to another?
Your right, sorry. They are columns in tables within the two databases.
Im using Microsoft SQL Server Management Studio.
Hope thats enough info.
Cheers,
|
|
|
|
|
This will depend on the structure of the 2 tables, if there is only 1 column (your moving the table) try
SELECT *<br />
INTO Database2.dbo.TableName<br />
FROM Table1
If there is an existing table then you will need to do an update and this will be dictated by your data
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
I have been trying to make this work, but so far no luck. Probably an easy one for someone with a bit more experience. Here is a simplified explanation.
I have two tables, the primary table is an inventory table listing items identifed by a unique inventory number and the second is a record of transactions. A record is added to the second table when a transaction occurs that affects an item in the first table. For instance when an item is sold, a transaction is created in the transaction table - "Sold" designated by the letter "S", a new item received is designated with the letter "R" etc. Occasionally an item is listed as "received" that is already in the inventory table or an item is listed as "sold" that is was not entered in the inventory table.
I need to do two things: (1) Find the records in the second table (t_log) that do not have a matching inventory number in the Inventory table and insert appropriate text into a comment field. (2) find the records in the transaction table with 'R' transaction type and insert an appropriate comment in the transaction table. In each case I need to set a boolean to to True (error).
The tables look like this:
----------------
Inventory Table (Inv)
-----------------------
InventoryNo
Name, etc.
-------------------
TransactionLog (T_Log)
-----------------------
InvNo
TransactionType
Comment
Error (Boolean)
------------------
For the records that are in the T_Log but not in the inventory (Table A), the query below displays the records that I want to update, but I don't know how to update just those records.
(select Inv.InventoryNo,Inv.ItemName,T_Log.id, T_Log.Invno,t_log.transactiontype, t_log.commment
FROM Inv
RIGHT OUTER JOIN T_Log on InventoryNo = Invno
WHERE Inv.InventoryNo is NULL )
Thanks for any help
|
|
|
|
|
Your select should woth with a LEFT outer join
Also try this
SELECT *<br />
FROM t_log<br />
WHERE invno NOT IN (SELECT Inventryno from inv)<br />
AND transactiontype = 'R/S'<br />
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
It's not the select that I am having trouble with it is how to update the transaction log (comments and error)
|
|
|
|
|
I always test an update with a select to see what the results are before I commit to the update, saves restoring a databse if I get the filter wrong. So naturally I assume everyone does this, you'll note the R/S needs to be modified as well naturally.
UPDATE t_log SET Comment = 'Thinking is good for you'<br />
--SELECT *<br />
FROM t_log<br />
WHERE invno NOT IN (SELECT Inventryno from inv)<br />
AND transactiontype = 'R/S'
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
OK This did it for part one.
UPDATE T_Log
set Commment='Not in main table'
WHERE invno NOT IN (SELECT Inventoryno from Inv)
AND transactiontype = 'T'
Thanks
|
|
|
|
|
Hi All
Please could someone help me out here I am battling slightly.
I can understand SQl but this is a mind breaker.
I have three tables.
[Customer]
CustomerID - Primary Key
CustomerName
[CustomerLog]
CustomerID
TransTimeStamp
Status
[Statuses]
StatusId - Primary Key
StatusDescription
Example Data
[Customer]
CustomerID CustomerName
1 John
2 Peter
[CustomerLog]
CustomerID TransTimeStamp Status
1 2008-04-12 11:53:01 2
1 2008-04-13 10:01:02 3
1 2008-04-14 08:30:32 2
2 2008-04-12 10:45:23 2
1 2008-04-15 22:23:12 3
2 2008-04-13 08:34:12 3
[Statuses]
StatusId StatusDescription
1 Tea Break
2 Start Work
3 End Work
4 Other Things
I need to create a query that will give me the following
CustomerName
StartWorkTime (First Ocurrence Of Start Work)
EndWorkTime (Last Occurrence Of End Work)
Data Should be something like this
John 2008-04-12 11:53:01 2008-04-13 10:01:02
John 2008-04-14 08:30:32 2008-04-15 22:23:12
Peter 2008-04-12 10:45:23 2008-04-13 08:34:12
Please can someone give me advise on this or where to begin.
Regards
|
|
|
|
|
Lets break this down:
To get the earliest occurrence of a date, grouped by the customer, for the start of work
SELECT CustomerID, MIN(TransTimeStamp) AS StartTime
FROM CustomerLog
WHERE Status = 2
GROUP BY CustomerID
And the latest occurrence
SELECT CustomerID, MAX(TransTimeStamp) AS EndTime
FROM CustomerLog
WHERE Status = 3
GROUP BY CustomerID
Now, that has to be joined up with the customer table
SELECT CustomerName, StartTime, EndTime
FROM Customer AS c
INNER JOIN (SELECT CustomerID, MIN(TransTimeStamp) AS StartTime
FROM CustomerLog
WHERE Status = 2
GROUP BY CustomerID) AS s ON s.CustomerID = c.CustomerID
INNER JOIN (SELECT CustomerID, MAX(TransTimeStamp) AS EndTime
FROM CustomerLog
WHERE Status = 3
GROUP BY CustomerID) AS e ON e.CustomerID = c.CustomerID
|
|
|
|
|
Hi Colin, thanks for the help. I was getting confused with the nested SELECTS.
I have tried your example but it only returns one row, the very MIN and the very MAX.
I need multiple rows retured for each instance of 2 and 3 i.e
John 2008-04-12 11:53:01 2008-04-13 10:01:02
John 2008-04-14 08:30:32 2008-04-15 22:23:12
Peter 2008-04-12 10:45:23 2008-04-13 08:34:12
I think this would have to work with the timestamp. I have tried a ORDER BY, but SQL does't like that to much.
Any ideas, thanks again for getting bck to me
|
|
|
|
|
Do the nested queries work on their own? (i.e. the first two queries I wrote)
|
|
|
|
|
They do work, but only return a single row. Therefore the main SELECT only gets one row.
Really appreciate the help
|
|
|
|
|
Very odd. It should only do that if there is only one customer ID. You should be getting one row per customer ID (that's what the GROUP BY clause does)
|
|
|
|
|
Yip it is doing that, your query works fine, but even if there is one customer i.e John. He has many Starts and Stops, therefore John could have more than one row. Your query is taking the first instance (MIN) and very last instance (MAX). What needs to happen is that it will find the first START and the the very next END and so forth.
John could have Started and Stopped work multiple times as recorded in the log file. The status will change accordingly.
I was trying something like this but it doesn't return a row for a row.
LEFT (INNER) JOIN will return duplicates with the right side.
SELECT CustomerID, StartUp, SwitchOff
FROM Customers
INNER JOIN
(SELECT TOP 100 PERCENT CustomerID AS Cus1, timestamp AS SwitchOff
FROM Log
WHERE status = 2
ORDER BY timestamp) AS Q1 ON Q1.Cus1 = Customers.CustomerID
INNER JOIN
(SELECT TOP 100 PERCENT CustomerID AS Cus2, timestamp AS StartUp
FROM Log
WHERE status = 3
ORDER BY timestamp) AS Q2 ON Q2.Cus2 = Customers.CustomerID
Another problem I may have is the data has inconsistencies i.e. the system may have recorded that he stopped worked twice in a row and then only started again.
Thanks again for your help
<div class="ForumMod">modified on Friday, August 15, 2008 9:57 AM</div>
|
|
|
|
|
Hi,
Another approach to this is to use correlated scalar queries. Grouping would be usable here but by using scalars, it may be easier to read and modify. For example:
SELECT CustomerName,
(SELECT MIN(TransTimeStamp) FROM CustomerLog cl WHERE cl.CustomerID = c.CustomerID) AS StartTime,
(SELECT MAX(TransTimeStamp) FROM CustomerLog cl WHERE cl.CustomerID = c.CustomerID) AS EndTime
FROM Customer c
If you want to find corresponding pairs you could use something like this:
SELECT CustomerName,
(SELECT MIN(TransTimeStamp) --min to find first stop after each start
FROM CustomerLog cl2
WHERE cl2.CustomerID = c.CustomerID
AND cl2.Status = 3
AND cl2.TransTimeStamp > cl.TransTimeStamp) AS EndTime
FROM Customer c,
CustomerLog --starts
WHERE cl.CustomerID = c.CustomerID
AND cl.Status = 2
There may be several typos in this but you get the idea.
Mika
|
|
|
|
|
I've been having trouble when trying to add a User Instance SQL database to a Visual Studio 2008 solution. The error is shown in this[^] image.
Connections to SQL Server files (*.mdf) require SQL Server Express 2005 to function properly. Please verify the installation of the component of download from the URL: http:
Note that SQL Server works quite happily as long as it's not a user instance.
Does anybody have any ideas on how to solve this problem?
I doubt it. If it isn't intuitive then we need to fix it. - Chris Maunder
|
|
|
|
|
Hi,
Never seen this, but few questions:
- is SQL Browser running
- can you connect using Server Explorer
Mika
|
|
|
|
|
Mika Wendelius wrote: is SQL Browser running
Yes
Mika Wendelius wrote: can you connect using Server Explorer
If it's not a user instance then yes. I.e. if I set it up in SQL Management Studio and connect like a "full" server then everything works.
|
|
|
|