Welcome to the Lounge

Mike Winiberg25-Jan-23 20:30

Mike Winiberg

25-Jan-23 20:30

Although it used SQL as the backend, I remember a Customer Relationship Management system called Maximiser that took a similar approach. There were, ISTR just two tables, one to hold all the relatively constant client data itself and one to hold the collection of notes linked to that.

In the Maximiser app there were complex joins on one table producing 'subtables' that held various views on the data. Some columns contained numbers that indicated what other columns actually held! I was given the job of moving all the data held in this system to another SQL based program.

It took ages (in the absence of any database schema documentation) to unravel the various actual combinations of joins required to get what we wanted. Here's just one query to extract a little of the info: All the tables named as a, b, c, d etc duplicate joins used in 'built-in' queries on the maximiser database.

I thought you might find an example of the stuff I had to build mildly amusing 8)

-- Build the View of the Maximiser data that shows what we want and store it

SELECT     
CASE 
	WHEN c.Record_Type = 1 THEN c.Name
	WHEN c.Record_Type = 31 THEN d.Name + ' - ' + c.First_Name + ' ' + c.Name
	WHEN c.Record_Type = 2 AND len(c.Firm) > 0 THEN c.Firm
	WHEN c.Record_Type = 2 AND len(c.Firm) < 1 THEN c.First_Name + ' ' + c.Name
	WHEN c.Record_Type = 32 THEN 
	(
		CASE 
			WHEN len(d.Firm) > 0 THEN d.Firm + ' - ' + c.First_Name + ' ' + c.Name
			WHEN len(d.Firm) < 1 THEN d.First_Name + ' ' + d.Name + ' - ' + c.First_Name + ' ' + c.Name 
		END
	)
	ELSE c.Name
END AS Company,
CASE
	WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.Address_Line_1 
	WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN  g.Address_Line_1
	WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN f.Address_Line_1
	WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN  g.Address_Line_1
	ELSE c.Address_Line_1
END AS Address_1,
CASE 
	WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.Address_Line_2 
	WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN  g.Address_Line_2
	WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN f.Address_Line_2
	WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN  g.Address_Line_2
	ELSE  c.Address_Line_2
END AS Address_2,
CASE 
	WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.City
	WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN  g.City
	WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN f.City
	WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN  g.City
	ELSE  c.City
END AS City,
CASE 
	WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.State_Province
	WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN  g.State_Province
	WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN  f.State_Province
	WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN  g.State_Province
	ELSE  c.State_Province
END AS State,
CASE 
	WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN  e.Zip_Code
	WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN  g.Zip_Code
	WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN  f.Zip_Code
	WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN  g.Zip_Code
	ELSE  c.Zip_Code
END AS Zip,
CASE 
	WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN  e.Country
	WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN  g.Country
	WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN  f.Country
	WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN      g.Country
	ELSE c.Country
END AS Country,
CASE 
	WHEN n.Type = 0 THEN  'Manual Note'
	WHEN n.Type = 1 THEN  'Mail - Out' 
	WHEN n.Type = 2 THEN  'Phone Call'
	WHEN n.Type = 3 THEN  'Timed Note'
	WHEN n.Type = 4 THEN  'Transfer'
	WHEN n.Type = 5 THEN  'Task'
	WHEN n.Type = 6 THEN  'Reserved' 
	WHEN n.Type = 7 THEN  'Reserved'
	WHEN n.Type = 8 THEN  'Opportunity'
	WHEN n.Type = 12 THEN 'Customer Service'
	ELSE  'Unknown'
END AS Activity_Type,
n.DateCol, n.TextCol, n.Owner_Id, n.Client_Id, n.Contact_Number, n.Note_Type,
'    ' AS sndex 
INTO BookerNotes
FROM
	dbo.AMGR_Client_Tbl AS c 
	LEFT OUTER JOIN dbo.AMGR_Client_Tbl AS d ON c.Client_Id = d.Client_Id AND d.Contact_Number = 0 
	LEFT OUTER JOIN dbo.AMGR_Client_Tbl AS e ON c.Client_Id = e.Client_Id AND c.Address_Id = e.Contact_Number 
	LEFT OUTER JOIN dbo.AMGR_Client_Tbl AS f ON c.Client_Id = f.Client_Id AND c.Address_Id = f.Contact_Number 
	LEFT OUTER JOIN dbo.AMGR_Client_Tbl AS g ON c.Client_Id = g.Client_Id AND g.Contact_Number = 0 
	RIGHT OUTER JOIN dbo.AMGR_Notes_Tbl AS n ON c.Client_Id = n.Client_Id AND c.Contact_Number = n.Contact_Number
WHERE c.Record_Type IN (1, 2, 31, 32)
GO

craig white 202126-Jan-23 4:43

craig white 2021

26-Jan-23 4:43

Ooof, that would be nasty to maintain!

englebart28-Jan-23 3:47

englebart

28-Jan-23 3:47

I just saw something similar the other day with additional complications.

The developer that inherited was trying to figure out how to make it a little more maintainable by removing some conditions that were nonsensical and others that were just bad hard codes.

maze326-Jan-23 1:09

maze3

26-Jan-23 1:09

old posting, but reddit had Key Value, then moved to what sounds like a few tables which are thing/data, so basically instead of just 1 key/value table, its many more key/value tables

Southmountain25-Jan-23 18:20

25-Jan-23 18:20

yes, it has limited documentation ....

diligent hands rule....

Southmountain25-Jan-23 18:24

25-Jan-23 18:24

I will double check the table column number and have very limited documents...

diligent hands rule....

jschell25-Jan-23 5:32

jschell

25-Jan-23 5:32

Southmountain wrote:
it has hundreds of fields

Presumably you mean 'columns'...

Southmountain wrote:
what is the best way to get understanding of this table quickly?

It is unlikely there is a way to do it quickly. The number of columns suggest it is probably overloaded so there are multiple uses. Best you might be able to do quickly is determine how the data is created in the first place. And that would only be true if it is just a batch load.

Gerry Schmitz25-Jan-23 6:19

Gerry Schmitz

25-Jan-23 6:19

I found MS Access and Excel, with some SQL management studio, good enough for "data analysis".

Access and Excel can connect to SQL server. You can then tap into their analytics and query ability.

There's also MS Power BI (Desktop), to top it off.

"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

Southmountain25-Jan-23 18:21

25-Jan-23 18:21

following your ideas, I will try to load it into an Excel pivot table and play around with it...

diligent hands rule....

Mycroft Holmes25-Jan-23 11:18

Mycroft Holmes

25-Jan-23 11:18

Having done this sort of thing in the past (and yes it was for the banking industry) you are going to need someone with domain knowledge, making an incorrect assumption on the relevance/relationship of a column can lead you down some nasty cul de sacs.

Never underestimate the power of human stupidity -
RAH
I'm old. I know stuff - JSOP

yacCarsten25-Jan-23 18:00

yacCarsten

25-Jan-23 18:00

Delete a column, see who complains and then get them to explain what it's for.

~~// TODO: Insert something here~~
Top ten reasons why I'm lazy
1.

lowracer25-Jan-23 20:50

lowracer

25-Jan-23 20:50

see if you can get an input screen (or a few) and some reports and open/run it for a specific record. Next you need to try and see if you can match the data for a specific record to fields on the input screens or reports. That will give you a good understanding of how some of the fields fit together.

MikeCO1026-Jan-23 2:17

MikeCO10

26-Jan-23 2:17

Southmountain wrote:
what is the best way to get understanding of this table quickly?

There isn't one. There are tools and scripts for most DBs that can create some type of analysis for you, but you really don't need to understand the fields, you need to understand the data.

We can all grouse and speculate about the "100s of fields", but let's assume there is a valid reason for them even though I'm hard pressed to come up with one.

What type of understanding are you trying to achieve? Data is data and the question is if and where it is used. I'd suspect there could be a lot of drop columns in your future, but that requires a detailed look at your recordset objects in the code that is using the DB. Honestly, it's a flat table so despite the crazy column count, it should be clear to understand. If it has a bunch of relations, that could take a lot of caffeine or alcohol, or both Smile | :)

What's your scope of work in relation to this monster? Crazy as it seems, the DB could be oddly efficient depending on the use of the data. You know, Select * (perish the thought!) from tablename where id=x is pretty simple, lol. If your task is to clean up and reduce the size of the database, that is one thing. If you're stuck with it, it is what it is and how the data is used is of the utmost importance.

Mark Starr26-Jan-23 4:11

Mark Starr

26-Jan-23 4:11

Understand ‘quickly?’: probably not possible.
Step back. Look at the application and the interfaces that update the table. Depending on the database tech, there’ll be a way to search procedures for the table name. Study these procedures.
Then (or while doing the above) look at a subset of the data, such as the last day’s worth of records.

Good luck. Wink | ;)

Time is the differentiation of eternity devised by man to measure the passage of human events.
- Manly P. Hall

Mark
Just another cog in the wheel

Peter Kelley 202126-Jan-23 4:53

Peter Kelley 2021

26-Jan-23 4:53

This mostly applies if it's a relational table. There a lot of excellent responses here, but one topic that I didn't see explicitely mentioned was "Normalization".
Normalize the data. You might need to have a sample of data to work(play) with to understand.
There are better resources to explain normalization further, and *some* SQL books hit the topic early (the good ones, anyway).

Normalization is at the root of what many here have indicated are problems - like too many columns in individual tables! You likely have key fields and data fields repeating all through your data. Even if you can't change the tables, understanding how it isn't normalized will help you understand the data better. (Maybe a future task will let you fix this)

> You'll need to understand the domain. It would be great if you have some people that can get you pointed in the best area. It sounds like there is a lot more complexity than can be dealt with quickly.
> Maybe there's a program that uses that table that might reveal columns content or purpose?
> Review a small sample of data in a dev/work database.
> For longer-term goals
>try to use the right terms too, especially when consulting your database expert. e.g. It might be habit to say 'fields', but in the relational database world it's a 'column'. Stuff like that will make communication more consistent.
> Maintain a group of documents, references for yourself of what you're finding - and share it.

> Topics and terms to be familiar with, or review:
> the different between a data table versus a reference table
> A Many-to-Many relationship and a "Join Table" (sometimes called a "Pin Table" etc. ...but not by me)
I didn't intend to write a book here! You have too much to read here already! I wish you success!

englebart28-Jan-23 3:53

englebart

28-Jan-23 3:53

Others have mentioned researching input screens/forms and reports.

If you can capture the queries/views/procedures from the database side that are involved that could help out.

Daniel Anderson 20213-Feb-23 5:20

Daniel Anderson 2021

3-Feb-23 5:20

I work for a finance institution, and our tables are highly un-normalized with all the problems this incurs.
we have many tables with hundreds of columns.

But people in finance are used to excel like spreadsheet and they want to see everything at one place.

I came to that place with a background in databases and I was horrified and still am.
But it is hard to change legacy code AND mentalities. So for better or worse I'm stuck with it.

The best advice I can give you: find the queries people/programs make the most often, this often reveal the «subtables» within the big one.
Also if you have self join that often indicate two tables (or more) sharing the same table data.
Once you have that draw a diagram as if those tables really exist.
If you can create views that represent those tables.
This way you will better understand the «schema»

Quote:
«Il faut imaginer Sisyphe heureux»
Albert Camus

based upon Engineer

rnbergren25-Jan-23 2:43

rnbergren

25-Jan-23 2:43

the engineer job description. I loved it.

Also the thought that jumped in my head.

Data Scientist.
Forecasts the future using neither data nor science.
See also, Weather Forecaster

Project Manager
Not a Manager of people nor do they have only one project

To err is human to really elephant it up you need a computer

Southmountain25-Jan-23 2:46

25-Jan-23 2:46

data scientist is a fancy term now...

diligent hands rule....

raddevus25-Jan-23 5:42

raddevus

25-Jan-23 5:42

I bet you really like the titles:

1. DevRel (Developer Relations)
2. Developer Advocate

Here's the meme I created for DevRel (snapshot)[^]. Roll eyes | :rolleyes:

Single Step Debugger25-Jan-23 6:34

Single Step Debugger

25-Jan-23 6:34

Development Manager:
Safety device that operates to provide deadline-overpressure protection of a VP and CEO. If project fails because of the high circle's stupidity, those melts.

Advertise here – minimum three posts per day are guaranteed.