|
Cool article man. The only thing I'd add would be to give credit where its due if you based some ideas / work off of others. And to do a quick search to see if it exists already, and if so what makes your article different, being that the Internet is kinda popular now there are tons of articles online already.
For example, check the bottom of one of my articles...
A multithreaded, OpenGL-enabled application.[^]
Anyway, good job man... sharing knowledge feels good and helps you learn even more.
Jeremy Falcon
|
|
|
|
|
Somebody has done this before....
I have a long list of members in a class called c_Client. Data for the class will come from database, HTML scraping and PDF scraping. It's a mess. I really don't know the extent of the mess.
So I made a c_Field class. c_Client contains a List<c_field>. c_Field has members ~ like
sNameInDatabase, // for select or insert statements
sDataFromDatabase, // some of the data will start out in the database, like the key
sDataFromScrape, // data from parse of screen scrape
sCaption // for parsing screen scraping ~ like "First Name:"
DataType, // int, string, Date
iMaxLength, // of strings
sError // You guess
Now I'm sure somebody has done stuff like this before... but I have no idea where to start looking or even search words. Any thoughts? T'anks, Mike
|
|
|
|
|
|
I second that. Duck or run as fast as you can.
The language is JavaScript. that of Mordor, which I will not utter here
This is Javascript. If you put big wheels and a racing stripe on a golf cart, it's still a f***ing golf cart.
"I don't know, extraterrestrial?"
"You mean like from space?"
"No, from Canada."
If software development were a circus, we would all be the clowns.
|
|
|
|
|
I see this as a good technical discussion, not a specific programming question. I think it is very appropriate for the Lounge.
There are two kinds of people in the world: those who can extrapolate from incomplete data.
There are only 10 types of people in the world, those who understand binary and those who don't.
modified 10-Jan-17 10:29am.
|
|
|
|
|
That's not the problem. If you had seen what terrible things can come from that idea, then you would also run as fast as you can. Always someone comes along and tries to invent that yet another time.
The language is JavaScript. that of Mordor, which I will not utter here
This is Javascript. If you put big wheels and a racing stripe on a golf cart, it's still a f***ing golf cart.
"I don't know, extraterrestrial?"
"You mean like from space?"
"No, from Canada."
If software development were a circus, we would all be the clowns.
|
|
|
|
|
Goose!
There are two kinds of people in the world: those who can extrapolate from incomplete data.
There are only 10 types of people in the world, those who understand binary and those who don't.
|
|
|
|
|
Oof! Keep your hands to yourself young man!
veni bibi saltavi
|
|
|
|
|
You should read... At least the red...
And should move it to the right forum or QA...
And you should do it quickly...
Winter is coming...
Skipper: We'll fix it.
Alex: Fix it? How you gonna fix this?
Skipper: Grit, spit and a whole lotta duct tape.
|
|
|
|
|
1. First you should learn to read,
2. when you can do that please read the very first message in this forum.
3. Next, programming lessons, just the names you chose tell me you aren't there either
(- and no, this is also not a programming lesson forum.)
Quack quack. Too bloody late.
Sin tack ear lol
Pressing the any key may be continuate
|
|
|
|
|
Are you saying you want a generic way to handle any structure? You can look into dynamic but I'm not sure it will help you.
Also, in my opinion, don't start property names with s, i, etc. Usually the name of the variable makes it clear what it is. Caption should be a string. MaxLength is clearly an int, etc. For me, it's easier to read without that first letter.
There are two kinds of people in the world: those who can extrapolate from incomplete data.
There are only 10 types of people in the world, those who understand binary and those who don't.
|
|
|
|
|
I grew up with this: Hungarian notation - Wikipedia[^] ... Still using ... Mostly ...
Skipper: We'll fix it.
Alex: Fix it? How you gonna fix this?
Skipper: Grit, spit and a whole lotta duct tape.
|
|
|
|
|
Kornfeld Eliyahu Peter wrote: I grew up with this: Hungarian notation - Wikipedia[^] Me too. But then I started using descriptive names.
There are two kinds of people in the world: those who can extrapolate from incomplete data.
There are only 10 types of people in the world, those who understand binary and those who don't.
|
|
|
|
|
RyanDev wrote: using descriptive names.
|
|
|
|
|
I would make all the data types string, and then create a parser to try and sort things out. If you know the exact data types for the exact fields you are scraping, then use those. Always use a TryParse for your non-string valued fields.
In the future, you may want to post your questions in the Question and Answer section, or the programming forum section.
-- Good luck.
|
|
|
|
|
Yes, everything starts as strings.
It uses far more sophisticated parsing than just TryParse... including Regex
Put it this way, A bit ago I needed to write fault tolerant code for a WAN. I did it, but later I found something a lot like what I did, just a bit better. That is what I am looking for with this. I have good code, but I bet someone has already done this and has a few good ideas I don't. That is why I thought it OK to ask for key words or concepts here. My apologies.
|
|
|
|
|
Well, I have to admit, I did not fully understand your request in your original post.
Let me ask you: is your solution working, and working - acceptably? If so, why second guess yourself? This type of stuff is like ETL work. Everyone has a way of doing the work, and everyone's way is the correct way, more or less. It is not exact science.
|
|
|
|
|
I figure it's best to learn if possible rather than re-invent the wheel. I respect my fellow developers.
Think of the fault tolerant Retry solution I came up with. It took me a long time and it consisted of a string that indicated Process_RetryCount. When I found someone else's solution it had
Process_RetryCount_DelayToNextAttempt. That was a slick bit of improvement I would have used if I had known it up front (code was too complicated for me to want to change without compelling reason).
Right now I'm enhancing my current model of c_Field by adding reflection. That way I can use NewtonSoft to serialize c_Client to JSON. My requirements are a bit odd, so I was just looking for good ideas. ... Oh yah, talk about problems, try PDF scraping. Tabula should do it.
|
|
|
|
|
We do PDF scraping where I work, both forms and OCR.
|
|
|
|
|
Ahh, stringly typed code, once again!
The language is JavaScript. that of Mordor, which I will not utter here
This is Javascript. If you put big wheels and a racing stripe on a golf cart, it's still a f***ing golf cart.
"I don't know, extraterrestrial?"
"You mean like from space?"
"No, from Canada."
If software development were a circus, we would all be the clowns.
|
|
|
|
|
Strongly typed code is so overrated. Just make everything a string and throw validation to the wind.
|
|
|
|
|
That's the old VB and Variant school!
The language is JavaScript. that of Mordor, which I will not utter here
This is Javascript. If you put big wheels and a racing stripe on a golf cart, it's still a f***ing golf cart.
"I don't know, extraterrestrial?"
"You mean like from space?"
"No, from Canada."
If software development were a circus, we would all be the clowns.
|
|
|
|
|
Maybe no one has done anything like this before...
I don't care about Ducks, Geese, vultures or pelicans or even feathered dinosaurs. It doesn't matter in the slightest what notation I used.
I'm not looking for a technical discussion or code, I'm looking for a conceptual discussion or perhaps even ideas for search words.
Here, have some codish stuff and think about this... No whining please. Just good, bright, positive, helpful thoughts. Maybe you can learn something... Maybe from someone who has done this before and has some conceptual thoughts on it. Otherwise you are saying I came up with a completely new idea or certainly one you haven't thought of and I am sure you don't want to do that.
List<c_Field> Lst_c_Field = new List<c_Field>()
DataSet ds = new DataSet()
foreach(c_Field cField in Lst_c_Field)
cField.sDataFromDatabase = ds.Tables[0].Rows[0][cField.sNameInDatabase];
Then I can parse from my screen scraping and add values to other c_Field(s) in Lst_c_Field based on their sCaption. Compare what you got from the database and from the screen scrape and you could then generate an update statement for the database... for that Client
|
|
|
|
|
Well, I've seen your construction before.
It's known by many names, such as MUCK (Massively Unified Code-Key), OTLT (One true lookup table) and EAV (Entity Attribute Value) for example.
And it's generally not a good idea.
I won't bore you with my own experiences since I saw the problems in time. So read this [^] instead.
<edit>And another one[^]</edit>
|
|
|
|
|
I really like your signature... It is true! Evil must be defeated! (I happen to be writing a book on morality currently.)
Very cute article, but since I am all those roles, it should work out... If only I could get Jack (my personality #3 - handles database code) to quit just repeating "burn them, burn them all" again and again... but he always does that anyway.
I do understand the objection, but my database is pretty well set.... mostly varchar, date and some int keys. It's the conversions between them that is an issue. When parsing from scraped PDF or HTML, I have to have the the string literal indicating what I am scraping such as "First Name:" or "Date of Birth:" and the value scraped, so there are two strings in the c_Field object. Then there is the name in the database "FirstName" and "DOB" with their values "Chucky" and "01/10/2017". Then the length in the Db and the data type. Add OutputCaption, DataSource and Error and you have a description of a data member that can be put into a data base or... use reflection to populate the data members of the class (with or without the same name as in the Db)... since there is reason for me to serialize this to JSON. It is parsed on one machine with no good connection to the machine that consumes it. Oh yah, it becomes part of an HTML page displayed in a Win Form Web Browser Control.
Partly it seems sort of cool too. Besides, the same stuff that keeps my MAR Clap at bay holds off the fever. Really, if that isn't enough explanation for why this is difficult, realize it's for my wife.
|
|
|
|