|
Rob Philpott wrote: so is it that you've got these large files to process and you're trying to optimise the switching (state changes) for speed
Indeed.
Rob Philpott wrote: Which approach are you using at the moment
I've set up the parsing using switches just to make sure it works, but it's painfully slow so I'm looking at refactoring it at the moment.
Rob Philpott wrote: Does this mean the switch/state change logic is fixed in advance?
This is where it gets funny.
In theory yes. But changes might happen every now and then
These files are supplied by a government entity. And while we're allowed to get the data (which is actually only a subset), we're not allowed to take part of the documentation (no, really).
And they can't be bothered to make separate documentation for our subset (without us paying an extortion fee that is).
So I've added logics that tells me when they've added or removed attributes.
Oddly enough, I'm having fun tinkering with these files, mostly.
Multithreading is the next logical step, but I want to get as far as possible without using brute force before that.
|
|
|
|
|
I believe I might have a generic answer to my question.
This is a simple implementation of IDictionary using a singly linked list. It is smaller and faster than a Hashtable if the number of elements is 10 or less. This should not be used if performance is important for large numbers of elements.
|
|
|
|
|
Maybe, that thing is old, predating generics so there might be some boxing overhead depending on what you stick in it, unless they've done a generic version of it.
It's hard to comment from this distance, but if the state machine might change, surely its better to model it at runtime so you just need to adjust some static data rather than go back to source...
Profiling is always a good option, to see where the bottlenecks lie. Anyway, best of luck!
Regards,
Rob Philpott.
|
|
|
|
|
Jörgen Andersson wrote: It's strings.
For a dictionary then you are going to need to compute the hash.
Jörgen Andersson wrote: And as the files
And then you must compute the hash for each of those.
I suspect this really depends on the size and probably the standard deviation of the sizes for each string.
I haven't thought this through and certainly have not profiled it but a tree might be better. The sparse tree is built with each fork having one character. Next level has 26 (or whatever size your set is) characters.
Keep in mind that a hash requires sequencing through each character. So a tree is somewhat similar to that EXCEPT when you reach the end (leaf of tree) you have already reached your delegate. So no further operations to look up.
If each level has the entire character set you can use an array and do a direct look up to the next level (the character is the index into the array.)
Carefully calculate the memory space. You could use a sparse tree but that will slow it down.
And maybe you should look at unmanaged code. Specifically C++.
One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also.
The problem with that of course is that you must then deal with 4/8 character size blocks only.
Jörgen Andersson wrote: be worth some optimization.
You should actually profile the application. Not specific code but the entire application. If you want it to be fast then you should find the exact places where it is slow.
|
|
|
|
|
A tree is a really good suggestion!
Thanks!
|
|
|
|
|
jschell wrote: One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also.
Ah, yes to do that in C# I would need to use pointer indirection operators.
I hate pointers.
|
|
|
|
|
Not if you're using a recent version of .NET, or have a reference to the System.Memory NuGet package[^]:
ReadOnlySpan<char> input = "ABCD";
ReadOnlySpan<byte> bytes = System.Runtime.InteropServices.MemoryMarshal.AsBytes(input);
int value = System.Runtime.InteropServices.MemoryMarshal.Read<int>(bytes);
Of course, you may still need to take the "endianness" of the system into account.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Richard Deeming wrote:
ReadOnlySpan<char> input = "ABCD";
ReadOnlySpan<byte> bytes = System.Runtime.InteropServices.MemoryMarshal.AsBytes(input);
int value = System.Runtime.InteropServices.MemoryMarshal.Read<int>(bytes); Will that compile to a single instruction, as you might see when using C/C++ casting?
If you want to treat 4 chars at a time by treating them as ints, this doesn't look like something that would save CPU cycles. I admit that I haven't tried to compile the code and studied the instructions generated.
Endianness isn't your only concern. Don't forget that UTF16, the internal character format of C#, also can contain surrogates and other funny elements.jschell wrote: So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. obviously expecting a result of 1 094 861 636, hex 41424344 on big-endian machines, 1 145 258 561, hex 44434241 on little-endian machines. With UTF16 representation, the value 4325441, hex 00420041, encodes only two characters, "AB", not four as the C++ programmer expected. (C# never used 8 bit char representation, so the C# should not expect that four chars are packed into a 32 bit int.)
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
For literals, the other alternative would be to use a UTF8 string literal:
ReadOnlySpan<byte> bytes = "ABCD"u8;
int value = System.Runtime.InteropServices.MemoryMarshal.Read<int>(bytes);
It's not going to compile to a single instruction, but it should be fairly well optimized.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
ReadOnlySpan<byte> bytes = "ABCD"u8;
You see, I had no idea you could do that. u8 - when did that arrive?! Can't keep up with it all.
Regards,
Rob Philpott.
|
|
|
|
|
|
I think that is one of the (few) useful extensions to C# in recent years.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
That solves the issue if you restrict yourself to 7-bit ASCII.
Even for West European languages (such as Norwegian), a character may fill more than one octet. So if you step through a string 4 characters at a time, converting them to 32 bits, you will miss 8 bits here and there. Can lead to nasty, hard-to-debug errors when it happens with a customer on the other side of the earth, using a language with lots of multi-octet UTF8 characters.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
П.... не копил Editor3D_Render_Control на 2022 MSVStuio
|
|
|
|
|
write your question in English and NOT with kyrillic letters ...
|
|
|
|
|
I'm sorry, your question makes no sense. First of all, this is an English speaking site so please post questions in English. Second, when I translated your question, it said:
P.... didn’t save Editor3D_Render_Control for 2022 MS VStuio What does that even mean?
|
|
|
|
|
What did you use to translate? Google produces "screw up" instead of "save" for me.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
If you mean this control: Editor3D: A Windows.Forms Render Control with interactive 3D Editor in C#[^] then don't post this under a "generic" forum like C# - if you got the code from an article, then there is a "Add a Comment or Question" button at the bottom of that article, which causes an email to be sent to the author. They are then alerted that you wish to speak to them.
Posting this here relies on them "dropping by" and realising it is for them.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
In VS 2022 I can not find any templates for WinUI. I installed Windows SDK and WinUI controls but I can not find any templates. Any idea why ?
Thanks
|
|
|
|
|
|
|
Did you install the correct workloads by following the link in point 1?
|
|
|
|
|
|
|