|
Very nice, compact, works great and is very fast. Thanks, Jack!
Best wishes,
Hans
|
|
|
|
|
I've seen a few people in these boards complain that I didn't check for null pointers in this function. This is a C function and the last time I checked, passing NULL to strcmp or any other C string function will segfault. I'm not saying this is great, and if you wanted to add a check for null, that would be fine. I just don't think that this is a 'bug' (if you can even call it that) worth flaming an otherwise great function.
-Jack
There are 10 types of people in this world, those that understand binary and those who don't.
|
|
|
|
|
I agree. This is not what I meant.
Efrat
|
|
|
|
|
Yeah, I wasn't talking to you, you were respectful. I was talking to the people below in the 'too complicated' thread. Namely 'The C++ Guru'.
-Jack
There are 10 types of people in this world, those that understand binary and those who don't.
|
|
|
|
|
great code, but if I'm not mistaken
cp can point beyon string array bounds.
try: wild="*a", string="xyzab"
correction:
string = cp++;
should be changed to:
string = cp;
if(*cp) cp++;
|
|
|
|
|
I can not reproduce this bug.
wildcmp("*a", "xyzab") returns 0 as it should.
Can you elaborate on what you are doing to cause it to function incorrectly?
Thanks,
Jack
|
|
|
|
|
You're right, it is not a bug in functionality,
but when I debugged it I saw that
for the above input, the line:
string = cp++;
causes cp to point one place beyond the string array bounds.
(When string points to the last char 'b', cp points to \0.
On the next iteration, string will be advanced to point to \0,
and cp will be advanced to point to one place after the \0).
Although it is not critical,I thought it is worth fixing.
regards,
Efrat
|
|
|
|
|
Thanks, I'll have to look into this once I get some free time.
-Jack
There are 10 types of people in this world, those that understand binary and those who don't.
|
|
|
|
|
Actually, it was brought to my attention that it is probably
legal to do that in C, since the pointer is not used afterwards.
So, maybe it's just a matter of coding practice.
|
|
|
|
|
I saw this one a long time ago, and finally have a use for it. Thank you very much.
Chris Richardson
Programmers find all sorts of ingenious ways to screw ourselves over. - Tim Smith
|
|
|
|
|
No problem. I hope it serves you well.
-Jack
There are 10 types of people in this world, those that understand binary and those who don't.
|
|
|
|
|
I converted this into C# and bingo...
I tried to break it but couldn't 
|
|
|
|
|
You didn't try hard enough:
wildcmp(NULL,whatever)
will break it.
Hector Santos, CTO
Santronics Software, Inc.
http:/www.santronics.com
|
|
|
|
|
The C# code would NOT break, instead the framework would throw an exception in this case. With appropriate exception handling routines, pointer checking in C# is useless overhead.
Cheers anyway,
K. C. Dorner
IBM Billing Solution
|
|
|
|
|
Exception handling in the dotnet framework is slow like hell
we will see
|
|
|
|
|
Where can I get the C# version?
- Bruce
BRCKCC
|
|
|
|
|
Hi !
Wery useful function, save at least a one sigarette lifetime
Seriously - great code.
Stanislav.
|
|
|
|
|
Yeah...good stuff. How long did it take you?
|
|
|
|
|
Could anyone explain how this code works for me? I am having trouble trying to figure out what is going on in a couple places. I would think a short explination would help out some other people like me who don't know C. Thanks in advance!
|
|
|
|
|
Ok, I'll try, even though I think it would be a good idea for you to learn C
The first loop basically goes through both strings step by step until there is a * in the wild string.
When ever the characters of the both strings don't match and the character in the wild string is no ? the function returns 0 (FALSE) = no match.
(I'm not a hundred percent sure, 'cause I don't have time to test it, but I guess this loop is for speed reasons only)
The second loop does the hard thing:
if (*wild == '*') {
if (!*++wild) {
return 1;
}
mp = wild;
cp = string+1;
This if stores the positions of the string pointers, when *wild is a star
(*wild is the character of wild at the current position of the pointer *wild - easy explanation, not 100% correct)
If this * is the last character in the wild string, it returns 1 (TRUE) = match.
} else {
wild = mp;
string = cp++;
This part if the ifs basically solves two things in one.
Firstly it's responsible to increase the pointer position of the string string pointer.
Secondly it returns the two pointers after a wrong go through to the end.
} else if ((*wild == *string) || (*wild == '?')) {
wild++;
string++;
This part does the same as the first loop, just after the first *.
while (*wild == '*') {
wild++;
}
Well, this loop just ingores several * at the end of the wild string.
return !*wild;
And now, that's a nice one
I like it.
After going through all the * in the last loop, the wild string can now contain either
- nothing anymore, that means *wild is NULL, or
- anything but nothing.
Is *wild NULL that means all the comparisons were successful and the function can return 1.
Or easier: it returns !*wild = not NULL = 1
Is it not NULL, but just any character, !*wild will be 0.
So this
return !*wild;
basically replaces
if (*wild = '') {
return 1;
} else {
return 0;
}
or something like this.
An example to explain how it really works:
wild is 'bl?h.*g'
string is 'blah.jpgeg'
After the first loop where 'b' is 'b' and 'l' is 'l' and '?' is 'a' and 'h' is 'h' and '.' is '.'
the position of the two pointers *wild and *string look like this:
*wild |<br />
'bl?h.*g'<br />
'blah.jpgeg'<br />
*string |
Now the second loop starts:
the pointer *string is increased until it points to a character that is the same as the *wild+1.
That means it looks for a 'g' in the string string beginning from the current position.
This increment is done by the last else, as explained above as firstly.
So it will look like this
*wild |<br />
'bl?h.*g'<br />
'blah.jpgeg'<br />
*string |
Now the second part if the ifs increases both pointers, because 'g' == 'g'.
*wild is now NULL because the g was the last character in the wild string.
*string is 'e'
Because Null != 'e' it sets back the pointers to the values they had before the comparison rush.
This is done again by the last else part, as explained as secondly above.
But the diferrence is now, that the *string pointer is one character further than then.
It looks now like that:
*wild |<br />
'bl?h.*g'<br />
'blah.jpgeg'<br />
*string |
This change compared to the first time is done by the cp++ of
} else {
wild = mp;
string = cp++;
where the pointer cp is incremented.
The same game starts all over again and again it doesn't succeed.
So next time it will look like this:
*wild |<br />
'bl?h.*g'<br />
'blah.jpgeg'<br />
*string |
and:
*wild |<br />
'bl?h.*g'<br />
'blah.jpgeg'<br />
*string |
and finally:
*wild |<br />
'bl?h.*g'<br />
'blah.jpgeg'<br />
*string |
And this will end the loop, as *string will be NULL after the next run.
Well, it's not an easy explanation, as the problem of wildcard search is not really as easy as C++ Guru wants it to have.
(Maybe for a guru, it's easy )
Don't hesitate to ask, if the answer is not understandable.
And of course please correct me, if something is wrong!
Targys
|
|
|
|
|
why use local variables and so many loops? it can be much easier to match two strings. i don't understand why so many people spend hours to search the web for wildcard matching when they can write it themselves in 5 minutes time??!
the code below could be shorter but it's easier to read like this.
i didn't debug it very much but it will work, though.
// -------------------------------------------------------------------
int wildcmp(const char* wild, const char* string)
// -------------------------------------------------------------------
{
if(*wild == *string)
return '\0' == *string || wildcmp(++wild, ++string);
if('\0' == *string)
return '*' == *wild && wildcmp(++wild, string);
switch(*wild)
{
case '?':
return wildcmp(++wild, ++string);
case '*':
wild++;
if('\0' == *wild)
return 1;
while(*string != '\0')
if(wildcmp(wild, string++))
return 1;
default:
return 0;
}
}
yours,
the c++ guru himself.
|
|
|
|
|
Works fine, can't beat!
But how the hell can you make this even shorter???
|
|
|
|
|
no problem, but it looks very ugly. but i think you cannot do it with less characters. i would be glad to find out i was wrong, so try to do it shorter!
int wildcmp(const char* w, const char* s)
{
if(*w == *s) return !*s || wildcmp(++w, ++s);
if(!*s) return '*' == *w && wildcmp(++w, s);
if('?' == *w) return wildcmp(++w, ++s);
if('*' == *w) if(!*++w) return 1; else while(*s) if(wildcmp(w, s++)) return 1;
return 0;
}
|
|
|
|
|
I don't know about the rest of you, but I personally prefer FAST code to SHORT code. The function you posted is considerably slower, do some benchmarks, I have.
This code....
int main(int argc, char **argv) {
int x;
for (x=0; x<9999999; x++) {
wildcmp("*t?st?n*this*t*", "testin this sh*t");
}
}
using your function...
real 0m13.170s
user 0m13.120s
sys 0m0.020s
using my function...
real 0m6.804s
user 0m6.790s
sys 0m0.000s
Furthermore I don't see why you criticize me for using loops when you are using a recursive function.
Thanks for your reply anyhow,
Jack
|
|
|
|
|
i didn't criticize you.
i just said that wildcard matching is not a very heavy problem to solve and that i don't understand why people spend hours looking around to find code while it takes a cigarette's lifetime to do it yourself.
first of all i feel happy that my function works at all! secondly, i know that recursion produces noticeable overhead, and as i said it was not my primary objective to write a fast function. i just tried to write it in 5 minutes time.
|
|
|
|
|