Click here to Skip to main content
15,868,005 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi all, i need help , I would like to know if what I am trying to do is possible, I am using php to run an API to retrieve company data for a CRM, this API accept regular expressions to do the search, for exemple if i'm literally using “CEO”, i will match all that has the letters C, E, O within it and it is not case sensitive, i can make it precise by using the ^CEO$ regex to match just for the exact term of “CEO”, i can combine several search terms within one singular regex.

The goal, is to run a search in a SINGLE API call for several different strings like "CEO" and "OWNER" but ignoring if the target string contain a list of another strings like "assitant" and "junior" or "president".

What I have tried:

For example :
https://xxxxx.com/api?search=|CEO|OWNER|president

target terms :
"OWNER", "assisant CEO", "OWNER and vice president", "CEO junior", "founder and director", "founder"

search string list :
"CEO", "OWNER", "founder"

ignore list :
"assisant", "junio", "president"

result must matche only :
"OWNER", "founder and director", "founder"

I specify that I shortened the terms to make it simple, in reality the target strings are paragraphs of text.

and there are maybe 10 strings to include and exclude.

Thank you all.
Posted
Updated 24-Mar-23 1:17am
v4

I'd use two regexes to make it more obvious and easier to maintain: the first to retrieve all matching strings, and then a second applied to the results which eliminates the "ignore list".

While you could do it in a single regex - I think, I've not had to try - it would be horrible to understand, and that makes it hard to maintain, which reduces your app reliability.

Splitting the operation in two makes the job a lot "cleaner" and easier to work with.
 
Share this answer
 
Comments
jerome sergan 24-Mar-23 5:15am    
hi,

I'm looking to use only one API call, I can't split into 2, each API call returns hundreds of results and costs a lot of money, splitting into 2 regex therefore comes at a double cost, this is precisely the purpose of my question.

thanks
Chris Copeland 24-Mar-23 6:38am    
Once you've made the initial API call, couldn't you just use regular expressions on your side of the code (PHP) to do the secondary filtering? Ie. find all matches via the API with "CEO" or "OWNER", and then loop through the results and remove any that have the "junior", "assistant" or "president" terms?
jerome sergan 24-Mar-23 6:46am    
it would be the simplest, unfortunately it's not possible, is exactly what I'm trying to avoid, I explain :

Each API call returns a no predictable number of result, it can be a single result or can be 300 results, the api charges each result separately, that's why I have to use the regex in the first query in order to return the minimum result possible corresponding to my search strings ...
You can use regular expression negative look-aheads to ensure that certain patterns don't match within the string, while still having a matcher to find terms you are looking for.

Here's an example regular expression[^]

I can't guarantee that this will be perfect all the time but it's the best of the situation you have. Hopefully the API you're calling supports advanced expressions like this. Most of this expression was inspired by this StackOverflow post[^] which also provides a good example.
 
Share this answer
 
Comments
jerome sergan 24-Mar-23 7:19am    
hi, thanks

It look good, i try and late you know soon
GKP1992 24-Mar-23 7:37am    
Thats a nifty solution.
That depends on what the API allows you to do. Whether it exposes a method that can accept a list of regexes or strings to include or exclude.
Assuming the API to be a black-box.
The only way you can know what it does is through the documentation.
If it does not allow you what you want, bad luck.
 
Share this answer
 
Comments
jerome sergan 24-Mar-23 7:22am    
Hi, no, the api does not allow the exclude list itself otherwise this question would not be asked, it allows complex regex, and logically with a regex we can include a list to ignore, as Chris Copeland suggested, I will try this regex and post the result

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900