Click here to Skip to main content
15,881,882 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
i am trying to use regular expressions to extract singular sql statements from a file containing several sql statements and alternate delimiters/comments.

i am trying to match the following patterns to isolate sql statements, then after isolating an individual statement, stripping it of comments:
"delimiter (del) (nonwhitespace sequence) (not (del) or comment with (del)) (del)"
"(not ; ) ;"

the first pattern should allow the use of any set of characters for a delimiter

What I have tried:

i tried the following to match the first pattern:
"/\s*delimiter\s+(?<d>[^\s]+)\s*;?\s*(?<qstr>(((?!--|\g{d}).)*|--[^\R]*\R)+)\g{d}\s*;?/s"

and if the first pattern fails, to match the second pattern:
"/\s*(?<qstr>(((?!--|;).)+|--[^\R]*\R)*);/s"
to match the second case

then if either succeeds, replace the following with empty string:
"/--[^\n\r]*(?:\n|\r)+/"

my problem is that apache crashes on preg_match when i try to search for either of the first 2 regular expressions on the following string:
"delimiter $$
create table MovieDetail
(
imdbid varchar(32) primary key not null,
title varchar(512),
year int,
rated varchar(16),
released int,
runtime int,
director varchar(128),
writer varchar(12),
plot varchar(2048),
imageurl varchar(512),
rating float,
ratingcount int,
type varchar(64)
); $$
detect this text as a separate statement"

i tried replacing escape sequences with // like //s and //g and it still crashes just the same

i'm using XAMPP with Apache 2.4.17 and PHP 5.6.23 (VC11 X86 32bit thread safe) + PEAR

i tried testing them on debugex.com and both expressions are valid.

major update: it seems as if the problem only manifests itself when i use run the expressions on multi-line strings, so i'm going to try comparing the binary data of 2 strings where i replace the line break with \n or \r\n

update: the problem seems to occur only with multiple whitespace characters.
Posted
Updated 17-Aug-16 5:24am
v7

Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx:
Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]

Looks like "(?(" is not allowed.
"/\s*delimiter\s+(?[^\s]+)\s*;?\s*(?(((?!--|\g{d}).)*|--[^\R]*\R)+)\g{d}\s*;?/s"
and
"/\s*(?(((?!--|;).)+|--[^\R]*\R)*);/s"
are wrong.
 
Share this answer
 
v3
Comments
Patrice T 16-Aug-16 3:35am    
Yes, I did it but "(?[" and "(?(" are not allowed
Patrice T 16-Aug-16 3:45am    
Can't help you more
i found out the source of the crash was the use of nested and repeated captured subpatterns, but even afterwhich, i wasn't able to get the expressions to work as desired so i gave up and resorted to manual character processing
 
Share this answer
 
Comments
Patrice T 17-Aug-16 11:35am    
If this is not a problem anymore, close the question by accepting at least one of the solutions.
Use Accept answer to close the question.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900