Downloads for C++
Download Unit for Delphi (statically linked into Delphi project)
Download ActiveX for VB
Download Dynamic Link Version
Introduction
DEELX is a simple regular expression engine coded in pure C++.
All source code of DEELX is just only one single header file (deelx.h). Without any other CPP or lib, you need not create a project alone for DEELX when you want to use it, and also you need not worry about link problems.
DEELX has a good compatibility that it can be compiled by Visual C++ 6.0, 7.1, 8.0 (Windows), gcc(Cygwin), gcc(Linux), gcc(FreeBSD), Turbo C++ 3.0(DOS), C++ Builder(Windows), etc. DEELX is coded using template, so char
, wchar_t
and other simple types can be used as its base type.
DEELX regular expression engine is the most convenient and easiest engine to use.
Features
DEELX supports PERL compatible regular expression syntax. Besides the basic pattern syntax, DEELX has implemented many extended syntaxes:
- Right to left match mode
- Named capture group
- Remark
- Zero-width assertion
- Independent expression
- Conditional expression
- Recursive expression
- Replace operation
Ideas
The most important idea of DEELX is the concept of "Element of Regular Expression". In the source code, I call it "ELX".
I regard every kind of element as "Abstract Element" => "ElxInterface
". This ElxInterface
has two methods: Match()
and MatchNext()
. Match()
means to try to match the first time. If Match()
returns true
, but what matched is not what you want, call MatchNext()
means to discard the result and try to get another successful match. If the result is still not what you want, go on calling MatchNext()
till it returns false
or you get what you want.
For example, two elements: (.*)(a)
- To call the "
Match()
" method of the first element(.*) will let it match all the text. But now the second element(a) will fail to match, so the match result of the previous "Match()
" is not what I want. - The next step is to call the "
MatchNext()
" method of the first element(.*). This step is also called "backtrack
". The first element(.*) will reduce its repeat times, then the second element(a) will again try to match. - So on, one possible final result is that: even the first element(.*) reduced to zero times, the second element still failed to match, so the overall regular expression failed to match.
- Another final result is that: when the first element(.*) reduced to a certain times, the second element succeeded to match, so the overall regular expression succeeded.
Match operations of all kinds of elements can be abstracted into "Match()
" and "MatchNext()
" operations.
That is DEELX's idea.
Demo in C++
#include "deelx.h"
int main(int argc, char * argv[])
{
char * text = "12.5, a1.1, 0.123, 178";
static CRegexpT <char> regexp("\\b\\d+\\.\\d+", IGNORECASE | MULTILINE);
MatchResult result = regexp.Match(text);
while( result.IsMatched() )
{
printf("%.*s\n", result.GetEnd() - result.GetStart(), text + result.GetStart());
result = regexp.Match(text, result.GetEnd());
}
return 0;
}
Regex flag definition:
enum REGEX_FLAGS
{
NO_FLAG = 0,
SINGLELINE = 0x01,
MULTILINE = 0x02,
GLOBAL = 0x04,
IGNORECASE = 0x08,
RIGHTTOLEFT = 0x10,
EXTENDED = 0x20,
};
Wrap for Delphi (Statically Linked into Delphi Project)
Use Borland C++ Builder to compile DEELX into a .obj file, then link this .obj file into a Delphi Unit: DEELX.dcu.
uses
DEELX;
var
result:TMatchResult;
re:TRegexpA;
begin
result := TMatchResult.Create();
re := TRegexpA.Create(Edit1.Text, IGNORECASE + MULTILINE);
re.Match(Edit2.Text, result);
if result.IsMatched() then
begin
Edit2.SelStart := result.GetStart();
Edit2.SelLength := result.GetEnd() - result.GetStart();
end
else
begin
Edit2.SelLength := 0;
end;
re.Destroy;
result.Destroy;
end;
Regex flags definition:
const
NO_FLAG = $00;
SINGLELINE = $01;
MULTILINE = $02;
GLOBAL = $04;
IGNORECASE = $08;
RIGHTTOLEFT = $10;
EXTENDED = $20;
Wrap to ActiveX for VB
Wrap DEELX to an ActiveX plugin, so DEELX can be used in VB or ASP file.
Private pos As Integer
Private re As New RegExLab.RegExp
Private Sub Command1_Click()
re.Compile (Text1.Text, "igm") ' the 2nd parameter is 'FLAG's
re.Match Text2.Text, pos
If re.IsMatched Then
pos = re.End
Text2.SelStart = re.Begin
Text2.SelLength = re.End - re.Begin
Else
pos = -1
Text2.SelLength = 0
End If
End Sub
The flags are the same as JScript.Regexp
:
s - SINGLELINE
m - MULTILINE
g - GLOBAL
i - IGNORECASE
r - RIGHTTOLEFT
x - EXTENDED
DLL Version of DEELX
The DLL version of deelx uses stdcall format for every function, because Visual Basic can call stdcall only.
The demo.zip contains two projects: one is in Visual Basic, the other is in Delphi.
References and Acknowledgements
Homepage - I'm the author, this is the homepage of DEELX.