An entire application framework and XML support. for Android, Windows(.NET, MAUI/Razor or MFC), Linux, iOS, and Unix. ServerCore.cpp has a ground up HTTP services framework and an HTTP file and proxy server. The Server aspects are independent of the XML to application layer functionality which is unique from other XML parsers in the ability to move data directly from linear XML input memory space to application layer without an intermediate step when using the Early Bound OID as an attribute.
Welcome to the XMLFoundation August 2024
A Big Finish and a Fresh Start
The XMLFoundation has had a 64 bit build for a long time however many 32 bit limitations still applied - until now in August 2024. Updates and Inserts via XML into StringLists had never been fully supported until now. .NET was largely unsupported until now requiring interface changes anywhere that a null const char * string was allowed to signify "argument not supplied" within XMLFoundation interfaces because the Marshalling of String type cannot send NULL strings, only empty strings. Winsock2 replaced Winsock in this release. This release made source code changes library wide - too many changes to list individually. A file compare utility will be the best way to see all the source code changes. This marks the beginning of a new era for the XMLFoundation library which has never been assigned version numbers - only release dates. Welcome to Release Candidate 1 of v1.0. The .NET interfaces are still solidifying.
I hope to enlist feedback regarding the .NET Interop interfaces. In the past, most user supplied code enhancements originated through the discussion thread following this article on CodeProject.com. This is the first release that is also in a supported at a GitHub project which will allow better source code management for user contributions and source code forks. https://github.com/Brian-Aberle/XMLFoundation
OpenSSL has been removed from the XMLFoundation. Cipher algorithms used by the XMLFoundation are included within the XMLFoundation to eliminate dependencies and simplify the build. SHA512 now compliments SHA256(currently 2 implementations of SHA256 that will eventually consolidate), and Rijndael compliments TwoFish as symmetric block ciphers built into the class library so that no additional DLLs or link libraries are necessary. Likewise, HMAC for Proxy Authentication is also of the algorithms within the XMLFoundation library. Simplification of a complex build is what had first inspired putting all of zlib into a single source file [GZip.cpp] - which is now updated to the most recent zLib source distro. NOTE that none of the new examples use traditional library linking - instead the entire library is #included and compiled inline to the host application or host library making use of #include <XMLFoundation.cpp> from within a .cpp file - an unusual practice in C++ development style. This ensures that the library builds with DEBUG symbols for any DEBUG application build and uses the same CRT libraries as the application. The unusual style is not forced upon the application - its merely an option in addition to the more typical link to a static XMLFoundation.lib or XMLFoundation.a
.NET MAUI-Blazor Integration
The XMLFoundation contains some new example programs in the V1.0 RC1 August 2024 release. The XMLFoundation Library also targets 32 and 64 bit ARM processors v7 and v8 respectively in a new MS Dev Studio project titled XMLFoundationLibAndroid within the project workspace that uses the CLang compiler. The Microsoft Studio project file and the source tree have been heavily reworked and rearranged. User feedback is welcome. I was never able to properly hook the debugger in the C++ layer DLL which is loaded by the C# layer Interop DLL in the new MAUI-Blazor example application. From the C++ DLL, I temporarily used an appended to log file with the hard coded the output of certain variables and app logic routes using GString::AppendToFile() (as i have done in various other times, situations, and platforms where the debugger is not available). Additionally, it was easy to develop portions of the code in a minimal .EXE where the debugger is able to help tremendously prior to moving the code into the C++ DLL within the .NET MAUI Blazor example . I saved that simple app, which is a console version of the logic found in .NET MAUI Blazor example - That project is titled "Simple" in the workspace currently containing 32 projects. A few projects still remain outside the larger Microsoft Visual Studio workspace and outdated projects were removed.
Ubuntu
The XMLFoundation Library and every C++ example program works on linux and has been tested in this release using the most recent versions of gcc and Ubuntu. Running the example program ExIndexObjects is impressive on Linux. That application allows you to compare execution times for insert, search-find, search-nofind, full-iterate operations on multiple indexed data structures containing objects created from XML. The execution times in a Linux VM are faster than on the Windows host machine although the host had more reserved CPU.
Android Studio
"Design Pattern Native"
There is a new and in-progress project for Android Studio using Kotlin in the Gradle build system under XMLFoundation/Examples/AndroidStudio titled "MyApp". This project currently builds the entire XMLFoundation C++ library source code - including ServerCore.cpp into a per-application native library that the Java application layer can use. The "MyApp" project is structured in a way that compiles the XMLFoundation code base into an application specific library which is built along with the Java user interface code that will use it (see native-lib.cpp in the MyApp example). This is the exact same application design pattern and the actual source code used by the .NET MAUI Blazor example which implements the same idea for interop between C# and C++ that the Android Studio app does between Java and C++. BOTH the Android Studio example and the .NET MAUI Blazor share the same C++ example application implementation layer found in the source code files [XMLFoundationApp.cpp/h] and [ExampleObjectModeling.cpp/h]. The interface from C++ to Java found in [native-lib.cpp] in the Android Studio "MyApp" example is what [DLLExports.cpp/h] is to the .NET MAUI Blazor app. This example app is still a work in progress. I decided to release it prior to completion because the remaining work is mostly basic user interface Java code to be written. This example builds for ARM processors, v7 and v8 - 32 and 64 bit respectively the build fails for x86 and x86_64 I would remove those targets until they are working - but i dont know how. If an ARMv7 virtual android device is running in the AVD Android Emulator then Android studio will run the application there.
"Design Pattern Java"
The XMLFoundation source distro has contained an example for Android Studio since 2013. It was developed using Gradle version 2, and supported Eclipse as well as Android Studio. That project will not build using the current version of Gradle. The emphasis of that example was using ServerCore for tunneling and routing of any TCP application with an example for VNC which has been removed from the workspace because VNC as the example TCP app no longer worked on modern versions of Android. The original GApp example also showed how to use [JavaXMLFoundation.cpp] with [ObjectModel.java] which is derived from [XMLObject.java]. Both java source files are located in [XMLFoundation/Examples/Android/XMLFoundationProject/GApp/src/main/java/a777/root/GApp ]. This design pattern for handling XML gives the software developer a "Pure Java" development solution based on the JavaXMLFoundation.cpp which requires no C++ code to be written. Although the unsupported Gradle build file prevents the example from building, all of the old source code which is still relevant, and still compiles, still remains in this release for reference. A new example project file for the "Design Pattern Pure Java" is on the TODO list.
The New GitHub Project
https://github.com/Brian-Aberle/XMLFoundation
The source distro from GitHub includes this debatably off topic file: http://SyrianRue.org/Soma
Reading it was nearly made mandatory in the license file.
The Index
Important Update to Licensing Agreement
If you incorporate this code into your own, a simple comment anywhere within your own source code giving credit to XMLFoundation fulfills the copywrite agreement. You have been advised of the important update to the licensing agreement. This is free and unencumbered software released into the public domain.
Important Stability Fix in September 2023
There was a fix for handling the situation of CDATA
in the input XML mapped to objects not derived from XMLObject
. The unusual, or unexpected, XML can cause application failure, therefore, it is advised to rebuild with the latest source code if the possibility exists that CDATA
could appear in your input XML where you do not have it handled. The change being in XMLObjectFactory.cpp. additionally empty CDATA
is now handled properly with a change to xmlLex.CPP. Also there is a small change to SHA256.CPP for g++ -O0 compiler options.
Don't expect more fixes, as this was the first in years after the many people have been using this code base. Support for the latest development tools will continue. Terrance McKenna said that there is a link between human technology advancement and consciousness expanding neuromedicine. After decades of research, I have a thesis that details how this can be and how neurodegeneration and cancer proliferation are involved: (PDF) Harmine and the Beta-Carboline nutrient complex: Neurological and Psychological Effects of Peganum Harmala: Neurogenesis, Entheogen, Alzheimer’s, Cancer and Antidepressant (researchgate.net)
This is a 2024 update to this paragraph from 2017.
in 2024 VC6 is finally unsupported in the fact that the "Unofficial VC6 service Pack 7" files that had previously been included in the source distro have been removed. There cant be many VC6 application integrations (from 1998) remaining today so it no longer merits bloating the source download. Additionally
This is where we were at in July of 2016: Android Studio 2.2 preview 6 does not allow you to change the NDK directory, previous versions did. In the most recent NDK(r12b), I could not link the most recent stable openssl(1.0.1i). The issue was unresolved at Stackoverflow here. I also found this open issue.
So – Since I cannot rollback to an NDK with the required symbols in the platform, I was forced to modify the openssl source code to remove the dependency on the platform support. What I did was the same fix used to cut out shell support on Windows CE builds by using preprocessor directives in [ui_openssl.c] This is the beauty of open source, so I built the [modified]most recent stable openssl (1.0.1i) for Android platforms: armeabi
, armeabi-v7a
, arm64-v8a
, x86
, and x86_64
.
Unbeknownst to many, Android (v6+) no longer uses OpenSSL. https://github.com/android/platform_external_openssl - Since GoogleAPIs are not using OpenSSL, it's likely not an issue that their development teams are facing. The GoogleAPIs are not open source, so if you build your application on them, it is reasonable to expect that your processor, bandwidth, and privacy will be taxed, therefore giving rise to https://microg.org/.
When I last published the XMLFoundation Android examples, I was using Android Studio 1.31 (Build date: Aug 3, 2015). At Google I/O May 2016 Android Studio 2.2 Preview was launched, a large update in many respects. “We are working on a new build system to replace both the build system inside ADT and Ant.” According to http://tools.android.com/tech-docs/new-build-system
XMLFoundation is embracing the new Gradle build system for Android builds. Setting up a proper Gradle build system can be confusing because most Android project build examples use older Gradle plugins. The new Android Studio (using version 2.2 Preview 6 built on July 18, 2016) example uses the new, more organized, build structure during a time where a few examples are the best documentation. The NDK examples now include the new examples “MoreTeapots
” and “CMake
” which contain helpful direction just like the new Android example titled “XMLFoundationProject
” does which targets ['armeabi', 'armeabi-v7a', 'arm64-v8a', 'x86', 'x86_64'], and links the openssl libs crypto and ssl accordingly. Inside the “XMLFoundationProject
” are several modules, one titled “XMLFLibrary
” builds the foundation into a shared dynamic library, and GApp uses all static linking in the "New School" build style.
The example builds in Windows and Linux. See XMLFoundation/Examples/Android/Using Android Studio 2.2.txt.
Each thread has its own heap. When a thread is destroyed - so is its heap. This allows us to disable all memory cleanup in many cases when we plan to destroy a thread and use a new thread for the next transaction. You might not think that such a simple idea will cause the XML parsing performance to double (aka happen twice as fast) or even triple. The memory manager in the operating system must block in the delete()
implementation, blocking is slow. We can't stop calling delete()
just because it's slowing things up - or can we? See the new example titled Threading. This design pattern cuts "initialization time" out of "transaction time", as well as cutting out the big time pit that (unnecessary) destruction is. Memory managers optimize new over delete, so we know why the black hole time pit exists. It takes longer to put memory back than to obtain it, the world would really be messed up if it was the other way around. We can take advantage of the situation for a massive performance enhancement. In 2009, I wrote the section Faster than Fast. In 2011, I expanded on the concept with Custom Memory Management. Using heaps efficiently expands on the same concept once again so this is faster than the old 2015 version.
Here is a comment from the new "Threading
" example - expect more worthy commentation in the source file:
int g_TrustThreadHeaps = 1;
Three times faster? Thats absurd. The proof is in the numbers not in the article word count.
Beyond the major performance improvement mentioned above, there are more less than major performance improvements, but they count too, and they apply to every object derived from XMLObject
regardless of their threading use. There were a few GString *
's internal to XMLObject
that were being allocated upon demand and destroyed in ~XMLObject()
- they were replaced with a char [32] array and their allocation/destruction is now avoided when the contained values are less than 32 bytes, which is generally the case, any values longer than 32 bytes will parse as slow as they did in 2015. These little bits add up to more speed in 2016 - and keep you out of the black hole time pit.
It always seems like there are NO bugs in the rather massive collection of source code that comprises the XMLFoundation. How could millions of lines of code be bug free? It might be in 2016. It was very close in 2015. A missing check for null
that could cause an application crash was fixed in 2023 along with a problem handling an empty CDATA
. Long ago, I wrote the section Exceptional Exception Handling and asserted that the implementation of 32 bit stack capturing in GException
is the best reference implementation on the Internet.
The perfection of this library is a collaborative effort, and as you see from the 400+ comments following this article on codeproject.com, no issue remains unaddressed or outstanding.
This old spit was new in September 2015.....
This is the glue that holds all software together. The XMLFoundation build dependencies were recently reorganized to build BEFORE Windows. For example, if <Windows.h>
, or a kernel file like <winnt.t>
is included anywhere in your application prior to your source code location in the compilation of your application, then untold vast numbers of structures and defines already exist. If they do not exist yet, your source can select structure definition implementations when existing implementations are wrapped in #ifndef _STRUCT_DEFINED
. This is accomplished by #define WIN32_LEAN_AND_MEAN
and that alone is not enough, when building BEFORE RPC
and TCP
definitions you gain build order precedence by defining RPC_NO_WINDOWS_H
and COM_NO_WINDOWS_H
if your app includes <Ole2.h>
anywhere. I spit a trick on Android that will become a foundational design option on all platforms and applications that build on XMLFoundation – the new spit is the concept of “The single Source file” build simplification that is a design principle re-re-re-implemented throughout this library of including all the includes into a single source file. For example – the entire PThreads
implementation in Pthreads.cpp and all the .c and .h files put into the single GZip.cpp and likewise in TwoFish.cpp. This build organization has many benefits. The concept can be applied to the entire XMLFoundation – being put into a single .cpp file – and it was.
This spit is especially handy to avoid linking problems – because THERE IS NO LIBRARY TO LINK. Sorry about YELLING IN ALL CAPS, sometimes I spit when I yell. Linking and Libraries go together like smoke and fire. The pearly buildmaster spit that holds the openssl
library together bonds at the molecular level. I cannot put openssl
into a single souce file. I actually tried, all I did was end up learning the deep innards of the library. Eric Andrew Young carved his initials into your application build and I am not taking them out so I expect them to stand until the end of time. I expect even more people to be linking with openssl in the future. They should all know that the order in which those libraries appear, especially openssl, makes all the difference in the success of the build. XMLFoundation is eternally free from such linking order issues when #included into your application, that feature improves the portability of XMLFoundation as well as simplifying its use in any app. The object code is still linked into the .exe obviously, when you see the .OBJ or .o intermediary files created in your application directory - one for each .cpp file - one will be very large - that one included the XMLFoundation.
All XMLFoundation applications may now easily replace the link with a #include
(see GAppServer.cpp in the GApp
example) . On Android, “Android Studio” is the “Official” build tools are now, which is based on the new Gradle build process, a nice step forward in the build organization of Android Apps. That’s old news from June 2015.
Android holds 80% of the worldwide smartphone market in 2015 according to IDC.com. This is the future of Android software development according to the droid folk and the NDK can now be obtained only in the Android Studio download - thats new. Android Studio has some new spit that integrates Java code and C++ code into one build process. Additionally, as the Android Studio IDE indexes the C++ source and the Java source code it builds the same support for both languages. Including automatic dependency knowledge, object method indexing, and the ability to jump to structure definitions using the editor in the IDE shown here indexing GString methods in the IDE.
I don’t want to sound like a commercial for Android Studio, but the editor is excellent. I am fussy about my editors. The text editor spell checks your comments unobtrusively, has block copy (hold down Alt, click and drag to select block text), has real time compilation to help you see those petty errors like when you forget a semicolon at the end of a statement underlined red in the editor – before you compile. It also has the structure definitions and methods in your source files indexed to take you there with a few clicks in the editor. Also all the hotkeys are what I would call “correct” as they were all natural to me even though I was new to Android Studio. Here, you can see the declarations are indexed by the IDE:
XMLFoundation for Java is an excellent solution on Android where the JVM(Java Virtual Machine) is used to call the entry point of every App. This makes Java THE language on Android, C++ step aside. Well - let it be known that the JVM was written in C and C++, therefore any enhancement to the JVM that makes Java more powerfully employed to be an Object Oriented and functional language would be implemented in C++ - and it was.
XML data is processed by xmlLex.cpp like all XML in the XMLFoundation, it then uses JNI to make instances of Java Objects that come instantiated with all the member variables already assigned from the XML - No code needs to be written to accomplish this - just a little table of information that allows the algorithm to correlate XML Elements to member variables. This problem is solved every day but it’s generally done with many lines of code that parse the XML and get the bits of data into the object manually. It's like magic from the Java side, the objects just appear in their containers. I make some outrageous claims in this article, and so that none of them be proved false – it needs to be known that JavaXMLFoundation.cpp uses a DOMish approach underneath Java, and therefore although it’s still fast, it's not going to have the big speed gain of the pure C++ implementation. No performance data has been collected at this time, and in the future, some of the JNI handles can be cached to be faster according to IBM, and that is logical. The Java implementation delivers primarily a simple maintainable design pattern to Android that replaces volumes of logic to transfer values from the XML into Objects - AND inversely every object can convert itself back into XML with no code as the Java objects inherently know how to do that.
JavaXMLFoundation.cpp is not new although it is new to appear in the XMLFoundation/src/Utils., it was in the /XMLFoundation/Tools/JavaFoundation folder along with makefiles for Solaris, NT, and Linux. It was tested in AIX Unix too. The old build process that uses those makefiles is more complex and therefore the XMLFoudation never had a usage example until now. With the recent advancements in Android Studio, that old file became extremely relevant. JavaXMLFoundation is based on an abstract language independent “native object factory” called DynamicXMLObject
and found entirely in DynamicXMLObject.h .NET-Java does not allow you to get underneath WinRT to support C# or .NET Java so this phenomenal code was rendered next to useless on Windows even though it had been tested with Oracles JVM on Windows before .NET. Today is a new day and JavaXMLFoundation
is now relevant not only on Android but also on Embedded Java devices as well as any Java application that uses the Oracle JVM.
DynamicXMLObject
is an object oriented data structure that makes a generalized intermediate object structure from the XML data – the same concept as DOM. DOM uses an n-airy tree of structures that contain element or attribute data, DynamicXMLObject
indexed the same information in XMLObject
derived objects. There is an ancient example in the XMLFoundation called LexTest that shows how to build a DOM tree using XMLFoundation, so that has always been an option. In that example, there is a clear layer between the lexical analyzer and the DOM Tree. DynamicXMLObject
is more tightly coupled with xmlLex as it uses the XMLObjectFactory
to build the structure. A “Normal” XMLObject
derivitive, calls MapMember()
to build a table of member information, whereas a DynamicXMLObject
has no table pre-defined but generates the “member” on-the-fly right when the ObjectFactory
process needs a memory location to store element and attribute data. That’s why it’s a DOMish design, because the data from the linear XML does not map directly to a Java object, it binds to an intermediate structure that is more “Object Oriented” than DOM Structures and is far more robust than DOM as the nodes derive from XMLObject
and inherit certain abilities.
XMLFoundation contains the example for Android called GApp. All in one app - it demonstrates TLS-SMTP and the HTTP Proxy (with NTLM Authentication) and the HTTP Server and the Object Oriented XML development for Java on Android. GApp is the App of Apps all over the map and that’s that. It shows how to use XMLFoundation tools like GProfile to build-in powerful (registry like) application configuration with very little code. GApp implements a shellish utility to invoke commands like ipconfig ls
or netstat
. GApp uses ServerCore, so it supports the whole 5Loaves extendable SOA App framework which is what 5Loaves is. The GApp makefile IS the Gradle C++ integration documentation (and see the Teapot example in the NDK example pack). The new gradle C++/JNI/Java build meshing functionality is officially still experimental, but I experimented with it and it works. If Android Studio is the future, the Gradle C++ support is the bloody cutting edge. In previous versions of XMLFoundation, like a billion other Android Native Libs, the tools and features of the XMLFoundation were compiled with ndk-build
. To make them fully integrate into Android Studio, it took more than buildmaster spit. It would have been impossible to make the changes in only build configuration files. Various small changes in the source code makes XMLFoundation beyond compatible – it's designed for Android Studio integration. GApp makes the most out of an unrooted Android device, and makes even more out of rooted devices. Here are a few screenshots from the GApp example, the full source is included in the XMLFoundation source code download.
The bloody cutting edge is changing all the time so whatever was written about version n.n.n.1 does not apply to n.n.n.2, that's just how it is - the source code has the ultimate say so in any debate with documentation. Getting the build environment to work is one thing today and another tomorrow - but the general outline for an integration into Android Studio is solid. I also attempted to have a development environment with Android Studio on Mac. The specifics of this chapter are still forming but here are some trail blazer notes that might help you get you build environment working. The same document is included in the source download in the examples/Android folder. The .APK file created from a working build is here.
This is the Android Studio Java Example on the XML page of the new GApp Example, the big concept to see cannot be conveyed in a screenshot, and therefore the XML tab page was not even included above. Here is the source code that communicates more than seeing the result of the source code. It simplifies development by an order of magnitude. That's what it was designed for.
rootView.findViewById(gapp.R.id.one).setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
TextView tv = (TextView) rootView.findViewById(gapp.R.id.editCommand);
String strFile = tv.getText().toString();
Customer1 c1 = new Customer1();
c1.fromXML( getXML1() );
c1.ObjDump();
showResults( c1.toXML() );
showResults("************************\n");
c1.fromXML( getXML2() );
showResults(c1.toXML());
c1.getOrder().addLineItem(777, "The Spirit", "Always");
showResults("************************\n");
GAppGlobal.emailSend(GAppGlobal.configGet("Email", "SMTPServer"),
GAppGlobal.configGet("Email", "SMTPPort"),
GAppGlobal.configGet("Email", "Protocol"),
GAppGlobal.configGet("Email", "LoginUser"),
GAppGlobal.configGet("Email", "LoginPass"),
GAppGlobal.configGet("Email", "SenderAlias"),
GAppGlobal.configGet("Email", "SenderEmail"),
GAppGlobal.configGet("Email", "EmailSubject"),
GAppGlobal.configGet("Email", "EmailRecipient"),
c1.toXML() );
Customer2 c2 = new Customer2( getXML1() );
c2.ObjDump();
c2.ApplyXML(getXML2());
c2.getOrder().addLineItem(777, "The Spirit", "Indirectly Inherited... Always");
GAppGlobal.emailSend(GAppGlobal.configGet("Email", "SMTPServer"),
GAppGlobal.configGet("Email", "SMTPPort"),
GAppGlobal.configGet("Email", "Protocol"),
GAppGlobal.configGet("Email", "LoginUser"),
GAppGlobal.configGet("Email", "LoginPass"),
GAppGlobal.configGet("Email", "SenderAlias"),
GAppGlobal.configGet("Email", "SenderEmail"),
"Example 2 Indirectly Inherited",
GAppGlobal.configGet("Email", "EmailRecipient"),
c2.XMLDump() );
}
});
This is the XML that I get in my inbox:
<Customer>
<FirstName>Gnosis</FirstName>
<CustomerID>777</CustomerID>
<long>4555444333</long>
<short>2</short>
<double>3.333</double>
<byte>54</byte>
<bool>False</bool>
<Order TransactionTime="today">
<SalesPerson>Omega</SalesPerson>
<OrderNumber>911</OrderNumber>
<LineItemContainer>
<LineItem>
<Quantity>OceansWavesCount</Quantity>
<Description>Peace beyond understanding</Description>
<SKU>123</SKU>
</LineItem>
<LineItem>
<Quantity>Infinate</Quantity>
<Description>Freedom</Description>
<SKU>456</SKU>
</LineItem>
<LineItem>
<Quantity>Eternal</Quantity>
<Description>Life</Description>
<SKU>789</SKU>
</LineItem>
<LineItem>
<Quantity>Always</Quantity>
<Description>The Spirit</Description>
<SKU>777</SKU>
</LineItem>
</LineItemContainer>
</Order>
<StringList>
<Level2Wrapper>
<StringItem>Hello</StringItem>
<StringItem>There</StringItem>
<StringItem>World</StringItem>
</Level2Wrapper>
</StringList>
</Customer>
This is what the Order
object looks like in the Object
model found in ObjectModel.java:
class MyOrder extends XMLObject
{
public String salesPerson;
public String orderDate;
public int OrderID;
public Vector vecLI;
MyOrder()
{
super("Order");
}
void MapXMLTagsToMembers() {
MapAttrib(orderDate, "orderDate", "TransactionTime");
MapMember(salesPerson,"salesPerson", "SalesPerson");
MapMember(OrderID, "OrderID", "OrderNumber");
MapMember(vecLI, "vecLI", "LineItem",
"gapp/MyLineItem", "LineItemContainer");
}
void addLineItem(int nID, String item, String quantity)
{
if (vecLI == null)
vecLI = new Vector();
vecLI.add(new MyLineItem(nID, item, quantity));
}
}
This paragraph was in the September 2015 update. Moores Law, which has held mostly true for a few decades, says that the rate of processing speed doubles every two years. Times are changing, infact Gallium Nitride will replace Silicon. Saying Silicon Valley will be like saying 8 Track Valley. You’ve gotta move with the times. I bought a development machine with a higher end processor about a year ago, recently I swapped the main disk with an SSD to see it boot 3 times faster and to see my VM’s start and stop even faster than that. Three times is a lot. It’s like being able to use your brain after opening up the bottle neck. When using the OID object IDs, XMLFoundation does parse XML 3 times faster than other XML Parsers due to its unusual design that never copies memory to an intermediate tree between the linear input buffer and the application layer which also effectively uses half as much memory while producing the 3X speed difference – it’s no more revolutionary than an SSD.
This is what was new in August 2015: A while back, I made a future reaching decision to include the big fat openssl
libraries into the XMLFoundation. They make the download huge and most of the examples do not use the openssl libraries, most people who use the XMLFoundation do not want openssl
especially a binary for Windows Phone. Despite all that, those libraries need to be there to see the NTLM authentication work on all platforms that XMLFoundation supports. Now the XMLFoundation is reaping the rewards from integrating openssl. Consider the class CSmtp
, added into the XMLFoundation during August 2015. That project grew up and matured here on CodeProject. The source code download in that project only contains one platform build of openssl binaries. I integrated CSmtp
into the XMLFoundation and added support for all Microsoft compilers32 and 64 bit targets with the correct openssl
libs ready to link as well as a Linux makefile. CSmtp
is portable and it was well done, kudos to the builders. The CSmtp
source download comes with openssl-0.9.8l, In the XMLFoundation, CSmtp
uses newer versions of openssl
. The big fat binaries are being shared by the NTLM code and the TLS code now - on many platforms.
Some legacy systems are still maintained with VC6 in 2017. I have a custom VC6 build that creates my 32 bit application that installs 64 bit binaries which exist inside the 32 bit binaries as bound resources. The new TCP connection information provided by GProcess
in this release of the XMLFoundation uses newer interfaces and has IPV6 support so VC6 apps must use the updated Winsock2
libraries and may use the updated RPC libraries(my apps use them). This essentially updates core libraries and header files normally addressed by a "Service pack", however my compiler is unsupported so I had to do it myself. This core library upgrade might be helpful to someone else so it is included in the XMLFoundation where we support the unsupported and do the un-doable. It's simple enough to use the VC6 update by selecting "Tools" then "Options" then adding the include
and library paths so that the files are found in the XMLFoundation\Libraries\VC6 folders and not in the default VC98 folders. By using the black up and down arrows, you can place the VC6 folders above the VC98 folders so that the compiler searches those folders first. The new example, SMTPandTLS
will not compile in VC6 unless you replace the Winsock
libraries.
Just because some new class or functionality shows up in the XMLFoundation in August 2015 does not mean it is something new or experimental. I included a portable routine that returns a list of processes. It works for Windows CE, Windows 98, Windows Phone, Win32 and Win64, Android, IOS, and Linux. It obtains all the process information available if running as Administrator or it obtains a lesser set of process information available when running as non-administrator in Windows. It works everywhere, with all compilers, it is well debugged and it is a worthy contribution to the XMLFoundation. The new code is in GProcess.cpp and the new Example program SMTPandTLS
shows how to use it.
I updated the VS2012 and VS2013 workspaces. I recently verified every XMLFoundation example program in 64 bit, 32 bit, Debug, and Release under VS2013 . A while back, I decided to stop "migrating" the workspace into new Microsoft compilers. Instead, I create a new workspace for each compiler, this works in more environments. In the case of VS2015, keeping the other workspaces reveals major differences in the C-Runtime that the compilers use. When I was a young cocky C++ programmer, several times I thought I found bugs in Microsoft code only to later discover that I was somehow misusing some under documented API call. Now I am well "seasoned" (even smoked and marinated - been through fire and flood). I am slower to hurl bug accusations at Microsoft, however I did find a few bugs in the C-Runtime libraries that ship with VS2015. Hopefully, this problem will be corrected in the next Service Pack since that compiler is still supported. The bug that I found causes the link to fail on all example programs that require openssl in VS2015. Look at the first two pages of source code in CSmtp.cpp to see the details on this bug. The VS2012 and VS2013 compilers produce working binaries of the same example. This is why I still use VC6.
The new sample application titled SMTPandTLS
gathers up this information and emails it to you like this:
----------------------------Wan IP----------------------------------------
73.3.213.175
----------------------Network Interfaces----------------------------------
Host:MY_COMPUTER_NAME
Type[5]:10.0.0.6
Type[1]:127.0.0.1
Type[1]:192.168.3.1
Type[1]:192.168.121.1
----------------------Network Connections----------------------------------
TO=23.195.144.35 port:47873 FROM=10.0.0.6 port: 39872 C:\VMware Workstation\vmware.exe
TO=23.99.205.208 port:47873 FROM=10.0.0.6 port: 59601 C:\Internet Explorer\IEXPLORE.EXE
TO=65.55.246.20 port:47873 FROM=10.0.0.6 port: 57298 C:\Internet Explorer\IEXPLORE.EXE
TO=204.79.197.210 port:47873 FROM=10.0.0.6 port: 57810 C:\Internet Explorer\IEXPLORE.EXE
------------------------Processes-------------------------------------------
pid:3288 iusb3mon [Intel(R) USB 3.0 Monitor] C:\Program Files (x86)\Intel\Intel USB3
pid:460 IAStorIcon [C:\Program Files\Intel Rapid Storage Technology\IAStorIcon.exe]
pid:980 iexplore#1 [C:\Program Files (x86)\Internet Explorer\IEXPLORE.EXE]
pid:2596 firefox [[XMLFoundation - CodeProject]
pid:32 http://www.codeproject.com/Article] C:\Program Files (x86)\Mozilla Firefox\firefox.exe
pid:6716 SMTPandTLS [C:\XMLFoundation\Examples\C++\SMTPandTLS\Debug\SMTPandTLS.exe]
That's what's new for August 2015. Let me hear from you in the comments below, Gimme a high 5 or something. I work hard to put this code together clean and clearly labeled, again you will find the new code strategically commented with helpful insights.
I have written multiple kinds of TCP servers and a few kinds of TCP proxies (conversational like VNC and Telnet and transactional like HTTP) so it helps that I have done this kind of thing before. ServerCore
, at the very core of it is not a server at all - it's a "TCP Socket to Thread of Execution manager" but that might confuse people and I work hard to keep this stuff simple. ServerCore.cpp has served on Solaris and AIX and on EVERY mobile platform.
I was once working next door to a company that had 50 million image files on a disk array that their own custom software corrupted the NTFS headers on so the folder of files appeared empty in the Windows Graphical shell explorer and at the command prompt. The file names were completely predictable so the files could be copied by a simple loop that knows the naming scheme and issue a copy like (c:>copy c:\lost data\File00000000000000000000a.jpg d:\found data\File00000000000000000000a.jpg) but the next crisis was that it was going to take 10 days to run the file copy. ServerCore.cpp was used to execute the file copy on <font face="Courier New"> clientThread()</font>
. I quick made a GUI for this (a one time use - throw-away GUI) that allowed me to adjust the number of concurrent copy threads and see the change in throughput. I found that with their hardware configuration the best performance was around 7 concurrent threads to maximize the I/O throughput, I could easily measure the difference between 5 and 50,000 threads on their high end hardware which allowed me to measure thread context switching overhead and stress test and optimize the ServerCore
threading model in my spare time. The copy was finished in about 50 hours. Is it right to call the copy program a server? I would say no, but I avoid terminology conversations. The threaded copy program was based on ServerCore
's threading model which was first designed and used in Unix and NT.
Maximizing the I/O throughput of TCP connections is quite a bit different because generally TCP is an order of magnitude slower than moving data over a disk controller. This means that most of the threads are sleeping most of the time on a TCP server. I applied an experienced approach toward implementing an HTTP Proxy in ServerCore.cpp. It's a complete implementation that runs on all mobile platforms(Android, iOS, Win Phone) and many server platforms. In the past, an HTTP Proxy was generally only the function of a high end server. Nowadays, every phone and every tablet can act as an HTTP Proxy to extend Internet connectivity to devices near them.
We all know that a picture is worth a thousand words, and I will refrain from many more words on this topic because the source code is very well commented so there are more words down in the code for folks who want more words. This is the basic logic flow. It takes two threads for an active proxy: listenThread()
is in a tight loop that is either sleeping or accept()ing a socket that will be handled by a clientThread()
in the pool, clientThread()
gets a ProxyHelperThread()
from another pool. Inside ServerCore.cpp is also an HTTP server (which has its own instance of a listenThread()
on another port) The HTTP Proxy and the HTTP Server are both implemented in clientThread()
- but only the HTTP Proxy uses a ProxyHelperThread()
. clientThread
handles A and B, the ProxyHelperThread
handles C and D in a loop that is mostly sleeping until data arrives.
The source contains a new project titled HTTP Proxy Server that compiles into a Windows Service. The Windows service itself is a reference implementation, complete with command line arguments to install, uninstall, start, and stop the service and set it to automatic or manual start. The full set of options are documented in the source code. The "HTTP Transactional" logging could be helpful to web developers, when it's turned on, it creates files with names based on the HTTP Transaction. They look like this, when I view this web page:
Inside the .OUT file, we see the data sent out for that HTTP transaction, being able to sort the individual transactions by name, server, and size is handier than reading a Wireshark log. Other than for reverse engineering someone else's website (or debugging your own), this wouldn't be handy for much.
GET /Articles/37850/XMLFoundation HTTP/1.1
Accept: text/html, application/xhtml+xml, * / *
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Proxy-Connection: Keep-Alive
DNT: 1
Host: www.codeproject.com
Cookie: __utma=40492976.65300.139906.1416767222.1416957829.696
and in the .IN file, you would see a response that starts like this:
HTTP/1.1 200 OK
Cache-Control: private
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Set-Cookie: SessionGUID=7ed7a38c-d059-4f14-86ea-0fef4c82d576; path=/
Set-Cookie: mguid=1fc5162-e564-458c-8180-b42cec55ce5;
domain=.codeproject.com; Date: Tue, 25 Nov 2014 23:24:47 GMT
1072e (followed by the data)
As I said above, ServerCore.cpp also includes an HTTP server that I have running on port 80. I also have my own protocol implemented in port 10777. I have the HTTP Proxy configured on port 8080. Down in the gut of ServerCore
is a little function showActiveThreads()
that returns real-time thread state for every clientThread()
. It returns the data in a comma separated list that you could view many ways in your own application, this is a screenshot of the information displayed in a GUI that allows you to sort individual connections by port, IP, connection time, or action. When looking at many connections, the sort on Action helps to group and identify connections.
XMLFoundation now includes openssl
. This is a future reaching decision that was made while implementing the authentication in the HTTP Proxy. This better positions XMLFoundation with a wider array of encryption algorithms. In the past, XMLFoundation only included SHA-2 and TwoFish. Now every application that uses the XMLFoundation also has the industry standard openssl interfaces at their disposal. This move is not at all out of line with what XMLFoundation has been doing since the very beginning, which is to gather together all of the algorithms necessary to build portable native applications. openssl even goes a step further toward native with assembly language implementations (where available) that outperform C++.
XMLFoundation included zLib in a way that makes building on XMLFoundation simple by putting all of the zLib source code into a single file GZip.cpp and making it work in all the environments the XMLFoundation supports. That approach eliminates the need to link to zLib because you linked to XMLFoundation. The same is true for GSparseHash.h, PThread.cpp, TwoFish.cpp, and GThread.cpp (and the unused BZip.cpp) in the XMLFoundation. Integration of essential components is essential, otherwise loading all the dependant libraries would dramatically complicate the use of XMLFoundation. Consider the current situation of the iconv
library in 2014. In Linux iconv()
based on ICU is the standard interface for handling Unicode translations. It has been ported to Android, but it's not part of Android. It is part of iOS but if XMLFoundation uses it then any app built on XMLFoundation in iOS would need to link iconv as well. The Windows builds use WideCharToMultiByte()
so they don't NEED iconv()
. I intend to include an iconv()
interface inside XMLFoundation because GString
needs it and it solves the aforementioned problems on Android and iOS. http://site.icu-project.org/home#TOC-What-is-ICU-
Additionally, this allows for the tightest possible integration with GString
and allows for benchmarks comparisons to WideCharToMultiByte()
because it would become available to Windows as well as Android. The integration of openssl
is another story.
Openssl
cannot be wrapped up into one big .cpp file. It can be wrapped up into one big binary. The new folder in the source distribution XMLFoundation\Libraries\openssl is 60 MB. As of November 2014, within the XMLFoundation none of the utilities use any openssl
calls (yet) so using XMLFoundation will not force you to link to openssl
. Additionally, if you #define __NO_OPENSSL
prior to including ServerCore.cpp, then there will be no link dependencies to openssl
generated. Therefore, it is very easy to NOT use openssl
OR to make your own build of libcrypto.a rather than trust the openssl
binaries now included in the XMLFoundation distribution.
This move simplifies the use openssl
if you are using XMLFoundation. Every project needs to have the C++ include file search path setup so that standard includes like GString.h and XMLObject.h can be found. Notice where the new folders are in the source tree:
By default, the openssl folder contains the header files for all 3 windows binaries (Win32, Win64, and WinPhone). This is a new comment at the very top of the Read.me.txt file in the root of the source distribution:
======================================================================================================
The openssl directory is configured for Windows builds by default.
======================================================================================================
If you are building on Linux
rename the [ Libraries/XMLFoundation/inc/openssl ] to [ Libraries/XMLFoundation/inc/openssl-Windows ]
rename the [ Libraries/XMLFoundation/inc/openssl-Linux ] to [ Libraries/XMLFoundation/inc/openssl ]
If you are building on iOS
rename the [ Libraries/XMLFoundation/inc/openssl ] to [ Libraries/XMLFoundation/inc/openssl-Windows ]
rename the [ Libraries/XMLFoundation/inc/openssl-ios ] to [ Libraries/XMLFoundation/inc/openssl ]
If you are building on Android
rename the [ Libraries/XMLFoundation/inc/openssl ] to [ Libraries/XMLFoundation/inc/openssl-Windows ]
rename the [ Libraries/XMLFoundation/inc/openssl-Android ] to [ Libraries/XMLFoundation/inc/openssl ]
======================================================================================================
Now any application setup to use XMLFoundation can use openssl
like this because Libraries/XMLFoundation/inc is already in the path.
#include <openssl/md5.h>
#include <openssl/x509.h>
On all 3 Windows builds, the binaries are automatically linked (unless you #define NO_PRAGMA_CRYPTO_LINK
) so in this way XMLFoundation has included openssl. On the non-Windows builds that do not support the #pragma comment(lib,"libcrypto.a")
, you must add -lcrypto
to any makefile that generates dependencies to openssl. This has already been done for the 5Loaves example program makefile that builds the HTTP server in the console for Linux. For additional information, read the notes on building the openssl
binaries located in the Libraries/openssl folder.
November 26, 2014, this was what I had to say: On January 11 - 2014, I published a performance breakthrough that dramatically improved the performance of what was already the fastest XML Parser on earth. The new breakthrough of memory management on top the breakthrough of the "oid" put XMLFoundation in a league of its own when doing XML performance tests. In search of some official feedback, I "Requested For Comments" at IETF (Internet Engineering Task Force) in [apps-discuss]. I explained the "oid" enhancements made to XML that are required to achieve a three fold performance gain over any SAX or DOM XML Parser. I did get a little feedback, but there was an argument from senior members that the issue before them was not in their jurisdiction so I was recommended to bring the issue to [xml-dev]. So I did. Talk of XML 2.0 is like talk of changing the Bible to some people. The discussion took over the list servers until some political bug at Oasis brought the list servers offline for almost 5 days. Since [apps-discuss] had been on the topic of SMTP at the time, I started a thread [Help! SMTP has stopped Working] and asked for advice. Here are a few parts of that public conversation about how XMLFoundation uses the "oid" as the 1st attribute to key object instances at [xml-dev]. We can can rightly call these folks experts.
http://www.linkedin.com/pub/hans-juergen-rennau/11/6a9/5b2 Hans-Juergen Rennau "The OID pattern to which Brian Aberle introduced us seems to me very interesting, as it throws light on a peculiarity - perhaps weakness - of the XML information model: the XML model is unrelated to the concept of a resource as it underlies the web..... I think the OID may be regarded as a special kind of resource locator, URL, that is."
https://www.linkedin.com/in/peterhunsberger Peter Hunsberger: "I now understand that you are attempting to optimize the management of individual instance data, across the web"
http://www.xml.com/pub/au/206 Len Bullard: Creating objects with markup is a perma-thread from the early days. Tim Bray called such ideas "premature optimization"
Brian Aberle wrote: > [...] If the world was to increment the XML spec...
http://www.w3.org/People/Quin/ Liam R E Quin: "Unlikely, I'm afraid..."
Liam R E Quin: "Although there's certainly interest in making XML faster, it has to work e.g even if attributes are re-ordered or rewritten."
www.linkedin.com/pub/amelia-a-lewis/23/11a/a20 Amelia A Lewis: You do realize that this means "this is not XML"?
http://www.linkedin.com/in/garethoakes Gareth Oakes: "I can think of a few edge cases where an XML caching mechanism may be vaguely useful but I am intrigued about your use case for this."
http://en.wikipedia.org/wiki/Michael_Howard_Kay Michael Kay had The Best Question Asked, it was "What scope is the OID unique within?"
It was the best question because it is an exact question with an exact answer that drew an alternative implementation using XML's Process Instructions which I had never considered but once brought to light during this conversation recognized was possible, the drawbacks being more markup overhead, and the added liability that a processing instruction must always be hand copied along with the XML to which it applies if the XML is ever hand coped increasing the chances of human error due to lack of markup knowledge, whilst the OID as 1st attribute keeps the association with the XML/Object to which it applies. Still the argument is valid and such an implementation can achieve the bulk of the performance gains.
http://www.noaatech2002.noaa.gov/abstract_21.html Thomas Passin - "As I have been following this thread, it's seemed to me that this is exactly a case for a Processing Instruction."
http://www.linkedin.com/pub/hans-juergen-rennau/11/6a9/5b2 Hans-Juergen Rennau
Gentlemen,
I am not sure if such historical facts and details [about why XML does not address attribute ordering] are really important in the present context. At any rate, what interests me is the relationship between Brian's initiative and current XML. And what strikes me is the following. The XML model defines the information content of a given document; a document is the content which it is, and any glimpse beyond the document is out of scope. In particular, there is no room for distinguishing between a resource and its representation, - resource and representation are always one. But Brian's approach, so it seems to me, would build into the information content of a document a statement establishing a relationship with a seperate instance of information content (the data referenced by the oid), assigning to one (the data containing the oid) the role of being an update of the other - assigning to both the roles of subsequent states of the resource which assumes those states, but is not identical to them. And this is certainly an interesting idea.
Hans-Juergen Rennau
Why XML Fails to Address Attribute Order?
Within the discussion about the OID was the lingering unanswered question about why the XML specification does not address attribute order.
http://www.xml.com/pub/au/82 Simon St. Laurent: "I've always thought that discarding attribute order was one of the unforced errors of the XML spec "
www.linkedin.com/pub/michael-sokolov/4/a4a/235 Michael Sokolov: [ Answers the outstanding question with speculation (that I speculate must be right) ]: " Hashmaps do not preserve insertion order, so it may be that this was a rationale for discarding the order of attributes. I don't know. If so it wasn't a particularly good rationale."
This is what was new August 2014 - Documentation, Exemplification, and Feature Advancements are continuing despite the police, courts and judges who hate the truth that exposes them.
Exceptions must be handled. An unhandled exception will bring the whole system down. The GException recently enabled the Win32 and Win64 Call Stack capturing code that allows you to access information about the origin of the exception. Supported Windows builds includes VC6 and VS2013. The XMLDialog
example allows you to freehand enter the XML that will be parsed be the example - so by simply entering some invalid XML, we can raise an exception intentionally just to see the error and callstack information that is now included with ALL GExceptions system wide. The following screenshot is from the XMLDialog
example with callstack details down into the XML Parser.
The new callstack information that is now included with GException includes the Class::Method[SourceFile.cpp] @ Line Number in Source file
This new functionality is deep in the roots of XMLFoundation. xmlLex throws GExceptions
. I have multiple products that throw GExceptions
, and the implementation was written with a "Mission Critical" mindset that includes thread locking because the Windows API in DbgHelp
is singlethreaded and a 32/64 bit implementation and compiler support for VC6 and VS2013 including support for _UNICODE builds. This is "new" because the code was disabled sometime PRIOR to 2002. The XMLFoundation was first made public here on CodeProject in July 2002, GenericCallStack::GenericCallStack
was mostly developed at that time, however it was not threadsafe or failsafe so it was disabled by commenting out the intentional divide by 0 in GException::GException()
that sets the whole thing off. The old source code and article from the first 2002 XMLFoundation publication is here: http://www.codeproject.com/Articles/2534/XMLFoundation with only 61,753 views as of today. CodeProject hosts other articles about how to obtain the callstack, here is one from 2005, http://www.codeproject.com/Articles/11132/Walking-the-callstack . I used the approach posted in Johannes Passing blog, which closes the endless loop possibility with a constant stack size limit AND works in a multi-threaded application with a simple CRITICAL_SECTION
. Now the implementation in XMLFoundation provides the callstack with every GException
(in a DEBUG build). Yes, I have used debug binaries in production environments so the debug implementation must be "Mission Critical" too. The GCallStack
class is a textbook correct implementation for obtaining the callstack for native windows applications using the Windows API's enumAndLoadModuleSymbols
, StackWalk64
, SymGetSymFromAddr64
, UnDecorateSymbolName
, and SymGetLineFromAddr64
.
The GException
can be foundational in your own applications. It was designed to be used at all levels in the application. The robust functionality of GException includes XML serialization. On the NCIS project, we would catch a GException
on the server side of a failed CORBA or RPC call, then serialize the GException
to XML and "throw
" the XML over IIOP to the Client by returning a String
return value with EITHER the results OR the exception. The Client will "re-throw" exceptions from the parser level and the error goes straight to the GUI without coding any special logic to handle it. That is some ancient design history behind this previously undocumented logic in XMLObjectFactory.cpp as it relates to GException
. You can see from the following code that during the process of serializing ANY custom objects, you can also serialize in an exception in raw XML - and it will be re-thrown.
void XMLObjectFactory::extractObjects
( XMLObject *pRootObject, XMLObject *pSecondaryMapHandler )
{
m_pProtocolObject = pSecondaryMapHandler;
GetFirstTokenPastDTD();
receiveIntoObject( pRootObject, 0, 0, 0);
if (m_pException)
{
if(m_pzExceptionThrower)
{
throw GException("XML Object Factory", 1, m_pzExceptionThrower)
}
else
throw GException(*m_pException);
}
}
You can see from the callstack supplied with the GException
in the previous screenshot that <font face="Courier New">XMLObject::FromXml()</font>
calls <font face="Courier New">XMLObjectFactory::extractObjects()</font>
and you can see in the code snippet above that <font face="Courier New">XMLObjectFactory::extractObjects()</font>
will rethrow an exception if one was serialized in. This is true of every call to <font face="Courier New"> XMLObejct::FromXML()</font>
system-wide, which can handle not only the XML that was mapped to the user defined Objects but it also maps to XML Exceptions without writing any additional code. In the XMlDialog
example program included with the source it is easy to paste in an XML-Exception as the source XML to see how the software behaves. Note that the outermost XML Element name, in this case <TransactResultSet>
can be set to any value.
<TransactResultSet>
<Exception>
<Description>Throw this Unexpected CUSTOM XML Exception!</Description>
<ErrorNumber>777</ErrorNumber>
<SubSystem>7</SubSystem>
<UserContext>
<Detail>Ancient functionality with recent documentation</Detail>
</UserContext>
</Exception>
</TransactResultSet>
The internals of GException
include context details that make exceptions more functional since they may represent various layers and sub-systems in a large project. For example:
try
{
str.ToFile(pzDestFile);
}
catch(GException &e)
{
e.AddErrorDetail( 777, "Failed to uphold justice in the court");
throw e; }
In August 2014, for the first time in the history of the XMLFoundation, the entire library can be compiled in Windows with _UNICODE defined. This switches many underlying Windows API calls for example from MessageBoxA( )
to MessageBoxW( )
requiring "const char *
" vs "const wchar_t *
" respectively. For this reason, GString
has several new methods added for native wchar_t
support. GString
now supports a default conversion to the ANSI codepage 1252 which defines the Latin character set used for English and some other European languages. The Windows builds use MultiByteToWideChar()
and WideCharToMultiByte()
. The Unicode support currently uses default conversions in the non-Windows builds too. UTF-8 and ISO-8859-1. This new functionality is based on IBM's open source ICU library (International Components for Unicode) and the standard iconv()
interface. Unicode is the newest area of development in the XMLFoundation and it will likely continue to grow in functionality.
The new code now allows GString
to be directly assigned by CStrings
in MFC that are _UNICODE
. This is some of the recent enhancements in support of Unicode:
XMLObject.h
int FromXML(const GString &XML, ...);
int FromXML(const char *pzXML, ...);
int FromXML(const wchar_t *pzXML, ...);
const char * ToXML(...);
const wchar_t *ToXMLUnicode(...);
GString.h
GString(const wchar_t *strWide);
GString & operator=(const wchar_t *);
operator const wchar_t * () const;
This was the May 2014 Update.
Software projects can fail for many reasons. In mission critical software projects, there is a common objective to avoid design-resource dependencies. If one critical resource is run over by a bus, the whole project can become "bus terminated". In a lesser "mission critical" context, when another company hires away your undervalued programmer - then management finds out that the new guy can't just pick up where the undervalued guy left off. Some smaller thinking programmers even convolute code, or leave out a few necessary comments just for job security. I am normally here to support anybody using XMLFoundation in some way that uncovers an error or missing feature - however many external events could "bus terminate" this "live" support. XMLFoundation is designed and organized to prevent failure and stand the test of time. There are no outstanding bugs. There are no feature requests awaiting implementation. There are no disputes to the claim that this is the worlds fastest XML Parser. XMLFoundation is being used more than ever before and at the same time, the need for me to answer questions has decreased due to the heavily commentated and exemplified code. At this point, termination of myself will not be the termination of a project based on XMLFoundation. The next economic gut bubble that vents it self wont blow XMLFoundation away.
I remember when XML 1.0 was disruptive. The sudden appearance of a global standard immediately forced projects worldwide to change course, not all could. Projects that could not react to the disruptive technology (or the opportunity) are now built on less standard protocols that now excludes them from some new use case scenarios or forces them to add layers of inefficiency and failure points in retrospect design. It's no wonder why software projects are so well known for failure. Many things can go wrong, and times change fast. If you search the internet for the term "Disruptive Technology" along with "Capitol" or "Venture", you will see that "Disruptive Technology" is sought after and valued even though it disrupts culture and/or other businesses. XMLFoundation is Disruptive Technology. It's like what vehicle radar detectors once were to police radar guns. Is that Disruptive Technology or Defensive Driving? Technology defends us from technology at every level. The performance numbers XMLFoundation produces does disrupt some technology business plans. In most technology plans, XMLFoundation merely adds new options.
When referencing this "Article" at [xml-dev], I called it bloggish. I remember reading Stroustrups First edition of "The C++ Programming Language", which had interesting quotes in it, here are a few re-quotes from the Fourth edition:
Don't interrupt me while I'm interrupting.
-Winston S. Churchill
Premature optimization is the root of all evil.
-Donald Knuth
On the other hand, we cannot ignore efficiency.
-Jon Bentley
The purpose of computing is insight, not numbers.
-R. W. Hamming
... but for the student, numbers are often the best road to insight.
-A. Ralston
In his book, Stroustrup injects some "bloggish" quotes like this:
"... there is nothing more difficult to carry out, nor more doubtful of success, nor more dangerous to handle, than to initiate a new order of things. For the reformer makes enemies of all those who profit by the old order, and only lukewarm defenders in all those who would profit by the new order..."
-Niccol `o Machiavelli
Other Machiavelli Quotes that didn't fit in "The C++ Programming Language"
When I saw that Addison-Wesley allowed Mr. Stroustrup to publish "Programming is like sex..." in "The C++ Programming Language", then I knew that it's OK to have a little bit of fun with this stuff as we teach. So check out the new section in this documentation - The Sexy GString.
There are some heavier areas of exemplification and documentation that still need to be done, specifically in the areas of Distributed Object Designs, and XML-DBMS integration and the Java integration, these areas and topics are as foundational in XMLFoundation as the GProfile, which just now got its first real bit of documentation. If anyone would like to rent my hourly help, we will help each other contact me: XMLBoss at live dot com.
Programming is like sex: It may give some concrete results, but that is not why we do it.
-Richard Feynman
XMLFoundation is a cross platform application foundation. GString
and xmlLex
are 1 and yet they can be separated. The G classes are "designed" for each other - not simply "used" together. The GString
now supports Compress and Cipher as member methods. Since both GZip and TwoFish are very portable and since the GString
is designed for binary data equally as much as for character string data - the combination is a natural fit that deserves to be in the base class of GString
. This comes after synchronizing GZip.cpp in the XMLFoundation with the latest source code from http://zlib.net/, an update that includes a fix for a rare compression bug. Zlib implements compression for most applications built after 1996. The authors recommend that all users of Zlib should upgrade and obtain the fix.
The XMLFoundation library contains many things. From utilities to frameworks, (plural to plural). The XMLFoundation implements a use of XML that it is about 3 times faster that the best that can be done with SAX, when parsing XML for the purpose of an update to some dataset that you already have. It is a very common task in the application layer or in a DBMS that uses XML as a native input protocol. In that case, if a key word OID is present as the first attribute we can be triple fast AND use 50% less memory. This is accomplished by eliminating a temporary memory copy made by typical XML parsers as well as eliminating many arguments being pushed and popped from the stack by using xmlLex::getToken(*tok)
explained in more detail where performance is discussed.
The fact that the first 3 letters are ALL CAPS of in XMLFoundation understates the C++ FOUNDATIONAL application development tools ( which contain an XML Parser (like the C++ framework QT) ). Building on GString
is like building on a "better" ostream
. Opinion has no place amid "better" performance data. Every application that was ever built on ostream could benefit from the XMLFoundation. All the utilities are carefully designed to be a portable platform everywhere that a C++ compiler exists (like the popular C++ library Boost (which XMLFoundation compliments nicely) ). XMLFoundation's GThread
is still the only pthread port for Windows Phone that I know of. Windows Phone is now the third-largest OS across Europe with 10 percent of the smart phone market -- more than double its share compared with last year. I support it. Behind the Capitol XML is a solid application Foundation.
In the past 30 days, there have been several minor "fixes" in the XMLFoundation as well - so all of this deserved being put together into build 211 that is already serving several production applications. Additionally, new documentation now includes example use of the virtual ToXML()
method of XMLObject
in the new example program titled BigData
. Since XMLFoundation is the fastest XML Parser on earth, it stands as the most viable solution for integration of massive XML data sets - most will be to/from an SQL database - but where BigData
meets XML - the XMLFoundation is a solution present. Despite the focus on XMLFoundation support for the equally important mobile platforms (iOS/Android/Windows Phone), XMLFoundation originated from the high performance needs of large data warehousing and is positioned to be the de facto standard for BigData
XML just as zlib became the de facto compression. Zlib is officially de facto according to wikipedia - which is often known to be incorrect in certain topics, but I believe it is correct in this case.
This is what was happening on March 31, 2014.
Of recent, I have sought out 'leaders' in the domain of XML to present a proposal for a futuristic XML that has 2 new small keywords that prefix an attribute list in a 100% syntactically backwards approach as to not disrupt any XML 1.0/1.1 software. To assemble the chiefs for a pow wow on the subject requires the attendance from the leaders of many technical worlds. IETF is a moderated forum (like codeproject.com is moderated) and OASIS is unmoderated, once famous for hosting [xml-dev] where interfaces like SAX and DOM were created by unmoderated open debate and consensus.
I have been building and perfecting the examples in preparation for presenting this concept to the industry leaders. The concept is a 'new' idea - or as someone pointed out, it's a variation of a perma-thread from the old days. In this article, I had mentioned something about "Object Databases" almost changing the world once but there was a performance problem - so the world is square.
The discussion got hot at [xml-dev] until OASIS experienced malfunctions that had the list servers down for a total of 4 and a half days. OASIS Tech support had no response to some people and a broken hyperlink for problem ticket 2546 for other people. The discussion lasted for 2 days and brought out excellent debate and advancement in the documentation of the concept. Many "Authorities" on XML commented prior to the server failure that stopped the discussion for almost 5 days. I am trying to communicate this architectural design concept to all who can understand. For those seeking to understand the design concepts, I believe it is best to read the questions asked by "Experts" on the subject of XML and read the 2 day discussion. While discussing 'new' concepts like this, various views and terminology used in other people questions is helpful to read. The discussion at [xml-dev] mostly accomplished documenting the outline for an official 'Internet Draft" based on a summary of this discussion.
Please use the forum following this article to document any additional point to cover in a
formal "Internet Draft" document for standards approval - In progress.
Recently, I added a "major" new example program. It's mostly been there all the time but it never had a makefile - see CustomHTTPService
. It builds a static web server and implements a simple post handler AND a multi-part-form handler. It's great server platform based on ServerCore.cpp.
Recently, I also added a new example program called ContainOrInherit
, explaining the difference between containment design and inheritance design, and the GProfile
now supports XML format since that could be useful. See the (large) comment explaining the reasoning for adding XML support to GProfile
14 years after designing it amid all this XML support.
Earlier in March 2014, the bulk of the work was in the files Console.cpp and 5LoavesSvc.cpp - both are excellent frameworks for any service. Those files are not tied tightly to ServerCore.cpp, they use ServerCore.cpp. Those files are a very useful starting point 'design patterns' for many applications. They are intended to be generic for anyone needing to build a windows service. Those two files are not about XML, they are about Foundation.
This is what was new on Jan 11, 2014
This is a point in the XMLFoundation project that I will switch my focus to the porting and development of several applications that were designed and built using older versions of the XMLFoundation library. Recently, my focus had been enhancing the XMLFoundation. I have more than one application that still has yet to be ported to 64 bit. In the future, updates to this main article will indicate a major update versus the daily or frequent updates of the most recent source at the external link. The current XMLFoundation and the examples are very much a result of user feedback. Even the changelog.txt was started in response the suggestion that someone gave me. Good idea. The new ChangeLogTail.txt was my idea. If you come up with a new example program, idea (great or small), or feedback that might help others - don't keep it to yourself - share it with. Share it with myself and the public on the forum at the bottom of this page, or share/discuss it with me directly (Roaring Checkmate At Live dot com). I ask for this feedback despite the fact that recently there has been more "downvoting" than ever before. The old "5 Star" XMLFoundation has new "1 Star" haters.
The very old section in this document titled Faster than Fast, written in 2009, was put together to give some explanation and emphasis to the fact that XMLFoundation is fast. The XMLFoundation is FASTER in 2014 (as much as 3 TIMES FASTER at some operations than any 2013 release). Thank God for numbers, because words can say anything, but numbers divides fact from words. This 2014 speed-leap comes from customized memory management. This is as simple as reducing calls to the generalized global memory manager in the operating system via calls to new()
and delete()
. The concept is simple, and there are various approaches. The goal is to obtain heap memory for a set of operations that you might define as 1 transaction or complete task.
I have the advantage of having done custom memory management before so I applied an experienced approach toward accomplishing this. In this release, I abundantly commented the source code changes and additions to document exactly how this custom memory management is accomplished. The proof is in the numbers exactly how important of an issue memory management is regardless of your target being on an Android phone or a 64 bit Windows server the same concept applies to memory management.
Additionally, as we kick off 2014 with a massive reduction in the reliance of the operating systems memory manager, we raised the roof for 32 bit processing limits that are reached when new()
returns NULL
. So, if faster is not what you need, perhaps you will enjoy the higher processing limits that you can now achieve in the same memory limitations that you have always had. FASTER execution and HIGHER limits are both achieved by the same upgrade in this release. I expect that this raised the roof for 64 bit limits as well but I have never seen that roof.
The bulk of the memory management documentation is in the code, I will conclude the introduction to the 2014 version with the basic steps that make it all happen.
In XMLObject.h, the new method:
virtual int GetMemberMapCount()
returns the number of MemberMaps()
that all objects of any specific type will have mapped to them. Knowing this, XMLObject
will preallocate a chunk of memory large enough to hold the details of ALL MemberDescriptor()
s that contain that information at runtime for the object instance - in the primitive days of 2013, each MemberDescriptor
was allocated a global heapspace of its own using new()
.
The code in XMLObject.cpp now looks like this - to manage the memory blocks:
int nArraySize = GetMemberMapCount(0);
m_pMemberDescriptorArray = malloc(sizeof(MemberDescriptor) * nArraySize);
GList
made did a similiar change. Here is a bit from GListNodeCache
that aplies the same concept a bit differently.
void *pBlock = malloc(sizeof(GList::Node) * NODES_PER_ALLOC);
// Now we access individual nodes like this: The temp
variable pVoid
is for readability....
void *pVoid = ((char *)pBlock) + ( sizeof(GList::Node)*nBlockIndex);
GList::Node *pNodeInBlock = (GList::Node *)pVoid;
// this is the same code as above with no temp
variable. It is a type cast with pointer arithmetic:
pNodeInBlock = (GList::Node *)((void *)(((char *)pBlock)+(sizeof(GList::Node)*nBlockIndex)));
This is what was new December 21st 2013.
MurmurHash was the latest big advancement in Hashing. It was published in Austin Appleby personal blog. Google hired him and has taken over Murmurhash, and published a variant of it called CityHash. Strangely, many Google publications failed to recognize SpookyHash - published October 31, Ground Hog Day, and other celebrated days, citing only MurmurHash.
Here are two links to help you quickly catch up on hashing algorithms to get your prerequisites up to date. To cut through all the trending Hash Hype, I recommend this brief overview of hashing.:
http://www.homolog.us/blogs/blog/2013/05/06/why-computer-science-professors-dislike-hash-functions/
I wanted to see if CityHash could help me speed up my indexing scheme so I decided to test SpookyHash/CityHash within GHash which is part of XMLFoundation. The test counts CPU cycles and/or microseconds on both Windows and Linux while indexing large datasets for both 32 and 64 bit applications. The source code for this test is included in the example program ExIndexObjects
.
The results are very interesting. GHash
is so fast that CityHash slows it down. The GHash
is a unique algorithm designed to index XMLObjects
but it can index anything. In summary, GHash
is an Array of B-Trees. It handles hash collisions so efficiently that it eliminates the need for "low collision" hashing algorithms such as CityHash which use far more CPU than a simple Rotating Hash, and further research will determine just how low it can go in the simplification of that step.
CRC-n. is faster than CityHash. For Checksumming, or creating a hash function with a perfect distribution (aka avalanche effect) CRC is a better choice. CityHash and SpookyHash are curious works in math that have only 1 possible application they can destroy data quickly. The Google Code Search Google Code Search was a free beta product from Google which debuted in Google Labs on October 5, 2006 allowing web users to search for open-source code on the Internet. Google announced that Code Search was to be shut down along with the Code Search API on January 15, 2012.[1] The service remained online until March 2013,[2] and it now returns a 404. If you have a HUGE amount of sensitive data and want to delete it (which will mark the disk sectors free for the OS to use) and DESTROY it so that it could never be recovered by any disk utility tools. There are many free utilities that accomplish this already, but with these new algorithms, the software can accomplish more block corruption in less time. Why does Google Inc invest in and market algorithms with no purpose in any of their active projects? If I was a share-holder, I would vote for new management.
GHash
ought not be confused with a block cruncher. The name GHashTableTreeStack
was too long so its called GHash
. The long name would be more proper, like Mr. Hash. GHash is one single "Data Structure Algorithm" that combines multiple algorithms and data structures (see highlighted) as parts of the whole. GHash uses a Rotating Hash to index a Static Array of Binary Trees. A GBTree
is a variant of an automatically balancing AVL Tree that contains an alternate index so that beyond traverse Ascending and Descending (like any B-Tree via Left/Right) it also has a secondary index via Next/Previous which allows the GBTree
to also be traversed in the order that items were inserted. The secondary index can optimize certain bulk commits (aka disk writes) of individual updates that were applied while it was stored in RAM by GBTree. The GHash
cannot do that, but it uses a GBTree
that can. The XML data updates and application layer updates primarily use only the primary index. Upon a special kind of commit, the fragmented memory structure can be flattened back to its initial state into the same order that they came from the disk. Another important algorithmic component of GHash
is the unique (aka one of a kind) GHashIterator
which allows SIMPLE and non-blocking THREAD-SAFE iteration of this complex data structure via an internal integer index into the Static Array used in combination with a GBTreeIterator
which is maintains two GStack
s that act as LIFO queues of state information used to quickly iterate the Tree portion of this structure, such a task is typically accomplished by a recursive method of the Tree structure. A mere typical approach will force a multithreaded application to block reads of the structure. GHash is non typical and very fast.
Each algorithm has various properties. Building from a Window64 starting point, I re-structured the Google Sparse Hash project source code. I made major structure changes to the code. This may be the answer to a commented question in the source code about an empty bucket. I added struct
support for Windows, fully redesigned the build, and provide a 64 bit target for Windows which the Sparse Hash Project does not. The work is in the file GSparseHash.h. It will be the file of interest in this release, and admittedly it is still a work in progress. It is a starting point. That one file stands as an exception in the XMLFoundation because it does not yet build under every compiler. It's MUCH simpler to incorporate into any build as the inlined implementation in GSparshHash.h (where the compiler supports) - and I expect only minor changes for GCC to support GSparseHash.h.
Although GHash
is not a Distributed Data Structure, it is a far more valuable algorithmic component within a Distributed Data Structure than CityHash
. Just as GHash
is made of many algorithmic components so must a structure like BigTable
designed and used by Google, which is likely where CityHash
is used. BigTable
is best defined as a sparse, distributed multi-dimensional sorted map.
Boom. Bust. Readjust. Ashes to Ashes. Dust to Dust. In God we Trust. Watch the Bubble sort Bust.
64 bit build results from ExIndexObjects to test XMLObject indexing speeds
----------------------------------------------------
This sample works with HUGE test files.
It is VERY VERY slow under a debugger.
If you dont have enough RAM, or want to speed it up.......
Delete TheWholeTruth.txt as many folks must do in their reality.....
It will use Truth.txt which will obtain evidence you can see.
Note: 777 milliseconds = 777,000 microseconds
Note: The 32 bit build counts cpu clock cycles
[Creating Object Instances]=539 milliseconds
[Create 81 MB XML Document]=1499 milliseconds
[Create 81 MB XML Document Faster]=633 milliseconds
[Save To Disk]=933 milliseconds
[Releasing memory]=1026 milliseconds
-------- GList --------
[InsertObjects]=5472 milliseconds
[Iterate All ]=821160 objects in 5,245 microseconds
[Search Find ]=23,455 microseconds
[Update Object]=3,182 microseconds
[Update Faster]=2,982 microseconds
[Iterate All ]=821160 objs in 4,259 microseconds
[Search NoFind]=33,071 microseconds
[Create XML ]=661 milliseconds
[XML To Disk ]=708 milliseconds
[Free Memory ]=2920 milliseconds
---------------------- Compressed 83,109,379 bytes of XML to 5,131,905
-------- GQSortArray --------
[InsertObjects]=5738 milliseconds
[Iterate All ]=821160 objects in 3,797 microseconds
[Search Find ]=1,429 microseconds
[Update Object]=1,113 microseconds
[Update Faster]=1,091 microseconds
[Iterate All ]=821160 objs in 3,827 microseconds
[Search NoFind]=2,108 microseconds
[Create XML ]=743 milliseconds
[XML To Disk ]=1311 milliseconds
[Free Memory ]=2849 milliseconds
---------------------- Compressed 83,109,379 bytes of XML to 5,131,905
-------- GBTree --------
[InsertObjects]=6814 milliseconds
[Iterate All ]=821160 objects in 19,073 microseconds
[Search Find ]=6 microseconds
[Update Object]=45 microseconds
[Update Faster]=29 microseconds
[Iterate All ]=821160 objs in 18,026 microseconds
[Search NoFind]=6 microseconds
[Create XML ]=644 milliseconds
[XML To Disk ]=676 milliseconds
[Free Memory ]=3844 milliseconds
---------------------- Compressed 83,109,379 bytes of XML to 5,131,905
-------- GHash --------
[InsertObjects]=6061 milliseconds
[Iterate All ]=821160 objects in 101,138 microseconds
[Search Find ]=1 microseconds
[Update Object]=38 microseconds
[Update Faster]=29 microseconds
[Iterate All ]=821160 objs in 111,718 microseconds
[Search NoFind]=1 microseconds
[Create XML ]=1641 milliseconds
[XML To Disk ]=745 milliseconds
[Free Memory ]=4102 milliseconds
---------------------- Compressed 83,109,379 bytes of XML to 6,902,527
-------- GSparseHash --------
[InsertObjects]=6654 milliseconds
[Iterate All ]=821160 objects in 8,931 microseconds
[Search Find ]=1 microseconds
[Update Object]=39 microseconds
[Update Faster]=22 microseconds
[Iterate All ]=821160 objs in 8,705 microseconds
[Search NoFind]=1 microseconds
[Create XML ]=1659 milliseconds
[XML To Disk ]=1332 milliseconds
[Free Memory ]=4352 milliseconds
---------------------- Compressed 83,109,379 bytes of XML to 6,990,074
C:\XMLFoundation\Examples\C++\ExIndexObjects\Release>
The ongoing commentating and documenting of the source code is always improving the usability of this powerful set of tools. The tools are getting even more powerful. Many comments were added into the source code in response to questions that people have asked. As I answer questions, I put those answers in strategic places in the source code in the form of a comment that will prevent others from having the same issue in question.
You can always search through the XMLFoundation library source code on almost any method in any G class and within the XMLFoundation, you will find a usage example to compliment the documentation in comments above each method.
If a picture is worth a thousand words, an example is worth ten thousand. "Open Source" projects are frequently unsupported and undocumented, however you will find that the ongoing commentating of source code is continuously being maintained and developed to make the toolkit more useful and productive in the hands of people who are new to using it. The detailed documentation is all in the source code, right where you need it. For example, a comment was just recently added into ListAbstraction.h just above the StringCollectionAbstraction
class that explains how that base class is used to store ANY data type into ANY data structure. That comment links to a new class called CDoubleArrayAbstraction
in AbstractionsMFC.h that stores the data type "double
" into MFC's array implementation called CArray
. While the implementation was the point of interest of one person, the comments added will be the point of interest to even more people that need some other data type in some other kind of data structure.
The GString
is ancient. Before the GString
, xmlLex used ostream
. The ONLY reason GString
was initially created was to out-perform ostream
. This is why GString
has two identical methods, Write()
and write()
. ostream
uses a lowercase write()
. In our application(s) that used ostream
, we simply commented out the ostream
instance and replace it with a GString
of the same variable name. If GString
did not have a lowercase write()
, you would have to hunt through your code and replace all the places you called write()
and change it to Write()
making it difficult to change something at such a low level, it is underneath the XML Parser. If you understand what ostream
is, then you understand what GString
is. If they were both clothing, they would be found in the same department next to MFC's CString
. The GString
is way sexier. It exposes parts that ostream
and CString
keep private
.
The big 'hack' in GString
that makes it so fast, is that it makes no heap allocation unless your data grows beyond 64 bytes. It may sound like a small thing, but it's the hack that fills the crack. In this document, I already wrote about how important memory management is. GString
applies the same concept of heap avoidance to use the faster stack space if possible, or by including some 'data' space in the 'object' space. This means when you instantiate a GString
, it eats up an 'extra' 64 bytes just in case you might use it. If you use more than the 64 bytes, you will be punished by waiting for the slower heap allocation. If you put no data in the GString
, then you needlessly preallocated memory that you never used and your punishment was that you are holding a lock on unused memory that perhaps is needed elsewhere. It's all about memory vs. speed tradeoffs.
Very recently XMLFoundation added GString0
and GString32
. The number represents the bytes of 'extra' space used to avoid the heap allocation. If you expect that a GString
value will be empty 99% of the time and you expect many instances of that normally empty GString
will exist at the same time, you may choose to use GString0
, which makes no initial stack allocation, your application will then use less memory and run a little slower only when the 1% of the exceptions come around. The GString
is a GString64
, but we just call it GString
.
Another major performance area of the GString
is in the resize()
method, the reallocation algorithm is very simple. The size of the buffer is doubled whenever it would otherwise grow beyond its current allocation bounds. Alternatively, the allocation space can grow by a block size, that approach is generally an order of magnitude slower in a preallocated GString
. GString
supports both allocation styles, but it defaults to doubling the current buffer size. The worst case scenario for the size doubling is: If you have 4 GB in a GString
and only need to fit one more byte, you might have to allocate 8GB for a GString
that will never grow beyond 4GB + 1 byte. You can always preallocate 4GB + 1 byte if you know that will be the final size, then you can avoid all this performance oriented guesswork with the fastest implementation possible. Many times, the final size is unknown until it is final.
If the GString
was ALL about performance, it would be of little interest beyond an internal tool needed by a performance orientated XML Parser, however the GString is plenty robust, filled with common needs in a wide array of applications, the interfaces are fully document in GString.h, here is a summary: I left out all the typical String
methods like Mid() Left() Right(), Upper() Compare*()
and many others, see GString.h for complete documentation.
void write(const char *pSource,__int64 nBytes);
void Write(const char *pSource,__int64 nBytes)
{write(pSource,nBytes);};
void write(unsigned char *pSource,__int64 nBytes){write((const char *)pSource,nBytes);};
void Write(unsigned char *pSource,__int64 nBytes)
{write((const char *)pSource,nBytes);};
bool FromFile(const char* pzFileName, bool bThrowOnFail = 1);
bool FromFileAppend(const char* pzFileName, bool bThrowOnFail = 1);
bool ToFile(const char* pzFileName, bool bThrowOnFail = 1);
bool ToFileAppend(const char* pzFileName, bool bThrowOnFail = 1);
void MergeMask(const char *szSrc, const char *szMask);
void FormatBinary(unsigned char *pData, __int64 nBytes, int bIncludeAscii=1);
void FormatBinary(const GString &strBinary, int bIncludeAscii =1);
const char *CommaNumeric();
const char *AbbreviateNumeric();
void StripQuotes();
void EscapeXMLReserved();
void TrimLeft(char ch = ' ', short nCnt = -1);
void TrimLeftWS();
void TrimLeftBytes(__int64 nCnt);
void PadRight(__int64 nCnt, char ch = ' ');
void Append(__int64 nCnt, char ch = ' ');
void TrimRight(char ch = ' ', short nCnt = -1);
void TrimRightBytes(__int64 nCnt);
void TrimRightWS();
__int64 FindCaseInsensitive( const char *lpszSub, __int64 nStart = 0 ) const;
__int64 Find( const char *pstr, __int64 nStart = 0 ) const;
__int64 Find( char ch, __int64 nStart = 0 ) const;
__int64 FindNth( const char *pstr, int Nth, __int64 nStart = 0 ) const;
__int64 FindNth( char ch, int Nth, __int64 nStart = 0 ) const;
__int64 FindOneOf(const char *pzCharsToSearchFor) const;
const char *FindStringAfter(const char *pSearchFor) const;
GString FindStringBetween(const char *pSearchForBegin, const char *pSearchForEnd) const;
__int64 FindBinary(GString &strToFind, __int64 nStart = 0);
__int64 ReverseFind( char chToFind ) const;
__int64 ReverseFind( char chToFind, __int64 nStart ) const;
__int64 ReverseFind( const char *pzToFind, __int64 nStart = -1,
int bMatchCase = 0 ) const;
__int64 ReverseFindNth( const char *pzToFind, int Nth) const;
__int64 ReverseFindOneOf( const char *pzToFind ) const;
__int64 ReverseFindOneOf( const char *pzToFind, __int64 nStart ) const;
void Insert( __int64 nIndex, char ch );
void Insert( __int64 nIndex, const char *str, __int64 nStrLen = -1 );
__int64 InsertBefore( const char *pzMatch, const char *pzInsertThis, int bMatchCase = 0);
__int64 InsertAfter( const char *pzMatch, const char *pzInsertThis, int bMatchCase = 0);
void Remove ( __int64 nStart, __int64 nLen );
__int64 RemoveAll ( char ch );
__int64 RemoveAll ( const char *pStrToRemove, int bMatchCase = 0 );
int RemoveFirst ( char ch );
int RemoveFirst ( const char *pStrToRemove, int bMatchCase = 0 );
int RemoveLast ( const char *pStrToRemove, int bMatchCase = 0 );
int RemoveLast ( char ch );
void Replace( char chWhat, char chReplaceWith, int nFirstOccuranceOnly = 0 );
void Replace( const char * szWhat, char chReplaceWith, int nFirstOccuranceOnly = 0 );
void Replace( const char * szWhat, const char *szReplaceWith, int nFirstOccuranceOnly = 0 );
void ReplaceCaseInsensitive( const char * szWhat, const char *szReplaceWith,
__int64 nStart = 0, int nFirstOccuranceOnly = 0 );
void Replace( char chWhat, const char *szReplaceWith, int nFirstOccuranceOnly = 0 );
void ReplaceChars(const char *pzCharSet, char chReplaceWith);
The new GThread
, first of all, is a Windows thing. Inspired by that "designed for each other" concept. Windows Mobile, Windows 32, Windows 64, and Windows Phone all need a pthread
interface (POSIX Threads) for ServerCore, and for the thread synchronization within XMLFoundation caching. iOS, Linux, Android, AIX, Solaris, and HPUX all have an official pthread implementation. Microsoft decided not to be bound to the POSIX standard. I guess you can't be a leader if you always follow. Windows Run Time (aka managed .NET code or WinRT) has a completely different threading model besides the Win32 threading model. I needed a pthread
interface for WinRT and I was forced to build my own - but the vast majority of the work was already complete, thanks to a combination of Win32 PThreads
, the implementation from John E. Bossom and the publication of "namespace ThreadEmulation" Copyright (c) Microsoft Corporation. GThread.cpp works on all Windows platforms. It is NOT intended to implement the whole of POSIX threads - only the small subset necessary within the XMLFoundation code. This is a clearly defined abstract interface (defined with #defines) in GThread.h. This better positions XMLFoundation to further customize GThread.cpp which unlike the previously used PThread.cpp (which was "designed" to implement POSIX), GThread
is "designed" for the needs of XMLFoundation. As was the case between the ObjectFactory
and the XML Parser - being designed for each other makes all the difference in the world. If you can do it better than the standard, I guess it's time to quit following that standard. I suspect GThread
will someday further optimize the integration to BOTH Windows threading models to make the most of a fully native solution. For example, consider this code snipped from xmlDefines.h that uses a native threading optimization for Windows.
#ifdef _WIN32
#ifdef __WINPHONE
#define XML_MUTEX gthread_mutex_t
#define XML_INIT_MUTEX(m) gthread_mutex_init(m,0);
#define XML_DESTROY_MUTEX(m) gthread_mutex_destroy(m);
#define XML_LOCK_MUTEX(m) gthread_mutex_lock(m);
#define XML_UNLOCK_MUTEX(m) gthread_mutex_unlock(m);
#else
#define XML_MUTEX CRITICAL_SECTION
#define XML_INIT_MUTEX(m) InitializeCriticalSection(m);
#define XML_DESTROY_MUTEX(m) DeleteCriticalSection(m);
#define XML_LOCK_MUTEX(m) EnterCriticalSection(m);
#define XML_UNLOCK_MUTEX(m) LeaveCriticalSection(m);
#endif
#else
#define XML_MUTEX pthread_mutex_t
#define XML_INIT_MUTEX(m) pthread_mutex_init(m,0);
#define XML_DESTROY_MUTEX(m) pthread_mutex_destroy(m);
#define XML_LOCK_MUTEX(m) pthread_mutex_lock(m);
#define XML_UNLOCK_MUTEX(m) pthread_mutex_unlock(m);
#endif
PThread.cpp is still included in case you need it, but it is now unused by the XMLFoundation and several applications I have built upon it.
If you have ever built an application that stores and retrieves "Application Settings", You had a situation where you could have used GProfile. On Windows, it is popular to put your application configuration settings in the Windows Registry. There are reasons not to do that, however it is a practice that dates back to 16 bit Windows 3.1 development when Charles Petzold documented and exemplified the Windows API. Even back then, it was popular to add your own [Section] into the WIN.INI file. The WIN.INI file stored "Application Settings" for Windows such as wallpaper settings and other user preferences. The Windows API allowed you to write in WIN.INI - so did notepad.exe.
There are some advantages to writing your configuration settings in a file that is not managed by the operating system. It becomes portable to Linux and iOS as well as Windows. It becomes very simple to move configurations from machine to machine when the "Application Settings" are in a file rather than all over in the registry. It also becomes very easy to manage multiple configurations, by maintaining multiple versions of the configuration file. It's not so simple to copy a snapshot of the Windows registry and switch between versions. You can easily do that with GProfile
.
GProfile
also allows you to encrypt your application configuration settings. You could encrypt individual keys in the registry, that would require a bit more code and work than if you were to use GProfile
. The Windows Registry has an API called RegNotifyChangeKeyValue()
, this will allow your application to be notified when an external source has changed a value. GProfile
has a method called RegisterChangeNotification()
that does the same thing. Additionally, you can easily switch between INI format or XML format to store your "Application Settings", settings stored in the registry are not so easily exported to a .REG file - a far cry from simple INI or open XML like the GProfile
supports.
The GProfile
is ancient, like the GString
. In fact, not until may of 2014 were the last of the C++ "short
" datatype
s converted to "bool
". When GProfile
was first written, not all C++ compilers supported 'bool
', or they just defined it as a short, so now GProfile
uses more modern and intuitive syntax. The following are a few of the method interfaces to GProfile
.
void RegisterChangeNotification(const char *pzSection,
const char *pzEntry, fnChangeNotify fn);
void UnRegisterChangeNotification(const char *pzSection, const char *pzEntry);
GStringList *ListChangeNotifications(){return &lstChangeNotifications;}
const char *LastLoadedConfigFile(){return m_strFile;}
void SetConfig(const char *szSection, const char *szEntry, const char *pzValue);
void SetConfig(const char *szSection, const char *szEntry, int nValue);
void SetConfig(const char *szSection, const char *szEntry, long lValue);
void SetConfig(const char *szSection, const char *szEntry, __int64 lValue);
void SetConfigBinary(const char *szSection, const char *szEntry,
unsigned char *lValue, int nValueLength);
void SetConfigCipher(const char *szSection, const char *szEntry,
const char *pzPassword, const char *lValue, int nValueLength);
long WriteCurrentConfig(const char *pzPathAndFileName, bool bWriteXML = 0);
long WriteCurrentConfig(GString *pzDestStr, bool bWriteXML = 0);
long WriteCurrentConfigSection(GString *pzDestStr,
const char *pzSection, bool bWriteXML = 0);
void GetSectionNames(GStringList *lpList);
const GList *GetSection(const char *szSectionName);
__int64 GetSectionEntryCount(const char *szSectionName);
GProfileSection *RemoveSection(const char *szSection);
void AddSection(GProfileSection *pS, int bIssueChangeNotification = 1);
bool RemoveEntry(const char *szSection, const char *szEntry);
__int64 ValueLength(const char *szSection, const char *szEntry);
bool DoesExist(const char *szSectionName, const char *pzEntry);
bool DoesExist(const char *szSectionName);
bool GetBoolean(const char *szSectionName, const char *szKey, bool bThrowNotFound = true );
bool GetBool(const char *szSectionName, const char *szKey, bool bThrowNotFound = true);
__int64 GetInt64(const char *szSectionName, const char *szKey, bool bThrowNotFound = true);
int GetInt(const char *szSectionName, const char *szKey, bool bThrowNotFound = true);
const char *GetString(const char *szSectionName,
const char *szKey, bool bThrowNotFound = true);
const char *GetPath(const char *szSectionName, const char *szKey,
bool bThrowNotFound = true);
GProfile(const char *szConfigData, __int64 dwSize, bool bIsXML);
GProfile(const char *pzFilePathAndName, bool bIsXML);
Building the Windows Phone Example is very simple once you have the development environment setup. The WP8 Emulator is Hyper-V, so you need to have a Core i5 or i7 CPU that has Intel VT-x/EPT to see anything work. You will also need Win 8 Pro or Enterprise. Install the Windows Phone SDK after installing VS2012 ( or select the Windows Phone Development option during the install of VS2013 ). Under the Examples folder, open the solution for Windows Phone, build and run it on the emulator. The example application does the same thing as the example for iPhone - It shows how to deal with XML, and it starts an HTTP server on the phone using ServerCore.cpp.
The port to iOS is complete. The XMLFoundation concepts work beautifully in Objective C++. I added a new example program called ObjectiveObjects
that shows how to use all the C++ examples in Objective C++. The example added for iOS is more complete than the example for Android, in that it documents converting XML to Objects. Like the Android example, the iOS example also shows how to use ServerCore.cpp to create an HTTP server on the phone. All of the code of interest in the new example is found in the file ViewController.m. Here is a bit of that file:
@implementation ViewController
@synthesize button1 = _button1;
@synthesize button2 = _button2;
@synthesize button3 = _button3;
@synthesize textView = _textView;
class MyCustomObject : public XMLObject
{
public: GString m_strString; GString m_strColor; int m_nInteger; char m_szNative[10]; GStringList m_strList;
virtual void MapXMLTagsToMembers()
{
MapMember( &m_strList, "StringList", "Wrapper");
MapMember( &m_nInteger, "Number");
MapMember( &m_strString, "String");
MapMember( m_szNative, "FixedBuffer", sizeof(m_szNative) );
MapAttribute(&m_strColor, "Color");
}
DECLARE_FACTORY(MyCustomObject, Thing)
MyCustomObject(){} ~MyCustomObject(){};
};
IMPLEMENT_FACTORY(MyCustomObject, Thing)
char pzXML[] =
"<Thing Color='Red'>"
"<String>Owners Word</String>"
"<Number>777</Number>"
"<FixedBuffer>native</FixedBuffer>"
"<Wrapper>"
"<StringList>one</StringList>"
"<StringList>two</StringList>"
"</Wrapper>"
"</Thing>";
int StartHere0()
{
MyCustomObject O;
O.FromXMLX(pzXML);
GString strDebug;
strDebug << "Yo! Check out O:" << O.m_strString <<
"[" << O.m_nInteger << "]:" << O.m_szNative << "\n\n\n";
XlogInfo(strDebug);
O.m_strString = "Root was here";
GString strXMLStreamDestinationBuffer = "<?xml version=\"1.0\" standAlone='yes'?>\n";
strXMLStreamDestinationBuffer << "<!DOCTYPE totallyCustom SYSTEM \
"http: O.ToXML( &strXMLStreamDestinationBuffer);
XlogInfo(strXMLStreamDestinationBuffer);
return 0;
}
- (IBAction)test2:(id)sender {
StartHere0();
}
This design pattern for processing XML is described in more detail further down in this document. All the concepts presented here apply to iOS as well as all the other platforms it already supported.
And the code to start the HTTP server from ObjectiveC++ is even simpler:
#include "/Users/user/Desktop/XMLFoundation/Servers/Core/ServerCore.cpp"
const char *pzBoundStartupConfig =
"[System]\r\n"
"Pool=5\r\n"
"ProxyPool=0\r\n"
"\r\n"
"[HTTP]\r\n" "Enable=yes\r\n"
"Index=index.html\r\n"
"Home=%s\r\n" "Port=%s\r\n";
int g_isRunning = 0;
void StartHTTPServer(NSString *strHome, NSString *strPort)
{
if (!g_isRunning)
{
g_isRunning = 1;
SetServerCoreInfoLog( iOSInfoLog );
const char *pzHome = [strHome UTF8String];
const char *pzPort = [strPort UTF8String];
GString strCfgData;
strCfgData.Format(pzBoundStartupConfig,pzHome,pzPort);
GProfile *pGP = new GProfile((const char *)strCfgData, (int)strCfgData.Length());
SetProfile(pGP);
server_start("-- iOS Server --");
}
else
{
GString G("Server is already running");
iOSInfoLog(777, G);
}
- (IBAction)test1:(id)sender {
NSArray *paths = NSSearchPathForDirectoriesInDomains
(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
NSString *filePath = [documentsDirectory stringByAppendingPathComponent:@"index.html"];
NSString *str = @"<html><head><title>Hello</title>" +
@"</head><body><p>Hello World</p></body></html>";
[str writeToFile:filePath atomically:TRUE encoding:NSUTF8StringEncoding error:NULL];
StartHTTPServer(documentsDirectory,@"8080");
}
For more detailed build information, see the document "XMLFoundation for iOS" in the source distribution.
An Android sample program was added that displays a simple GUI from Java that uses ServerCore.cpp to build an HTTP server application. You will find this documentation in the source download.
Although the Android example focused on the use of ServerCore.cpp and did not make use of the XML-to-Object code - all that code is fully ported to Android. Android is where the JavaXMLFoundation should be used to process XML in a native binary that uses JNI for the object bindings so that the developer has a pure Java experience, and faster XML processing. And finally about Android, naturally if your object model is built with the Android NDK then you can fully make use of the C++ XMLFoundation and all the Object-to-XML features on Android.
The December 21, 2012 build extended and widened the library interfaces with emphasis on the future in software design, and even style. XMLFoundation maps the data in raw XML to lists, arrays, strings, ints, and even int64s in the application layer. All of these data types have been supported for over a decade already. New interfaces in the 2012 version support mapping char(1 byte), short(2 bytes), and char buf[n bytes] to fully complete the mapping to every native C++ data type. The foundational GString
is now indexed by 64 bit addressing which pushes all XML document size limitations into almost infinity. This long addressing scheme has been added in such a way that 32 bit applications will still use 64 bit addressing granting them the bounds of infinity as well. Benchmark tests confirm that XMLFoundation is the fastest approach for moving XML into the application layer. The overhead of pushing an extra 4 bytes on the call stack during tokenization is measurable but insignificant in light of all the stack operations eliminated by using a custom non SAX interface to the XML parser (Read details in the 'Faster than Fast' section). On 64 bit systems, there is no performance penalty to pay at all since the registers are 64 bits wide already.
The GString
is such a sexy article of engineering that it gets used to hold all types of streamed data in an application, not just XML. By design, the GString
replaced ostream which the tokenizer (aka -the lexical analyzer or the XML Parser) was initially built with. By overloading the << operator, it was a very simple task to port this work to a better stream class. In the times of 2012, we now deal with file sizes and offsets that require 64bit indexing on the average or above average home computer. Granted it will be many years before the average home computer allocates contiguous regions of memory that large - but high end servers do it already and they have registers that are 64 bits wide. To keep GString
positioned to serve mankind in ALL situations, it now uses a 64 bit index. Target the future.
A 32 bit test parsed XML containing element and attribute tags mapped to various lists, string and integers. This was executed while counting cpu cycles using the assembly code in GPerformanceProfile.cpp for these results running in a native 32 bit operating system that is not under WOW or virtualization:
Tokenizing with a 64 bit index in the new XMLFoundation 32 bit build on a 32 bit OS: (176,121) CPU cycles
Tokenizing with a 32 bit index in the old XMLFoundation 32 bit build on a 32 bit OS: (170,144) CPU cycles
By merely widening the integer index at the lowest level of the XML parser, it caused the machine code produced by the C++ compiler to PUSH and POP more data onto the stack, hence it now takes more CPU cycles to process the same amount of XML. If you understand what caused the difference, then you can understand why the XMLFoundation is preferable to SAX if you want the fastest solution. Truly, this is the fastest solution on earth for processing XML in 32 bit, even though it is now optimized for 64 bit. The fastest solution will be the one selected to process the largest XML data sets in the world because the decision will be made by an engineer not a politician. Those data sets will need this very large indexing scheme. Smaller data sets no longer need to worry that some freak occurrence (an exception) might (however unlikely) surpass 32 bit indexing thresholds. English words fail to express what raw numbers so emphatically and eloquently assert is the fastest way to process XML. Aside from all this raw horsepower produced through efficient algorithmic design, the application source code that uses XML is organized and simple.
As the name suggests, it provides a foundation for XML support in an application, however this is much more than just another XML parser. It applies a unique approach to handling XML that allows your application code to focus on the application rather than traversing DOM or subscribing to SAX events. The most unique feature of the XMLFoundation is the object oriented encapsulation that provides XML support in the application layer. XMLFoundation allows you to easily integrate XML with your GUI, or with your server objects, and it natively supports COM, DCOM, and CORBA objects.
XMLFoundation contains a small, fast, and portable XML tokenizer that has been refined and optimized in many large software projects. My involvement with XML pre-dates the finalization of the XML 1.0 recommendation by W3C. For years, the only XML Parser that could match XMLFoundation tokenization performance was "Xpat" by James Clark - but as you will see the unique ability to bypass DOM and SAX altogether makes XMLFoundation the fastest solution available for moving XML to and from application layer objects - and it requires far less lines of code to do it.
The performance of the stack based XML parser is at the top of its class for non-validating parsers. Parsing and tokenization is only half the task, the other half is getting the results into the member variables, lists, and objects that they need to be in to be useful in the application layer - it is in that task that XMLFoundation is in a class of its own. The performance is unparalleled because the memory buffer that contains the source XML parses directly into your custom class objects without ever being copied or temporarily stored in a DOM tree. It parses directly into your lists, objects, arrays, indexed data structures, and all native C++ data types. It even has support for common containers of element data such as MFC CString
s. It's been used in Java too. That said, speed of execution is less impressive than the speed of development and overall reduction in lines of code required to effectively use XML in your application.
XML is in the Foundation, but the foundation does much more than just XML. It is also a web services framework implemented in ServerCore.cpp. It can be extended several ways for HTTP as well as for other protocols. This allows you to to build your application on a multi-threaded server blueprint that has been used on many platforms and it has been used to build servers that are not even XML based, but needless to say it works great for building an XML based server. The services framework supports a unique design approach for both static and dynamic server extensions and examples of both - but XMLFoundation does even more than XML and Web Services.
If you are building an application that does not use XML and never will..... XMLFoundation is still a very valuable tool available to solve many very common development tasks. The data structure classes alone (List
, Hash
, Stack
, Tree
, Array
, QSort
) are very useful. They all have "Iterator" objects so that data structures can be read-referenced by multiple threads at the same time without blocking. The interface is standard to all data structures. If you find MFC or Rogue Wave Standard C++ library data structures useful, you will likely find XMLFoundation data structures even more so.
XMLFoundation also has standard algorithm implementations (Encryption, Compression, Data hash, Encoding). These are based on the works of other authors. They have been included into the XMLFoundation in a simplified build format. They all compile under C++, so if you are using them on AS/400, AIX, Solaris, Linux, or other like platforms - you do not even need to reference a C compiler from the makefile, only your C++ compiler. They are also organized into single .CPP files for each implementation - often a consolidation of many individual C source files in the original authors publications.
XMLFoundation also has a plethora of application utilities including (Sorts, Performance Timers, Disk Directory, Exceptions, INI Profiles, Caching, String, Stream). XMLFoundation has many utilities that MFC does not. They are complete, comment documented with examples, and thoroughly tested on many software projects.
XMLFoundation is very portable. It builds on all versions of Windows (Win95 through Windows8 and Windows Mobile). Portions were initially developed on a RISC machine, and it was used in Solaris and Linux as early as 2001. Some of the compilers that have been used to build XMLFoundation include: CC5.0, Xlc, IntelC++, KAIc++, ForteC++, Visual C++, Borland C++, and eMc++. However, I believe it works with any C++ compiler found here, because it does not use namespaces, iostreams, or STL - all areas that are prone to porting problems from my experience. It does have template classes but their inclusion is optional as part of the implementation rather than part of the foundation. XMLFoundation and all the sample applications have recently been built and verified on Ubuntu and Fedora. The source is distributed with VC6 makefiles so that the source can be imported into projects using every version of the Microsoft compilers from 1998 through Visual Studio 2010. Now the source includes a Visual Studio 2012 project file with 32 and 64 bit targets defined.
The build dependencies are meticulously correct. Smart linkers leave out everything you don't use, so don't expect to see code bloat as a punishment for using XMLFoundation. Other development libraries were not designed as well from a build perspective. Your application will not load any DLLs as a result of using the XMLFoundation. Xfer
is another project I manage (Xfer Docs) that is built on the XMLFoundation for the platform independence - the code is tight and the product(s) built on XMLFoundation reflect that.
I suppose an entire article could be written about each of the foundational classes, and I'm certain that they will be written. They are all well commented and coded with a highly experienced approach. The String
class uses stack space when possible to avoid heap allocations. It's the best string implementation I've ever seen. The INI Profile class uses triggers that allows your application to pick up real-time configuration changes much like RegNotifyChangeKeyValue()
in the Windows SDK. Exceptions can be configured to unwind the call stack to a memory buffer like Java's printStackTrace()
. The Tree
has an iterator. The Directory
can delete recursively - on all platforms. The Stack
is entirely inline, with standard and macro methods - It could not be any faster if it was coded directly in assembly. The StringList
puts MFC's CStringList
to shame, just look at the interfaces. The GHash
puts Microsoft's CMapStringToPtr
to shame. It is unspeakably faster. Look at the "MFCTypesFromXML
" example and see for yourself.
You cannot build a house on foundation of wet cement that has not cured yet. With cement, minimizing stress prior to curing minimizes cracking in your foundation. The same is true of software. The XMLFoundation is solid and completely cured. It would be too bold to say that the XMLFoundation has no bugs in, but it has none that I am aware of and the code has been heavily used. It is a complete foundation. Building an application on any foundation like Java 1.0 or .NET 1.0 or anything 1.0 means that if you don't get slowed down by the bugs, you will be slowed down when you find all the missing functionality. This code was first released to the public July 4, 2002 - the XMLFoundation was already very mature for it's age - it came from a good family - it's mother had already been used on the largest software project in the world. Since then I have built several complex applications on it and many others have as well. It was completely stress tested with SMP hardware during a recent Fortune 50 proof of concept implementation. XMLFoundation interfaces are well established constants, no longer a curing foundation that is still forming.
The mother of the XMLFoundation was "The XML Object Framework", born in 1998 and 1999 (it was a long labor) for a client of mine. The XMLFoundation was born the following year. The XMLFoundation sported a completely new implementation of the XML parser based on the custom GString stream
class that was also born in 2000. XML Journal Magazine reviewed a product built on the early XMLFoundation object factorization and called it "5 Star / World Class" in XMLJournal Magazine Volume 2 Issue 7 (note: They did not review XMLFoundation they reviewed TransactXML). XMLFoundation was heavily developed the following two years before it became public in 2002. This project is mature and stable.
XMLFoundation absolutely IS the future in certain technology subsets. It is a gift to the world of engineering, and it comes with all the source code. Universities that want to teach algorithms, applications, or OO Design will find the XMLFoundation to be a great source code to base a curriculum on. Independent authors who want to write about cutting edge technology will find XMLFoundation a worthy subject. The future was written in the past.
XML is data. “Objects from data"is not a new concept. Programmers have been doing that for years, even before they were called objects. We still need to get data into objects today. The data can be XML or a result set, and the object might be a CDialog
, a CORBA Object, a COM object, or your own invention. You still need to get the same thing done. Programmers have been doing this as long as there have been programmers.
If you apply enough force, you can make the cube fit into the round hole. If you apply enough force you can do anything - even police California. The brute force approach is to parse the XML into a DOM tree, and traverse the tree to gather the data required by application/object variables. This approach causes volumes of “simple" source code to move data from XML into Structured Objects, a poor approach with respect to implementation time and long term maintenance.
Alternatively, the OO approach generalizes this process into reusable functionality that enables objects to serialize to and from XML directly. OO is pronounced ohhh-ohhh - and it is short for Object Oriented (incase you didn't know) - it's poetic tech lingo - a code of it's own.
Software developers of every language have a similar need. They must either:
- Write their own Object-XML tools,
- Find some production quality framework ready to use or,
- Use brute force and budget for maintenance programmers.
We can mostly rule out option A because it takes a lot of time and the purpose of the project is to build product not tools. Option C is also unwise if you have any long term plans for your product or want to be able to quickly add new features. Option B leaves several paths and it wouldn't be right for me to toot my own horn and tell you that XMLFoundation is the best option available in the entire software industry to accomplish this fundamental task - so I encourage you to research this yourself and I expect that you will agree XMLFoundation is not just the best free solution, it's the best solution.
It's difficult to directly compare XMLFoundation to other solutions because the utilities in XMLFoundation, and many of the features in XMLFoundation are not found in other solutions. That said, here is a starting point for your own research:
Microsoft developed the "Xml.Serialization.XmlSerializer
", for C# but it only supports shallow serialization (no nested or complex objects) and it lacks many other features found in the XMLFoundation. I would wager that even the limited support it does provide is slower than XMLFoundation but I have not put the two technologies to a speed comparison.
IBM Developer Works has posted 2 or more XML Serialization libraries but they are based on an external XML Parser - so by nature of their design they must be slower.
The list goes on, and on, and on, and on of likeminded solutions.
From a procedural perspective, we put data in square sets just to make programming simple. Consider this example data that is a "Customer
" with a list of "Orders
" where each order has a list of "LineItems
". That is not a square dataset - but for the sake of the application layer we have forced it to be square for the last 4 decades.
CUSTOMER | CUST_ID | ORDER_ID | ORDER_DATE | LINEITEM_ID | LINEITEM_DESC | PRICE |
Brian | 777 | 1 | July 4, 1777 | 7 | Firecrackers | $111 |
Brian | 777 | 1 | July 4, 1777 | 14 | Ariel Shells | $222 |
Brian | 777 | 1 | July 4, 1777 | 21 | Party Favors | $444 |
Brian | 777 | 2 | July 4, 2009 | 28 | Attorney Fees | $222 |
Brian | 777 | 2 | July 4, 2009 | 35 | State Fines | $555 |
The repetition in red filled the hole in to make non-square data be square. The data in red is normally a pointer reference to the last sort break at the DBMS kernel level, but various toolsets often expand it long form so that 1 instance of a "Row" object does not rely on the data in another instance. It's a terrible situation that has plagued applications for as long as I can remember. The problem is that in reality there is no such thing as a "Row" object - it was more of a temporary/tool-object to get the data into real objects like Customers, Orders and LineItems. Countless data access products in the form of VBX, OCX, ActiveX and various frameworks and libraries serve up square data sets to applications that MANUALLY code the transfer of data into their application objects with volumes of code that looked something like this:
Notice all the use of Tool
that represents some sort of data set tool, class or library. Building software to accomplish this task of copying data into objects without such a tool would dramatically increase the lines of code required to move the square dataset into your application objects. Knowing what tools to use can be the difference between the success or failure of an entire project. One bad tool, or one missing tool can make all the difference in the world to a software developer.
This is the same example data from the square result set represented in XML:
You can see the objects, Customers, Orders, and LineItems.
<Customer id=777>
<Name>Brian</Name>
<Order id=1>
<Date>July 4, 1777</Date>
<LineItem id=7>
<Desc>Firecrackers</Desc>
<Price>111</Price>
</LineItem>
<LineItem id=14>
<Desc>Ariel Shells</Desc>
<Price>222</Price>
</LineItem>
<LineItem id=21>
<Desc>Party Favors</Desc>
<Price>444</Price>
</LineItem>
</Order>
<Order id=2>
<Date>July 4, 2009</Date>
<LineItem id=72>
<Desc>Attorney Fees</Desc>
<Price>222</Price>
</LineItem>
<LineItem id=42>
<Desc>State Fines</Desc>
<Price>555</Price>
</LineItem>
</Order>
</Customers>
There is nothing square about XML. XML is an N-airy tree. That's why we naturally use DOM (Document Object Model) to traverse the data. For 1000s of years, we thought that the world was flat. Engineers made it be square because that was easier for them to cope with and now we live in the days where it begins to take its true shape. Unfortunately as of 2010, the opportunity of the paradigm data shape shift has not been harnessed by most programmers that grew up in the square world and are only familiar with square tools. They take the most obvious development path. If you presented the problem of sorting to someone who has no tools, they will likely build a "bubble sort" - because that is the most obvious and immediate solution.
Typically the XML is parsed into a tree structure. This means that the linear and contiguous memory buffer of source XML is copied into many fragmented pieces of memory across the heap - each element and in many cases each token gets its own heap space. This makes the XML elements and attributes programmatically accessible with loops and recursion, just like Tool did for square datasets. The XML parser puts the Elements and Attributes into this temporary fragmented memory tree structure so that the application programmer can get at the information to copy it once more into a final structure that can be displayed on the GUI or used by the application. It is likely going to take as much or more code to get from the temporary DOM tree into the objects as it did to get from the square result set into the objects. In many cases, it will require recursion that is difficult to debug - much more difficult than the old fashioned iterative code required to copy from square result sets. Below is a code sample of some common tasks:
-------------------------------------------------------
MSXML::IXMLDOMNamedNodeMapPtr pAttrList = m_pCurNode->Getattributes();
_bstr_t bstrAttrName = (_bstr_t)(LPCTSTR)m_strName;
MSXML::IXMLDOMAttributePtr pNewAttr = m_pDOMDoc->createAttribute(bstrAttrName);
_bstr_t bstrAttrValue = (LPCTSTR)m_strValue;
pNewAttr->PutnodeValue((_variant_t)bstrAttrValue);
pAttrList->setNamedItem(pNewAttr);
AddNodeToTree(pNewAttr, m_hCurItem);
The square world is becoming part of history like the flat world. I remember back in the early 90s, we tried to rid ourselves of the square world with something called "The Object Database". It was a great concept and the only reason square prevailed against it is because nobody could implement an Object Database that was fast enough. Who cares how clean the code is if the application is dysfunctional because it is too slow? This is why XMLFoundation is so performance oriented - that's what it takes to change the world. The clean code alone is not enough.
Now I'll explain how to accomplish the task of loading up your object with the information in the XML using a fully object oriented approach to data handling. Customer
, Order
and LineItem
are derived from XMLObject
. They must implement 1 virtual method called MapMembers()
that would look like this:
void Customer::MapXMLTagsToMembers()
{
MapMember(&m_OrderList, Order::GetStaticTag());
MapAttribute(&m_nCustomerID, "id");
MapMember(&m_strName, "Name");
}
void Order::MapXMLTagsToMembers()
{
MapAttribute(&m_nOrderID, "id");
MapMember(&m_LineItemList, LineItem::GetStaticTag());
MapMember(&m_strDate, "Date"); }
void LineItem::MapXMLTagsToMembers()
{
MapAttribute(&m_nLineItemID, "id");
MapMember(&m_strDesc, "Desc");
MapMember(&m_strPrice, "Price");
}
Now all the object assignment and creation code is summed up into this one line.
Customers.FromXML( pzXML )
If you had a trace statement in the constructor of the Order
, you would see that it was called for every appearance of an Order
in the XML.
Now suppose you wanted to manipulate some member variables then regenerate the XML: just assign your member variables normally then regenerate your XML - that's just as easy.
char *pzXML = Order.ToXML()
MapXMLTagsToMembers()
defines everything needed for your objects to read or write XML as a base method. Without the XMLFoundation, you would have to code all that looping and mapping 2 times if you wanted both reading and writing XML. Without the XMLFoundation, you will have a larger maintenance issue if any XML document structure changes because you'll have to hunt through your looping and recursion routines to find the Element
name to change. XMLFoundation provides countless other niceties such as mapping any number of XML tags to the same member, and conditional inclusion of members in the output XML based on tag name or the member state such as DIRTY indicating that the member was updated and you only want ToXML()
to generate a delta of the data rather than the entire set. You can specify element order or have them output alphabetically. Common needs that can all be accomplished in 1 line of code rather than pages of code.
-------------------------------------------------------
MapMember(&m_nVersion,"VersionNumber");
MapMember(&m_nVersion,"ProtocolVersion");
SetMemberSerialize("VersionNumber", false );
void AddAttribute( const char * pzName, const char * pzValue, int nUpdate=0 );
It's fun to compare the differences between DOM and XMFoundation, but much of the functionality in the XMLFoundation cannot be compared to anything in DOM. For example, the XMLFoundation maintains a bit flag field for each member that it manages. These are the values that can be managed:
#define DATA_DIRTY 0x01
#define DATA_NOT_NULL 0x02
#define DATA_CACHED 0x04
#define DATA_NULL 0x08
#define DATA_SERIALIZE 0x10
The following interface uses some of the member state flags:
bool setMemberDirty(void *pAddressOfMemberToSet, int bDirty = 1);
bool setMemberDirty(char *pzTagNameOfMemberToSet, int bDirty = 1);
bool isMemberDirty(void *pAddressOfMemberToCheck);
bool isMemberDirty(char *pzTagNameOfMemberToCheck);
bool isMemberNull(void *pAddressOfMemberToCheck);
bool isMemberNull(char *pzTagNameOfMemberToCheck);
bool isMemberCached(void *pAddressOfMemberToCheck);
bool isMemberCached(char *pzTagNameOfMemberToCheck);
XMLFoundation also has many options available during the creation of the XML. DOM has nothing that compares.
#define ORDER_MEMBERS_ALPHABETICALLY 0x01
#define RECURSE_OBJECTS_DEEP 0x02
#define INCLUDE_ALL_CACHED_MEMBERS 0x04
#define EXCLUDE_SHORT_TERMINATION 0x08
#define EXCLUDE_MAPPED_ATTRIBUTES 0x10
#define EXCLUDE_UNMAPPED_ATTRIBUTES 0x20
#define INCLUDE_DOCTYPE_DECLARATION 0x40
#define FULL_SERIALIZE 0x80
#define USE_OBJECT_MARKERS 0x100
#define NO_WHITESPACE 0x200
#define NO_EMPTY_STRINGS 0x400
It also has a SAX like (but faster and far simpler) way to subscribe to notifications.
virtual MemberDescriptor *HandleUnmappedMember( const char *pzTag );
virtual void *ObjectMessage( int nCase, char *pzArg1,
char *pzArg2, unsigned int nArg3, void *pArg4)
The approach used by XMLFoundation is faster than SAX. Since the object factory and the XML tokenizer were built for each other they did some unusual tricks for each other. The tokenizer uses a unique approach to begin with. It's purely pointer based. Tokens are structures that point into the source XML, except for entities that get expanded into a special memory region. Tokens do not hold copies of any data. During object factorization, it becomes necessary to have the token data in a null
terminated string
format. The big performance boosting hack is that to obtain null terminated string
s, the tokenizer actually plunks a null
down over the first byte past the end of the token data. It keeps track of the data it clobbers and restores it before parsing out the next token. There are no event calls that needlessly push data on the stack just to immediately pop it back off. Performance profilers showed that call stack pushes and pops were the single largest consumer of CPU cycles in the tokenization process. XMLFoundation eliminates them by "pulling" the data through a call to [void getToken(token *tok)
], rather than the SAX approach that gets the data "pushed" into the application events with between 2 and 7 arguments depending on the token type. SAX would be the fastest approach if the XMLFoundation did not exist. The XMLFoundation is the only XML parser that uses this approach. It is non-standard, and not in compliance with W3C interfaces to an XML Parser - For our uses, It's better than any W3C standard.
I realize that the vast majority of people who use XMLFoundation would never care about these grungy technical details. To say that it is very fast is enough for most people, but I am also writing to the people at the Apache Foundation, and Microsoft, and IBM, W3C, and the many other people who have built their own XML Parser implementations. Fast is an understatement. Performance is a prevailing design pattern found throughout the XMLFoundation. For example, the XMLObject
class is carefully designed to add minimal CPU cycles during construction because it is to the XMLFoundation what CObject
is to MFC. It has been carefully designed to add minimal entries to the virtual method table. In many cases, virtual calls were consolidated for that purpose.
The Object Factory is the part of the XMLFoundation that instantiates objects for you based on certain element tags in the source XML. It is based on the same principle as DECLARE_DYNCREATE()
that allows MFC to instantiate CView
derived classes for you. In the XMLFoundation, it is called DECLARE_FACTORY()
. The XMLFoundation uses this macro to instantiate COM and CORBA objects as well.
Every object that derives from XMLObject
must have 1 macro in the class definition, the DECLARE
macro, normally in your .h source file. It must also have one macro at global space, often in the .cpp file matching the .h file - or you may choose to consolidate all of your IMPLEMENT macros in a single .cpp file. These macro's supply the XML tag and 'this
' object's name, aka the class name. Terminology Note: Within the XMLFoundation, the term 'tag
' is 'Element Name' and sometimes 'Attribute Name'.
These macros write a method that return new instances of 'this
' object type. The address of this global static function is stored in a structure keyed by tag name. As the tags are encountered - during the XML parsing - objects are created to contain the data that they expect to follow.
If a tag is mapped to an object in a list or tree structure, then every time that tag is encountered at the level it is mapped, it will create a new instance for you and put it in the data structure (list, tree, etc.) you specified with all its member variables already assigned from the source XML as you have them mapped.
XMLFoundation has support for mapping to all native C++ data types. It also has support for mapping into data container objects. It has specific support for RWCString
, CString
, and GString
, and it's very easy to add support for others by deriving from the class "StringAbstraction
" and supplying the pure virtual methods that will enable any kind of data container class to interoperate with the Object Factory for automatic member assignments. These are the MemberMap
methods in XMLObject
:
void MapMember(bool *pValue,const char *pTag, int nBoolReadability = 1);
void MapMember(char *pValue,const char *pTag);
void MapMember(short *pValue,const char *pTag, const char *pzTranslationMapIn = 0, ...
void MapMember(int *pValue,const char *pTag, const char *pzTranslationMapIn = 0, ...
void MapMember(long *pValue,const char *pTag, const char *pzTranslationMapIn = 0, ...
void MapMember(__int64 *pValue, const char *pTag, const char *pzTranslationMapIn = 0 ...
void MapMember(double *pValue, const char *pTag);
void MapMember(char *pValue,const char *pTag,int nMaxLen,
const char *pzTranslationMapIn = 0, ...
void MapMember(GString *pValue,const char *pTag, const char *pzTranslationMapIn = 0, ...
void MapMember(void *pValue,const char *pTag,StringAbstraction *pHandler, ... );
void MapMember(GHash *pDataStructure,
const char *pzObjectName,const char *pNestedInTag = 0);
void MapMember(GBTree *pDataStructure,const char *pzObjectName,
const char *pNestedInTag = 0);
void MapMember(GQSortArray *pDataStructure,const char *pzObjectName,
const char *pNestedInTag = 0);
void MapMember(void *pDataStructure,KeyedDataStructureAbstraction *pHandler,
char *pzObjectName, ...
void MapMember(GStringList *pStringCollection,
const char *pzElementName,const char *pNestedInTag=0, ...
void MapMember(void *pStringCollection, char *pzElementName,
StringCollectionAbstraction *pHandler,...
void MapMember(GArray *pIntegerArray, const char *pzElementName,
const char *pNestedInTag = 0, ...
void MapMember (void *pIntegerArray, const char *pzElementName,
IntegerArrayAbstraction *pHandler, ...
void MapMember(void *pList, char *pObjectTag,ListAbstraction *pHandler,
const char *pNestedInTag=0, ...
void MapMember(GList *pList,char *pObjectTag,const char *pNestedInTag = 0,
ObjectFactory pFactory=0);
void MapMember(XMLObject *pObj, const char *pDefaultTagOverride = 0,
const char *pzWrapper = 0 );
void MapMember(XMLObject **pObj,const char *pzTag,
const char *pNestedInTag= 0,ObjectFactory pFactory=0);
The following code can be found in the example programs. Inheritance of XML maps works intuitively and enables you to organize and manage your code efficiently.
class CMatter : public XMLObject
{
public:
GString m_strWeight;
virtual void MapXMLTagsToMembers()
{
MapMember(&m_strWeight, "Weight");
}
DECLARE_FACTORY(CMatter, Matter)
CMatter(){}
~CMatter(){};
};
IMPLEMENT_FACTORY(CMatter, Matter)
class CLife : public CMatter
{
public:
GString m_strDNA;
virtual void MapXMLTagsToMembers()
{
MapMember( &m_strDNA, "DNA");
CMatter::MapXMLTagsToMembers();
}
DECLARE_FACTORY(CLife, Life)
CLife(){ }
~CLife(){};
};
IMPLEMENT_FACTORY(CLife, Life)
class CHuman : public CLife
{
public:
GString m_strFingerPrint;
GString m_strGender;
virtual void MapXMLTagsToMembers()
{
MapMember(&m_strFingerPrint,"FingerPrint");
MapMember(&m_strGender,"Gender");
CLife::MapXMLTagsToMembers();
}
DECLARE_FACTORY(CHuman, Human)
CHuman(){}
~CHuman(){};
};
IMPLEMENT_FACTORY(CHuman, Human)
char pzXML3[] =
"<Human>"
"<Gender>Male</Gender>"
"<DNA>1101010001010101101011000010101010</DNA>"
"<FingerPrint>Unique</FingerPrint>"
"<Weight>777</Weight>"
"</Human>";
void Main()
{
CHuman O;
O.FromXMLX(pzXML3);
GString strDebug;
strDebug << "\n\n\nGender:" << O.m_strGender << " FingerPrint:"
<< O.m_strFingerPrint << "\n" << "DNA:" << O.m_strDNA
<< " Weight:" << O.m_strWeight << "\n\n";
printf(strDebug);
printf(O.ToXML());
CLife life;
life.FromXML(pzXML3);strDebug.Empty();
strDebug << "\n\nDNA:" << life.m_strDNA << " "
<< "Weight:" << life.m_strWeight << "\n\n";
printf(strDebug);
printf(life.ToXML());
So - for example, you may create an object CPlant
that like the CHuman
is derived from CLife
. A CPlant
would contain the elements of CLife
(DNA) and of CMatter
(Weight) by inheritance.
If each XML message represents a transaction, it is wise to map the commonalities of all transactions, or groups of transactions into a base class that allows derivatives to inherit the base elements of the transaction that will only be maintained in one place.
By using the XMLFoundation, you inherit some powerful navigation features that can be used to help you debug your application with the Dump()
member. Because the factory manages all the object relationships, a new kind of object navigation arises: objects know their creators so an "Order" can know at runtime if it resides inside a list in a "Customer
", or some other kind of object, or if it is not contained by another object at all. This is what a full Dump()
output looks like:
----------------------------------------------------------------------------------
Object Dump My comments
----------------------------------------------------------------------------------
Object Instance name: MyOrder Dump of Order Object
{
string OID =
string UpdateTime =
References = 1
--------------------------------
Type :string
Tag :OrderDate
Value :1776-07-04 The Order Date is July 4 1776
State :(Clean | Valid | Cached)
Kind :Element
--------------------------------
Type :string
Tag :ShippedDate
Value :2010-07-04 The Ship Date is July 4 2010
State :(Clean | Valid | Cached)
Kind :Element
--------------------------------
Type :List<XMLObject *>
Tag :LineItem contains a list of 3 LineItem objects
Contains:3 items
Object Instance name: MyOrderLineItem
{
string OID = 1121.0000 The 1st begins here
string UpdateTime =
References = 26
--------------------------------
Type :string
Tag :Description
Value : Description is empty
State :(Clean | Null | Uncached) here we can see that it was never
assigned, it was not set to ""
Kind :Element
--------------------------------
Type :int
Tag :ProductID
Value :11 ProductID is 11
State :(Clean | Valid | Cached)
Kind :Element
--------------------------------
Type :string
Tag :UnitPrice
Value :21.0000 Unit Price is 21.0000
State :(Clean | Valid | Cached)
Kind :Element
}
Object Instance name:
MyOrderLineItem <--- here begins the 2nd of 3 line items
{
string OID = 332.5000
string UpdateTime =
References = 21
--------------------------------
Type :string
Tag :Description
Value :
State :(Clean | Null | Uncached)
Kind :Element
--------------------------------
Type :int
Tag :ProductID
Value :33
State :(Clean | Valid | Cached)
Kind :Element
--------------------------------
Type :string
Tag :UnitPrice
Value :2.5000
State :(Clean | Valid | Cached)
Kind :Element
}
Object Instance name: MyOrderLineItem
{
string OID = 7234.8000
string UpdateTime =
References = 23
--------------------------------
Type :string
Tag :Description
Value :
State :(Clean | Null | Uncached)
Kind :Element
--------------------------------
Type :int
Tag :ProductID
Value :72
State :(Clean | Valid | Cached)
Kind :Element
--------------------------------
Type :string
Tag :UnitPrice
Value :34.8000
State :(Clean | Valid | Cached)
Kind :Element
}
}
This is an example of what is involved to get XML to the GUI. The XML is somewhat complex to show how simple the code will be. The XML is a "Customer
" with a list of "Orders
" where each order has a list of "LineItems
". This is the XML:
<Customer>
<ContactName>New Dude</ContactName>
<City>Antioch</City>
<Country>All of them</Country>
<Order>
<ShippedDate>1997-09-02</ShippedDate>
<OrderDate>1997-08-25</OrderDate>
<LineItem>
<UnitPrice>45.6000</UnitPrice>
<ProductID>28</ProductID>
<Description/>
</LineItem>
<LineItem>
<UnitPrice>18.0000</UnitPrice>
<ProductID>39</ProductID>
<Description/>
</LineItem>
</Order>
<Order>
<ShippedDate>Futuristic</ShippedDate>
<OrderDate>Tomorrow</OrderDate>
<LineItem>
<UnitPrice>1234567.77</UnitPrice>
<ProductID>1234567</ProductID>
<Description/>
</LineItem>
</Order>
</Customer>
Notice that the XML foundation will parse directly in to the CString
s that are already DDX bound to MFC's UpdateData()
. This is accomplished through Multiple Inheritance. Our Dialog
class derives from both MFC's CDialog
, and XMLFoundation
's XMLObject
.
The sample application reads XML and displays it in the GUI where it can be changed by the user, then saved back out to XML that reflects the users changes.
The complete code for this example is in "XMLDialog
", but for the purpose of understanding what it takes to integrate XMLFoundation with an MFC Dialog. This shows you ALL the code of interest.
#include "xmlObject.h"
#include "GList.h"
class CXMLDialogDlg : public CDialog, public XMLObject
{
GList m_lstOrders;
virtual void MapXMLTagsToMembers();
virtual void *ObjectMessage( int nCase, char *pzArg1, char *pzArg2,
unsigned int nArg3 = 0, void *pArg4 = 0 );
DECLARE_FACTORY(CXMLDialogDlg, Customer);
CString m_strCity;
CString m_strCountry;
CString m_strName;
CString m_strRichEditXML;
}
IMPLEMENT_FACTORY(CXMLDialogDlg, Customer)
void CXMLDialogDlg::MapXMLTagsToMembers()
{
MapMember(&m_strName, "ContactName", &gC);
MapMember(&m_strCity, "City", &gC);
MapMember(&m_strCountry, "Country", &gC);
MapMember(&m_lstOrders, MyOrder::GetStaticTag(), &gGListHandler, 0 );
}
void CXMLDialogDlg::OnBtnMakeXML()
{
UpdateData(TRUE); m_strRichEditXML = ToXML(); UpdateData(FALSE); }
void CXMLDialogDlg::OnBtnLoadGUI()
{
FromXML(m_strRichEditXML); UpdateData(FALSE);
}
void *CXMLDialogDlg::ObjectMessage( int nCase, char *pzArg1,
char *pzArg2, int nArg3, void *pArg4 )
{
if(nCase == MSG_SUBOBJECT_UPDATE)
{
MyOrder *pO = (MyOrder *)pArg4;
int nItemIndex = m_List.InsertItem(LVIF_TEXT|LVIF_PARAM, 0,
pO->m_strOrderDate,
0, 0, 0, (long)pO);
m_List.SetItemText(nItemIndex, 1, pO->m_strShippedDate);
}
return 0;
}
The XMLFoundation was designed and built for CORBA before it ever added any support for MFC. If you have a pre-existing CORBA system that needs some XML tools, you have come to the right place. If you are building a new CORBA system - this is best tool available for XML support.
If you have read this document all the way to this point, then you will likely understand how the XMLFoundation works for CORBA by showing you this tiny piece of code:
class CustomerImpl : public virtual CustomerBOAImpl, public virtual XMLObject
along with the IMPLEMENT_ORB_FACTORY()
macro defined in XMLObject.h, this is how CORBA can natively support the FromXML()
and ToXML()
by using the XMLFoundation. The Object Factory can instantiate your interface objects for you based on the XML.
CORBA implementations can be done in Java or C++. The XMLFoundation supports both. CORBA breaks down the language barrier allowing Java applications to easily, and natively deal with C++ objects. This example details the creation of C++ CORBA objects - The Java implementation is nearly identical further blurring the lines between Java/C++ within the same project.
The C++ CORBA implementation will bridge into J2EE Application servers everywhere, it will work for any ORB but a few of the most popular ones have been tested, and the makefiles are included with the CORBA sample that ships with the XMLFoundation. The three makefiles included are for:
- Borland/Enterprise Studio - Visibroker
- IONA/iPortal Enterprise - Orbix
- BEA/Weblogic Enterprise - ObjectBroker (works great with Tuxedo implementations)
This example extends the ORB to provide native XML accessors. The sample CORBA application is based around 1 very simple object type. It has a unique integer we call a CustomerID
and a string
we call a CustomerName
. Each customer may contain 0 to n references to another object of the same type as itself, a MyCORBAObject
. This would model something like a list of Customers
that were referred by 'this
' customer.
The IDL Looks Like This
module ExCORBA
{
interface MyCORBAObject
{
void getXMLState(out string s);
void setXMLState(in string s);
void setState(in string s, in long l);
void addSubObject(in string s, in long l);
void delSubObjects();
MyCORBAObject getSubObjectIOR(in long l);
void dumpState(out string s);
};
};
Follow this 12 Step Program
This is a very simple application. The client application makes 12 calls to the server. Every even numbered call is exactly the same - it is a call to getXMLState()
to see what's going on in the server. The client obtains an initial IOR from a server serialized IOR upon server startup.
Step 1 - Assign some state in a native CORBA call. This is a typical CORBA data assignment operation. Two values are set in the object. The client assigns two members on the server. The code looks like this on the client:
CustObject1->setState("Root",777);
Step 2 - Remember - every even numbered call in theis 12 step program is exactly the same
CORBA::String_var s;
CustObject1->getXMLState(s);
Step 2 (behind the scenes) This is how getXMLState()
is implemented on the CORBA server:
void ExCORBAImpl::getXMLState( CORBA::String_out s)
{
const char *p = ToXML();
s = CORBA::string_dup(p);
}
This is the XML sent from the server to the client as the result from step 2.
<MyCORBAImpl>
<CustomerID>777</CustomerID>
<CustomerName>Root</CustomerName>
</MyCORBAImpl>
The tag names are configured by the ExCORBAImpl
object like this:
void ExCORBAImpl::MapXMLTagsToMembers()
{
MapMember(&_nCustID, "CustomerID");
MapMember(&_strCustName, "CustomerName",&gGenericStrHandler);
MapMember(&m_lstCMyImplObjs, "MyCORBAImpl",&gGListHandler,0);
}
Step 3 - Update the state of the object through XML. Step 1 used a typical CORBA object accessor to assign the state. Step 3 accomplishes the same through XML. All CORBA objects based on XMLObject support this member assignment style. This is the code on the client:
CustObject1->setXMLState("<MyCorbaImpl><CustomerName>SuperUser</CustomerName></MyCorbaImpl>");
on the server, the implementation is extremely simple - it looks like this:
void ExCORBAImpl::setXMLState( const char* pzXML )
{
FromXML( pzXML );
}
Step 4 - just like step 2 call getXMLState()
and this is the result:
<MyCORBAImpl>
<CustomerID>777</CustomerID>
<CustomerName>SuperUser</CustomerName>
</MyCORBAImpl>
Step 5 - Add CORBA Sub-Objects through XML. Step 5 is a lot like step 3 where we updated the name "root
" to "SuperUser
" through an XML assignment. This time, we'll add an object reference. This is a large concept to fully realize and apply - the XML creates CORBA objects.
The client code looks like this:
CustObject1->setXMLState(
"<MyCORBAImpl>"
"<MyCORBAImpl>"
"<CustomerID>123</CustomerID>"
"<CustomerName>Al Gore</CustomerName>"
"</MyCORBAImpl>"
"<MyCORBAImpl>"
"<CustomerID>456</CustomerID>"
"<CustomerName>George Bush Jr.</CustomerName>"
"</MyCORBAImpl>"
"</MyCORBAImpl>");
even though we are creating new CORBA objects we write NO SPECIAL CODE on the server.
We are calling the same implementation on the server we called in Step 3:
void ExCORBAImpl::setXMLState( const char* pzXML )
{
FromXML( pzXML );
}
Step 6 - Just like step 2, call getXMLState()
and this is the result:
<MyCORBAImpl>
<CustomerID>777</CustomerID>
<CustomerName>SuperUser</CustomerName>
<MyCORBAImpl>
<CustomerID>123</CustomerID>
<CustomerName>Al Gore</CustomerName>
</MyCORBAImpl>
<MyCORBAImpl>
<CustomerID>456</CustomerID>
<CustomerName>George Bush Jr.</CustomerName>
</MyCORBAImpl>
</MyCORBAImpl>
Step 7 - Get a CORBA object reference for object instance 456, an object created by the XML. On the client, the code looks like this:
ExCORBA::MyCORBAObject_var CustObject2;
CustObject2 = CustObject1->getSubObjectIOR(456);
and on the server, we walk the list of objects and return the first one that matches the supplied CustomerID
like this:
ExCORBA::MyCORBAObject_ptr ExCORBAImpl::getSubObjectIOR(CORBA::Long CustomerID)
{
GListIterator it(&m_lstCMyImplObjs);
while(it()) {
XMLObject *pO = (XMLObject *)it++; ExCORBAImpl*pIO = (ExCORBAImpl*)pO->GetInterfaceObject();
if (pIO->GetCustomerID() == CustomerID)
{
return pIO->_this();
break;
}
}
return 0;
}
Step 8 - Exactly like steps (2, 4, and 6) EXCEPT we are calling toXML() on the Object ref
returned by step 7.
<MyCORBAImpl>
<CustomerID>456</CustomerID>
<CustomerName>GeorgeBush Jr.</CustomerName>
</MyCORBAImpl>
Step 9 - Add a Sub-Object without using XML. In the same way, we used a traditional member assignment in step 1, we can create a new object reference to demonstrate the two models seamlessly working together. On the client:
CustObject1->addSubObject("Michelangelo",1475);
and on the server, the code looks like this:
void ExCORBAImpl::addSubObject( const char* s, CORBA::Long l )
{
ExCORBAImpl *p = new ExCORBAImpl;
p->_nCustID = l;
p->_strCustName = s;
m_lstCMyImplObjs.AddLast((XMLObject *)p);
}
Step 10 - Get an object reference to the object created in step 9 and display its state in XML. This is the client code:
CustObject2 = CustObject1->getSubObjectIOR(1475); CustObject2->getXMLState(s);
and the result is:
<MyCORBAImpl>
<CustomerID>1475</CustomerID>
<CustomerName>Michelangelo</CustomerName>
</MyCORBAImpl>
Step 11 - Deleting Sub
objects. All objects, no matter how they were created, are destroyed the same. The list contains both Factory created objects and Objects created the traditional way. Once again, this shows how seamlessly the ORB fits together with the XMLFoundations's Object Factory. This is the CORBA Implementation/Interface and XMLObject
are all one in the same. This cleans up the whole mess.
CustObject1->delSubObjects();
on the server:
void ExCORBAImpl::delSubObjects() IT_THROW_DECL((CORBA::SystemException))
{
GListIterator it(&m_lstCMyImplObjs);
while(it())
{
XMLObject *pO = (XMLObject *)it++;
pO->DecRef();
}
m_lstCMyImplObjs.RemoveAll();
}
Step 12 - To see that step 11 worked, view the XML state like we did in 2, 4, 6, 8, & 10. Now all the contained objects are gone, and "SuperUser
" is alone.
<MyCORBAImpl>
<CustomerID>777</CustomerID>
<CustomerName>SuperUser</CustomerName>
</MyCORBAImpl>
Create a basic ATL COM project with Visual Studio.
Visual Studio will write your IDL, and implementation header files. The following code sample is the standard implementation header file with the addition of deriving from public XMLObject
, the DECLARE_FACTORY
macro, and MapXMLTagsToMembers
.
class ATL_NO_VTABLE CAddress :
public CComObjectRootEx<ComSingleThreadModel>,
public CComCoClass<CAddress, &CLSID_Address>,
public IDispatchImpl<IAddress>, &IID_IAddress,
&LIBID_ATLExample2012Lib, 1, 0>,
public XMLObject
{
void MapXMLTagsToMembers(){};
public:
DECLARE_FACTORY(CAddress, Address)
In your implementation file, you'll need to add the macro at a global scope and implement
MapXMLTagsToMembers() to define the Object to XML mappings.
This example maps an integer, a string, and a list of COM objects.
<pre lang="C++">void CMyATLObj::MapXMLTagsToMembers()
{
MapObjectID("CustomerID",1);
MapMember(&m_nInteger, "CustomerID");
MapMember(&m_strString, "CustomerName", &gGenericStrHandler);
MapMember(&m_lstCMyATLObj, CMyATLObj::GetStaticTag(),&gGListHandler,0);
}
The ExATLCOM
sample application builds under VC6 and the ATLExample2012
builds under newer versions of Visual Studio. They both implement COM in a way that makes it NATIVE. You will see the additional methods that have been added to the COM Object. Most notably put_XMLState()
that has the ability to assign members variables and create COM objects when supplied well-formed XML as input.
STDMETHODIMP CMyATLObj::put_XMLState(BSTR newVal)
{
_bstr_t b(newVal);
FromXML((const char *)b);
return S_OK;
}
XMLFoundation has been serving up the XML related needs of the application layer for nearly a decade. It has been used to build a wide variety of application types. A common recurring need in the application layer has to do with "data updates". Any application that receives XML updates might consider the performance advantages and reduction in development labor by using the XMLFoundation to solve the problem for them. For example, suppose you had some large dump of XML data. In your application layer, you need to quickly access individual pieces of that information. In just a few lines of code, the XML can be mapped to a keyed data structure for fast indexed reads by your application. If the initial XML dataset was 100+million records - you will want to provide updates to your indexed information rather than rebuilding the entire index. You could write the code to search for the data to update, or allow the XMLFoundation to manage it for you. Another common example in distributed systems: Data is often cached at a middle tier or in the application itself. Efficiently designed systems only update an "Address
" rather than a whole "Customer
" and all his "Orders
" when an "Address
" changes. XMLFoundation can greatly simplify this task. At the core of caching is something XMLFoundation calls the OID, or Object ID. It is a unique key to the object, and any object that participates in XMLFoundation caching must have one. The definition of the OID can come from 2 places. It can be defined in the XML data, or it can be defined by the object that mapped the data. This is how a "'MyOrderLineItem
" object might define the OID. It uses a combination of the "ProductID
" and "UnitPrice
" so in this example a price change constitutes a different object. Normally an OID has a direct correlation to DBMS indexes in properly normalized data. ObjectID
s can be made to work well over poor data models too. This example code uses two XML Elements ("ProductID
" and "UnitPrice
") to build the unique object ID. MapObjectID()
also allows you to use Attributes to define the OID.
void MyOrderLineItem::MapXMLTagsToMembers()
{
MapObjectID("ProductID",1,"UnitPrice",1);
}
Alternatively, the OID can be directly defined by the data itself with a special attribute named "OID
" - so that NO CODE needs to be written.
<MyOrderLineItem oid='777'>
<ProductID>123</ProductID>
<UnitPrice>7.77</UnitPrice>
</MyOrderLineItem>
The sample application "ObjectCache
" provides over 30 test cases that detail the usage of object caching.
For the most part - objects contain members. Members are mapped to attributes and elements in the XML. For the most part - that is how most XML documents are arranged but as you will see here there are yet still two other forms of markup for getting untagged data into the object. As a prerequisite to reading the following paragraph, you must know a little bit about what the XML specification refers to as CDATA
, so go plug that into your favorite search engine if you are not familiar with unparsed data then when you pop your stack of things to do you will find yourself ready to continue reading this:
<Thing Color='Red White and Blue'>
<![CDATA[-Object Data-=
<String>Capitol Capital G</String>
<Number>777</Number>
<Wrapper>
<StringList>one</StringList>
<StringList>two</StringList>
</Wrapper>=-More Object Data-
</Thing>
The class declaration below has Maps for all of the elements and attributes in the XML above. It makes no provisions for Object Data - Parsed
or Unparsed
. It only maps the "String
", the "Number
" and the StringList
in Wrapper
.
class MyCustomObject : public XMLObject
{
public:
GString m_strString; GString m_strColor; int m_nInteger; GStringList m_strList;
virtual void MapXMLTagsToMembers()
{
MapMember( &m_strList, "StringList", "Wrapper");
MapMember( &m_nInteger, "Number");
MapMember( &m_strString, "String");
MapAttribute(&m_strColor, "Color");
}
DECLARE_FACTORY(MyCustomObject, Thing)
MyCustomObject(){}
~MyCustomObject(){};
};
IMPLEMENT_FACTORY(MyCustomObject, Thing)
This is how your code will obtain this "unmapped" object data.
void ObjectDataAndCDataExample()
{
MyCustomObject O;
O.FromXMLX(pzXML);
GString *pG = O.GetCDataStorage();
printf(*pG); printf("\n\n");
int nOffsetFromStartofXML = pG->Buf() - pzXML;
pG = O.GetObjectDataStorage();
nOffsetFromStartofXML = pG->Buf() - pzXML; printf(*pG); printf("\n\n");
printf(O.ToXML());
printf("\n\n");
MyCustomObject O2;
O2.m_nInteger = 777;
O2.m_strString = "G.G.G.Guru";
O2.m_strColor = "Gold, Green and White";
*(O2.GetCDataStorage()) << "x<data>x";
O2.SetObjectDataValue("-- object <data> is parsed --");
printf(O2.ToXML());
}
5Loaves is included with the XMLFoundation, but not in the foundation. It is a tool that uses XMLFoundation. It needs the XMLFoundation but the XMLFoundation does not need 5Loaves. 5Loaves is implemented in a single file called ServerCore.cpp. It is a unique piece of code and it is very simple to use. There is no header file for ServerCore.cpp and it causes no DLLs to be loaded by your application. It is a portable, properly threaded, and well written server core that can be applied to countless custom server implementations. It is a Proxy. It is an HTTP Server. It is a 'connection joiner'. ServerCore.cpp is currently used to accept TCP connections for an advanced networking product called Xfer. (Note: Xfer is not open source). The ServerCore
has a good portable threading model with some unique features that allow you to limit such things as, number of connections per IP/Subnet and Connections per second. The 5Loaves HTTP Server works faster than IIS or Apache in some cases. It has been designed to be fast and includes unique features such as 'content caching' to serve up prebuilt HTTP headers and data from memory rather than from disk. The Core of 5Loaves is a ground up POSIX threaded TCP server. This server template can be applied to build many types of applications that service TCP connections. The sample programs are server applications that use the 5Loaves ServerCore. One example is the 5Loaves shell console. It is a command line interpreter like the DOS prompt or a Unix shell written from scratch. It is a useful application starting point if you ever need a simple shell that has been tested in Linux, Solaris, AIX, HPUX, and Windows. There is also a Windows Service application. It contains the proper implementation for integrating 5Loaves with the Windows Service Control Manager. There is also an example program with all the source to an ActiveX implementation of 5Loaves so you can see that this server core can be embed just about anywhere in any development language. The following code sample shows you how to start the HTTP service from inside your own process. You can't do that with IIS or Apache. The HTTP server does support binary plugins like ISAPI, and there is an example program that creates plugins. It's an advanced HTTP server.
#include "../Core/ServerCore.cpp"
char *pzBoundStartupConfig =
"[System]\r\n" "Pool=20\r\n"
"ProxyPool=0\r\n"
"\r\n"
"[HTTP]\r\n" "Enable=yes\r\n"
"Index=Index.html\r\n"
"Home=%s\r\n"
"Port=%s\r\n";
void CMyClass::StartHTTPServer(const char *pzHomeDirectory,const char *pzPort)
{
GString strCfgData;
strCfgData.Format(pzBoundStartupConfig,pzHomeDirectory,pzPort);
SetProfile(new GProfile( strCfgData, strCfgData.Length()) );
server_start();
Approach #1 - "Low Level Static Code" Adding on to the rather trivial amount of code shown above for integrating ServerCore, you may want to develop a custom "dynamic content" server based on this HTTP server implementation that can be easily integrated into YOUR process (unlike IIS or Apache). There are several ways to go about accomplishing this task and depending on your situation one way may be more appropriate than another. One way of implementing a custom ServerCore extension can be accomplished via "Low Level Static Code". The advantage to this form of integration is that no DLL's are loaded. Another possible advantage is that no form of integration could possibly be faster. Using this approach ServerCore will manage nothing except for multi-threading the connections for you and the initial TCP network read. This can be done by adding this one line of code:
#define SERVERCORE_CUSTOM_HTTP
prior to adding this line of code:
#include "../Core/ServerCore.cpp"
You will also have to create a file called ServerCoreCustomHTTP.cpp, a sample implementation has been provided in the Server/Core folder. To see it work in the "5Loaves" example project add the #define SERVERCORE_CUSTOM_HTTP
into the file Servers/5loaves/Console.cpp then create a text file called 5Loaves.txt that you can place in the same folder as the binary or at "C:\\" with this contents.
[System]
Pool=20
ProxyPool=0
[HTTP]
Enable=yes
This sets the thread pool to 20 setting the limit of your server to 20 concurrent client connections and supports no proxy connections, you may set the thread pool at any value that you have enough hardware resources to support. Making this example work in the Windows Service is equally as simple by adding #define SERVERCORE_CUSTOM_HTTP
into Servers/WebServerService/WebServerService.cpp (directly above the inclusion of ServerCore.cpp) This will extend the service application to run a custom 'low level - static code' extension that is implemented in ServerCoreCustomHTTP.cpp exactly like the console application. I have two products that both use this "Low Level Static Code" approach to extending ServerCore.cpp. It is my preferred approach when building an "application" based on ServerCore.cpp because the integration is "tight" beyond the definition of "tight". You should search ServerCore.cpp for SERVERCORE_CUSTOM_HTTP
to see for yourself that this is not function call - but a true inline implementation into the very first call stack frame of the servicing thread. Since the code is being added directly into the lowest level possible there is no function call dispatch, it executes the extension without even adding a new frame on the call stack (you can't do that with IIS or Apache - much less in YOUR process space) - this makes the extension code look strange because there will be no open scope { or close of scope } - and you will exit with a GOTO
rather than a return. For example, consider this complete example found in Servers/Core/ServerCoreCustomHTTP.cpp:
if (memcmp(sockBuffer,"GET",3) == 0)
{
GString strRequest;
strRequest.SetFromUpTo(&sockBuffer[4]," ");
GString strResponse("Server Response: Hello World");
HTTPSend(td->sockfd, strResponse, strResponse.Length());
goto KEEP_ALIVE;
}
else if (memcmp(sockBuffer,"POST /",6) == 0)
{
}
Now - to see this example work... Run 5Loaves.exe with the 5Loaves.txt in the same directory. Put this URL into your browser: http://127.0.0.1/ Your browser will display "Server Response: Hello World" Approach #2 - "Binary Plugin" This next approach to extending an HTTP service is more typical. Both IIS and Apache support both CGI and ISAPI to support user developed web server extensions via "Binary Plugin's". This approach allows you to rebuild the extension without rebuilding the HTTP server. You will create a DLL (under Windows) or an SO (under Unix) that works a lot like ISAPI. The HTTP service will load and execute your extension giving you access to everything necessary to build any kind of custom extension. The ServerCore adds one additional layer of abstraction by invoking the "Plugin" through the "Language Driver" described in the next section.
The XMLFoundation supports "Language Drivers". 5Loaves is among the applications that implements them. Language drivers allow user developed extensions to be invoked programmatically. Just about every programming language can have a language driver developed for it. Several complete Language Driver implementations come with the XMLFoundation source code. The programmer's code is "the plugin" that is executed by the "Language Driver". The majority of people who use this technology will probably be developing "plugins", but you also may develop an application that allows users to develop their own plugins. All of the source code for everything I speak of is included in the XMLFoundation, see [IntegrationBase.h/cpp][IntegrationLanguages.h/cpp], using DynamicLibrary.h to load the DLL/SOs on many platforms. I have used the Language Driver functionality in several applications. For example, I have an XSLT that allows user extensions in addition to the built-in XSL keywords. This allows my application to pass a string
value into a user defined method in a plugin that might need to do a database lookup to translate a code - a task that is too complex for any XSL keyword. Loading the "Language Driver" and executing a plugin does not require much code. This is the code to pack 4 arguments and invoke a plugin called "Test1
" inside "PluginExample.dll" through the CStdCall
Language Driver:
char pzArgBuff[256];
sprintf(pzArgBuff,"aaa%cbbb%c....fast%c",0,0,0); unsigned long *pL = (unsigned long *)&pzArgBuff[8]; *pL = 777; char pzArgSizes[16];
strcpy(pzArgSizes,"4|4|4|5");
InterfaceInstance *pII = GetInterfaceInstance("CStdCall");
int nOutResultSize;
const char *pzResults = pII->InvokeEx( "PluginExample.dll", "PluginExample", "Test1",
pzArgBuff,pzArgSizes,&nOutResultSize, "anonymous",
"password" );
The code shown above would be in the hosting application, or the application that supports custom extensions. It will then execute the plugin that is even easier to develop (code shown below):
ExposedMethods() should define the arguments like this:
Test1&&One&&char *&&Two&&char *&&Three&&Packed Unsigned Long&&Four&&char *
extern "C" __declspec(dllexport) void Test1(void *pHandle, DriverExec Exec,
const char *One, const char *Two, const char *Three, const char *Four)
{
PlugInController PIC(pHandle, Exec);
(*(unsigned long *)Three)++;
unsigned long uThree = *(unsigned long *)Three);
uThree++;
PIC.AppendResults("Argument 3 is Native Packed Binary in a Plugin");
}
One sample application titled "PluginExample
" is devoted to C++ plugins. It shows how to write several types of POST handler plugins for the HTTP server. You will find the utility class CMultiPartForm
handy if you need to write a handler for a Multipart HTTP POST, other plugin utilities like PlugInController
found in PluginBuilder.h make the job of building a plugin as easy as possible. ServerCore.cpp has the implementation using (InterfaceInstance *)pII->InvokeEx()
described above already complete so that you may extend the HTTP service via your own DLL(or COM object or Perl/Python script). This is how to build a custom dynamic web page from an HTTP server plugin:
5Loaves>Windows Console using [5Loaves.txt]
Listening on port[80]
All bound ports are now being serviced
started
5Loaves>
5Loaves HTTP Server Invoked me with [root] and [777] and [superuser]!
extern "C" __declspec(dllexport) void Test2(void *pHandle, DriverExec Exec,
const char *One, const char *Two, const char *Three)
{
PlugInController PIC(pHandle, Exec);
PIC.AppendResults("5Loaves HTTP Server Invoked me with [");
PIC.AppendResults(One);
PIC.AppendResults("] and [");
PIC.AppendResults(Two);
PIC.AppendResults("] and [");
PIC.AppendResults(Three);
PIC.AppendResults("]!");
}
- Compile the
CStdCall
Language Driver - Compile the
PluginExample
- Setup the 5Loaves.txt Configuration File like this (but set the
[TXML]Drivers=
and [CStdCall]Path=
to where steps 1 and 2 compiled to):
[System]
Pool=20
ProxyPool=0
[HTTP]
Enable=yes
EnableServerExtensions=yes
ServerPlugin1=/Test2WWWPage|CStdCall|PluginExample.dll|PluginExample|Test2
ServerPlugin2=/Test3WWWPage|CStdCall|PluginExample.dll|PluginExample|Test3
ServerPlugin3=/Test4WWWPage|CStdCall|PluginExample.dll|PluginExample|Test4
ServerPlugin4=/Test5WWWPage|CStdCall|PluginExample.dll|PluginExample|Test5
ServerPlugin5=/Test7WWWPage|CStdCall|PluginExample.dll|PluginExample|Test7
ServerPlugin6=/Page1|CStdCall|PluginExample.dll|PluginExample|Page1
ServerPlugin7=/Page2|CStdCall|PluginExample.dll|PluginExample|Page2
ServerPlugin8=/Page3|CStdCall|PluginExample.dll|PluginExample|Page3
[TXML]
Drivers=C:\XMLFoundation\Drivers\Debug
[CStdCall]
Path=C:\XMLFoundation\Examples\C++\HTTP.Xfer.Messaging-PluginExample
The HTTP service will loop through the ServerPluginN
entries and dispatch the plugin calls for you. You must configure the ServerPluginN
entries in numerical order and the HTTP service will load each entry upto the first numerical break. You may also manage the dispatch partially yourself by using a wildcard * in the first argument - for example.
ServerPlugin1=/PluginPage*|CStdCall|MyDLL.dll|MyDLL|DoIt
This would map the following URLs to DoIt()
in:
MyDLL.dll
http://127.0.0.1/PluginPageFoo.html
http://127.0.0.1/PluginPageBar%20Argument1%20Argument2
http://127.0.0.1/PluginPageHello
- Put the 5Loaves.txt file into the project directory if you run 5Loaves.exe under a debugger or put 5Loaves.txt in the same directory as 5Loaves.exe if you run 5Loaves.exe outside the debugger - either way when it starts
5Loaves
will display this message: - Now execute the most basic plugin that takes 0 arguments and is simply a static bound html page - put this URL into your browser: http://127.0.0.1/Page1
- The browser will show a simple web page with 3 edit fields and a "Submit" button. If you see that page - the plugin executed properly through an HTTP
GET
which loaded an HTML page that will POST back the 3 arguments into another handler that exists in that same plugin. The information POST'ed by the HTML form calls a method with with 3 arguments. So you supply the values 1, 2, and 3 to the three edit fields and press submit your browser will display this:
5Loaves HTTP Server Invoked me with [1] and [2] and [3]!
- Now try this URL: http://127.0.0.1/test2WWWPage&root&777&superuser and this will be the result:
- Look at the Plugin example to see the details of implementing a plugin. You will see that this is the plugin handler that was called in step 5 and 6:
5Loaves provides tunneling with encryption and compression as a base service. It can be used to secure internet connections much like SSH. It works by running some form of server with the 5Loaves engine on two machines. The data passed between those machines can be compressed or encrypted or simply logged. Starting the server is done exactly the same no matter if you are running the HTTP service or the Tunneling service - only the configuration startup string changes. The following configuration example will open a listener on port 1972, anything it receives will be encrypted and compressed and sent to [www.ExampleServer.com] on port 2009. The server will decrypt and decompress the data from port 2009 and forward it to port 1972, so the data on port 1972 at the server will be as if the client had directly sent it. To open a tunnel entry point on the client side, use this:
[Tunnel1]
Enable=yes
LocalPort=1972
RemotePort=2009
RemoteMachine=www.ExampleServer.com
Timeout=30
CompressEnabled=yes
CipherPass=Tiger
RawPacketProxy=no
To exit the tunnel on the server:
[Proxy1]
Enable=yes
LocalPort=2009
RemotePort=1972
RemoteMachine=127.0.0.1 (we could also use an internal resource like
192.168.*)
Timeout=60
RawPacketProxy=no
CompressEnabled=yes
CipherPass=Tiger
Note: You can start any number of tunnels, simply increment the section [Tunnel2]
and [Proxy2]
. ServerCore
will stop loading tunnels at the first break of numeric order of the [sectionN]
. This type of usage may be of interest especially to web developers or people who are just curious. Sometimes, it's interesting to see the data between the HTTP server and the browser. Redirects and HTML frames and JavaScript can make that difficult. This section sets up a clear text proxy just for the purpose of seeing the transmission log between the browser and the web-server. Once this is running, connect with a web browser to 127.0.0.1 and you in fact be connecting to the endpoint configured under the RemoteMachine=
entry.
[Tunnel2]
Enable=yes
LocalPort=80
RemotePort=80
RemoteMachine=www.SampleWebSite.com
Timeout=65000
RawPacketProxy=yes
LogPath=c:\HTMLSpy
LogBinary=no
LogEnabled=yes
In the folder HTMLSpy, you will see a log file of the communication between the browser and the web server. I used Firefox as a browser, and the 5Loaves
HTTP server for this example. I loaded the page twice in the browser so that you can see the caching mechanism working between the browser and the web server in this log file: The header is defined like this: "tx->s
" means "transmit to server" so that is information that came from the browser, it is followed by the time, then the total bytes transmitted, 470 in this case. The reply is "tx->c
" or "transmit to client", you can see that the HTTP server responded with 181 bytes that instructed the browser to use the version it has cached. You can also see that the "Server:
" name was set to "MyWebServer
", that is a variable in the 5Loaves
HTTP server unlike the static names IIS and Apache use.
tx->s00:51:42-000470>
GET / HTTP/1.1
Host: 127.0.0.1
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.11)
Gecko/2009060215 Firefox/3.0.11
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/.;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT
If-None-Match: 7430174881
Cache-Control: max-age=0
tx->c00:51:42-000181>
HTTP/1.1 304 Not Modified
Server: MyWebServer
Date: Sun, 16 Aug 2009 07:01:12 GMT
Connection: keep-alive
Keep-Alive: timeout=20, max=149
ETag: 7430174881
Content-Length: 0
By changing the log binary to yes, now we can see what a small gif file looks like over the wire from the web server.
[Tunnel2]
LogBinary=yes
tx->c01:05:29-000317>
48 54 54 50 2F 31 2E 31 20 32 30 30 20 4F 4B 0D 0A 44 61 74 65 3A 20 57 65 HTTP/1.1 200 OK..Date: We
64 2C 20 30 31 20 4A 75 6C 20 32 30 30 39 20 30 31 3A 30 35 3A 32 39 20 47 d, 01 Jul 2009 01:05:29 G
4D 54 0D 0A 53 65 72 76 65 72 3A 20 4D 79 57 65 62 53 65 72 76 65 72 0D 0A MT..Server: MyWebServer..
43 6F 6E 6E 65 63 74 69 6F 6E 3A 20 6B 65 65 70 2D 61 6C 69 76 65 0D 0A 4B Connection: keep-alive..K
65 65 70 2D 41 6C 69 76 65 3A 20 74 69 6D 65 6F 75 74 3D 32 30 2C 20 6D 61 eep-Alive: timeout=20, ma
78 3D 31 34 39 0D 0A 4C 61 73 74 2D 6D 6F 64 69 66 69 65 64 3A 20 53 75 6E x=149..Last-modified: Sun
2C 20 31 33 20 4D 61 72 20 32 30 30 35 20 32 32 3A 33 32 3A 33 36 20 47 4D , 13 Mar 2005 22:32:36 GM
54 0D 0A 45 54 61 67 3A 20 31 31 31 30 37 34 39 35 35 36 0D 0A 43 6F 6E 74 T..ETag: 1110749556..Cont
65 6E 74 2D 74 79 70 65 3A 20 69 6D 61 67 65 2F 67 69 66 0D 0A 43 6F 6E 74 ent-type: image/gif..Cont
65 6E 74 2D 6C 65 6E 67 74 68 3A 20 37 34 0D 0A 0D 0A 47 49 46 38 39 61 10 ent-length: 74....GIF89a.
00 10 00 91 00 00 00 00 00 FF FF FF FF FF FF 00 00 00 21 F9 04 01 00 00 02 ..................!......
00 2C 00 00 00 00 10 00 10 00 00 02 1B 94 8F A9 CB 07 AD C0 83 4E 52 23 2D .,...................NR#-
CD BA F1 BE 7C 5B 76 91 E5 54 5E EA CA 1A 05 00 3B ....|[v..T^.....;
5loaves has a unique network connectivity utility built in. It allows machines behind a firewall that cannot "listen" for connections outside the network to accept connections from anywhere without any firewall configuration changes. There is a complete example called "FilePoster
" that puts a file on a machine behind a firewall. This is a bare bones 'proof of concept' implementation that gives you a working model to customize for your own purposes. It requires three machines to see it work as designed. Machine 1 - (the switchboard) should be located on the internet. You must run the HTTP service along with the "Switchboard Service", you can see that this example runs it on port 81 just incase you have IIS or Apache already on port 80.
[SwitchBoardServer]
Enable=yes
Name=/PublicPath/
[HTTP]
Enable=yes
Port=81
ContentCache=0
UseKeepAlives=1
HTTPHeaderServerName=5Loaves
KeepAliveTimeOut=20
ShowIPAddressPageName=ShowIP
Home=d:\home
[Trace]
HTTPHeaderTrace=0
ThreadTrace=0
ConnectTrace=1
Machine 2 - (the server) This is the machine behind the firewall that you want to open up a connection path to. It will poll the switchboard server looking for connections. It should run 5Loaves
with the configuration below. This will accept remote data, and write it to disk in a file at "c:\5LMessages\UBTsAccountForYou". You could change the application logic that writes the file - you can do anything with the data that may contain commands, database queries, or custom logic.
[Messaging]
Enable=yes
AcceptFrom=UBTsAccountForYou
DefaultSwitchBoardServer=10.20.30.40
DefaultSwitchBoardPort=81
UseBrowserProxy=no
[MsgFrom-UBTsAccountForYou]
Enable=yes
CheckAtSwitchBoard=yes
Name=/PublicPath/UBTsAccountForYou
DiskLocation=c:\5LMessages\UBTsAccountForYou
LetSenderPlaceFile=No
PollIntervalSeconds=20
Machine 3 - (the client) runs the FilePoster
sample application. Machine 3 can reach "the switchboard
(machine 1)" but not "the server (machine 2)". We will send the data to Machine 2 and get a response back from that machine. In this case, we are simply writing the data we send to a file, but the data could just as easily have been an SQL statement and the return data could be the result set rather than just a confirmation that the file was written. How it works: The "server
" polls into the "switchboard
" with an HTTP GET
. The "client
" pushes a multipart HTTP POST
to the "switchboard
". The switchboard
joins the connections and proxies the data. An HTTP GET needs an HTTP "200 OK
" so the "switchboard
" server rips off the POST
headers from the data sent up by the "client
" and replaces them with an HTTP 200 followed by the POST
data that gets proxy-ed straight through. Once this initial message proxy is complete, the client connection that POST
ed it waits in the switchboard, for the server to POST
back a response. Then the Switchboard
goes through the same process of ripping off the POST
HTTP header and replacing it with a 200 OK
before sending the response back to the client. Lastly, the switchboard
server replies with an empty HTTP 200
to the servers response POST
to complete the normal HTTP request/response design for both the client and the server. This allows it to pass through HTTP proxy servers and direct support for them is included. Technically, this is a loophole through most networks that only allow HTTP, because as you see, we invented a new protocol that looks like HTTP, but in fact it is not.
There is little need for an English description for how to map bits of a member variable using this method found in xmlObject.h
void MapMemberBit(unsigned char *pValue, const char *pTag, int nBit1to8,
const char *pzCommaSeparated0Values,
const char *pzCommaSeparated1Values);
void MapMemberBit(unsigned short *pValue, const char *pTag, int nBit1to16,
const char *pzCommaSeparated0Values,
const char *pzCommaSeparated1Values);
void MapMemberBit(unsigned int *pValue, const char *pTag, int nBit1to32,
const char *pzCommaSeparated0Values,
const char *pzCommaSeparated1Values);
void MapMemberBit(unsigned __int64 *pValue,const char *pTag, int nBit1to64,
const char *pzCommaSeparated0Values,
const char *pzCommaSeparated1Values);
The StartHere0.cpp example program contains this code:
class MyCustomObject : public XMLObject
{
public:
unsigned char m_bits;
virtual void MapXMLTagsToMembers()
{
MapMemberBit( &m_bits, "Seven77thBit", 7,
"False,No,Off,0", "True,Yes,On,1");
MapMemberBit( &m_bits, "OnItsSide8", 8,
"Black", "White");
}
DECLARE_FACTORY(MyCustomObject, Thing)
MyCustomObject(){m_bits=0;}
~MyCustomObject(){};
};
IMPLEMENT_FACTORY(MyCustomObject, Thing)
char pzXML[] =
"<Thing>"
"<Seven77thBit>on</Seven77thBit>"
"<OnItsSide8>white</OnItsSide8>"
</Thing>";
int main(int argc, char* argv[])
{
MyCustomObject O;
O.FromXMLX(pzXML);
// print out the value of each individual bit in m_bits
// Change "<OnItsSide8>" from "white" to "black" in pzXML
// and watch how the bits change
for(int i=0;i<8;i++)
printf("bit%d=%d\n",i+1,((O.m_bits&(LONG_ONE<<i))!=0)); // :)
// this is the output
// bit1=0
// bit2=0
// bit3=0
// bit4=0
// bit5=0
// bit6=0
// bit7=1
// bit8=1
// turn the 8th bit off - changes the value to "Black" in the XML
// - remove the following line and it stays "White" in the output XML
O.m_bits &= ~(1<<7);
printf(O.ToXML();
// this is the output
<Thing>
<Seven77thBit>True</Seven77thBit>
<OnItsSide8>Black</OnItsSide8>
</Thing>
Translating values before the assignment of member variables from XML is a common need in the application layer. Also, the reverse that, translating the value stored in the member variable to the value that will appear in the XML. Without writing ANY code, we can Map()
the translation logic with fine tuned control. This is done using the last 3 arguments of MapMember()
. Here is the comment taken from XMLObject.h explaining those last three arguments:
That may sound complex - but it's very easy to use. See the example program TranslatingXML.cpp where this (simple) code was taken from. The yellow string affects how member variables are assigned a value. The green string affects how the XML is created. Each member variable has its own translation rules. Notice where the other colors are referenced in the places where they are used.
int main(int argc, char* argv[])
{
MyCustomObject O;
O.FromXMLX(pzXML);
O.m_strString2 = "1"; O.m_strString3 = "2"; O.m_strString4 = "777"; O.m_strString5 = "anybody"; O.m_strString6 = "anything"; O.m_strString7 = "XMLFoundation"; O.m_strString8 = "xmlFoundation"; O.m_strString9 = ""; O.m_strString10 = "";
printf(O.ToXMLX()); }
XMLFoundation supports Java too, it supports all Java data types like (string
, byte
, bool
, double
, long
, short
) as well as Java data structures such as (Vector Stack ArrayList LinkedList TreeSet HashSet
). The XMLFoundation for Java is binary. If you want to build it yourself, you can because the source code to the entire JavaXMLFoundation is public
and included in this release. There is no need to build it, but it's nice to be able to. I can hear some uneducated Java programmer already saying "I only want a 'Pure' Java solution". This is Pure. It's as pure as the JVM, because if you look closely, you'll see that it is actually an enhancement to the JVM. The JVM (Java Virtual Machine) is written in C and C++. Your Java code runs anywhere the JVM can compile, and the XMLFoundation works the same way. The good news is that Java programmers don't have to deal with C++ just because their JVM is written in C++. The same is true of the XMLFoundation for Java. The XMLFoundation uses JNI (Java Native Interface). It parses the XML in the native binary (that was created by a C++ compiler just like the JVM) and instantiates 'pure' Java Objects through JNI, then it assigns all the member variables just like the C++ XMLFoundation does. Can you think of a faster way to get the job done? This is some sample code that uses the JavaXMLFoundation. A much more detailed example is included in the source code:
import java.util.Iterator;
import java.util.Vector;
class MyLineItem extends XMLObject
{
private String item;
private String quantity;
private int ItemID;
MyLineItem()
{
super("LineItem");
}; }
MyLineItem(int nID, String itm, String qty)
{
super("LineItem");
item = itm;
quantity = qty;
ItemID = nID;
}
void MapXMLTagsToMembers()
{
MapMember(quantity, "quantity", "Quantity");
MapMember(item, "item", "Description");
MapMember(ItemID, "ItemID", "SKU");
MapObjectId(this, "ItemID"); }
}
class Customer2
{
public String name;
private int CustID;
private XMLObject ContainedXMLObj;
public MyOrder objOrder;
private Vector vecStrings;
long l;
short s;
double d;
byte b;
boolean z;
public void XMLDump()
{
MyExchange("out");
System.err.println( ContainedXMLObj.toXML() );
}
void MyExchange(String inOut)
{
ContainedXMLObj.Member(this, inOut, b, "b","byte","", 1);
ContainedXMLObj.Member(this, inOut, z, "z","bool","", 1);
ContainedXMLObj.Member(this, inOut, l, "l","long","", 1);
ContainedXMLObj.Member(this, inOut, d, "d","double","", 1);
ContainedXMLObj.Member(this, inOut, s, "s","short","", 1);
sp; "s","short","", 1);
ContainedXMLObj.Member(this, inOut, name, "name","FirstName","", 1);
ContainedXMLObj.Member(this, inOut, CustID, "CustID","CustomerID","", 1);
ContainedXMLObj.Member(this, inOut, objOrder,
"objOrder", "Order","MyOrder","");
ContainedXMLObj.Member(this, inOut, vecStrings,"vecStrings",
"StringItem","String", "StringList Level2Wrapper");
}
public Customer2( String strXML )
{
ContainedXMLObj = new XMLObject("Customer", false );
ContainedXMLObj.fromXML(strXML);
MyExchange("in");
}
public void ApplyXML( String strXML )
{
ContainedXMLObj.fromXML(strXML);
MyExchange("in");
}
}
This is some of the code I developed inside the JavaXMLFoundation that interacts with the JVM. (Don't worry you'll never have to work with this code.)
jobject MakeObjectInstance(JNIEnv *env,const char *pzObjectType,
DynamicXMLObject *pDO,DynamicXMLObject *pDXOOOwner)
{
if (env->ExceptionOccurred())
{
env->ExceptionClear();
}
jclass clazzA = env->FindClass(pzObjectType);
jobject objReturnValue = 0;
GString strType("Ljava/lang/String;");
jmethodID midctor = env->GetMethodID(clazzA, "", "(LXMLObject;)V");
if (env->ExceptionOccurred())
{
env->ExceptionClear();
}
if (midctor)
{
jclass clazzX = env->FindClass("XMLObject");
jmethodID midX =
env->GetMethodID(clazzX, "", "(Ljava/lang/String;)V");
jstring tagX = env->NewStringUTF(pDO->GetObjectTag());
jobject objX = env->NewObject(clazzA, midX, tagX );
jclass clazz = env->GetObjectClass(objX);
jfieldID fid = env->GetFieldID(clazz, "oH", "I");
env->SetIntField(objX, fid, CastDXMLO(pDO));
objReturnValue = env->NewObject(clazzA, midctor, objX );
}
else
{
objReturnValue = env->AllocObject(clazzA);
jclass clazz = env->GetObjectClass(objReturnValue);
jfieldID fid = env->GetFieldID(clazz, "oH", "I");
if (env->ExceptionOccurred())
{
env->ExceptionClear();
}
if (fid)
{
env->SetIntField(objReturnValue, fid, CastDXMLO(pDO));
union CAST_THIS_TYPE_SAFE_COMPILERS
{
jobject mbrObj;
void * mbrVoid;
}Member;
Member.mbrObj = env->NewGlobalRef(objReturnValue);
pDXOOOwner->addSubUserLanguageObject(Member.mbrVoid);
pDO->setUserLanguageObject(Member.mbrVoid);
cacheManager.addAlternate( pDO );
}
else if (strType.CompareNoCase(pzObjectType) != 0)
{
GString Err;
Err.Format("Object type [%s] must either be derived from XMLObject\n"
"or supply a constructor %s(XMLObject o)",pzObjectType,pzObjectType);
}
}
pDO->SetObjectType(pzObjectType);
return objReturnValue;
}
Java programmers will derive from this class - then the use is nearly identical to the C++ XMLFoundation:
import java.util.Vector;
import java.util.Stack;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.TreeSet;
import java.util.HashSet;
public class XMLObject {
static { System.loadLibrary("JavaXMLFoundation"); }
private static int InstanceId = 0;
private native void JavaMap(int oH, int DataType,String strName,String xmlTag,
String strWrapper, String ObjType, String strContainerType, int nSource);
private native void JavaExchange(Object o, int oH, String inout,int nType,
String b,String c,String d,String strObjectType, String strContainerType,
int nSource);
private native int JavaConstruct(int n, String strXMLTag, int bAutoDataSync);
private native void JavaDestruct(int oH);
private native void JavaMapCacheDisable(int oH);
private native void JavaMapOID(int oH, Object o, String Key1, String Key2,
String Key3, String Key4, String Key5);
private native XMLObject JavaGetSubObj(int oH);
private native void JavaFromXML(int oH, String strXML);
private native String JavaToXML(int oH);