Introduction
After finishing my XSearch article, I decided that I wanted to support quoted strings, just like web search engines. At first I planned to build in support for quoted strings; then I realized that this really should be generalized, because I could see other situations where I would want to use it.
So I decided to make a separate function to handle extracting quoted tokens, and at the same time do something about how painful it is to use strtok (i.e., looping to get all tokens). I wrote down a list of what I would like XTokenString function to do:
- No looping! Tokens must be returned in a
CStringArray
. This allows you to verify number of tokens found immediately. Another input parameter lets you specify maximum number of tokens to be returned, to eliminate runaway looping in case of bad data.
- String must not be modified (e.g., by inserting nul characters). This allows input parameter to be
const
, and avoids need for casting.
- Specify delimiters, just like strtok.
- Optionally trim leading/trailing whitespace from returned tokens.
- Optionally handle quoted tokens. To take example of web search engines, double quotes are used to indicate exact matches, and so may include characters specified as token delimiters.
- Optionally handle escaped characters, like \" (or any of the token delimiter characters).
- Optionally return empty tokens. For example, for CSV record, where all values are not present:
Dietrich,Hans,,,,,213-555-1234
should return:
Dietrich
Hans
<empty token>
<empty token>
<empty token>
<empty token>
213-555-1234
where empty tokens are returned for missing fields (address, city, state, zip). It is also important to handle special cases of leading/trailing empty token, and several consecutive empty tokens.
The implementation of XTokenString assumes that this option will not be used when delimiters are whitespace. Otherwise, two consecutive spaces would produce an empty token in returned array, which is probably not what you want.
XTokenString In Action
The demo app allows you to compare behavior of
XTokenString with that of
strtok. You can choose a built-in string or enter your own. The four checkboxes allow you to control how
XTokenString will parse string.
XTokenString Function
XTokenString()
- Parse string to extract tokens.
How To Use
To integrate
XTokenString into your app, you first need to add following files to your project:
- XTokenString.cpp
- XTokenString.h
For details on how to use XTokenString, refer to code in XTokenStringTestDlg.cpp.
Revision History
Version 1.0 - 2005 August 2
Usage
This software is released into the public domain. You are free to use it in any way you like, except that you may not sell this source code. If you modify it or extend it, please to consider posting new code here for everyone to share. This software is provided "as is" with no expressed or implied warranty. I accept no liability for any damage or loss of business that this software may cause.
I attended St. Michael's College of the University of Toronto, with the intention of becoming a priest. A friend in the University's Computer Science Department got me interested in programming, and I have been hooked ever since.
Recently, I have moved to Los Angeles where I am doing consulting and development work.
For consulting and custom software development, please see
www.hdsoft.org.