Introduction
This article covers the implementation of the dynamic ANSI string for the C language.
String
or more precisely character array is an important data structure for the programmatic text manipulation and presentation. And much of that is already covered in the standard libraries. However, I believe up to this day, there is no dynamically managed string
type in the C language standard. There are 3rd party libraries free or otherwise, that probably require tons of additional code to come along.
I hope this implementation of low level dynamic string
will help you in your C coding where the dynamic string
is a must.
Background
When you write a program in C, a need arises to hold or manipulate the text of variable size. Normally, this is done by writing code that covers growing, shrinking, etc. Mainly, this code is pointer arithmetic and memory management. It can be very messy code.
I wrote that code sometime back in 1994 and I probably will make Unicode version, but back then, this ANSI implementation only was sufficient for my needs.
Implementation
Dynamic C string
is implemented to be efficient in speed and memory ops. It keeps track of what was allocated and the length of the string
that uses that allocation. This allows it to ignore allocations for the different assignment operations to the same string
if the previously allocated memory is already big enough. That extra allocated memory can be shrunk by the programmer at will when needed.
It is not dependent on any frameworks and libraries.
It should compile for any platform and run on any device that runs compiled C code.
Base type for the string
is a HSTRINGA
. It stands for the HANDLE
to an ANSI string
. Handle
encapsulates messy string
data details from the user by hiding the implementation. HSTRINGA
can be stored as a member of C structure, passed on as a parameter, copied, or returned from a function just as the normal pointer would.
It was originally built with C89 compliant compiler so there should be no issues going backward with this code to an older compilers as long as it is post C89 standard.
HSTRINGA
internally encapsulates tagSTRINGA
structure:
typedef struct tagSTRINGA
{
char* data;
size_t data_length;
size_t alloc_length;
} STRINGA;
The structure is not visible to the user and represented as a handle:
DECLARE_HANDLE(HSTRINGA);
This approach guarantees type safety and your compiler either gives warning or error if accidentally another pointer passed in. Void
pointer in its place would not give even a warning. At the same time, it hides tagSTRINGA
structure members. User does not have access to the members and they are used internally to track the state of the string
.
API Overview
API functions are decorated to prevent any namespace collision with other similar libraries.
Creation and destruction:
Astr_Create()
Astr_Destroy(HSTRINGA hstr)
Copying:
Size management:
Assignment, concatenation, deletion, replacement:
Astr_GetLength(HSTRINGA hstr)
Astr_GetAllocLength(HSTRINGA hstr)
Astr_IsEmpty(HSTRINGA hstr)
Astr_FreeExtra(HSTRINGA hstr)
Astr_Empty(HSTRINGA hstr)
Astr_Set(HSTRINGA hstr, const char* str)
Astr_Get(HSTRINGA hstr)
Astr_SetAt(HSTRINGA hstr, size_t index, char ch)
Astr_GetAt(HSTRINGA hstr, size_t index)
Astr_Cat(HSTRINGA hstr, const char* str)
Astr_Insert(HSTRINGA hstr, size_t index, const char* str)
Astr_InsertCh(HSTRINGA hstr, size_t index, char c)
Astr_Replace(HSTRINGA hstr, const char* pOld, const char* pNew)
Astr_ReplaceCh(HSTRINGA hstr, char chOld, char chNew)
Astr_Remove(HSTRINGA hstr, const char* str)
Astr_RemoveCh(HSTRINGA hstr, char chRemove)
Astr_Delete(HSTRINGA hstr, size_t index, size_t count)
Case conversions and reversal:
Trimming white space:
Astr_ToUpper(HSTRINGA hstr)
Astr_ToLower(HSTRINGA hstr)
Astr_Reverse(HSTRINGA hstr)
Astr_TrimRight(HSTRINGA hstr)
Astr_TrimLeft(HSTRINGA hstr)
Astr_Trim(HSTRINGA hstr)
Search:
Astr_Find(HSTRINGA hstr, const char* sub, size_t start)
Astr_FindCh(HSTRINGA hstr, char ch, size_t start)
Astr_ReverseFind(HSTRINGA hstr, char ch)
Astr_FindOneOf(HSTRINGA hstr, const char* char_set)
Extraction:
Astr_Mid(HSTRINGA hstr, size_t start, size_t count)
Astr_Left(HSTRINGA hstr, size_t count)
Astr_Right(HSTRINGA hstr, size_t count)
Format:
Astr_Format(HSTRINGA hstr, const char* fmt, ...)
Allocation and Deallocation
This code snippet demonstrates basic usage of HSTRINGA
:
HSTRINGA hstr = Astr_Create();
Astr_Set(hstr, "Hello Dynamic C String!");
printf("%s\n", Astr_Get(hstr));
Astr_Destroy(hstr);
String
must be deallocated by calling Astr_Destroy(HSTRINGA)
function when you are done using it.
Any HSTRINGA
handle
returned by the API function must be deallocated individually because it is a complete copy of the original string
. This is done to prevent things like deleting two pointers to the same string
by accident or any other pointer mishaps.
Copying
HSTRINGA hCopy;
HSTRINGA hstr = Astr_Create();
Astr_Set(hstr, "Hello Dynamic C String!");
printf("%s\n", Astr_Get(hstr));
hCopy = Astr_Copy(hstr);
Astr_Destroy(hstr);
Astr_Destroy(hCopy);
Copy
operation will create an independent copy of the original string
. Any consecutive modification of the original will not affect the copy and vice versa. Each copy of the original string
must be deallocated separately.
Size Management
Call Astr_GetLength
function to get a count of the bytes in this HSTRINGA
object. The count does not include a null
terminator.
Call Astr_GetAllocLength
function to determine the total memory allocated to the string
. It can be different from the string
length. If you want to shrink the memory manually, you can call this function and compare it to the actual string
length.
Astr_IsEmpty
tests a HSTRINGA
object for the empty condition.
Call Astr_FreeExtra
function to free any extra memory previously allocated by the string
but no longer needed. This should reduce the memory overhead consumed by the string
object. The function reallocates the buffer to the exact length returned by Astr_GetLength
.
Call to Astr_Empty
makes this HSTRINGA
object null string
and frees memory. It does not destroy the string
object and the object can be reused later.
printf("size:%d\n", Astr_GetLength(hstr));
printf("alloc:%d\n", Astr_GetAllocLength(hstr));
if(Astr_IsEmpty(hstr))
{
}
else
{
}
if(Astr_GetAllocLength(hstr) > Astr_GetLength(hstr))
{
Astr_FreeExtra(hstr);
assert(Astr_GetAllocLength(hstr) == Astr_GetLength(hstr));
}
Astr_Empty(hstr);
assert(Astr_Get(hstr) == NULL);
Assignment, Concatenation, Insertion, Deletion, Replacement
size_t i;
HSTRINGA hstr = Astr_Create();
Astr_Set(hstr, "Hello Dynamic C String!");
printf("%s\n", Astr_Get(hstr));
Astr_SetAt(hstr, 0, 'h');
Astr_SetAt(hstr, 6, 'd');
Astr_SetAt(hstr, 14, 'c');
Astr_SetAt(hstr, 16, 's');
printf("%s\n", Astr_Get(hstr));
for(i = 0; i < Astr_GetLength(hstr); i++)
{
printf("%c\n", Astr_GetAt(hstr, i));
}
Astr_Cat(hstr, " For all");
Astr_Cat(hstr, " your c coding");
Astr_Cat(hstr, " needs");
printf("%s\n", Astr_Get(hstr));
Astr_Insert(hstr, 6, "awesome ");
printf("%s\n", Astr_Get(hstr));
Astr_InsertCh(hstr, 6, '\'');
printf("%s\n", Astr_Get(hstr));
Astr_Replace(hstr, "C", "awesome C");
printf("%s\n", Astr_Get(hstr));
Astr_ReplaceCh(hstr, 'l', 'L');
printf("%s\n", Astr_Get(hstr));
Astr_Delete(hstr, 5, 7);
Astr_Destroy(hstr);
Case Conversions and Reversal
HSTRINGA hstr = Astr_Create();
Astr_Set(hstr, "String to reverse");
printf("%s\n", Astr_Get(hstr));
Astr_Reverse(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_Reverse(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_ToUpper(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_ToLower(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_Destroy(hstr);
Trimming White Space
HSTRINGA hstr = Astr_Create();
Astr_Set(hstr, "String to trim ");
printf("%s\n", Astr_Get(hstr));
Astr_TrimRight(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_Set(hstr, " String to trim");
printf("%s\n", Astr_Get(hstr));
Astr_TrimLeft(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_Set(hstr, " String to trim ");
printf("%s\n", Astr_Get(hstr));
Astr_Trim(hstr);
printf("%s\n", Astr_Get(hstr));
Astr_Destroy(hstr);
Search
size_t f;
HSTRINGA hstr = Astr_Create();
Astr_Set(hstr, "Hello Dynamic C String!");
printf("%s\n", Astr_Get(hstr));
f = Astr_Find(hstr, "String", 0);
printf("\nfound at %d\n", f);
f = Astr_Find(hstr, "to", f);
printf("found at %d\n", f);
f = Astr_Find(hstr, "tr", f);
printf("found at %d\n", f);
f = Astr_Find(hstr, "nonexitent", 0);
printf("found at %d\n", f);
f = Astr_FindCh(hstr, 'r', 0);
printf("found at %d\n", f);
f = Astr_FindOneOf(hstr, "tuwxyz");
printf("found at %d\n", f);
Astr_Destroy(hstr);
Extraction
Extraction functions are one of the few which return type is HSTRINGA
. Any function that returns HSTRINGA
creates a new HSTRINGA
so to speak. Therefore, it must be individually deallocated.
HSTRINGA h, hstr;
double d = 3.9876;
hstr = Astr_Create();
Astr_Format(hstr, "double %.*f", 10, d);
printf("%s\n", Astr_Get(hstr));
h = Astr_Mid(hstr, 6, 4);
printf("%s\n", Astr_Get(h));
Astr_Destroy(h);
h = Astr_Left(hstr, 8);
printf("%s\n", Astr_Get(h));
Astr_Destroy(h);
h = Astr_Right(hstr, 4);
printf("%s\n", Astr_Get(h));
Astr_Destroy(h);
Astr_Destroy(hstr);
Formatting
HSTRINGA
object can be formatted with the printf
like statement providing appropriate growth of the internal buffer.
Example of the formatting:
int i = 10;
double d = 3.9876;
float f = 3.14f;
__int64 i64 = 200;
const char* s = "Test";
Astr_Format(hstr, "int %d, double %f, float %f, int64: %I64d", i, d, f, i64);
printf("%s\n", Astr_Get(hstr));
Astr_Format(hstr, "string: %s, int %d, double %f, float %f", s, i, d, f);
printf("%s\n", Astr_Get(hstr));
Astr_Format(hstr, "double %.2f, float %.3f", d, f);
printf("%s\n", Astr_Get(hstr));
Astr_Format(hstr, "double %.*f", 10, d);
printf("%s\n", Astr_Get(hstr));
Using the Code
Include "str.h" and "str.c" into your project.
Enjoy!
Points of Interest
That code shows how to:
- Encapsulate C language structures from the user
- Memory management at low level
- Pointer arithmetic
History
- 31st August, 2018. Original article
- 30th November, 2018. Fixed several typos in the example code
- Dec 20th 2019. Fixed several parts of article
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.