Duck Three Ways: File I/O and String Performance Comparison Between C#, C++, and C





5.00/5 (18 votes)
Different languages and approaches are evaluated for a file and string processing benchmark
Introduction
I was interested in comparing the performance of using different programming languages and different approaches to solving a basic programming problem, so I developed applications in C#, C++, and C, and drew conclusions for making decisions about which languages and approaches to use in different circumstances.
The Benchmark
The problem I came up with is for a program to load an input CSV file into an array of data structures, and use the data structures to write the data back out to an output CSV file. Your basic input / output data processing problem, without any data processing. This is a file I/O and data structure serialization benchmark.
I chose to use a Unicode-encoded text file for the CSV data, as C# is a Unicode language and C/C++ can work with Unicode data pretty well at this point. I chose to have seven data fields per line of CSV text, typical demographic stuff: first name, last name, address line 1, address line 2, city, state, and zip code. To keep things simple, each field is limited to 127 characters of Unicode text.
The CSV generation program (the "gen
" project in the attached code) outputs 100,000 lines of this random CSV data to a file on the desktop.
// C# script to generate the data used by the test programs
using System.Text;
// These constants are shared with the C program that uses fixed-length fields
const int FIELD_LEN = 128;
const int FIELD_CHAR_LEN = FIELD_LEN - 1;
const int RECORD_COUNT = 100 * 1000;
var rnd = new Random(0); // same output each gen
string rnd_str()
{
StringBuilder sb = new StringBuilder(FIELD_CHAR_LEN);
int rnd_len = rnd.Next(1, FIELD_CHAR_LEN);
for (int r = 1; r <= rnd_len; ++r)
sb.Append((char)((sbyte)'A' + rnd.Next(0, 25)));
return sb.ToString();
}
string output_file_path =
Path.Combine
(
Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
"recordio.csv"
);
if (output_file_path == null)
throw new NullReferenceException("output_file_path");
using (var output = new StreamWriter(output_file_path, false, Encoding.Unicode))
{
for (int r = 1; r <= RECORD_COUNT; ++r)
output.Write($"{rnd_str()},{rnd_str()},{rnd_str()},{rnd_str()},
{rnd_str()},{rnd_str()},{rnd_str()}\n");
}
Let's Skip to the Punchline: What are the Results?
Here are the timings of the loading and writing phases of the different programs. The programs run through the load /write cycle four times. I took the best load and write times from all the runs for each program, the best of the best.
Approach / Language | Load (ms) | Write (ms) |
Approach 1: C# with loops (net) | 317 | 178 |
Approach 2: C# in bulk (net2) | 223 | 353 |
Approach 3: C++ loops (class) | 2,489 | 1,379 |
Approach 4: C batches (struct) | 107 | 147 |
Approach 5: C++ batches (class2) | 202 | 136 |
Approach 6: C/C++ batches (recordio) | 108 | 149 |
Conclusions and Points of Interest
The C# programs, loops and batch, are clean and easy to read and have good performance. The loops use StreamReader
/ StreamWriter
and are intuitive and easy to develop and maintain. The batch program uses the File
class functions ReadAllLines
/ WriteAllLines
and are much faster than the C# loops program for reading, LINQ and all, and slower for writing. Given this, you'd use ReadAllLines
/ LINQ for loading and StreamWriter
for writing.
The big news is that there's something very wrong with Approach 3: C++ loops (class). It comes down to the std::getline
call for loading, and the stream output for writing; the other code costs little. I'm interested in folks reproducing these numbers and reporting how to solve these performance problems.
The C batches program handily outruns the others for loading data because it uses fixed-size strings packed together in struct
s stored sequentially in one array, so the data locality is excellent. We're reading 90 MB of CSV in around 100ms. Wow! For some reason, the C program is slower for writing the data to the output file; the code looks clean, not sure about that.
Inspired by the C program, the "C++ batches" program does not come close to C at reading the data, but bests them all at writing the data out. You would choose C for reading and C++ for writing.
Approach 6, recordio - rhymes with rodeo - the hybrid of the C and C++ batch approaches, in a reusable package, yields 108 / 149, besting C#'s bests of 223 / 178. The difference in write performance is not significant. The 2X speedup for load performance cannot be ignored. The use of fixed-length strings packed into a single vector of struct
s, that can't be beat by any storage of C++ wstring
s or C# strings. The code is clean, in implementation and in test driver illustration of use of the reusable class, have at it.
My overall advice would be to do this sort of thing in C/C++, with structs with fixed-width fields, how you might imagine things are stored in a database, and take advantage of recordio or similar to get great file and string I/O.
If you're in C# code, use File.ReadAllLines
and LINQ to load the data, and StreamWriter
to write it back out, and get respectable performance with great productivity and security.
Here's more information about the different approaches.
Approach 1: C# with Loops
In the attached code, this is the "net
" project. This might be the most intuitive of the languages and approaches. You have StreamReader
slurp up the data, and StreamWriter
write it all out:
// C# performance test program
using System.Diagnostics;
using System.Text;
var sw = Stopwatch.StartNew();
string input_file_path =
Path.Combine
(
Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
"recordio.csv"
);
if (input_file_path == null)
throw new NullReferenceException("input_file_path");
for (int iter = 1; iter <= 4; ++iter)
{
// Read lines into a list of objects
var nets = new List<info>();
using (var input = new StreamReader(input_file_path, Encoding.Unicode))
{
while (true)
{
string? line = input.ReadLine();
if (line == null)
break;
else
nets.Add(new info(line.Split(',')));
}
}
Console.WriteLine($".NET load took {sw.ElapsedMilliseconds} ms");
sw.Restart();
// Write the objects to an output CSV file
using (var output = new StreamWriter("output.csv", false, Encoding.Unicode))
{
foreach (var cur in nets)
output.Write($"{cur.firstname},{cur.lastname},
{cur.address1},{cur.address2},{cur.city},{cur.state},{cur.zipcode}\n");
}
Console.WriteLine($".NET write took {sw.ElapsedMilliseconds} ms");
sw.Restart();
}
// NOTE: Using struct did not change performance, probably because the
// contents of the strings are not stored consecutively, so
// any data locality with the array of info objects is irrelevant
class info
{
public info(string[] parts)
{
firstname = parts[0];
lastname = parts[1];
address1 = parts[2];
address2 = parts[3];
city = parts[4];
state = parts[5];
zipcode = parts[6];
}
public string firstname;
public string lastname;
public string address1;
public string address2;
public string city;
public string state;
public string zipcode;
}
Approach 2: C# in Bulk
In the attached source, this is the "net2
" project. You might say to yourself, "Loops are tedious. I can use the File
class functions like ReadAllLines
to do things en masse. In .NET I trust!" It's certainly less code...
// C# performance test program
using System.Diagnostics;
using System.Runtime.ConstrainedExecution;
using System.Text;
var sw = Stopwatch.StartNew();
string input_file_path =
Path.Combine
(
Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
"recordio.csv"
);
if (input_file_path == null)
throw new NullReferenceException("input_file_path");
for (int iter = 1; iter <= 4; ++iter)
{
// Read CSV file into a list of objects
var nets =
File.ReadAllLines(input_file_path, Encoding.Unicode)
.Select(line => new info(line.Split(',')));
Console.WriteLine($".NET 2 load took {sw.ElapsedMilliseconds} ms");
sw.Restart();
// Write the objects to an output CSV file
int count = nets.Count();
string[] strs = new string[count];
int idx = 0;
foreach (var cur in nets)
strs[idx++] = $"{cur.firstname},{cur.lastname},{cur.address1},
{cur.address2},{cur.city},{cur.state},{cur.zipcode}\n";
File.WriteAllLines("output.csv", strs, Encoding.Unicode);
Console.WriteLine($".NET 2 write took {sw.ElapsedMilliseconds} ms");
sw.Restart();
}
Approach 3: C++ Loops
C++ has come a long way when it comes to Unicode file I/O and streams. It is now easy to write C++ code that rivals C#'s Loopy Approach 1 for readability and simplicity:
// C++ loop performance test program
#include <codecvt>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
// Our record type, just a bunch of wstrings
struct info
{
info(const std::vector<std::wstring>& parts)
: firstname(parts[0])
, lastname(parts[1])
, address1(parts[2])
, address2(parts[3])
, city(parts[4])
, state(parts[5])
, zipcode(parts[6])
{
}
std::wstring firstname;
std::wstring lastname;
std::wstring address1;
std::wstring address2;
std::wstring city;
std::wstring state;
std::wstring zipcode;
};
// Split a string by a separator, returning a vector of substrings
std::vector<std::wstring> split(const std::wstring& str, const wchar_t seperator)
{
std::vector<std::wstring> retVal;
retVal.reserve(FIELD_COUNT); // cheat...
std::wstring acc;
acc.reserve(FIELD_CHAR_LEN); // ...a little
for (wchar_t c : str)
{
if (c == seperator)
{
retVal.push_back(acc);
acc.clear();
}
else
acc.push_back(c);
}
if (!acc.empty())
retVal.push_back(acc);
return retVal;
}
int main(int argc, char* argv[])
{
timer t;
for (int iter = 1; iter <= 4; ++iter)
{
// Read the file into a vector of line strings
std::vector<std::wstring> lines;
{
std::wifstream input(argv[1], std::ios::binary);
input.imbue(std::locale(input.getloc(), new std::codecvt_utf16<wchar_t,
0x10ffff, std::codecvt_mode(std::consume_header | std::little_endian)>));
if (!input)
{
std::cout << "Opening input file failed\n";
return 1;
}
std::wstring line;
while (std::getline(input, line))
lines.push_back(line);
}
// Process the lines into a vector of structs
std::vector<info> infos;
infos.reserve(lines.size());
for (const auto& line : lines)
infos.emplace_back(split(line, ','));
t.report("class load ");
// Write the structs to an output CSV file
{
std::wofstream output("output.csv", std::ios::binary);
output.imbue(std::locale(output.getloc(), new std::codecvt_utf16<wchar_t,
0x10ffff, std::codecvt_mode(std::generate_header | std::little_endian)>));
if (!output)
{
std::cout << "Opening output file failed\n";
return 1;
}
for (const auto& record : infos)
{
output
<< record.firstname << ','
<< record.lastname << ','
<< record.address1 << ','
<< record.address2 << ','
<< record.city << ','
<< record.state << ','
<< record.zipcode << '\n';
}
}
t.report("class write");
}
}
For the file load step, a quick timing check shows that all the time is spent on the std::getline()
call. And for the file write step, all the time is spent in the output loop, where else would it be? This mystery is left as an exercise to the reader. What is wrong with this simple code?
Approach 4: C Batches
If we are willing to load the entire text into memory, then we can play tricks with slicing and dicing the data in place, and take advantage of fixed-length string buffers and character-level string operations that take advantage of data locality and unsafe operations. What fun!
#include <stdio.h>
#include <stdlib.h>
const size_t FIELD_LEN = 128;
const size_t FIELD_CHAR_LEN = FIELD_LEN - 1;
const size_t FIELD_COUNT = 7;
const size_t RECORD_LEN = std::max(FIELD_COUNT * FIELD_LEN + 1, size_t(1024));
// Struct with fixed char array fields
struct info
{
wchar_t firstname[FIELD_LEN];
wchar_t lastname[FIELD_LEN];
wchar_t address1[FIELD_LEN];
wchar_t address2[FIELD_LEN];
wchar_t city[FIELD_LEN];
wchar_t state[FIELD_LEN];
wchar_t zipcode[FIELD_LEN];
};
// Read a comma-delimited string out of a buffer
void read_str(const wchar_t*& input, wchar_t* output)
{
size_t copied = 0;
while (*input && *input != ',')
*output++ = *input++;
*output = '\0';
if (*input == ',')
++input;
}
// Initialize a record using a buffer of text
void set_record(info& record, const wchar_t* buffer)
{
read_str(buffer, record.firstname);
read_str(buffer, record.lastname);
read_str(buffer, record.address1);
read_str(buffer, record.address2);
read_str(buffer, record.city);
read_str(buffer, record.state);
read_str(buffer, record.zipcode);
}
// Output a record to a buffer of text
wchar_t* add_to_buffer(const wchar_t* input, wchar_t* output, wchar_t separator)
{
while (*input)
*output++ = *input++;
*output++ = separator;
return output;
}
int64_t output_record(const info& record, wchar_t* buffer)
{
const wchar_t* original = buffer;
buffer = add_to_buffer(record.firstname, buffer, ',');
buffer = add_to_buffer(record.lastname, buffer, ',');
buffer = add_to_buffer(record.address1, buffer, ',');
buffer = add_to_buffer(record.address2, buffer, ',');
buffer = add_to_buffer(record.city, buffer, ',');
buffer = add_to_buffer(record.state, buffer, ',');
buffer = add_to_buffer(record.zipcode, buffer, '\n');
return buffer - original;
}
int main(int argc, char* argv[])
{
timer t;
for (int iter = 1; iter <= 4; ++iter)
{
// Open input file
FILE* input_file = nullptr;
if (fopen_s(&input_file, argv[1], "rb") != 0)
{
printf("Opening input file failed\n");
return 1;
}
// Compute file length
fseek(input_file, 0, SEEK_END);
int file_len = ftell(input_file);
fseek(input_file, 0, SEEK_SET);
// Read file into memory
wchar_t* file_contents = (wchar_t*)malloc(file_len + 2);
if (file_contents == nullptr)
{
printf("Allocating input buffer failed\n");
return 1;
}
if (fread(file_contents, file_len, 1, input_file) != 1)
{
printf("Reading input file failed\n");
return 1;
}
size_t char_len = file_len / 2;
file_contents[char_len] = '\0';
fclose(input_file);
input_file = nullptr;
// Compute record count and delineate the line strings
size_t record_count = 0;
for (size_t idx = 0; idx < char_len; ++idx)
{
if (file_contents[idx] == '\n')
{
++record_count;
file_contents[idx] = '\0';
}
}
// Allocate record array
info* records = (info*)malloc(record_count * sizeof(info));
if (records == nullptr)
{
printf("Allocating records list failed\n");
return 1;
}
// Process memory text into records
wchar_t* cur_str = file_contents;
wchar_t* end_str = cur_str + file_len / 2;
size_t record_idx = 0;
while (cur_str < end_str)
{
set_record(records[record_idx++], cur_str);
cur_str += wcslen(cur_str) + 1;
}
if (record_idx != record_count)
{
printf("Record counts differ: idx: %d - count: %d\n",
(int)record_idx, (int)record_count);
return 1;
}
t.report("struct load ");
// Write output file
wchar_t* file_output = (wchar_t*)malloc(record_count * RECORD_LEN);
if (file_output == nullptr)
{
printf("Allocating file output buffer failed\n");
return 1;
}
size_t output_len = 0;
for (size_t r = 0; r < record_count; ++r)
{
int new_output = output_record(records[r], file_output + output_len);
if (new_output < 0)
{
printf("Writing to output buffer failed\n");
return 1;
}
output_len += new_output;
}
FILE* output_file = nullptr;
if (fopen_s(&output_file, "output.csv", "wb") != 0)
{
printf("Opening output file failed\n");
return 1;
}
if (fwrite(file_output, output_len * 2, 1, output_file) != 1)
{
printf("Writing output file failed\n");
return 1;
}
fclose(output_file);
output_file = nullptr;
t.report("struct write");
// Clean up
free(file_contents);
file_contents = nullptr;
free(records);
records = nullptr;
}
return 0;
}
Approach 5: C++ Batches
Perhaps that pile of C leaves a bad taste in your mouth. Could we apply the same batch approach in C++?
// C++ batch test program
#include <codecvt>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
// Our record type, just a bunch of wstrings
struct info
{
info(const std::vector<wchar_t*>& parts)
: firstname(parts[0])
, lastname(parts[1])
, address1(parts[2])
, address2(parts[3])
, city(parts[4])
, state(parts[5])
, zipcode(parts[6])
{
}
std::wstring firstname;
std::wstring lastname;
std::wstring address1;
std::wstring address2;
std::wstring city;
std::wstring state;
std::wstring zipcode;
};
void parse_parts(wchar_t* buffer, wchar_t separator, std::vector<wchar_t*>& ret_val)
{
ret_val.clear();
ret_val.push_back(buffer); // start at the beginning
while (*buffer)
{
if (*buffer == separator)
{
*buffer = '\0';
if (*(buffer + 1))
ret_val.push_back(buffer + 1);
}
++buffer;
}
}
int main(int argc, char* argv[])
{
timer t;
for (int iter = 1; iter <= 4; ++iter)
{
// Read the file into memory
std::vector<char> file_contents;
{
std::ifstream file(argv[1], std::ios::binary | std::ios::ate);
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
file_contents.resize(size + 2); // save room for null termination
if (!file.read(file_contents.data(), size))
{
std::cout << "Reading file failed\n";
return 1;
}
// null terminate
file_contents.push_back(0);
file_contents.push_back(0);
}
// Get the lines out of the data
std::vector<wchar_t*> line_pointers;
parse_parts(reinterpret_cast<wchar_t*>
(file_contents.data()), '\n', line_pointers);
// Process the lines into data structures
std::vector<info> infos;
infos.reserve(line_pointers.size());
std::vector<wchar_t*> line_parts;
for (wchar_t* line : line_pointers)
{
parse_parts(line, ',', line_parts);
infos.emplace_back(line_parts);
}
t.report("C++ 2 load");
// Write the structs to an output CSV file
std::wstring output_str;
output_str.reserve(file_contents.size() / 2);
for (const auto& record : infos)
{
output_str += record.firstname;
output_str += ',';
output_str += record.lastname;
output_str += ',';
output_str += record.address1;
output_str += ',';
output_str += record.address2;
output_str += ',';
output_str += record.city;
output_str += ',';
output_str += record.state;
output_str += ',';
output_str += record.zipcode;
output_str += '\n';
}
std::ofstream output_file("output.csv", std::ios::binary);
if (!output_file)
{
std::cout << "Opening output file failed\n";
return 1;
}
output_file.write(reinterpret_cast<const char*>
(output_str.c_str()), output_str.size() * 2);
output_file.close();
t.report("C++ 2 write");
}
}
Approach 6: C/C++ Hybrid
Let's take the best from approaches 4 and 5, and make a reusable class that has a prayer of real-world usability.
First, the reusable class. It is templated by the record
type so the record
class can define its field and record
length limits:
// C/C++ hybrid file / string I/O class
namespace recordio
{
template<class record>
class lineio
{
public:
static void load(const char* inputFilePath, std::vector<record>& records)
{
// Initialize our output
records.clear();
// Open input file
FILE* input_file = nullptr;
if (fopen_s(&input_file, inputFilePath, "rb") != 0)
{
throw std::runtime_error("Opening input file failed");
}
// Compute file length
fseek(input_file, 0, SEEK_END);
int file_len = ftell(input_file);
fseek(input_file, 0, SEEK_SET);
// Read file into memory
size_t char_len = file_len / 2;
std::unique_ptr<wchar_t[]> file_contents(new wchar_t[file_len / 2 + 1]);
if (fread(reinterpret_cast<void*>
(file_contents.get()), file_len, 1, input_file) != 1)
{
throw std::runtime_error("Reading input file failed\n");
}
file_contents[char_len] = '\0';
fclose(input_file);
input_file = nullptr;
// Compute record count and delineate the line strings
size_t record_count = 0;
for (size_t idx = 0; idx < char_len; ++idx)
{
if (file_contents[idx] == '\n')
{
++record_count;
file_contents[idx] = '\0';
}
}
records.reserve(record_count);
// Process memory text into records
wchar_t* cur_str = file_contents.get();
wchar_t* end_str = cur_str + file_len / 2;
while (cur_str < end_str)
{
records.emplace_back(cur_str);
cur_str += wcslen(cur_str) + 1;
}
}
static void write(const char* outputFilePath, const std::vector<record>& records)
{
std::wstring output_str;
output_str.reserve(record::max_record_length * records.size());
for (const auto& cur : records)
{
cur.get_record_str(output_str);
output_str += '\n';
}
// Write output file
std::ofstream output_file(outputFilePath, std::ios::binary);
if (!output_file)
{
throw std::runtime_error("Opening output file failed");
}
output_file.write(reinterpret_cast<const char*>
(output_str.c_str()), output_str.size() * 2);
}
The "derived" record
types have a little work to do, but it's pretty minimal:
// C/C++ hybrid class and test driver
// Struct with fixed char array fields
// Looks familiar...
struct info
{
info() {}
info(const wchar_t* str)
{
recordio::lineio<info>::read_str(str, firstname);
recordio::lineio<info>::read_str(str, lastname);
recordio::lineio<info>::read_str(str, address1);
recordio::lineio<info>::read_str(str, address2);
recordio::lineio<info>::read_str(str, city);
recordio::lineio<info>::read_str(str, state);
recordio::lineio<info>::read_str(str, zipcode);
}
void get_record_str(std::wstring& str) const
{
str += firstname;
str += ',';
str += lastname;
str += ',';
str += address1;
str += ',';
str += address2;
str += ',';
str += city;
str += ',';
str += state;
str += ',';
str += zipcode;
}
const static size_t max_field_length = FIELD_LEN;
const static size_t max_record_length = RECORD_LEN;
wchar_t firstname[FIELD_LEN];
wchar_t lastname[FIELD_LEN];
wchar_t address1[FIELD_LEN];
wchar_t address2[FIELD_LEN];
wchar_t city[FIELD_LEN];
wchar_t state[FIELD_LEN];
wchar_t zipcode[FIELD_LEN];
};
int main(int argc, char* argv[])
{
if (argc != 2)
{
printf("Usage: recordio <input CSV file path>\n");
return 0;
}
timer t;
for (int iter = 1; iter <= 4; ++iter)
{
std::vector<info> records;
recordio::lineio<info>::load(argv[1], records);
t.report("recordio load ");
recordio::lineio<info>::write("output.csv", records);
t.report("recordio write ");
}
printf("All done.\n");
return 0;
}
And that's it! Looking forward to the comments!
History
- 25th September, 2022: Initial version