Table of Contents
More than a few C++ developers having accustomed to << and >> operators on text stream, missed them on binary streams. Simplistic Binary Stream is nothing but a barebone wrapper over STL fstream's read
and write
functions. Readers may compare it to other serialization libraries like Boost Serialization Library and MFC Serialization due to seemingly similar <<, >> overloads. Simplistic Binary Stream is not a serialization library: it does not handle versioning, backward/forward compatible, endianess correctness, leaving everything to the developer. Every developer who had used Boost Serialization in the past, is fresh in their memory having bitten when the version 1.42-1.44 files rendered unreadable by newer version. Using a serialization library is like putting your file format under a third party not within your control. While Simplistic Binary Stream offers none of the serialization convenience, it puts the developer in the driver seat over their file format.
Anyone using the library to read/write file format, is advised to implement another layer above it. In this article, we will look at the usage before looking at the source code. Simplistic Binary Stream comes in two flavors: file and memory streams. File stream encapsulates the STL fstream
while memory stream uses STL vector<char>
to hold the data in memory. Developer can use the memory stream to parse in memory for files downloaded from network.
The examples of writing and then reading is similar to both memory and file streams, except that we flush and close the output file stream before reading it with the input stream.
#include <iostream>
#include "MiniBinStream.h"
void TestMem()
{
simple::mem_ostream out;
out << 23 << 24 << "Hello world!";
simple::mem_istream in(out.get_internal_vec());
int num1 = 0, num2 = 0;
std::string str;
in >> num1 >> num2 >> str;
cout << num1 << "," << num2 << "," << str << endl;
}
void TestFile()
{
simple::file_ostream out("file.bin", std::ios_base::out | std::ios_base::binary);
out << 23 << 24 << "Hello world!";
out.flush();
out.close();
simple::file_istream in("file.bin", std::ios_base::in | std::ios_base::binary);
int num1 = 0, num2 = 0;
std::string str;
in >> num1 >> num2 >> str;
cout << num1 << "," << num2 << "," << str << endl;
}
The output is the same for both:
23,24,Hello world!
Say we have a Product
structure. We can overload them like below:
#include <vector>
#include <string>
#include "MiniBinStream.h"
struct Product
{
Product() : product_name(""), price(0.0f), qty(0) {}
Product(const std::string& name,
float _price, int _qty) : product_name(name), price(_price), qty(_qty) {}
std::string product_name;
float price;
int qty;
};
simple::mem_istream& operator >> (simple::mem_istream& istm, Product& val)
{
return istm >> val.product_name >> val.price >> val.qty;
}
simple::file_istream& operator >> (simple::file_istream& istm, Product& val)
{
return istm >> val.product_name >> val.price >> val.qty;
}
simple::mem_ostream& operator << (simple::mem_ostream& ostm, const Product& val)
{
return ostm << val.product_name << val.price << val.qty;
}
simple::file_ostream& operator << (simple::file_ostream& ostm, const Product& val)
{
return ostm << val.product_name << val.price << val.qty;
}
If the struct
only contains fundamental types and the developer can pack the struct
members with no padding or alignment as shown below, then he/she can write/read the whole struct
at one go, instead of processing the members one by one. Reader should notice that we overload the memory and file streams with the same code. That is unfortunate because both types of streams are not derived from the same base class. Even if they are, it wouldn't work because the write
and read
functions are template functions and template functions cannot be virtual for reasons that template function is determined at compile time while virtual polymorphism is determined at runtime: they cannot be used together.
#if defined(__linux__)
#pragma pack(push)
#pragma pack(1)
// Your struct declaration here.
#pragma pack(pop)
#endif
#if defined(WIN32)
#pragma warning(disable:4103)
#pragma pack(push,1)
// Your struct declaration here.
#pragma pack(pop)
#endif
Next, we overload the operators for writing/reading vector
of Product
and also outputting it on console. Rule of thumb: never use a size_t
because its size is dependent on platform(32/64bits).
simple::mem_istream& operator >> (simple::mem_istream& istm, std::vector<Product>& vec)
{
int size=0;
istm >> size;
if(size<=0)
return istm;
for(int i=0; i<size; ++i)
{
Product product;
istm >> product;
vec.push_back(product);
}
return istm;
}
simple::file_istream& operator >> (simple::file_istream& istm, std::vector<Product>& vec)
{
int size=0;
istm >> size;
if(size<=0)
return istm;
for(int i=0; i<size; ++i)
{
Product product;
istm >> product;
vec.push_back(product);
}
return istm;
}
simple::mem_ostream& operator << (simple::mem_ostream& ostm, const std::vector<Product>& vec)
{
int size = vec.size();
ostm << size;
for(size_t i=0; i<vec.size(); ++i)
{
ostm << vec[i];
}
return ostm;
}
simple::file_ostream& operator << (simple::file_ostream& ostm, const std::vector<Product>& vec)
{
int size = vec.size();
ostm << size;
for(size_t i=0; i<vec.size(); ++i)
{
ostm << vec[i];
}
return ostm;
}
void print_product(const Product& product)
{
using namespace std;
cout << "Product:" << product.product_name << ",
Price:" << product.price << ", Qty:" << product.qty << endl;
}
void print_products(const std::vector<Product>& vec)
{
for(size_t i=0; i<vec.size() ; ++i)
print_product(vec[i]);
}
We test the overloaded operators for Product
using the code below:
void TestMemCustomOperatorsOnVec()
{
std::vector<Product> vec_src;
vec_src.push_back(Product("Book", 10.0f, 50));
vec_src.push_back(Product("Phone", 25.0f, 20));
vec_src.push_back(Product("Pillow", 8.0f, 10));
simple::mem_ostream out;
out << vec_src;
simple::mem_istream in(out.get_internal_vec());
std::vector<Product> vec_dest;
in >> vec_dest;
print_products(vec_dest);
}
void TestFileCustomOperatorsOnVec()
{
std::vector<Product> vec_src;
vec_src.push_back(Product("Book", 10.0f, 50));
vec_src.push_back(Product("Phone", 25.0f, 20));
vec_src.push_back(Product("Pillow", 8.0f, 10));
simple::file_ostream out("file.bin", std::ios_base::out | std::ios_base::binary);
out << vec_src;
out.flush();
out.close();
simple::file_istream in("file.bin", std::ios_base::in | std::ios_base::binary);
std::vector<Product> vec_dest;
in >> vec_dest;
print_products(vec_dest);
}
The output is as follows:
Product:Book, Price:10, Qty:50
Product:Phone, Price:25, Qty:20
Product:Pillow, Price:8, Qty:10
All the source code is in a header file, just include the MiniBinStream.h to use the stream
class. The class is not using any C++11/14 features. It has been tested on VS2008, GCC4.4 and Clang 3.2. The class is just a thin wrapper over the fstream
: there isn't any need for me to explain anything.
#ifndef MiniBinStream_H
#define MiniBinStream_H
#include <fstream>
#include <vector>
#include <string>
#include <cstring>
#include <stdexcept>
#include <iostream>
namespace simple
{
class file_istream
{
public:
file_istream() {}
file_istream(const char * file, std::ios_base::openmode mode)
{
open(file, mode);
}
void open(const char * file, std::ios_base::openmode mode)
{
m_istm.open(file, mode);
}
void close()
{
m_istm.close();
}
bool is_open()
{
return m_istm.is_open();
}
bool eof() const
{
return m_istm.eof();
}
std::ifstream::pos_type tellg()
{
return m_istm.tellg();
}
void seekg (std::streampos pos)
{
m_istm.seekg(pos);
}
void seekg (std::streamoff offset, std::ios_base::seekdir way)
{
m_istm.seekg(offset, way);
}
template<typename T>
void read(T& t)
{
if(m_istm.read(reinterpret_cast<char*>(&t), sizeof(T)).bad())
{
throw std::runtime_error("Read Error!");
}
}
void read(char* p, size_t size)
{
if(m_istm.read(p, size).bad())
{
throw std::runtime_error("Read Error!");
}
}
private:
std::ifstream m_istm;
};
template<>
void file_istream::read(std::vector<char>& vec)
{
if(m_istm.read(reinterpret_cast<char*>(&vec[0]), vec.size()).bad())
{
throw std::runtime_error("Read Error!");
}
}
template<typename T>
file_istream& operator >> (file_istream& istm, T& val)
{
istm.read(val);
return istm;
}
template<>
file_istream& operator >> (file_istream& istm, std::string& val)
{
int size = 0;
istm.read(size);
if(size<=0)
return istm;
std::vector<char> vec((size_t)size);
istm.read(vec);
val.assign(&vec[0], (size_t)size);
return istm;
}
class mem_istream
{
public:
mem_istream() : m_index(0) {}
mem_istream(const char * mem, size_t size)
{
open(mem, size);
}
mem_istream(const std::vector<char>& vec)
{
m_index = 0;
m_vec.clear();
m_vec.reserve(vec.size());
m_vec.assign(vec.begin(), vec.end());
}
void open(const char * mem, size_t size)
{
m_index = 0;
m_vec.clear();
m_vec.reserve(size);
m_vec.assign(mem, mem + size);
}
void close()
{
m_vec.clear();
}
bool eof() const
{
return m_index >= m_vec.size();
}
std::ifstream::pos_type tellg()
{
return m_index;
}
bool seekg (size_t pos)
{
if(pos<m_vec.size())
m_index = pos;
else
return false;
return true;
}
bool seekg (std::streamoff offset, std::ios_base::seekdir way)
{
if(way==std::ios_base::beg && offset < m_vec.size())
m_index = offset;
else if(way==std::ios_base::cur && (m_index + offset) < m_vec.size())
m_index += offset;
else if(way==std::ios_base::end && (m_vec.size() + offset) < m_vec.size())
m_index = m_vec.size() + offset;
else
return false;
return true;
}
const std::vector<char>& get_internal_vec()
{
return m_vec;
}
template<typename T>
void read(T& t)
{
if(eof())
throw std::runtime_error("Premature end of array!");
if((m_index + sizeof(T)) > m_vec.size())
throw std::runtime_error("Premature end of array!");
std::memcpy(reinterpret_cast<void*>(&t), &m_vec[m_index], sizeof(T));
m_index += sizeof(T);
}
void read(char* p, size_t size)
{
if(eof())
throw std::runtime_error("Premature end of array!");
if((m_index + size) > m_vec.size())
throw std::runtime_error("Premature end of array!");
std::memcpy(reinterpret_cast<void*>(p), &m_vec[m_index], size);
m_index += size;
}
void read(std::string& str, const unsigned int size)
{
if (eof())
throw std::runtime_error("Premature end of array!");
if ((m_index + str.size()) > m_vec.size())
throw std::runtime_error("Premature end of array!");
str.assign(&m_vec[m_index], size);
m_index += str.size();
}
private:
std::vector<char> m_vec;
size_t m_index;
};
template<>
void mem_istream::read(std::vector<char>& vec)
{
if(eof())
throw std::runtime_error("Premature end of array!");
if((m_index + vec.size()) > m_vec.size())
throw std::runtime_error("Premature end of array!");
std::memcpy(reinterpret_cast<void*>(&vec[0]), &m_vec[m_index], vec.size());
m_index += vec.size();
}
template<typename T>
mem_istream& operator >> (mem_istream& istm, T& val)
{
istm.read(val);
return istm;
}
template<>
mem_istream& operator >> (mem_istream& istm, std::string& val)
{
int size = 0;
istm.read(size);
if(size<=0)
return istm;
istm.read(val, size);
return istm;
}
class file_ostream
{
public:
file_ostream() {}
file_ostream(const char * file, std::ios_base::openmode mode)
{
open(file, mode);
}
void open(const char * file, std::ios_base::openmode mode)
{
m_ostm.open(file, mode);
}
void flush()
{
m_ostm.flush();
}
void close()
{
m_ostm.close();
}
bool is_open()
{
return m_ostm.is_open();
}
template<typename T>
void write(const T& t)
{
m_ostm.write(reinterpret_cast<const char*>(&t), sizeof(T));
}
void write(const char* p, size_t size)
{
m_ostm.write(p, size);
}
private:
std::ofstream m_ostm;
};
template<>
void file_ostream::write(const std::vector<char>& vec)
{
m_ostm.write(reinterpret_cast<const char*>(&vec[0]), vec.size());
}
template<typename T>
file_ostream& operator << (file_ostream& ostm, const T& val)
{
ostm.write(val);
return ostm;
}
template<>
file_ostream& operator << (file_ostream& ostm, const std::string& val)
{
int size = val.size();
ostm.write(size);
if(val.size()<=0)
return ostm;
ostm.write(val.c_str(), val.size());
return ostm;
}
file_ostream& operator << (file_ostream& ostm, const char* val)
{
int size = std::strlen(val);
ostm.write(size);
if(size<=0)
return ostm;
ostm.write(val, size);
return ostm;
}
class mem_ostream
{
public:
mem_ostream() {}
void close()
{
m_vec.clear();
}
const std::vector<char>& get_internal_vec()
{
return m_vec;
}
template<typename T>
void write(const T& t)
{
std::vector<char> vec(sizeof(T));
std::memcpy(reinterpret_cast<void*>(&vec[0]), reinterpret_cast<const void*>(&t), sizeof(T));
write(vec);
}
void write(const char* p, size_t size)
{
for(size_t i=0; i<size; ++i)
m_vec.push_back(p[i]);
}
private:
std::vector<char> m_vec;
};
template<>
void mem_ostream::write(const std::vector<char>& vec)
{
m_vec.insert(m_vec.end(), vec.begin(), vec.end());
}
template<typename T>
mem_ostream& operator << (mem_ostream& ostm, const T& val)
{
ostm.write(val);
return ostm;
}
template<>
mem_ostream& operator << (mem_ostream& ostm, const std::string& val)
{
int size = val.size();
ostm.write(size);
if(val.size()<=0)
return ostm;
ostm.write(val.c_str(), val.size());
return ostm;
}
mem_ostream& operator << (mem_ostream& ostm, const char* val)
{
int size = std::strlen(val);
ostm.write(size);
if(size<=0)
return ostm;
ostm.write(val, size);
return ostm;
}
}
#endif // MiniBinStream_H
Requires C++11 now. The classes are templates.
template<typename same_endian_type>
class file_istream {...}
template<typename same_endian_type>
class mem_istream {...}
template<typename same_endian_type>
class ptr_istream {...}
template<typename same_endian_type>
class file_ostream {...}
template<typename same_endian_type>
class mem_ostream {...}
How to pass in same_endian_type
to the class? Use std::is_same<>()
.
using same_endian_type = std::is_same<simple::BigEndian, simple::LittleEndian>;
simple::mem_ostream<same_endian_type> out;
out << (int64_t)23 << (int64_t)24 << "Hello world!";
simple::ptr_istream<same_endian_type> in(out.get_internal_vec());
int64_t num1 = 0, num2 = 0;
std::string str;
in >> num1 >> num2 >> str;
cout << num1 << "," << num2 << "," << str << endl;
If your data and platform always share the same endianness, you can skip the test by specifying std::true_type
directly.
simple::mem_ostream<std::true_type> out;
out << (int64_t)23 << (int64_t)24 << "Hello world!";
simple::ptr_istream<std::true_type> in(out.get_internal_vec());
int64_t num1 = 0, num2 = 0;
std::string str;
in >> num1 >> num2 >> str;
cout << num1 << "," << num2 << "," << str << endl;
Advantages of compile-time Check
- For
same_endian_type = true_type
, the swap function is a empty function which is optimised away. - For
same_endian_type = false_type
, the swapping is done without any prior runtime check cost.
Disadvantages of compile-time Check
- Cannot parse file/data which is sometimes different endian. I believe this scenario is rare.
Swap functions are listed below:
enum class Endian
{
Big,
Little
};
using BigEndian = std::integral_constant<Endian, Endian::Big>;
using LittleEndian = std::integral_constant<Endian, Endian::Little>;
template<typename T>
void swap(T& val, std::true_type)
{
}
template<typename T>
void swap(T& val, std::false_type)
{
std::is_integral<T> is_integral_type;
swap_if_integral(val, is_integral_type);
}
template<typename T>
void swap_if_integral(T& val, std::false_type)
{
}
template<typename T>
void swap_if_integral(T& val, std::true_type)
{
swap_endian<T, sizeof(T)>()(val);
}
template<typename T, size_t N>
struct swap_endian
{
void operator()(T& ui)
{
}
};
template<typename T>
struct swap_endian<T, 8>
{
void operator()(T& ui)
{
union EightBytes
{
T ui;
uint8_t arr[8];
};
EightBytes fb;
fb.ui = ui;
std::swap(fb.arr[0], fb.arr[7]);
std::swap(fb.arr[1], fb.arr[6]);
std::swap(fb.arr[2], fb.arr[5]);
std::swap(fb.arr[3], fb.arr[4]);
ui = fb.ui;
}
};
template<typename T>
struct swap_endian<T, 4>
{
void operator()(T& ui)
{
union FourBytes
{
T ui;
uint8_t arr[4];
};
FourBytes fb;
fb.ui = ui;
std::swap(fb.arr[0], fb.arr[3]);
std::swap(fb.arr[1], fb.arr[2]);
ui = fb.ui;
}
};
template<typename T>
struct swap_endian<T, 2>
{
void operator()(T& ui)
{
union TwoBytes
{
T ui;
uint8_t arr[2];
};
TwoBytes fb;
fb.ui = ui;
std::swap(fb.arr[0], fb.arr[1]);
ui = fb.ui;
}
};
The code is hosted at Github.