Click here to Skip to main content
15,881,840 members
Articles / Programming Languages / C++

tarlib – Windows TAR Library

Rate me:
Please Sign up or sign in to vote.
4.27/5 (4 votes)
5 Oct 2012CPOL3 min read 44K   9   8
tarlib – Windows TAR Library

Our latest addition to the open-source projects we feature on the site is tarlib. This is intended as a (small) C++ library that you can use in Windows applications that need to handle TAR files. Of course, most zipping tools (for Windows) support TAR archives, so if you just need to extract or create a TAR archive, you can use one of them (my favorite is 7-zip). But when you need to do this in your app, things could get a bit more complicated. Of course, there are already available solutions. You can use for instance LZMA SDK (from 7-zip) or the commercial library Chilkat. My proposal is a library with a simple API that enables you to process TAR files with ease.

TAR Description

If you need tarlib, you must already know something about TAR files. Anyways, you can get more information in the following articles:

Here is a short summary of the TAR format:

  • TAR archives consist of a series of objects, most common being files and folders
  • Each such object is preceded by a header (of 512 bytes)
  • The information in the header is encoded in ASCII and numbers are written in the octal base
  • The file data is written unaltered, but it is rounded up to a multiple of 512 bytes
  • The end of the file is marked with at least two consecutive entries filled with zeros
  • There are different version of the TAR archives (UNIX V7, “old GNU” and GNU, STAR and POSIX) and different implementation

Library

tarlib is written in C++ with Visual Studio and requires minimum Windows XP (because of file system APIs that it uses and that were introduced with WinXP). The library is provided as a pack of C++ files (headers and cpps) that you can include in your application.

Note that:

  • the library is distributed under the Creative Commons Attribution-ShareAlike license
  • The software is provided “as-is”. No claim of suitability, guarantee, or any warranty whatsoever is provided.

The current version (v1.1):

  • is able to read (and process) existing TAR files
  • does not support creation of TAR files
  • supports parsing tar objects representing files and folders (as these are the most common objects on Windows at least)

Library API

There are a few classes/structures the library provides for handling TAR files.

  • tarFile: is the representation of a tar file.
    • bool open(std::string const &filename, tarFileMode mode, tarFormatType type) opens the specified TAR file for reading or writing (not supported in v1.1)
    • bool extract(std::string const &folder) extracts the content of the archive (files and folders) to the specified destination
    • tarEntry get_first_entry() retrieves the first entry in a tar archive
    • tarEntry get_next_entry() retrieves the next entry in a tar archive
    • void rewind() re-positions the file cursor at the beginning of the archive
  • tarEntry: represents an object in a TAR file. it contains the header for the entry and methods to process the entry:
    • bool is_empty() indicates whether this is an empty entry (empty entries are used to mark the end of the archive)
    • bool is_md5() indicates whether this is an entry that contains the MD5 hash of the actual TAR file (always found at the end of the archive)
    • void rewind() re-positions the file cursor ar the beginning of the object’s data (so you can read it again)
    • bool extract(std::string const &folder) extracts the current entry (file or folder) to the specified folder
    • size_t read(char* buffer, size_t chunksize = tarChunkSize) reads from the current position in the object’s data to the provided buffer; this function does not read past the end of the object’s data
    • static tarEntry makeEmpty() creates a tarEntry representing an empty object
    • static tarEntry makeMD5(char* buffer, size_t size) creates a tarEntry from a buffer containing the MD5 hash for the TAR object

Examples

Example 1: Extract a TAR archive to a specified folder using the tarFile:

C++
void extract1(std::string const &filename, std::string const &destination)
{
   // create tar file with path and read mode
   tarFile tarf(filename, tarModeRead);

   // extract to folder
   tarf.extract(destination);
}

Example 2: Extract a TAR archive to a specified folder using a loop that iterates through the entries of the TAR archive:

C++
void extract2(std::string const &filename, std::string const &destination)
{
   // create tar file with path and read mode
   tarFile tarf(filename, tarModeRead);

   // get the first entry
   tarEntry entry = tarf.get_first_entry();
   do 
   {
      // if the entry is a directory create the directory
      if(entry.header.indicator == tarEntryDirectory)
      {
         createfolder(path_combine(destination, entry.header.filename));
      }
      // if the entry is a normal file create the file
      else if(entry.header.indicator == tarEntryNormalFile || 
             entry.header.indicator == tarEntryNormalFileNull)
      {         
         entry.extractfile_to_folder(destination);
      }

      // get the next entry in the TAR archive
      entry = tarf.get_next_entry();
   } while(!entry.is_empty());
}

Example 3: A simplified version of the 2nd example:

C++
void extract3(std::string const &filename, std::string const &destination)
{
   // create tar file with path and read mode
   tarFile tarf(filename, tarModeRead);

   // get the first entry
   tarEntry entry = tarf.get_first_entry();
   do 
   {
      // extract the current entry
      entry.extract(destination);

      // get the next entry in the TAR archive
      entry = tarf.get_next_entry();
   } while(!entry.is_empty());
}

Example 4: Explicitly process the entries of a TAR file (no auto-extraction to disk, can be in memory processing):

C++
void extract4(std::string const& filename)
{
   // create tar file with path and read mode
   tarFile tarf(filename, tarModeRead);

   std::list<tarEntry> entries;

   // get the first entry
   tarEntry entry = tarf.get_first_entry();
   do 
   {
      // add the entry to the list
      entries.push_back(entry);

      // get the next entry in the TAR archive
      entry = tarf.get_next_entry();
   } while(!entry.is_empty());   

   // iterate through the entries
   for(std::list<tarEntry>::iterator it = entries.begin();
      it != entries.end();
      ++it)
   {
      tarEntry& entry = *it;

      // consider the files
      if(entry.header.indicator == tarEntryNormalFile ||
         entry.header.indicator == tarEntryNormalFileNull)
      {
         // position the TAR file cursor at the beginning of the entry
         entry.rewind();

         // read from the TAR file in a chunk
         char chunk[8*1024];
         size_t total = 0;
         do
         {
            size_t readBytes = entry.read(chunk, sizeof(chunk));

            // do something with the read buffer
            // ...

            total += readBytes;
         }while(total < entry.header.filesize);
      }
   }
}

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect Visma Software
Romania Romania
Marius Bancila is the author of Modern C++ Programming Cookbook and The Modern C++ Challenge. He has been a Microsoft MVP since 2006, initially for VC++ and nowadays for Development technologies. He works as a system architect for Visma, a Norwegian-based company. He works with various technologies, both managed and unmanaged, for desktop, cloud, and mobile, mainly developing with VC++ and VC#. He keeps a blog at http://www.mariusbancila.ro/blog, focused on Windows programming. You can follow Marius on Twitter at @mariusbancila.

Comments and Discussions

 
QuestionCan I create TAR files with Visual Basic 6.0?? Pin
LeMarS25-Mar-21 5:55
LeMarS25-Mar-21 5:55 
QuestionMemory leak Pin
Member 926021028-Oct-14 22:38
Member 926021028-Oct-14 22:38 
QuestionLicense Pin
SepBen3-Feb-14 21:09
SepBen3-Feb-14 21:09 
AnswerRe: License Pin
Marius Bancila3-Feb-14 21:18
professionalMarius Bancila3-Feb-14 21:18 
Questionno sources? Pin
SepBen8-Jan-14 23:42
SepBen8-Jan-14 23:42 
AnswerRe: no sources? Pin
Marius Bancila9-Jan-14 1:32
professionalMarius Bancila9-Jan-14 1:32 
This is actually a blog posted fetched from another blog and posted here. That's why you don't see attached sources. However, if you follow the link[^] in the first sentence of the article you will find the sources.
QuestionCreate TAR arhive function. Pin
rsa_m31-Aug-13 3:36
rsa_m31-Aug-13 3:36 
AnswerRe: Create TAR arhive function. Pin
Marius Bancila1-Sep-13 23:10
professionalMarius Bancila1-Sep-13 23:10 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.