Click here to Skip to main content
15,867,568 members
Articles / General Programming / File

MergeFiles

Rate me:
Please Sign up or sign in to vote.
5.00/5 (3 votes)
3 Jun 2016CPOL1 min read 10K   5  
A C function that merges the content from an arbitrary number of text files into a Character-Separated-Variable-Width result file

Background

The other day, the following question was posted in QA:
How do I merge two files with two columns side by side in one file?[^]

In my opinion, this description can be abstracted to something like "form a Character-Separated-Variable-Width file from multiple text files, using the content of each file to populate each column".

Introduction

I love a good exercise, particularly if it gives me an opportunity to flex my rusty C-fu. So, in response, I wrote such a function my way.

  • Mine accepts an array containing an arbitrary number of file names.
  • Rather than failing, it reports file open errors, but otherwise treats unreadable files as empty.
  • It accepts a sequence of characters to use as the column separator.
  • A parameter specifies whether or not a "header line" containing the names of the files should be included.
  • It does not require that the input files all have the same number of lines -- "empty" values will be written to the output when input files run out of data.
  • The return value is the total number of lines written to the output.

I'm also considering adding the ability for it to put QUOTEs around values if specified. Another potential feature is the ability to filter out empty lines.

Using the Code

C
result = MergeFiles ( argc - 2 , argv [ 1 ] , argv [ 2 ] , "\t" , false ) ;

The result could be something like:

f:\>FileMerge CON A.txt B.txt C.txt
A.txt   B.txt   C.txt
AAAAA   BBBBB   CCCCC
AAAAA   BBBBB
AAAAA           CCCCC
AAAAA   BBBBB
AAAAA   BBBBB   CCCCC
        BBBBB
                CCCCC

                CCCCC

                CCCCC

f:\>

MergeFiles

Here's the function:

C
int
MergeFiles
(
  int   Count
,
  char* Dest
,
  char* Source[]
,
  char* Delimiter
,
  bool  Headers
)      
{
  int result = 0 ;
  
  FILE* dst ;
  
  if ( ( ( dst = fopen ( Dest , "w" ) ) ) == NULL )
  { 
    printf ( "\nError opening %s %d" , Dest , errno ) ;
    
    result = 0 - errno ;
  } 
  else
  {     
    int i ;      
    int j = 0 ;      
    
    FILE** src = (FILE**) calloc ( Count , sizeof(FILE*) ) ;
    
    if ( Delimiter == NULL )
    {
      Delimiter = "" ;
    }
                       
    for ( i = 0 ; i < Count ; i++ )
    {                  
      if ( Headers )
      {
        if ( i > 0 )
        {     
          fprintf ( dst , "%s" , Delimiter ) ;      
        }

        fprintf ( dst , "%s" , Source [ i ] ) ; 
      }
      
      if ( ( ( src [ i ] = fopen ( Source [ i ] , "r" ) ) ) == NULL )
      {
        printf ( "\nError opening %s %d" , Source [ i ] , errno ) ;
      }
      else
      {
        j++ ;
      }
    }  

    if ( Headers )
    {
      fputc ( '\n' , dst ) ;      
      
      result++ ;
    }
    
    while ( j > 0 )                    
    {                                    
      for ( i = 0 ; i < Count ; i++ )    
      {                                  
        if ( i > 0 )
        {
          fprintf ( dst , "%s" , Delimiter ) ;      
        }
         
        if ( src [ i ] != NULL ) 
        {                                
          while ( 1 )
          {            
            int c = getc ( src [ i ] ) ;
                      
            if ( c == '\n' )
            {
              break ;
            }
            else if ( c == EOF )
            {   
              fclose ( src [ i ] ) ;
              
              src [ i ] = NULL ;
              
              j-- ;
              
              break ;
            }   
            else
            {
              fputc ( c , dst ) ;
            }
          }
        }     
      }
      
      fputc ( '\n' , dst ) ;      
      
      result++ ;
    }

    free ( src ) ;

    fclose ( dst ) ;
  } 
    
  return ( result ) ;
}   

The only C compiler I have handy that supports bool is:

gcc version 3.2 (mingw special 20020817-1)

I used the -std=gnu99 switch.

History

  • 2016-06-03: First published

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
BSCS 1992 Wentworth Institute of Technology

Originally from the Boston (MA) area. Lived in SoCal for a while. Now in the Phoenix (AZ) area.

OpenVMS enthusiast, ISO 8601 evangelist, photographer, opinionated SOB, acknowledged pedant and contrarian

---------------

"I would be looking for better tekkies, too. Yours are broken." -- Paul Pedant

"Using fewer technologies is better than using more." -- Rico Mariani

"Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’" -- Steve McConnell

"Every time you write a comment, you should grimace and feel the failure of your ability of expression." -- Unknown

"If you need help knowing what to think, let me know and I'll tell you." -- Jeffrey Snover [MSFT]

"Typing is no substitute for thinking." -- R.W. Hamming

"I find it appalling that you can become a programmer with less training than it takes to become a plumber." -- Bjarne Stroustrup

ZagNut’s Law: Arrogance is inversely proportional to ability.

"Well blow me sideways with a plastic marionette. I've just learned something new - and if I could award you a 100 for that post I would. Way to go you keyboard lovegod you." -- Pete O'Hanlon

"linq'ish" sounds like "inept" in German -- Andreas Gieriet

"Things would be different if I ran the zoo." -- Dr. Seuss

"Wrong is evil, and it must be defeated." –- Jeff Ello

"A good designer must rely on experience, on precise, logical thinking, and on pedantic exactness." -- Nigel Shaw

“It’s always easier to do it the hard way.” -- Blackhart

“If Unix wasn’t so bad that you can’t give it away, Bill Gates would never have succeeded in selling Windows.” -- Blackhart

"Use vertical and horizontal whitespace generously. Generally, all binary operators except '.' and '->' should be separated from their operands by blanks."

"Omit needless local variables." -- Strunk... had he taught programming

Comments and Discussions

 
-- There are no messages in this forum --