Click here to Skip to main content
15,888,610 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I made a console with iTextSharp for reading a .pdf file and save it as a .csv So I have a hardcoded .pdf file but I would like to read more than 100 .pdf files and save it as a .csv

The files will be named like this:

DT12345678, DT98765432, FR123567, FR988654 ...

C#
static void Main(string[] args)
{
    string fileName = "test.pdf";
        StringBuilder text = new StringBuilder();
    StreamWriter write = new StreamWriter("test.csv");
        if (File.Exists(fileName))
        {
            PdfReader pdfReader = new PdfReader(fileName);

            for (int page = 1; page <= pdfReader.NumberOfPages; page++)
            {
                ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
                string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);

                currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
                text.Append(currentText);
                pdfReader.Close();
            }
        }
         text.ToString();
    write.Write(text.ToString());
    write.Close();
    Console.WriteLine(text.ToString());

}


What I have tried:

I couldn't try anything because I have no reference point.
Posted
Updated 20-Sep-20 20:54pm
Comments
Sandeep Mewara 21-Sep-20 2:48am    
It's not clear on where are you stuck? Seems you wrote a code to read and save for 1 pdf. Now you want to extend for multiple, so what is the issue?
Member 14783397 21-Sep-20 6:05am    
the issue is how to do that..

1 solution

You have code to read a pdf file.
So extract that into a separate method that accepts a single parameter - the path to the file - and returns the entire content. Test it, and make sure it works.

You can then call that method as many times as you need in a loop to get all the files content.

You will then probably need to process that content into actual data before outputting it as CSV, but that will depend on the data content, and we have no idea what your PDF files contain, or what you need in each column of the CSV. It is unlikely that the PDF content will be in CSV format already, but it is possible!
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900