Click here to Skip to main content
15,906,335 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Actually i have develop one winform application that winform application reads the

content in string format but my application reads the only one file at a time. but i

want to read set of pdf files(collection). i don't know how to set the collection path in

pdf reader class.i think may be foreach loop is very usefull in collection code

my code like as:
C#
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.IO;
using System.Collections;
using System.Windows.Forms;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;

namespace test
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }
   
        public static string ExtractTextFromPdf(string path)
        {
            using (PdfReader reader = new PdfReader(path))
            {
                StringBuilder text = new StringBuilder();

                for (int i = 1; i <= reader.NumberOfPages; i++)
                {
                    text.Append(PdfTextExtractor.GetTextFromPage(reader, i));
                    
                }

                return text.ToString();
            }

        } 
             private void button1_Click(object sender, EventArgs e)
            {
            Form1.ExtractTextFromPdf(@"D:\Data Sets\Enron\168.pdf");
            }
            
        }
        }


my requirement is how to read all pdf files(collection) like path will be as

"@"D:\Data Sets\Enron".Enron folder conatin set of pdf files then each time pick up

one pdf file and read the content. i think may be foreach is very usefull.
however i want read all pdf files(Enron folder)
Posted

1 solution

This snippet shows you how to get all PDF file names in your directory
and then how to iterate through this:
C#
string pathName = @"D:\Data Sets\Enron";

string[] pdfFileNames = Directory.GetFiles(pathName, "*.pdf");

foreach(string pdfFileName in pdfFileNames)
{
    ExtractTextFromPdf(pdfFileName);
}
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900