Click here to Skip to main content
15,881,204 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
Hi all,
i am facing problem in extracting text of pdf document paragraph wise, please help me out.

my code is,
C#
private void ReadPdf(string _filePath)
        {
            PdfReader rd = new PdfReader(_filePath);
            int pageNumber = 1;
           // TextWriter oContent = TextWriter(Char);
            string oContent = "";
            while (pageNumber <= rd.NumberOfPages)
            {
                oContent += PdfString.STREAM.ToString();
                ++pageNumber;
            }
        }

in the above code i am able read the pdf text but it extracts line by line. but i want in paragraph wise
Posted

C#
PdfReader reader = new PdfReader(path);
 StringWriter output = new StringWriter();
 for (int i = 1; i <= reader.NumberOfPages; i++)
 {
     Paragraph o = CreateSimpleHtmlParagraph(output.ToString());
     output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));
 }
 
Share this answer
 
Comments
[no name] 29-Aug-13 9:55am    
The question was asked and answered 2 years ago. There is no real need to keep answering it.
Sam Hobbs 18-Jan-19 18:38pm    
Actually, the original response has no relevance whatsoever to the question.
Sam Hobbs 18-Jan-19 18:39pm    
Thank you, chetan2020. I was looking for a simple sample of the use of iTextSharp. This got me started at least.
Hi
This will helpful for u

C#
protected void Page_Load(object sender, EventArgs e)
        {
            SqlServer server = new SqlServer("Data Source=KSHIT6773-G13\\SQLEXPRESS;Initial Catalog=Test;Integrated Security=True");
            string[] sql = { "SELECT E.Name, D.Name FROM Employee E, Department D WHERE D.DepartmentID = E.Department" };
            string[] table = { "EMPDEPT" };
            DataSet ds = new DataSet();
            ds = server.GetDataSet(sql, table, false);

            ReportCRtoPDF rptObj = new ReportCRtoPDF();
            rptObj.SetDataSource(ds);
            
            DiskFileDestinationOptions dsk = new DiskFileDestinationOptions();
            dsk.DiskFileName = Request.PhysicalApplicationPath + "files\\CrtoPDF.pdf";
            ExportOptions ex = new ExportOptions();
            ex.ExportDestinationType = ExportDestinationType.DiskFile;
            ex.ExportFormatType = ExportFormatType.PortableDocFormat;
            ex.ExportDestinationOptions = dsk;
            rptObj.Export(ex);
        }
 
Share this answer
 
v2
Comments
Sam Hobbs 18-Jan-19 18:37pm    
I searched for a simple sample use of iTextSharp and I see no relevance whatsoever that this has to the question.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900