Click here to Skip to main content
15,881,938 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
Hi,

I am looking for C# code that will read document file (.doc/.docx) with all formatted words including images, bullets, font with bold/italic/underline, table, header & footer and so on.

I am using WordprocessingDocument.Open() and Microsoft.Office.Interop.Word.Application But unable to get exactly i am looking for.

Please suggest some good snippets.

Thanks in adv.
Posted
Comments
BillWoodruff 30-Oct-13 7:28am    
Please "improve" your question and tell us what characters, or formats, you can read, and what characters, or formats, you can't read.

1 solution

If you are not getting proper results from OpenXml then there isn't any...

However,
what you can do is try parsing the XML (returned by WordprocessingDocument.Open()) and get the expected results from there.

Its bit laborious work but will work as you want.
For example you can check...
http://stackoverflow.com/questions/4824619/batch-conversion-of-docx-to-clean-html
 
Share this answer
 
Comments
Afzaal Ahmad Zeeshan 8-Aug-15 10:41am    
The question thread is almost dead, 3 years old and author has no interest in getting help.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900