Click here to Skip to main content
15,880,427 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I want to get headings h1, h2 from a word docx file with the page number from where it is fetched. e.g. there are headings "heading h1" and "heading h2" in page 1 and other h1,h2 headings on other pages. I want to get these with the page number they are fetched from. Can be something like

array(
0 => array(
h1 => array('h1 headings goes here'),
h2 => array('h2 headings goes here...')
page=>'page number here'))
I am able to get headings by converting docx to zip and reading the xml using DOM Document. But I am not able to get the page number from where I picked a particular heading.

Please share the best way to achieve this functionality.

What I have tried:

I have tried reading docx by first converting them into zip and then reading the its document.xml using DOM Document. I can read the content but not able to get from which page I get a particular content
Posted
Updated 12-May-16 19:30pm
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900