First regarding the extracting of specific text in Word documents, you can check out my following post:
Find Text in Word Documents[
^]
The provided code converts the DOCX files into a
string
, but now regarding your second requirements (extracting by coordinate), well this is somewhat impossible.
Word files are of a flow document type and its content is not fixed like in PDF files. To put it in another words in PDF a specific text is defined on a specific coordinates and it will always be rendered in the same location to anyone that is viewing the file.
But in Word files a specific text does not contain any information about where it will be rendered, it can be in first page or a fifth page, the content itself does not care about it, but instead the viewing applications does and different application can result in different rendering of the same document.