Click here to Skip to main content
15,881,938 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more: , +
I have a word document, (docx) file. I need to display the first page of docx in my web page. (Maybe image)

What I have tried:

I tried to extract the first page from docx using OpenXML.
steps: extract the first-page using OpenXML, Convert extracted docx to image format, then adding it to the web page.

Is there any alternative solution/easier solution? How to accomplish it?
Posted
Comments
RaultKlawas 25-Jan-18 4:32am    
This cannot be done with OpenXML SDK, at least not in any easy way.
It's because:

A) OpenXML SDK does not know where the first page ends, unless you used an explicit PageBreak element in your document.
Generally OpenXML SDK does not know how many pages the document is made up of and where each of those pages are, all it can tell you is the XML content of "document.xml".

B) OpenXML SDK cannot draw or visualize any document element, again all it can tell you is the XML content of "document.xml".

So this task requires a tool that would paginate and render the document's content.
For instance, all Word processing applications have a rendering engine which is responsible for this task. Rendering engine is able to paginate the document's content and calculate where exactly all the document's elements are positioned, so that it can be rendered on the application's GUI.

So what you could do is use a Microsoft Word Interop in C#, like this:

string input = "Input.docx";
string output = "Output.png";

var application = new Application();
application.Visible = false;

var document = application.Documents.Open(input);
document.ShowGrammaticalErrors = false;
document.ShowRevisions = false;
document.ShowSpellingErrors = false;

var page = document.Windows[1].Panes[1].Pages[1];
var bits = page.EnhMetaFileBits as byte[];

using (var stream = new MemoryStream(bits))
{
var image = Image.FromStream(stream);
image.Save(output, ImageFormat.Png);
}

document.Close();
application.Quit();

However, using the MS Word on server is not recommended nor supported, so instead you could try out this Word processing library for C# and VB.NET.
Here is how you would accomplish this task with it:

string input = "Input.docx";
string output = "Output.png";

var document = DocumentModel.Load(input);
var options = new ImageSaveOptions(ImageSaveFormat.Png);
options.PageNumber = 0;
document.Save(output, options);

Alternative approach that you could take is to convert Word to PDF or HTML in C# and then display that PDF through an "iFrame" or display that HTML directly on your website.
I hope this helps.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900