Introduction
Word Automation is the development of software routines which help optimize or automate existing Word procedures. There are mainly two ways to automate Word using the technology used: Late binding and Early binding. Before coming to the automation, a brief introduction about these technologies is necessary.
Early Binding
Early binding is the commonly used technology. Here, we directly add a reference of the Office library to the project.
Then, we use the namespace in the code file and code using the classes and interfaces provided by the Office library. The code looks almost like:
public object CreateWordApplication()
{
Word.Application wordApp = new Word.ApplicationClass();
return wordApp;
}
Early binding has several advantages:
- Code will run considerably faster.
- Debugging is very easy.
- You will get the environment Intellisense for coding (i.e., typing a classname or keyword and pressing dot will bring a popup). The main disadvantage is, it will support only the version that we have referenced in our application.
Late Binding
Late binding uses another mechanism called Reflection for invoking the Office libraries, and do not need any Office reference. It mainly uses two lines to communicate with the Office library:
Type wordType = Type.GetTypeFromProgID("Word.Application");
object wordApplication = Activator.CreateInstance(wordType);
By using this wordApplication
object, we can do almost all operations. The main advantages of late binding are, it will invoke our machine's active Word application and is version independent. But its performance is slower than that of early binding.
By using early binding, developing an automation is very easy, and in this article, I am giving a brief idea about Word automation using late binding and early binding.
Problem Specification
Here, I am discussing a simple Word automation, which is used for getting word count from a document. We also replace a word with another word. First, we will implement it using early binding using a Word 2003 Office reference. After that, we will use late binding for solving the same problem in all Word versions.
Word Automation: Early Binding
The first step of doing a Word automation using early binding is installing a particular version of Office in your machine. Then, create a new Class Library named "WordAutomation" and give a reference of the Office Library in your project, as mentioned above.
Using Code
Using Namespaces
Create a new class named WordAutomation
in your project and use the following namespaces:
using Word;
using Office;
Creating a WordApplication Object
The following code will create a Word Application object.
public object CreateWordApplication()
{
Word.Application wordApp = new Word.ApplicationClass();
return wordApp;
}
Closing a WordApplication
Here, the parameter is an already created Word Application object using the CreateWordApplication()
function.
public bool CloseWordApp(object wordApplication)
{
bool isSuccess = false;
if (wordApplication != null)
{
object missing = System.Reflection.Missing.Value;
object saveChanges = Word.WdSaveOptions.wdSaveChanges;
Word.Application wordApp =
wordApplication as Word.ApplicationClass;
wordApp.Quit(ref saveChanges, ref missing, ref missing);
isSuccess = true;
}
return isSuccess;
}
Creating a Word Document Object
The CreateWordDoc
function mainly requires a Word document file name and a Word application that we have created using the CreateWordApplication()
function. We can open it in either read only mode or read write mode.
public object CreateWordDoc(object fileName,
object wordApplication, bool isReadonly)
{
Word.Document wordDoc = null;
Word.Application wordApp = null;
if (wordApplication != null)
{
wordApp = wordApplication as Word.ApplicationClass;
}
if (File.Exists(fileName.ToString()) && wordApp != null)
{
object readOnly = isReadonly;
object isVisible = true;
object missing = System.Reflection.Missing.Value;
wordDoc = wordApp.Documents.Open(ref fileName, ref missing,
ref readOnly, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing,
ref missing, ref isVisible);
}
return wordDoc;
}
Closing a Word Document Object
The CloseWordDoc()
function will close an already created Word document using the CreateWordDoc()
function.
public bool CloseWordDoc(object wordDocument, bool canSaveChange)
{
bool isSuccess = false;
if (wordDocument != null)
{
object missing = System.Reflection.Missing.Value;
object saveChanges = null;
if (canSaveChange)
{
saveChanges = Word.WdSaveOptions.wdSaveChanges;
}
else
{
saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
}
Word.Document wordDoc = wordDocument as Word.Document;
wordDoc.Close(ref saveChanges, ref missing, ref missing);
isSuccess = true;
}
return isSuccess;
}
Getting the Word Count
A Word document has different objects inside. The main area of a Word document is called Content. Other objects include Shapes, Comments, Header and Footer etc. This function will find out the word count from the above sections. The main parameters of this function is a Word document and a text to search in the document. Here, the main function to count the word occurrence is GetCountFromRange()
, which is explained below.
public int GetWordCount(object wordDoc, string word)
{
int count = 0;
do
{
if (wordDoc == null)
{
break;
}
if (word.Trim().Length == 0)
{
break;
}
Word.Document wordDocument = wordDoc as Word.Document;
wordDocument.Activate();
count+= GetCountFromRange(wordDocument.Content,wordDocument,word);
foreach(Word.Comment com in wordDocument.Comments)
{
count+= GetCountFromRange(com.Range,wordDocument,word);
break;
}
foreach(Word.HeaderFooter header in wordDocument.Sections.Last.Headers)
{
count+= GetCountFromRange(header.Range,wordDocument,word);
}
foreach(Word.HeaderFooter footer in wordDocument.Sections.Last.Footers)
{
count+= GetCountFromRange(footer.Range,wordDocument,word);
}
foreach (Word.Shape shape in wordDocument.Shapes)
{
if (shape.TextFrame.HasText < 0)
{
count+=GetCountFromRange( shape.TextFrame.TextRange,wordDocument,word);
}
}
}
while(false);
return count;
}
Getting Word Count from a Range
The word finds a text inside a range. The range may be the following.
WordDocument.Content
Comment.Range
HeaderFooter.Range
Shape.TextFrame.TextRange
etc.
We can search or replace a text by using a range, and the sample code is shown.
private int GetCountFromRange(
Word.Range range, Word.Document wordDocument,string word)
{
int count = 0;
object missing = System.Reflection.Missing.Value;
object matchAllWord = true;
object item = Word.WdGoToItem.wdGoToPage;
object whichItem = Word.WdGoToDirection.wdGoToFirst;
wordDocument.GoTo(ref item, ref whichItem,ref missing, ref missing);
range.Find.ClearFormatting();
range.Find.Forward = true;
range.Find.Text = word;
range.Find.Execute(
ref missing, ref missing, ref matchAllWord, ref missing,
ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing,ref missing, ref missing,
ref missing, ref missing, ref missing);
while (range.Find.Found)
{
++count;
range.Find.Execute(
ref missing, ref missing, ref matchAllWord, ref missing,
ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing,ref missing, ref missing,
ref missing, ref missing, ref missing);
}
return count;
}
Find and Replace
The Find and Replace feature is similar to Find, the only difference is we need to provide the replace text as well. The function ReplaceRange()
will replace the Find text with the Replace text.
public bool FindReplace(object wordDoc,object wordApplication,
string findText, string replaceText)
{
bool isSuccess = false;
do
{
if (wordDoc == null)
{
break;
}
if (wordApplication == null)
{
break;
}
if (replaceText.Trim().Length == 0)
{
break;
}
if (findText.Trim().Length == 0)
{
break;
}
Word.Application wordApp = wordApplication as Word.ApplicationClass;
Word.Document wordDocument = wordDoc as Word.Document;
ReplaceRange(wordDocument.Content,wordDocument,
wordApp,findText,replaceText);
foreach (Word.Comment comment in wordDocument.Comments)
{
ReplaceRange(comment.Range, wordDocument,
wordApp,findText,replaceText);
}
foreach (Word.HeaderFooter header in
wordDocument.Sections.Last.Headers)
{
ReplaceRange(header.Range, wordDocument,
wordApp,findText,replaceText);
}
foreach (Word.HeaderFooter footer
in wordDocument.Sections.Last.Footers)
{
ReplaceRange(footer.Range, wordDocument,wordApp,
findText,replaceText);
}
foreach (Word.Shape shp in wordDocument.Shapes)
{
if (shp.TextFrame.HasText < 0)
{
ReplaceRange(shp.TextFrame.TextRange,
wordDocument,wordApp,
findText,replaceText);
}
}
isSuccess = true;
}
while(false);
return isSuccess;
}
Replace within a Range
This function will replace a Find text with a Replace text.
private void ReplaceRange(Word.Range range,
Word.Document wordDocument, Word.Application wordApp,
object findText, object replaceText)
{
object missing = System.Reflection.Missing.Value;
wordDocument.Activate();
object item = Word.WdGoToItem.wdGoToPage;
object whichItem = Word.WdGoToDirection.wdGoToFirst;
object forward = true;
wordDocument.GoTo(ref item, ref whichItem,
ref missing, ref missing);
object replaceAll = Word.WdReplace.wdReplaceAll;
object matchAllWord = true;
range.Find.ClearFormatting();
range.Find.Replacement.ClearFormatting();
range.Find.Execute(ref findText, ref missing, ref matchAllWord,
ref missing, ref missing, ref missing, ref forward,
ref missing, ref missing, ref replaceText, ref replaceAll,
ref missing, ref missing, ref missing, ref missing);
}
Word Automation: Late Binding
Late binding does not require any references and namespace usage in the project. Coding is very difficult, and we have to know the functions and properties.
Using Code
Using Namespaces
For working with late binding, you have to use the Reflection mechanism. So the following namespace usage is required:
using System.Reflection;
Creating a WordApplication Object
If at least one Word version is installed in your machine, this function creates a Word type using the program ID of Word. If the system has multiple versions, then at a time, only one of the Word versions is active and this function will create the type of the active version. The CreateInstance
method will create the Word Application object.
public object CreateWordApplication()
{
string message = "Failed to create word application. Check whether" +
" word installation is correct.";
Type wordType = Type.GetTypeFromProgID("Word.Application");
if (wordType == null)
{
throw new Exception(message);
}
object wordApplication = Activator.CreateInstance(wordType);
if (wordApplication == null)
{
throw new Exception(message);
}
return wordApplication;
}
Closing a WordApplication
For closing a Word application, we have to invoke the Quit
method of the Word application. Here, another method is used to invoke the Quit
function, called InvokeMethod()
, and is explained in the section below. The parameter wordApplication
is the instance created from the CreateWordApplication()
function.
public bool CloseWordApp(object wordApplication)
{
bool isSuccess = false;
if (wordApplication != null)
{
object saveChanges = -1;
InvokeMember("Quit",wordApplication,new object[]{saveChanges});
isSuccess = true;
}
return isSuccess;
}
Creating a Word Document Object
Here, wordApplication
and file name are required to create a Word document. Inside the function, we use the GetProperty()
function for getting the property named Documents
from the Word application, and it is explained in the section below.
public object CreateWordDoc(object fileName, object wordApplication, bool isReadonly)
{
object wordDocument = null;
if (File.Exists(fileName.ToString()) && wordApplication != null)
{
object readOnly = isReadonly;
object isVisible = true;
object missing = System.Reflection.Missing.Value;
object wordDocuments = GetProperty("Documents", wordApplication);
wordDocument = InvokeMember("Open",wordDocuments,
new object[]{fileName, missing, isReadonly, missing, missing, missing,
missing, missing, missing, missing,
missing, isVisible});
}
return wordDocument;
}
Closing a Word Document Object
This function requires mainly two parameters: a Word document and save changes information. The main thing to understand here is, we have to pass a save changes information to the Close()
function of the Word document, which is an enum value (Word.WdSaveOptions
), and you can refer the same function in early binding for better understanding. But instead of that enum member, we will just pass its integer value here, because we don't have any access to that enum here. Here, the value 1 is the integer value of Word.WdSaveOptions.wdSaveChanges
, and the value 0 is the integer value of Word.WdSaveOptions.wdDoNotSaveChanges
.
public bool CloseWordDoc(object wordDocument, bool canSaveChange)
{
bool isSuccess = false;
if (wordDocument != null)
{
object saveChanges = null;
if (canSaveChange)
{
saveChanges = -1;
}
else
{
saveChanges = 0;
}
InvokeMember("Close",wordDocument,
new object[]{saveChanges});
isSuccess = true;
}
return isSuccess;
}
Getting the Word Count
Compare the below function with the late binding technology and try to understand how the early binding code is converted to late binding.
public int GetWordCount(object wordDoc, string word)
{
int count = 0;
do
{
if (wordDoc == null)
{
break;
}
if (word.Trim().Length == 0)
{
break;
}
InvokeMember("Activate",wordDoc);
object content = GetProperty("Content",wordDoc);
count+= GetCountFromRange(content,wordDoc,word);
object comments = GetProperty("Comments",wordDoc);
object count1 = GetProperty("Count", comments);
int rangeCount = (int)count1;
for(int i = 1; i <= rangeCount;)
{
object comment = InvokeMember("Item",
comments, new object[] { i });
object range = GetProperty("Range", comment);
count+= GetCountFromRange(range,wordDoc,word);
break;
}
object sections = GetProperty("Sections",wordDoc);
object last = GetProperty("Last",sections);
object headers = GetProperty("Headers",last);
rangeCount = (int)GetProperty("Count", headers);
for (int i = 1; i <= rangeCount; i++)
{
object header = InvokeMember("Item",
headers, new object[] { i });
object range = GetProperty("Range", header);
count+= GetCountFromRange(range,wordDoc,word);
}
object footers = GetProperty("Footers",last);
rangeCount = (int)GetProperty("Count", footers);
for (int i = 1; i <= rangeCount; i++)
{
object footer = InvokeMember("Item",
footers, new object[] { i });
object range = GetProperty("Range", footer);
count+= GetCountFromRange(range,wordDoc,word);
}
object shapes = GetProperty("Shapes",wordDoc);
rangeCount = (int)GetProperty("Count", shapes);
for (int i = 1; i <= rangeCount; i++)
{
object shape = InvokeMember("Item",
shapes, new object[] { i });
object textFrame = GetProperty("TextFrame",shape);
int hasText = (int)GetProperty("HasText",textFrame);
if (hasText < 0)
{
object range = GetProperty("TextRange", textFrame);
count+= GetCountFromRange(range,wordDoc,word);
}
}
}
while(false);
return count;
}
Getting Word Count from a Range
Here also, the logic is the same as early binding, only a conversion is done.
private int GetCountFromRange(object range, object wordDocument,string word)
{
int count = 0;
object missing = System.Reflection.Missing.Value;
object matchAllWord = true;
object item = 1;
object whichItem = 1;
InvokeMember("GoTo",wordDocument,new object[]{item, whichItem});
object find = GetProperty("Find",range);
InvokeMember("ClearFormatting",find);
SetProperty("Forward",find,true);
SetProperty("Text",find,word);
SetProperty("MatchWholeWord",find,true);
InvokeMember("Execute",find);
bool found = (bool)GetProperty("Found",find);
while (found)
{
++count;
InvokeMember("Execute",find);
found = (bool)GetProperty("Found",find);
}
return count;
}
Find and Replace
The Find and Replace is similar to Find, the only difference is we need to provide the replace text as well. The function ReplaceRange()
will replace the find text with the replace text.
public bool FindReplace(object wordDoc,object wordApplication,
string findText, string replaceText)
{
bool isSuccess = false;
do
{
if (wordDoc == null)
{
break;
}
if (wordApplication == null)
{
break;
}
if (replaceText.Trim().Length == 0)
{
break;
}
if (findText.Trim().Length == 0)
{
break;
}
object content = GetProperty("Content",wordDoc);
ReplaceRange(content,wordDoc,wordApplication,findText,replaceText);
object comments = GetProperty("Comments",wordDoc);
object count1 = GetProperty("Count", comments);
int rangeCount = (int)count1;
for(int i = 1; i <= rangeCount; i++)
{
object comment = InvokeMember("Item",
comments, new object[] { i });
object range = GetProperty("Range", comment);
ReplaceRange(range,wordDoc,wordApplication,findText,replaceText);
}
object sections = GetProperty("Sections",wordDoc);
object last = GetProperty("Last",sections);
object headers = GetProperty("Headers",last);
rangeCount = (int)GetProperty("Count", headers);
for (int i = 1; i <= rangeCount; i++)
{
object header = InvokeMember("Item",
headers, new object[] { i });
object range = GetProperty("Range", header);
ReplaceRange(range,wordDoc,wordApplication,findText,replaceText);
}
object footers = GetProperty("Footers",last);
rangeCount = (int)GetProperty("Count", footers);
for (int i = 1; i <= rangeCount; i++)
{
object footer = InvokeMember("Item",
footers, new object[] { i });
object range = GetProperty("Range", footer);
ReplaceRange(range,wordDoc,wordApplication,findText,replaceText);
}
object shapes = GetProperty("Shapes",wordDoc);
rangeCount = (int)GetProperty("Count", shapes);
for (int i = 1; i <= rangeCount; i++)
{
object shape = InvokeMember("Item",
shapes, new object[] { i });
object textFrame = GetProperty("TextFrame",shape);
int hasText = (int)GetProperty("HasText",textFrame);
if (hasText < 0)
{
object range = GetProperty("TextRange", textFrame);
ReplaceRange(range,wordDoc,wordApplication,findText,replaceText);
}
}
isSuccess = true;
}
while(false);
return isSuccess;
}
Replace Within a Range
This function will replace a find text with a replace text within a range.
private void ReplaceRange(object range, object wordDocument,object wordApp,
string findText, string replaceText)
{
object missing = System.Reflection.Missing.Value;
InvokeMember("Activate",wordDocument);
object item = 1;
object whichItem = 1;
InvokeMember("GoTo",wordDocument,new object[]{item, whichItem});
object replaceAll = 2;
object matchAllWord = true;
object find = GetProperty("Find",range);
InvokeMember("ClearFormatting",find);
object replacement = GetProperty("Replacement",find);
InvokeMember("ClearFormatting",replacement);
InvokeMember("Execute",find,
new object[]{findText,false,true,missing,missing,
missing,true,missing,missing,
replaceText,replaceAll});
}
Reflection Utilities
Invoking a Method
private object InvokeMember(string method, object instance, object[] parameters)
{
Type type = instance.GetType();
return type.InvokeMember(method,
BindingFlags.InvokeMethod,null,instance,parameters);
}
Invoking a Method
private object InvokeMember(string method, object instance)
{
Type type = instance.GetType();
return type.InvokeMember(method,
BindingFlags.InvokeMethod,null,instance,new object[]{});
}
Getting a Property
private object GetProperty(string propertyName, object instance)
{
Type type = instance.GetType();
return type.InvokeMember(propertyName,
BindingFlags.GetProperty,null,instance,new object[]{});
}
Setting a Property
private void SetProperty(string propertyName,object instance, object value)
{
Type type = instance.GetType();
type.InvokeMember(propertyName,BindingFlags.SetProperty,
null,instance,new object[]{ value} );
}
Points of Interest
Getting Office Class Details
Before going to late binding type of coding, you have to know the object details of Office libraries. For this purpose, one of the better solutions is the Object Browser of Office applications. For getting the Object Browser, open Microsoft Word, open the menu Tools->Macro->Visual Basic Editor, or press Alt+F11. This will open a new Microsoft Visual Basic Environment. From this editor, open the View menu and click on the Object Browser submenu, or press F2. You will get the following window and all the object information is listed there.