The only format I can give you any help with here is PDF, and for that you can extract text with
XPdf[
^]. However: getting an accurate word count from some PDFs may be impossible, depending on how the program that created it decides to format the output (just because it appears as a word in a PDF viewer does not mean it was stored as a word in the document, PDF is a very complex format).
As has been mentioned here, getting word count from an image would require OCR, but I don't know enough about it to give you a recommendation (I do however know, that once again you may be unable to get an accurate word count with OCR).
.docx documents are essentially a zipped collection XML files, and shouldn't be too difficult to work with. But I don't know enough about the format to help there beyond that.
.doc documents are also a zipped collection of files, but I don't know anything about the format of the files contained within (they appear to be some binary format).
I think your best bet is to pick a single file type and stick with it.