Is there any sample of splitting docx file into multiple files by searching keywords?
for example we have a big word file which has repeating content but with different ID and informations. I would like to split them into separate files.
How can we achieve this in C#?
Separating based on sections or paragraphs is not suitable for our scenario.
I think there is no need to mention that should keep the formatting.
Thanks.
PS: My current code is as following. This code doesnt work properly. The count value which I get inside loop and pass to split seems to be incorrect.
int count = 0;
OpenXmlPowerToolsDocument doc = WmlDocument.FromFileName(TxtSource.Text);
using (OpenXmlMemoryStreamDocument streamDoc = new OpenXmlMemoryStreamDocument(doc))
using (WordprocessingDocument document = streamDoc.GetWordprocessingDocument())
{
XDocument mainDocument = document.MainDocumentPart.GetXDocument();
for (int i = 1; i < 1000; i++)
{
IEnumerable<xelement> content = mainDocument.Document.Descendants(W.p).Skip(i).Take(1);
Regex regex = new Regex("Delimiter");
count = OpenXmlRegex.Match(content, regex);
if (count >0)
{
count = i;
break;
}
}
}
List<source> documentSource = new List<source> {
new Source(new WmlDocument(TxtSource.Text), 0, count, true)
};
int filenumber = 2;
string filename = string.Format("{0}test_{1}.docx", Txtdest.Text, filenumber);
DocumentBuilder.BuildDocument(documentSource, filename);