Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

NRTFTree - A class library for RTF processing in C#

4.50/5 (42 votes)
7 Sep 2007LGPL33 min read 1   22.5K  
Class library to manage RTF files.
NRtfTree Demo Screenshot

Introduction

NRtfTree Library (LGPL) is a set of classes written entirely in C# that may be used to manage RTF documents in your own applications. NRtfTree will help you:

  • Open and parse RTF files.
  • Analyze the content of RTF files.
  • Add, modify and remove document elements (i.e. text, control words, control symbols).
  • Create new RTF documents.

Background

RTF (Rich Text Format) is a method of encoding formatted text and graphics for easy transfer between applications. An RTF document can contain text, images, tables, lists, hyperlinks and many other text and graphic elements. In addition, RTF is the format used internally by the RichTextBox control included as part of .NET Framework. Nevertheless, its functionality is not enough to satisfy all aspects of RTF file management.

Using the Code

NRtfTree has two modes of operation:

  1. DOM-like mode: RTF documents are loaded in a tree structure and are provided several methods to traverse it, access tag contents and modify or create new nodes. This implementation requires the entire content of a document to be parsed and stored in memory.

    In this mode, the main classes are RtfTree and RtfTreeNode:

    Image 2
  2. SAX-like mode: RTF file parser is implemented as an event-driven model in which the programmer provides callback methods that are invoked by the parser as part of its traversal of the RTF document.

    In this mode, the main classes are RtfReader and SARParser:

    Image 3

Examples

The following lines show how you can use the class library in your own code.

  1. DOM-like mode

    This code loads an RTF document into an RtfTree object and inspects all the child nodes:

    C#
    public void doSomething()
    {
        //Create the RTF tree object
        RtfTree tree = new RtfTree();
    
        //Load and parse RTF document
        tree.LoadRtfFile("c:\rtfdoc.rtf");
        
        //Get root node
        RtfTreeNode root = tree.RootNode;
    
        RtfTreeNode node = new RtfTreeNode();
    
        for(int i = 0; i < root.ChildNodes.Count; i++)
        {
            node = root.ChildNodes[i];
    
            if(node.NodeType == RTF_NODE_TYPE.GROUP)
            {
                //...
            }
            else if(node.NodeType == RTF_NODE_TYPE.CONTROL)
            {
                //...
            }
            else if(node.NodeType == RTF_NODE_TYPE.KEYWORD)
            {
                switch(nodo.NodeKey)
                    {
                    case "f":  //Font type
                    //...
                    break;
                case "cf":  //Font color
                    //...
                    break;
                case "fs":  //Font size
                    //...
                    break;
                }
            }
            else if(node.NodeType == RTF_NODE_TYPE.TEXT)
            {
                //...
            }
        }
    }
  2. SAX-like mode

    This is an example of the implementation of a simple rft sax-parser:

    C#
    public class MyParser : SARParser
    {
        //...
    
        public override void StartRtfDocument()
        {
          doc += 
            "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n";
    
          doc += "<DOCUMENT>\r\n";
        }
    
        public override void EndRtfDocument()
        {
            doc += "\r\n</DOCUMENT>";
        }
        
        public override void StartRtfGroup()
        {
            //...
        }
    
        public override void EndRtfGroup()
        {
            //...
        }
    
        public override void RtfControl(string key, 
                                bool hasParam, int param)
        {
            //..
        }
    
        public override void RtfKeyword(string key, 
                                bool hasParam, int param)
        {
            switch(key)
            {
               case "b":  //bold font
                    //...
                    break;
               case "i":  //Italic font
                    //...
                    break;
               //...
            }
        }
    
        public override void RtfText(string text)
        {
            doc += text;
        }
    }

    Once you have completed the parser, you can start parsing the RTF document by calling the function RtfReader.Parse(). Then the handlers for the configured events are automatically called as many times as necessary:

    C#
    //Create the parser
    MiParser parser = new MyParser(res);
    
    //Create the reader and associate the parser
    reader = new RtfReader(parser);
    
    //Load the RTF document
    reader.LoadRtfFile(rutaRTF);
    
    //Start parsing
    reader.Parse();
  3. RtfDocument class

    You can create new RTF documents using the new class RtfDocument (beta):

    C#
    RtfDocument doc = new RtfDocument("testdoc.rtf");
    
    RtfTextFormat format = new RtfTextFormat();
    format.size = 20;
    format.bold = true;
    format.underline = true;
    
    doc.AddText("Title", format);
    doc.AddNewLine();
    doc.AddNewLine();
    
    format.size = 12;
    format.bold = false;
    format.underline = false;
    
    doc.AddText("This is a test.", format); 
    doc.AddText("This is a text.");
    
    doc.AddNewLine();
    
    doc.AddImage("test.png", 50, 50);
    
    doc.Close();

Software License

NRtfTree Library is licensed under the GNU LGPL license.

More Information

You can find up-to-date information on my personal home page (Spanish) or NRtfTree SourceForge Project (English).

References

History

  • 2007/09/02 - v0.3.0 beta 1
    • New license: LGPL.
    • New classes to create RTF documents (basic support in beta): RtfDocument, RtfColorTable, RtfFontTable and RtfTextFormat.
    • RtfTree class:
      • New property MergeSpecialCharacters. When it is set to true, if special character is found ('\') it is converted to Text node and eventually merged to adjacent text nodes.
      • New property Text. Returns plain text from the RTF document.
      • New method GetEncoding(). Returns document encoding.
    • RtfTreeNode class:
      • New property Tree. Returns a reference to owner RTF tree.
      • New method To String().
      • New method InsertChild(). Inserts a new node at the specified location.
      • Methods SelectXXXByType() have been replaced by SelectXXX() overloads.
      • New methods SelectSibling() (3 overloads).
    • RtfNodeCollection class:
      • New method Insert(). Inserts a new node at the specified location.
      • New method RemoveRange(). Remove a range of nodes from the list.
    • InfoGroup class:
      • New method ToString().
    • Fixed Bugs:
      • Group and Root node types initialization with "ROOT" and "GROUP".
      • NRtfTree.Rtf property didn't include last '}' in a group node RTF code.
      • NRtfTree does not treat correctly special characters '\', '{' and '}' as part of the text.
      • Methods RtfTreeNode.AppendChild() and InsertChild() should update Root and Tree properties recursively.
  • 2006/12/10 - v0.2.1
    • Fixed - Bug in NRtfTree.SaveRtf() - Special character hex codes with one digit.
  • 2005/12/17 - v0.2.0
    • New namespaces: Net.Sgoliver.NRtfTree.Core and Net.Sgoliver.NRtfTree.Util
    • New classes: ImageNode, ObjectNode, InfoGroup.
    • RtfTreeNode class:
      • New properties: LastChild, NextSibling, PreviousSibling, Rtf.
      • New methods: CloneNode(), HasChildNodes(), SelectSingleNode(), SelectSingleChildNode(), SelectChildNodes(), SelectNodes(), SelectSingleChildNodeType(), SelectChildNodesByType(), SelectNodesByType(), SelectSingleNodeByType().
      • New indexer [equivalent to SelectSingleChildNode()].
      • Some optimization changes.
    • RtfTree class:
      • New methods: ToStringEx(), SaveRtf(), GetColorTable() y GetFontTable() y GetInfoGroup()
      • Some optimization changes.
      • Some bugs fixed.
    • RtfNodeCollection class:
      • New methods: IndexOf(), AddRange()
    • RtfLex class:
      • parseText() now ignores new line, tabs and null characters.
      • Some optimization changes.
  • 2005/08/13 - v0.1
    • First public release.

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License (LGPLv3)