Formatting .NET Assembly Summary Documentation

cigwork

4.58/5 (4 votes)

Mar 5, 2013

CPOL

11 min read

20895

144

Using XSLT to create on-line summary documentation for .NET Assemblies

Download source - 17.3 KB

Some General Hemming and Hawing by way of a Preamble

I'm afraid that there are no great coding secrets revealed in this article. I'm posting it because although there seem to be a great number of similar tools available they either cost cold hard cash or are free but fairly inflexible in what they do and I thought (hoped) that other .NET developers would find this either immediately useful or a good leg up for developing their own (improved) version.

If you're completely new to XSLT that you may find the recursive stuff and the demonstration of how to pass global parameters to a template from code useful otherwise it's all pretty vanilla.

Background

All versions of C# and later versions of VB.NET allow the addition of summary comment blocks on types, methods and properties and these summary comment blocks can be output as an XML file during compilation if document file generation is requested. Document file generation can be requested from the IDE (project / properties) or by specifying the /doc flag on the command line.

The conversion of the compiler generated XML to a display format meeting local standards is left to the programmer. This document outlines the construction of a simple summary HTML document generator using the tools available in Framework 3.5.

The compiler recognises the following tags in source code:

Tag	Remark
`<summary>`	A short description of the item.
`<remarks>`	A fuller description about or additional commentary on the item.
`<value>`	Describes the IO of a property.
`<param>`	Provide commentary on a method parameter. What's valid, what's not and so on.
`<returns>`	Commentary on the return value of a method.
`<exception cref="...">`	Lists exceptions thrown by a method, type or property. `<exception cref="System.ArgumentException">`Thrown when Fred is null.`</exception>`
`<example>`	This is intended to hold text associated with an example. However, because my "local standard" is to put one or two lines of sample code here rather than in the `<c>` and `<code>` elements this utility formats the content of this element as code rather than as text.
`<code>`	Intended for multiline code samples.
`<c>`	Intended for in-line code samples.
`<see cref="{a.n.other_member}">...</see>`	Intended for in-line cross references.
`<seealso cref="{a.n.other_member}">...</seealso>`	Intended for a separate "see also" section.
`<list> <listpara>`	For bulleted lists.

The utility as supplied doesn't deal with the less commonly used elements listed above but it does allow you to embed simple HTML formatting in the <remarks> section as shown below.

  /// <summary>
  /// Apply an XSL transform to a well formed XML string 
  /// returning the transform output as a string.
  /// </summary>
  /// <param name="xmlToTransform">A string containing well formed XML.</param>
  /// <param name="xslTemplatePath">Fully specified XSLT file path</param>
  /// <param name="paramList">A name, value dictionary of global template parameters. Can be empty but not null.</param>
  /// <returns>A well formed XML string.</returns>
  /// <example>
  ///string template = Server.MapPath(@"~/XSL/ToTypeHierarchyXML.xsl");
  ///string transformOutput = Library.XML.ApplyTransform(source, template, new Dictionary(string, string));
  /// </example>
  /// <exception cref="System.Xml.Xsl.XsltException"></exception>
  /// <exception cref="System.Xml.XmlException">
  /// Method rethrows any XML or XSLT exceptions it encounters.
  /// </exception>

  /// <remarks>
  /// <ol type="1">
  /// <li>The template file must exist and the process must have read access.</li>
  /// <li>This and other methods are not intended for use with large XML documents.</li>
  /// <li>Not intended for use with large XML documents.</li>
  /// </ol>
  /// </remarks>

  public static string ApplyTransform(string xmlToTransform, 
                                      string xslTemplatePath,
                                      Dictionary<string,string> paramList)

Each component of the assembly is represented by a <member> block and the ownership of methods and properties by types is shown by the use of fully qualified names rather than in the structure of the XML. Each fully qualified member name is given a single letter prefix to indicate its classification.

Member prefixes include:

Prefix	Group	Comment
N:	Namespace
T:	Type	Includes class, struct, delegate and enumeration and interface.
M:	Method
P:	Property
E:	Event
F:	Field	Ignored by this utility.

The extract below is from an XML document file produced by the compiler.

<?xml version="1.0"?>
<doc>
    <assembly>
        <name>Documenter</name>
    </assembly>
    <members>

        :

        <member name="T:Documenter.Library.XML">
            <summary>
            Group XML appropriate methods.
            </summary>
        </member>

        <member name="M:Documenter.Library.XML.FileTransform(System.String,System.String)">
            <summary>
            Apply a transform to a file returning a string
            </summary>
            <param name="filePath"></param>
            <param name="xslTemplate"></param>
            <returns></returns>
        </member>

        :

        <member name="T:Documenter.Library.ForTesting.myHandler">
            <summary>
            Delegate : Here to generate an event member for test purposes.
            </summary>
            <param name="alpha">first parameter</param>
            <param name="beta">second parameter</param>
            <returns>True if method invocation succeeds.</returns>
        </member>

        :

    </members>
</doc>

You'll note from the fragment above that:

The return type of a method or property isn't included unless the coder explicitly puts it in the <returns> element
The nature (class, struct, delegate, enum etc.) of a type isn't available.
The scope (public, private etc.) of a member isn't available.
The parameter types for methods are given as a comma delimited string.
Parameter types for delegates aren't available, only the parameter description.

Some experimentation (and not a little swearing) showed this "flat" format generated by the compiler not to be suitable for direct generation of output using XSLT 1.0 (Framework 3.5 does not support XSLT 2.0) so the document generation process is run as a two step process:

Apply an XSLT template to convert the flat form to a more hierarchical form .
Apply a second XSLT template to the transformed XML to generate an HTML page.

The templates used are:

TypeSelect.xsl	Allows us to display the documentation for a single type (class, struct, whatever).
ToTypeHierarchyXML.xsl	Generates the more hierarchical form.
ToHTML.xsl	Generates the output HTML from the intermediate XML produced by the type hierarchy transfrom.

TypeSelect

This uses a simple for-each to identify the <member> element for a type as well as any nested types it may contain and these elements are used to generate a cut-down copy of the "flat" source XML.

ToTypeHierarchy

There are three parts to this template worth mentioning:

The main scan
unadornedName
toNodes

Main Scan

The flat nature of the compiler generated XML means that the easiest way to process it is using nested for-each iterators (beware; for-each is not an indexed for loop). This is almost certainly not the fastest nor most elegant solution, but it is easy to implement and understand.

A side effect of this approach is that if the owning type (class/interface/struct) doesn't have a summary block then none of its methods will be documented.

unadornedName

Member (method, property, event) names were initially extracted using one or more calls to the substring-after() method, but very, very occasionally the first one or two characters of the member name would be stripped. Most unsatisfactory.

There is no built in delimited string splitter in XSLT 1.0 so we have to roll our own. Complicating factors are that xsl:variables are write once read many and there is no equivalent of an indexed for loop. The standard approach is to use recursion.

This template takes a fully qualified member name, such as "M:Documenter.Library.Extensions.DefaultValue", and returns the member name without a namespace prefix. Use of the term "returns" is slightly misleading, it may be better to think of recursive templates as delaying the writing of the element or attribute to the output stream until the desired end point is reached.

toNodes

This is handed a CSV list of parameter types. Unlike unadornedName we are interested in writing to the output stream at each stage not just at the end. As each parameter type is encountered a <param> node is written. If the input string is not empty then a recursive call is made to the template.

An interesting (read annoying) wrinkle was found late on.

If you have a method with a signature :

   public static string ApplyTransform(string xmlToTransform, 
                                       string xslTemplate,
                                       Dictionary<string,string> paramList)

You end up with intermediate XML of the form:

   String,String,Dictionary{System.String,System.String}

So it becomes necessary to treat the "{" & "}" as escape characters in toNodes to avoid splitting the generic type's arg list. The result is a nested <xsl:choose> structure to handle this.

Having said that the toNodes and unAdorned look to me as though they are eminently reusable.

TypeHierarchy output - Intermediate XML

The converted output has the following general layout:

  <assembly name="...">
    <type name="...">

      <typeHeader>
      <summary> a summary comment</summary> 
      <!-- 
        Delegates and other paramterised types will also have param, 
        value and returns elements.
        -->
      <param name="firstArg">The first argument.</param> 
      <param name="secondArg">The second argument.</param> 
      </typeHeader>


      <!-- method comment -- >
      <method name="..." paramTypes="...">
      <summary>...    </summary> 
      <paramType typeName="..." /> 
      <paramType typeName="..." /> 
      <param name="..." /> 
      <param name="..." /> 
      <returns>...</returns>
      <remarks>...</remarks>
      <example>...</example>
      <exception cref="">...</exception>
      </method>

      <!-- property comment -- >
      <property name="..." paramTypes="...">
      <summary>...    </summary> 
      <paramType typeName="..." /> 
      <paramType typeName="..." /> 
      <param name="..." /> 
      <param name="..." /> 
      <returns>...</returns>
      <remarks>...</remarks>
      <example>...</example>
      <exception cref="">...</exception>
      </property>

      <!-- event comment -- >
      <event name="..." paramTypes="...">
      <summary>...    </summary> 
      <paramType typeName="..." /> 
      <paramType typeName="..." /> 
      <param name="..." /> 
      <param name="..." /> 
      <returns>...</returns>
      <remarks>...</remarks>
      <example>...</example>
      <exception cref="">...</exception>
      </event>

      <!-- T:Assembly.Namespace.Class.AType--> 
      <nestedType xref="Assembly.Namespace.Class.AType" 
                  name="AType" 
                  summary="A nested type (struct, class, enum, delegate)." /> 

    </type>
  </assembly>

Method Example

  <type name="Library.Extensions">
  <!-- M:Documenter.Library.Extensions.DefaultValue(System.String,System.String) --> 
  <method name="DefaultValue" paramTypes="System.String,System.String">
  <summary>Deal with null strings.</summary> 
  <paramType typeName="System.String" /> 
  <paramType typeName="System.String" /> 
  <param name="s" /> 
  <param name="defaultValue" /> 
  </method>

Notes:

The <returns> and <remarks> elements do not appear in the example above because they were empty in the source XML for the method.
paramTypes is retained as a CSV string attribute of the member type to allow the display of parameter types as a single string using XSLT 1.0 should it be necessary.
The separation of the <paramType> and <param> elements. There are two main reasons for this:
- The comma separated string listing the parameter types is always up to date.
- Once created the <param> nodes in the source <summary> block are not updated automatically if the method signature changes so there may be fewer (or more) <param> elements than <paramType> elements.

ToHTML

This template has only one section of significance; the paramType template.

paramType

This template is responsible for matching parameter names with parameter types. Within the template the most important lines are :

      <xsl:variable name="position" select="position()"/>
      <xsl:variable name="paramName" select="../param[$position]/@name"/>

The first line notes the position ( a 1 based index ) of the current <paramType> node in the sequence of <paramType> nodes for the current member. The next line retrieves the parameter name from the sequence of <param> nodes of the parent member (method etc.) of the current <paramType>. If the summary block for the member is up to date there will be a 1:1 correspondence. This correspondence means that we can detect the addition of new parameter where there has been no update of the member's <summary> block by the absence of a matching <param> node for a <paramType> node . Unfortunately it isn't possible to identify the removal of or renaming of a parameter.

Something else worth noting is the separation of the position() call from the node access. Use of a single line :

  <xsl:variable name="paramName" select="../param[position()]/@name"/>

...results in the first node in the <param> sequence being retrieved for each <paramType> regardless of the <paramType> index position. This is unexpected; position() should, "...report the position of the context item in the sequence." It would seem that when used to index the param[] sequence it interprets the context as <param> rather than <paramType>. By retrieving the value as $position in the first line we ensure that the correct context is used.

  <!-- Lay out parameters where we have parameter types available. -->
  <xsl:template match="paramType">

    <span class="typeName">
      <!-- Mark reference types with (out) -->
      <xsl:choose>
        <xsl:when test="contains(@typeName, '@')">
          <xsl:value-of select="normalize-space(substring-before(@typeName,'@'))"/>
          <xsl:value-of select="' (out) '"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="normalize-space(@typeName)"/>
          <xsl:text disable-output-escaping="yes"> </xsl:text>
        </xsl:otherwise>
      </xsl:choose>
    </span>

    <xsl:variable name="position" select="position()"/>
    <span class="parameterName">
      <!-- If the summary block is up to date show the parameter name
       otherwise note that the block is out of date. -->
      <xsl:variable name="paramName" select="../param[$position]/@name"/>
      <xsl:choose>
        <xsl:when test="string-length($paramName) = 0">
          <span class="remarks">{ Summary block needs updating. }</span>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="normalize-space($paramName)"/>
        </xsl:otherwise>
      </xsl:choose>
    </span>

    <!-- Write out any remarks for this parameter -->
    <div class="indentedRemarks">
      <xsl:value-of select="../param[$position]"/>
    </div>

  </xsl:template>

Document Generation

Once we have our templates creating the output couldn't really be any easier ...

  void onGenerate(object sender, EventArgs e)
  {

    string templatePath = null;
    Dictionary<string, string> searchParams = new Dictionary<string,string>();
    string typeName = fqClassName.Text;

    // Get the content of the document file.
    HttpPostedFile f = FileUpload1.PostedFile;
    byte[] buffer = new byte[f.InputStream.Length];
    f.InputStream.Read(buffer, 0, buffer.Length);
    System.Text.Encoding enc = new System.Text.UTF8Encoding();
    string documentation = enc.GetString(buffer).Trim();

    if (string.IsNullOrEmpty(documentation))
      Response.Write(@"Couldn't upload the XML. Try again.");
    else
    {

      // If we're only interested in one type then extract it and its constituents
      // into a mini version of the source XML.       
      if (!string.IsNullOrEmpty(typeName))
      {
        searchParams.Add("typeNameSought", typeName);
        templatePath = Server.MapPath(@"~/XSL/SelectType.xsl");
        documentation = Library.XML.ApplyTransform(documentation, 
                                                   templatePath, 
                                                   searchParams);
      }

      // Now turn the flattish compiler output into something
      // with a bit more of a hierarchy about it then...
      templatePath = Server.MapPath(@"~/XSL/ToTypeHierarchyXML.xsl");
      documentation = Library.XML.ApplyTransform(documentation, 
                                                 templatePath, 
                                                 new Dictionary<string, string>());

      // ...turn the hierarchical XML into HTML before...
      templatePath = Server.MapPath(@"~/XSL/ToHTML.xsl");
      documentation = Library.XML.ApplyTransform(documentation, 
                                                 templatePath, 
                                                 new Dictionary<string, string>());

      // ...pushing it back to the user.
      Response.Write(documentation);
    }

    Response.End();
  }

Notes

If there are non-printing characters before the opening <XML ... > tag in the source XML then an invalid XML exception is thrown when the transform is attempted. VB.NET seems to be guilty of this.

The Library.XML class is just a wrapper for some standard .NET CompiledTransform calls and MSDN has a good explanation of their use

Running the transforms from a browser to a web page has a number of advantages:

No need for everyone to have their own local copy of the util.
Everyone automatically uses the latest version of the templates if local documentation standards change.
Not so much temptation or need to print stuff off.
The documentation is always as current as the last compiler run.

Using the Utility

Browse for the required document file.
If you are interested in a specific type enter its fully qualified name otherwise leave blank to get all types in the class.
Click the [Generate] button
Read the output...

The yellow title bars? Ahh BeOS. Now there was a proper operating system...

Points Of Interest

I stumbled across a couple of points that may be worth pointing out if you are new to XSL.

Don't be afraid of using <xsl:variable> it may not be good style to do so, but they can make things a great deal easier to read (and write) especially if you've got deeply nested string function calls.

substring-before returns an empty string if the string doesn't contain the delimiting string or character used. This is unhelpful. There are a couple of ways around this, I've used both. Either an <xsl:choose> block, see unAdorned name for an example. The choose block is a little verbose. Much more straightforward is the use of concat() to ensure the delimiter was found.

  substring-before( concat( @name, '(' ) , '(' )

A similar trick could be pulled with substring-after...

  substring-after(  concat( '(', @name ) , '('  )

Questions

Why isn't class XXYZ or method HH32A showing up in the generated document?

Because whoever wrote the code for XXYZ and HH32A was a lazy tyke who didn't include summary blocks for them.
There's a bug.

Why doesn't this utility document fields?

Because most of the time it'd clutter the output beyond the point of usefulness. This utility is intended to give a programmer a feel for an assembly as quickly as possible. See the next point...

Why on earth would I want to bother with this when I can just go straight to the source code? Indeed you can but consider the following:

You're starting cold on an existing huge project and you're expected to be productive in hours rather weeks. A summary document produced by a utility like this provides an easy to read overview of the capabilities and purpose of each class and how the various classes within the assembly are connected and, because parameter types are fully qualified, across assemblies. It can take a lot of spelunking through code to put that information together in your head.
You're part way through a life stretch (with no remission for good behaviour) on a major project and you've come up against something that makes you think you'll need a new library method. If you have summary documentation it becomes very much easier to check to see if a method that does what you want already exists.
You're wrapping up a major new module or leaving to take up a new job (or, better yet, you've been left a massive inheritance by Great Aunt Hilda and no longer need to slave in the code mines) and have been asked to produce a hand-over guide for colleagues by your :
```
  [ ] Project Manager
  [ ] Team Leader,
  [ ] Evil Overlord 
  (tick all that apply)
```
No need to worry. You can just point the PM, TL or EO at a utility such as this. That and a few class diagrams from the IDE dropped into your favourite word processor will go a very long way to satisfying that requirement.
You're a PM, TL or EO on a project and have just lost a long serving team member and you need to get your remaining ~~deadbeats~~valued staff up to speed on his or her areas of expertise before the whole project goes pear shaped.
You are a CMMI auditor (boo! hiss!) and want to carry out a local coding standards ~~witchhunt~~check. Of course all of the foregoing only applies if coders actually bother to write meaningful comments in the summary blocks. Hey ho...

History

February 2013 - Add type selection.
February 2012 - First cut.