Object Mapping Part I - The Row Cursor

Marc Clifton, J. Dunlap

Rate me:

4.85/5 (21 votes)

17 Dec 200613 min read

65K

416

A row cursor implementation suitable for synchronizing and navigating a DataView against with object-mapped instance.

Download source files - 17.6 Kb

Introduction

The code for the ViewRecord was written by Justin Dunlap.

Background

I don't use object-relational mapping (ORM) because I use a client-server system that handles the table relationships for me at the server. This works quite well when working with the client-side UI components. However, I do need something for client and server-side plug-in business rules that is more robust than coding row["LastName"], for example. What I asked Justin to write for me was a code generator that creates a class containing the fields, their property getters and setters, and the necessary logic to do simple field validation (such as not-null). So this isn't ORM. Instead, it's straightforward object mapping--ORM without the "R".

Part I of this two part article discusses something I discovered is pretty much a requirement--an intelligent row cursor, and the analysis of my requirements for an OM that led to this decision. Part II discusses how the code generator--what it needs to do its job and what it generates, along with a set of additional requirements for the generated OM class.

The Architecture

The following describes the various requirements and their impact on the design that led to the final architecture.

Requirement: Hide The DataView

One of the requirements I made was to hide the DataView and the DataRowView classes from the application. The interface to the OM should be abstracted enough that the underlying data collection could be anything--a DataView, a DataSet, XML, even a comma separated file (CSF). The only time you would need to reference the underlying collection type is in the constructor to the concrete OM representation.

Impact

This requirement affected the architecture in the following ways:

An interface could be used for navigating the data collection independent of the concrete implementation of that data collection.
Because the application can now be isolated from the underlying concrete data set, the typical scenario of iterating through the rows of a DataView would now be different. Instead, the application would navigate through the collection using the navigation methods or the iterator provided by the implementing data collection manager.

Requirement: There Is No Separate Record Manager

A typical implementation would have a record manager that implements record navigation and OM persistence, returning an OM object when navigating the record collection. I decided against this approach and instead Justin implemented the record navigation in a base class that the OM generated class derives from. This means that the OM class provides both the mapping functions and, via the base class, the navigation functions.

Impact

This design decision has some interesting side effects:

You can reuse the same OM object as you navigate through the record collection. If you're navigating through a million records, re-using the OM object is probably a good thing rather than instantiating a new OM object for every iteration. It's faster and doesn't stress out the garbage collector. Yes, you could pass in an object to a record manager for the purposes of re-using that object.
Position is managed by the OM object directly rather than a separate record manager. There are pros and cons to this. This con is that it the OM object is really handling two separate tasks, which is usually indicative of a bad design. On the pro side, it avoids the use of templates which would be otherwise required for the record manager can return the strongly typed OM object. I think it also makes it cleaner when manipulating a record collection in a multithreaded application. Since the OM instance knows where it is, you don't need to keep track of a separate record manager class associated with a processing thread.
The implementation of the enumerator looks weird because the concrete class implements its own iteration. So, the GetEnumerator method returns this. I find that non-intuitive.

From my perspective, it just seems simpler to be able to have a OM object and tell that object, "position yourself here" or "load the next record".

Requirement: Support Multithreaded Record Processing

I would really like to be able to split up a record collection amongst multiple processors and have each processor handle a subset of the records for processing.

Impact

This really means that I can't use a simple iterator. Do I really want to hand off each row to a thread in a thread pool as I iterate through the records? Or do I want to put the record into a queue for worker threads to pull off and process? These are completely valid approaches but involve a lot of overhead. For example, if I put the records into a queue, that means the queue count is potentially as large as the record count. Processing a lot of records, that could be really inefficient.

Or, I could give each worker thread a record index range to work on. That would be simple enough and accomplish the task quite simply.

What if those worker threads delete records?

Now there's an interesting question! What do you think this code does, given that dv is a DataView (no, it doesn't mean Divine Vitality) of 2 or more records?

foreach (DataRowView drv in dv)
{
  drv.Delete();
}

This code throws a IndexOutOfRangeException! Why? Let's say you have 3 records. The above code is the equivalent of saying:

dv.Delete(0);
dv.Delete(1);
dv.Delete(2);

But guess what? By the time dv.Delete(2) is called, there's only one record, at index 0! So we have a case for a row cursor independent of the DataView's intrinsic row indexing mechanism. This row cursor should adjust accordingly so that the following code works:

/// <summary>
/// This test validates that deleting
/// all records actually works using an iterator.
/// </summary>
[Test, Sequence(7)]
public void DeleteAllRecordsTest()
{
  foreach (ViewRecord vr in viewRec)
  {
    vr.Delete();
  }

  Assertion.Assert(dv.Count == 0, "Expected all records to be deleted.");
}

Similarly, in a multithreaded environment, we don't want the behavior of one thread that is deleting a record to affect the behavior of another thread that is working on a record (including potentially deleting it as well). Therefore, this should work as well, when running on multiple threads:

protected void DeleteRecords(object si)
{
  StartInfo startInfo = (StartInfo)si;
  MockPersonViewRecord rec = startInfo.Rec;

  for (int i = 0; i < startInfo.Count; i++)
  {
    rec.Delete();
    // automatically moves to the next record.
  }
}

Requirement: Allow For Reverse Navigation And Direct Positioning

In addition to iterating through the row collection, the row cursor should be able to be positioned directly given an index. The additional methods First, Last, Next, and Previous are also implemented.

Requirement: The Row Cursor Should Always Track Where The Row Is In The List

This means many things. If I have a sorted list and I'm at the fifth record and I insert a record that ends up, as a result of the sorting, at an index prior to my current position, I still want the iterator (or the Next or Previous method) to return the correct record even though I've now moved my current position from index 5 to index 6. For example, let's say I have the following list of last names:

Clifton
Dunlap

sorted by last name, and I'm at the "Clifton" record. If I add a record with a last name of "A", when I call Next, I want to be positioned at "Dunlap", as the following unit test validates:

[Test, Sequence(0)]
public void InsertBeforeCurrentRowTest()
{
  viewRec.First();
  viewRec.NewRow();
  DataRowView drv = viewRec.CurrentRow;
  drv["LastName"] = "A";
  viewRec.CommitRecord();
  // The added record should now be the first record, but should not have 
  // affected our current record.
  viewRec.Next();
  Assertion.Assert(viewRec.CurrentRow["LastName"].ToString() == "Dunlap", 
       "Record index was not correctly positioned.");
}

If I did this with a dumb row cursor, the "Clifton" record at index 0 would go to index 1 but my row cursor would still be at index 0.

Impact

This can only really be done if I have a row cursor and the list changes are tracked so that when I call the Next method, I get the correct row at the correct position. Granted, iterating through a collection while modifying the collection is something that usually throws an exception. Oddly enough, the DataView class doesn't throw a collection modified exception when, say, deleting rows within an iterator. But more to the point, I think the row cursor should always track where the currently loaded row is positioned in the list, whether records are inserted, deleted, or the field on which one is sorting is changed.

Caveat

One of the frustrating things about a DataRowView instance is that when the sort field or row filter is changed, physical row that the DataRowView is pointing to actually changes, because the DataRowView is referencing the physical row based on its index; change the sort of row filter and the collection changes but not the DataRowView index. The following test should work, given that the row cursor is positioned on the row containing "Marc" for the first name, and there is a row with "Mary" somewhere in the collection as well, with nothing alphabetically inbetween:

[Test, Sequence(7)]
public void SortOrderChangeTest()
{
  viewRec.First();
  viewRec.View.Sort = "FirstName";
  Assertion.Assert(viewRec.FirstName == "Marc", "Collection should not have 
       changed.");
  Assertion.Assert(viewRec.CurrentRow["FirstName"].ToString() == "Marc", 
       "Collection should not have changed.");
  viewRec.Next();
  Assertion.Assert(viewRec.FirstName == "Mary", "Incorrect next row.");
}

However, in order to achieve this (and similarly when the row filter is changed) the routines handling the list change have to brute force iterate through each row to locate the position of the DataRowView that references the physical DataRow prior to the sort change. That's unfortunate, but the row cursor does stay positioned correctly!

A Row Cursor Is Not So Simple After All

The simplest concept of a row cursor is just an index into the data collection. First sets the index to 0, Next increments it, Previous decrements it, and Last sets it to the the last item in the collection. Sounds simple enough, but as you can see from the above requirements, it actually isn't that simple. More importantly, if you have an object-mapping system, it is critical that the object that maps to a particular record is always synchronized with where that record is. Because the DataView is such a nice component with it's Sort and RowFilter properties, any OM needs to work with a DataView, in my opinion, rather than the underlying DataTable. If we worked just with the DataTable, a row cursor would be simpler. Any row cursor implementation needs to be careful to handle when the DataRowView changes as a result of a change to the Sort or RowFilter properties, besides the "regular" changes to the collection that can occur when inserting or deleting items. These issues, such as updating the cursor position when the collection is changed, are less important but in my opinion, makes for a consistent and more robust tool.

The ViewRecord hooks the DataView's ListChanged event to help track changes to the current row index. The following code is from the event handler itself.

private void DataViewListChanged(Object sender, ListChangedEventArgs e)
{
  lock (dataView)
  {
  // update current row index based on change, and throw exception if the 
  // current record is removed.
  switch (e.ListChangedType)
  {
    case ListChangedType.ItemDeleted:
      // The list has changed, making the current record invalid.
      listChanged = true;

      if (e.NewIndex < currentRowIndex)
      {
        currentRowIndex--;
      }

      // If we are deleting the current row...
      if (e.NewIndex == currentRowIndex)
      {
        // And there's more rows in the record set from our position...
        if (currentRowIndex < dataView.Count)
        {
          // Set the current row.
          LoadRecord(currentRowIndex);
          // Overrides the Next method.
          recordDeleted = true;
        }
        else
        {
          // Otherwise, we're at the end of the record set.
          EndOfData();
        }
      }
      break;

      case ListChangedType.ItemAdded:
        // The list has changed, making the current record invalid.
        listChanged = true;

        // If this is a new record, set the row index.
        if (currentRowIndex == -1)
        {
          currentRowIndex = e.NewIndex;
        }
        break;

      case ListChangedType.ItemMoved:
        // The list has changed, making the current record invalid.
        listChanged = true;

        //if moved row comes before this row...
        if (e.OldIndex < currentRowIndex)
        {
          //if row was moved so that it is after the current row,
          //then this row will have been shifted backwards
          if (e.NewIndex >= currentRowIndex)
          {
            currentRowIndex--;
          }
        }
        // if the row that was moved is this row, then the event's new index 
        // is this row's new index.
        else if (e.OldIndex == currentRowIndex)
        {
          currentRowIndex = e.NewIndex;
        }
        else if (e.OldIndex > currentRowIndex)
        {
          // If a record is moved from past the current row index to prior 
          // to it, then the current row index must be incremented.
          if (e.NewIndex <= currentRowIndex)
          {
            currentRowIndex++;
          }
        }
        break;

      case ListChangedType.ItemChanged:
        break;

      case ListChangedType.Reset:
        ResetRow();
        listChanged = true;
        break;
    }
  }
}

The ListChangedType Cases

The following describes the cases that ViewRecord deals with when the ListChanged event is raised.

ItemDeleted

In this case, if the row being deleted is prior to the current row, then the row index is decremented. Additionally, if the row being deleted is the current row, a check is made to see if the next record exists, and if it does, that record is automatically loaded. A flag is then set to indicate that the call to Next shouldn't do anything, as the record is already positioned on the record following the one deleted.

ItemAdded

The item added only sets the current row index if the record being added is a new record . This change event is a bit quirky though. The following unit test illustrates this:

[Test, Sequence(0)]
public void InsertBeforeCurrentRowTest()
{
  viewRec.First();
  viewRec.LastName = "A";
  viewRec.FirstName = "B";
  viewRec.ID = Guid.NewGuid();
  viewRec.AddRecord();
  // The added record should now be the current record and the first record, 
  // and the next record should be the second record.
  viewRec.Next();
  Assertion.Assert(viewRec.LastName == "Clifton", "Record index was not 
       correctly positioned.");
}

The record set consists of a sorted list. This is important to remember.

The AddRecord call ends up making a call to dataView.AddNew() via the NewRow() method:

currentRowIndex = -1;
NewRow();
ValidateAndCommitFields();
// The behavior of EndEdit when adding a new record with a sort field is that
//  the currentRow will
// point to the last row regardless of where the row got moved to. See A below.
currentRow.EndEdit();

Call dataView.AddNew() results in the ItemAdded change event being fired. AddRecord has set the currentRowIndex to -1, so the first condition is met, and the currentRowIndex is then set to the index of the row being added. This is all well and good. When EndEdit is called, the event fires two more times. The first time, the list change type is again ItemAdded! The currentRowIndex is now at the last record, but the NewIndex is 0, because the row is being added at the beginning of the list. Interestingly, the next event is the ItemMoved change type.

So we have three events that fire when a row is added:

dataView.AddNew() results in an ItemAdded event firing with e.NewIndex equal to the last record (newly added) in the list.
currentRow.EndEdit() results in ItemAdded event firing with e.NewIndex equal to the position of the record in its sorted location. e.OldIndex is -1, incidentally.
ItemMoved then fires, with e.OldIndex set to the index of the last record and e.NewIndex set to the index of where the record got moved.

In my opinion, the second ItemAdded event should not fire, but since it does, we ignore it or any ItemAdded event that does not affect our cursor. It's possible that we could determine from the NewIndex and OldIndex that the record is moving, but I'd prefer to have the ItemMoved change handle that situation.

ItemMoved

As the code above illustrates, there are three cases:

A row is moved from earlier in the list to later relative to the current position, so decrement the current row position
The current row is moved, so merely set it to its new position
A row is moved from later in the list to earlier relative to the current position, so increment the current row position.

ItemChanged

ItemChanged is ignored. Changing the value of a sorted-on field, as this unit test illustrates:

public void ChangeSortFieldTest()
{
  viewRec.Last();
  viewRec.Previous();
  Assertion.Assert(viewRec.LastName == "Harrison", "Not at second to last
         record.");
  viewRec.LastName = "D";
  viewRec.Update();
  viewRec.Next();
  Assertion.Assert(viewRec.LastName == "Dunlap", "Not at third record.");
}

Results in a ItemMoved event. The only time this change case occurs, that I've determined, is when a non-sorting field value changes, in which case we don't really care because the record doesn't move.

Reset

This change event occurs when the Sort or RowFilter is changed, or some other major change to the list happens. In this case, we have to brute force synchronize to the index of the data view row that references the physical row we were at. If no such data view row exists (perhaps the RowFilter excluded it) then the current row position is set to -1.

The ViewRecord Class

The ViewRecord class is abstract and is intended to be used as the base class for a code generated mapping class of the underlying DataView. However, you can use it as a simple row cursor. In Part II, I'll describe how the code generator works that Justin wrote.

Implementing A Simple RowCursor

The MockEmptyViewRecord class included with the unit tests illustrates an empty class suitable for a simple row cursor. In fact, a series of unit tests use this mock object to test just the record navigation and row cursor aspects of the ViewRecord class.

using System;
using System.Data;

using Interacx.Dev;

namespace ViewRecordUnitTests
{
  /// <summary>
  /// A mock view record that does not implement the record loading 
  /// and committing.
  /// This object is intended to test only record navigation and manipulation.
  /// </summary>
  public class MockEmptyViewRecord : ViewRecord
  {
    public MockEmptyViewRecord(DataView dataView)
      : base(dataView)
    {
    }

    protected override void LoadAllFields()
    {
    }

    protected override void LoadField(string fieldName)
    {
    }

    protected override void ValidateAndCommitFields()
    {
    }
  }
}

As you can see, there are three methods that are stubbed out that deal exclusively with moving data between the OM class and the underlying record.

The IRecord Interface

The IRecord interface describes the methods and properties available generically:

using System;
using System.Collections;

namespace Interacx.Dev
{
  public interface IRecord : IEnumerable
  {
    /// <summary>
    /// Returns true if the record collection has records.
    /// </summary>
    bool HasRecords { get;}

    /// <summary>
    /// Returns true if the list has changed--it's order or the number of 
    /// entries.
    /// </summary>
    bool ListChanged { get;}

    /// <summary>
    /// Gets/sets the current record position.
    /// </summary>
    int Position { get; set;}

    /// <summary>
    /// Used for creating records without a derived object-mapping class.
    /// </summary>
    void NewRow();

    /// <summary>
    /// Creates a new record and populates the record with the current 
    /// OM field values.
    /// </summary>
    void Add();

    /// <summary>
    /// Updates the current record with the current OM field values.
    /// </summary>
    bool Update();

    /// <summary>
    /// Deletes the current current. May be used without a derived 
    /// object-mapping class.
    /// </summary>
    void Delete();

    /// <summary>
    /// Navigates to the first record.
    /// </summary>
    void First();

    /// <summary>
    /// Navigates to the last record.
    /// </summary>
    void Last();

    /// <summary>
    /// Navigates to the next record. Returns false if there are no further 
    /// records.
    /// </summary>
    bool Next();

    /// <summary>
    /// Navigates to the previous record. Returns false if there are no 
    /// further records.
    /// </summary>
    bool Previous();

    /// <summary>
    /// Loads the record at the specified index and sets the current row 
    /// position to that record.
    /// </summary>
    /// <param name="index"></param>
    void LoadRecord(int index);
  }
}

In addition, the ViewRecord class gives you access to the underlying DataView and DataRowView, as it is intended to be used with a DataView:

/// <summary>
/// Returns the underlying DataView.
/// </summary>
public DataView View
{
  get { return dataView; }
}

/// <summary>
/// Returns the underlying DataRowView for the current row.
/// </summary>
public DataRowView CurrentRow
{
  get { return currentRow; }
}

The Unit Tests

The unit tests are written for my Advanced Unit Test engine and perform a variety of tests, as illustrated here. This is probably not to comprehensive.

Conclusion

Generally speaking, the creators of the .NET framework must have looked long and hard at the issue of whether a row cursor is a good thing and decided it wasn't. It seems to me that a lot of effort has gone into making a row cursor unnecessary for most uses of the DataView and DataTable classes. However, I feel that if you want to do something more sophisticated than forward iteration, handle multithreaded list processing, perform object mapping, and all the while keep yourself synchronized with changes that are occurring to the list, then it seems that an intelligent row cursor is a necessary requirement.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Written By

Marc Clifton

Architect Interacx

United States

Blog: https://marcclifton.wordpress.com/
Home Page: http://www.marcclifton.com
Research: http://www.higherorderprogramming.com/
GitHub: https://github.com/cliftonm

All my life I have been passionate about architecture / software design, as this is the cornerstone to a maintainable and extensible application. As such, I have enjoyed exploring some crazy ideas and discovering that they are not so crazy after all. I also love writing about my ideas and seeing the community response. As a consultant, I've enjoyed working in a wide range of industries such as aerospace, boatyard management, remote sensing, emergency services / data management, and casino operations. I've done a variety of pro-bono work non-profit organizations related to nature conservancy, drug recovery and women's health.

Written By

J. Dunlap

Web Developer

United States

My main goal as a developer is to improve the way software is designed, and how it interacts with the user. I like designing software best, but I also like coding and documentation. I especially like to work with user interfaces and graphics.

I have extensive knowledge of the .NET Framework, and like to delve into its internals. I specialize in working with VG.net and MyXaml. I also like to work with ASP.NET, AJAX, and DHTML.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Object Mapping Part I - The Row Cursor

Introduction

Background

The Architecture

Requirement: Hide The DataView

Impact

Requirement: There Is No Separate Record Manager

Impact

Requirement: Support Multithreaded Record Processing

Impact

Requirement: Allow For Reverse Navigation And Direct Positioning

Requirement: The Row Cursor Should Always Track Where The Row Is In The List

Impact

Caveat

A Row Cursor Is Not So Simple After All

The ListChangedType Cases

ItemDeleted

ItemAdded

ItemMoved

ItemChanged

Reset

The ViewRecord Class

Implementing A Simple RowCursor

The IRecord Interface

The Unit Tests

Conclusion

License

Comments and Discussions