Introduction to the Entity Framework, or, My Initial Exploration into the Entity Framework
I wanted to learn about the Entity Framework using some basic database patterns:
- singleton table
- one-to-one association
- one-to-many association
- many-to-many association
to see how the Entity Framework handles these relationship patterns.
Requirements
To use the Entity Framework and related wizards, you have to have VS2008 SP1 and .NET 3.5 SP1 installed.
What You Will Learn
Basic information on creating a project that uses the Entity Framework:
- Working with a simple table (PersonName)
- Simple editing and data binding
- Working with a one-to-one relationship (Person has a PersonName)
- Working with a one-to-many relationship (Person has a collection of PersonName)
- Working with a many-to-many relationship via an association table (Persons and Locations)
- One or more people can be associated with zero or more locations
What You Will Not Learn
- Anything having to do with best practices, because I have no idea
- Anything having to do with performance, again because I have no idea
- How you might use the Entity Framework in an n-tier environment (something to do with proxies, but again, I'm clueless)
The Data Model
Actually, I'm going to be creating the data model in steps. When all is said and done, we should have a good understanding of how the Entity Framework works with some basic database architecture patterns.
PersonName Table
First, we'll start off with a simple PersonName table that includes an ID, FirstName and LastName fields. This table will be used for demonstrating using the Entity Framework with a simple table.
Person Table
The Person table will contain the ID, Gender, and PersonNameID fields, and will demonstrate a simple one-to-one relationship in which one Person references one PersonName. The idea here is to explore how the Entity Framework handles "including" or "joining" two tables.
Revisiting the PersonName Table
I'm going to change the PersonName table later to have a PersonID field to establish a one-to-many relationship so we can support the concept that a Person can have several names: maiden name, real name, aliases, etc. Put another way, I'll be creating a master-detail relationship and seeing how that works in the Entity Framework.
Location Table
The Location table will consist of the ID, StreetNumber, StreetName, and Zip fields. There will also be a PersonLocationAssociation table to establish a many-to-many association between Person and Location. For example, several people can "live at" a location. An office has many people that "work at" that location. The initial association table will simply have ID, PersonID, and LocationID fields.
Working with a Simple Table
Create the Model
So, to begin, let's create a form for a simple PersonName table. In your favorite database (one supported by the Entity Framework, harhar), create a new database and add the PersonName table:
ID | integer , primary key, not nullable, identity |
FirstName | nvarchar(64) , not nullable |
LastName | nvarchar(64) , not nullable |
Remember that ID is an identity value. All the ID fields in tables we're going to create are designated as identity values.
Create an EntityFramework WinForms Application
- Launch Visual Studio 2008 and create a WinForms application called PersonName.
- In the Solution Explorer, right-click on the PersonName project and select Add/New Item...
- Click on the ADO.NET Entity Data Model wizard found in the template list (if you don't see it, you don't have VS2008 SP1 and/or .NET 3.5 SP1 installed).
- Change the name in the textbox below "PersonName.edmx". This is the Entity Data Model (EDM) metadata file.
- Click "OK".
- You will want to "Generate from database", so click "Next" again.
- Select from the list of database servers, or create a new connection. Test the connection before proceeding with the next step.
These next two steps are very important:
- In the textbox below the checkbox "Save entity connection setting in App.Config as:", you will first want to make sure this box is checked (by default, it is).
- Second, you will want to put in a name that makes sense and has the correct casing, because this will actually generate the context class using this name. So, type in "PersonNameEntities".
Finally:
- Click Next.
- Select the PersonName table from the list of tables.
- Click "Finish".
A bunch of references will be added to your project; after generating the code, a diagram of your data model (showing just the PersonName table) will be displayed.
Creating the User Interface
Now, let's create a simple user interface to edit the PersonName table.
- In the Form Designer, add a
DataGridView
to the form and name it "dgvPersonName
". - Double-click on the form to generate the
Load
event handler. - Add the following
using
statements at the top of the code file:
using System.Data.Objects;
using System.Data.Objects.DataClasses;
- Add the following field to the class:
protected PersonNameEntities personNameContext;
- Instantiate the
personNameContext
field in the form's constructor after the InitializeComponent()
call:
public Form1()
{
InitializeComponent();
personNameContext = new PersonNameEntities();
}
- And, for the form's
Load
event handler, add a few lines of code so it looks like this:
private void Form1_Load(object sender, EventArgs e)
{
ObjectQuery<PersonName> personNameQuery = personNameContext.PersonName;
dgvPersonName.DataSource = personNameQuery;
dgvPersonName.Columns["ID"].Visible = false;
}
We're hiding the ID because this is an identity field assigned by the database.
- In the form designer, add a "Save" button and create a
Click
event handler that calls SaveChanges()
for personNameContext
.
private void btnSave_Click(object sender, EventArgs e)
{
personNameContext.SaveChanges();
}
Run the Program!
You should have something similar to the above, where you can fill in names, edit them, delete them, and so forth, and save your changes to the database.
OK, this is pretty nifty, as I haven't had to write a single SQL statement in this, albeit, it being a two-tier application.
So, What's Going On?
PersonNameEntities
is auto-generated and derived from ObjectContext
. What is this ObjectContext
class? Well, MSDN probably says it best:
"Provides facilities for querying and working with entity data as objects."
and:
The ObjectContext
class is the primary class for interacting with data as objects that are instances of entity types that are defined in an Entity Data Model (EDM). An instance of the ObjectContext
class encapsulates the following:
- A connection to the database, in the form of an
EntityConnection
object. - Metadata that describes the model, in the form of a
MetadataWorkspace
object. - An
ObjectStateManager
object that manages objects persisted in the cache.
The auto-generated subclass, PersonNameEntities
, has a useful method and property:
AddToPersonName(PersonName personName)
which you would use to programmatically add a PersonName
to the collection managed by the context. What I find a little odd is that there isn't a "RemovePersonName" method, although the base ObjectContext
class does have a DeleteObject(object entity)
method for programmatically marking entities for deletion.
There is also a getter property for PersonName
, which we used in this line:
ObjectQuery<PersonName> personNameQuery = personNameContext.PersonName;
The ObjectQuery
class represents a typed query, whose type is the class specified in the generic parameter. Of note is that the data is not loaded until the object is actually used in a foreach
statement, assigned to fill a list collection, or when you explicitly call the Execute
method. Also of note is that the ObjectQuery
class implements IListSource
, so it can be used as a data source for a list control like a grid, which is what we're doing in the next line:
dgvPersonName.DataSource = personNameQuery;
Finally, the table fields are exposed as properties in the PersonName
class (which was generated for us as well), so you can take advantage of strong typing and the field names when accessing the fields in code.
Working with a One-to-One Association
In this section, I will look at how a one-to-one association works. Thi
s is not a common database association pattern, but it's useful to look at because it has similarities to how a view would work with the Entity Framework.
Create the Model
Delete out the data in the PersonName table by executing this SQL statement (or use the app above to delete all the rows!):
delete from PersonName
Now, let's create a Person table that has a one-to-one relationship with the PersonName table:
ID | integer , primary key, identity, not nullable |
PersonNameID | integer , foreign key to PersonName.ID, not nullable |
Gender | char(1) , not nullable |
It's a bit contrived, but will suit our purposes. Let's add a name directly with SQL:
insert into personname (firstname, lastname) values ('Marc', 'Clifton')
select * from personname
Using the ID that is returned from the PersonName table, populate the Person table:
insert into person (gender, personnameid) values ('M', 12)
Replace "12" with whatever the correct ID is.
Create an EntityFramework WinForms Application
Create a new WinForms application called PersonPersonName
, following the steps above. This time however, when you select the tables, select both the Person and PersonName tables. Add the code to initialize the context as we did above.
Create the User Interface
Add a grid view to the form. Since this is a one-to-one association, we want to be able to edit the name and gender on a single row of the grid, and we want the Entity Framework to handle the separate CRUD operations (or at least, that's the idea.)
As before, double-click on the form to create the Load
event handler.
Now, we can load the data in a couple different ways. For example, we can suck in the contents of both tables and do the join in LINQ. This obviously isn't the preferred way of doing a join, because you get everything from both tables then do the join on the client side, rather than having the database do the join and return only the qualified rows.
private void Form1_Load(object sender, EventArgs e)
{
ObjectQuery<Person> person = ppnContext.Person;
ObjectQuery<PersonName> personName = ppnContext.PersonName;
var ppnQuery = from p in person
join pn in personName
on p.PersonName.ID equals pn.ID
select new { p.Gender, pn.FirstName, pn.LastName };
dgvPersonPersonName.DataSource = ppnQuery;
}
You can also write it like this:
private void Form1_Load(object sender, EventArgs e)
{
var ppnQuery=ppnContext.Person.Join(
ppnContext.PersonName,
person => person.PersonName,
personName => personName,
(person, personName) => new
{
Gender=person.Gender,
LastName=personName.LastName,
FirstName=personName.FirstName});
dgvPersonPersonName.DataSource = ppnQuery;
}
What is the Generated SQL?
Now, is the database doing the join for you, or does this just decompose to something similar to the first example? I have no idea! There is a debugger visualizer that Scott Gu has posted on his blog; however, I'm using a Vista-64 and it appears not to work. Someone else posted a comment regarding this problem as well. Since I'm using SQL 2005 Express, I don't have a profiler. When I get this set up with SQL 2005 or SQL 2008, I'll update the article with more information on the SQL statement that the server is executing.
Notice the Read Only DataGridView
Notice that the DataGridView
control is in a read-only state:
Because the resulting data set is constructed by joining two tables, we can no longer edit (change, insert, or delete) the data using the control's edit capabilities. I've dealt with this issue successfully in other frameworks, and I'm disappointed that the Entity Framework does not provide the smarts to provide you with an editable data set when the data set is constructed from a join statement(s).
Deleting a Row
So, let's modify our UI so we can add, edit, and delete rows "manually". To begin, we will add a BindingSource
so we can get the currently selected row and modify the query so that we get something more concrete than an anonymous type, which means we need to define a class for the Person-PersonName record:
public class PPN
{
public string Gender { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public Person PersonObject { get; set; }
public PersonName PersonNameObject {get; set;}
}
The query is modified to return an IQueryable<PPN>
collection, assign that collection to a BindingSource
, which is itself assigned to the DataGridView
control.
private void Form1_Load(object sender, EventArgs e)
{
IQueryable<PPN> ppnQuery=ppnContext.Person.Join(
ppnContext.PersonName,
person => person.PersonName,
personName => personName,
(person, personName) => new PPN()
{
Gender=person.Gender,
LastName=personName.LastName,
FirstName = personName.FirstName,
PersonObject=person,
PersonNameObject=personName
}).AsQueryable();
bsppn.DataSource = ppnQuery;
dgvPersonPersonName.DataSource = bsppn;
dgvPersonPersonName.Columns["PersonObject"].Visible = false;
dgvPersonPersonName.Columns["PersonNameObject"].Visible = false;
}
We can now delete the selected record when the user clicks on the "Delete" button (which we added to our UI):
private void btnDelete_Click(object sender, EventArgs e)
{
if (bsppn.Current != null)
{
PPN current = bsppn.Current as PPN;
ppnContext.DeleteObject(current.PersonObject);
ppnContext.DeleteObject(current.PersonNameObject);
ppnContext.SaveChanges();
Form1_Load(null, EventArgs.Empty);
}
}
The above quote acquires the PPN
record from the binding source, deletes the Person
and PersonName
records, and then saves the changes. The form data is reloaded. This is not a best practice, but I've implemented it this way so that you can see the change reflected in the DataGridView
.
If you think about what is going on in the above handler, this seems like it would have been more natural:
ppnContext.DeleteObject(current.PersonObject);
ppnContext.DeleteObject(current.PersonObject.PersonName);
However, PersonObject.PersonName
is null
. Similarly, one might imagine that deleting current.PersonObject
would be sufficient, expecting perhaps that the Entity Framework will perform a cascading delete. But, since PersonObject.PersonName
is null
, we can imagine that a cascading delete probably shouldn't work. In fact, this gives us a clue that we aren't really using the Entity Framework with best practices in mind--it seems that behind the scenes, the Entity Framework is still acquiring the records of both tables independently and joining them at the client side.
Why Not Use a View?
At this point, you probably have already asked, "Why not use a View?" The answer (for now) is that I want to keep things as "simple" (even if that means "stupid") as possible. If I use a View, it's also probably a best practice to use Stored Procedures for the CRUD operations against that View. Instead, I want to explore the capabilities of the Entity Framework as much as possible.
Later on, I'll be looking at real master-detail relationships and the Include(string path)
method rather than the Join<>
operation.
Adding a New Record
As the DataGridView
control is in a read-only state, I created separate controls to add a new record:
The code is simple enough (again, not necessarily representative of a best practice):
private void btnAdd_Click(object sender, EventArgs e)
{
PersonName pn = new PersonName();
pn.FirstName = tbFirstName.Text;
pn.LastName = tbLastName.Text;
Person p = new Person();
p.Gender = tbGender.Text;
p.PersonName = pn;
ppnContext.AddToPerson(p);
ppnContext.SaveChanges();
Form1_Load(null, EventArgs.Empty);
}
Note that I'm assigning the PersonName
instance to p.PersonName
, as the Entity Framework then knows about the association, so I don't have to deal with the primary and foreign keys.
Editing an Existing Row
Let's add some simple data binding to the TextBox
controls (by this time, I've refactored the Form Load
event, so the binding only happens once):
private void Form1_Load(object sender, EventArgs e)
{
LoadData();
tbGender.DataBindings.Add(new Binding("Text", bsppn, "Gender"));
tbFirstName.DataBindings.Add(new Binding("Text", bsppn, "FirstName"));
tbLastName.DataBindings.Add(new Binding("Text", bsppn, "LastName"));
}
As the DataGridView
control is in a read-only state, the data binding is one way. This means we have to manually update the Person
and PersonName
objects of the selected row:
private void btnUpdate_Click(object sender, EventArgs e)
{
if (bsppn.Current != null)
{
PPN current = bsppn.Current as PPN;
current.PersonObject.Gender = tbGender.Text;
current.PersonNameObject.FirstName = tbFirstName.Text;
current.PersonNameObject.LastName = tbLastName.Text;
ppnContext.SaveChanges();
LoadData();
}
}
Working With a One-to-Many Association
A more common association pattern is the one-to-many association. Here, I will look at creating this pattern between Person
and PersonName
by reversing the foreign key field, so that a Person can have many names, like maiden name, alias, and so forth.
Create the Model
Change the model so that PersonName has a PersonID which is a foreign key to the Person.ID table. Remove the PersonNameID field from the Person table.
Also, create a foreign key associating PersonName.PersonID with Person.ID.
Create an EntityFramework WinForms Application
As previously, create a WinForms application and add the Entity Framework information for the Person and PersonName tables using the ADO.NET Data Model Wizard.
Once you've done this, notice how the Person.PersonName
property is a collection! The Entity Framework has correctly understood that the relationship between Person
and PersonName
is a one-to-many relationship, and creates a PersonName
collection allowing us to associate many names to one person.
Create the User Interface
For this example, I created two DataGridView
controls, one for the master (Person) data, and the other for the detail (PersonName) data. A "Save" button saves the context, and as usual, the Form Load
event is used to load the data set. Notice, when we run the application, that both grids are editable:
The Load
event handler looks like this:
private void Form1_Load(object sender, EventArgs e)
{
ObjectQuery<Person> personQuery = mpnContext.Person.Include("PersonName");
bsPerson.DataSource = personQuery;
dgvPerson.DataSource = bsPerson;
dgvPerson.Columns["ID"].Visible = false;
dgvPerson.Columns["PersonName"].Visible = false;
dgvPersonName.DataBindings.Add(new Binding("DataSource",
bsPerson, "PersonName"));
dgvPersonName.Columns["ID"].Visible = false;
dgvPersonName.Columns["Person"].Visible = false;
}
Notice the use of the Include(string path)
method, which loads the data into a master-detail object model, meaning that each Person
instance has a collection of PersonName
objects specific to that association.
A BindingSource
tracks the currently selected row in the person DataGridView
, and this binding source is used to dynamically bind the Person.PersonName
collection with the detail DataGridView
control.
Reality Check
This is all that is necessary to add, edit, and delete rows (caveat, see cascading deletes next) in this master-detail model, which is certainly nifty. However, when using a highly normalized database, it's unrealistic to assume that a master-detail relationship can be represented by one master table and one detail table. Rather, the master and detail are probably Views. This makes working with a master-detail data set more complex, as we saw above.
Cascading Deletes
If we try to delete a master record and save the results, the following exception occurs (this assumes you did not specify "Cascade
" for the OnDelete
rule of the foreign key):
What we really want is to cascade the delete. We "should have" specified Cascade
for the OnDelete
rule of the foreign key, and then this would have generated the correct EDM. So instead, let's edit the EDM manually.
Now, we can delete master records without generating an error, and the detail records associated with the master will be automatically removed.
Reality Check
Integrity checking often depends on the context in which the application is run. For example, a standard user should probably not be able to remove Person records if there are associated detail records. However, a super-user or a high level administrator probably ought to have this capability. Implementing a cascade rule in the database and the EDM doesn't capture the context information that makes this determination. Yes, one can do this programmatically. But, do you really think doing this programmatically is the best approach when you're designing an enterprise architecture application? Personally, I think not.
Working With a Many-to-Many Association
In this last example, we'll look at a many-to-many association, between PersonName and a new table, Location, and the PersonNameLocationAssociation table.
Create the Model
Modify PersonName, removing the PersonID column, since we don't want to confuse the issue by requiring a Person record.
The Location table looks like this:
ID | integer , primary key, identity |
Number | nvarchar(16) , not nullable |
Street | nvarchar(64) , not nullable |
Zip | nvarchar(5) , not nullable |
The PersonNameLocationAssocation table is defined as:
PersonNameID | integer , foreign key to PersonName.ID, cascade on delete |
LocationID | integer , foreign key to Location.ID, cascade on delete |
The composite primary key is defined consisting of both fields.
Create an Entity Framework WinForms Application
This should be pretty routine by now. Create a WinForms application and use the ADO.NET Entity Data Model wizard to generate the classes for the PersonName, Location, and PersonNameLocationAssociation tables.
Notice that the wizard generates an entity data model that consists only of the PersonName and Location tables:
This is an artifact of the association table: it contains only the foreign keys. Later, I'll add another field (the association type) and we'll look at the EDM again. Also notice the "Navigation Properties"--we can link the Location direction to PersonName, and vice versa, without explicitly going through the association table.
Reality Check
This is a sort of cute shortcut, but is it really practical? In my experience, an association table will have additional properties about the association, perhaps describing the association type, the begin and end dates of the association, and so forth. So realistically, this shortcut is not something I would expect to be able to take advantage of in the real world.
The UI
The UI I created for this is rather simplistic, allowing you to create PersonName
and Location
records, and then associate them by selecting a row in each grid and clicking on the "Associate" button:
The Form Load
event handler should look familiar:
private void Form1_Load(object sender, EventArgs e)
{
ObjectQuery<PersonName> personQuery = context.PersonName;
ObjectQuery<Location> locationQuery = context.Location;
bsPersonName.DataSource = personQuery;
bsLocation.DataSource = locationQuery;
dgvPersonName.DataSource = bsPersonName;
dgvLocation.DataSource = bsLocation;
dgvPersonName.Columns["ID"].Visible = false;
dgvPersonName.Columns["Location"].Visible = false;
dgvLocation.Columns["ID"].Visible = false;
dgvLocation.Columns["PersonName"].Visible = false;
}
The Associate button event handler creates the association between the selected records:
private void btnAssociate_Click(object sender, EventArgs e)
{
if ((bsLocation.Current != null) && (bsPersonName.Current != null))
{
PersonName pn = (PersonName)bsPersonName.Current;
Location loc = (Location)bsLocation.Current;
pn.Location.Add(loc);
loc.PersonName.Add(pn);
context.SaveChanges();
}
}
As you can see, I've used a BindingSource
to manage the currency (the selected row) in the two DataGridView
controls.
After making a couple associations, we can verify that they were saved by querying the association table:
So, that's straightforward.
Association Tables With Property Fields
The last thing I want to look at in this article is, what happens when you add some property fields to the association. For example, let's add the AssocType
field (I'm really lazy, so we'll make an nvarchar(64)
) to the PersonNameLocationAssociation table, so we can enter some information about the association, like "lives at", "works at", and so forth.
Delete the .edmx file from the project and use the ADO.NET Entity Data Model wizard to add the PersonName, Location, and PersonNameLocationAssociation tables again (I don't see an easy way of updating an existing EDM).
Notice that now the association table is explicitly included in the EDM:
To make the association, we now have to work with all three entities:
private void btnAssociate_Click(object sender, EventArgs e)
{
if ((bsLocation.Current != null) && (bsPersonName.Current != null))
{
PersonName pn = (PersonName)bsPersonName.Current;
Location loc = (Location)bsLocation.Current;
PersonNameLocationAssociation assoc = new PersonNameLocationAssociation();
assoc.PersonName = pn;
assoc.Location = loc;
assoc.AssocType = "Lives At";
pn.PersonNameLocationAssociation.Add(assoc);
loc.PersonNameLocationAssociation.Add(assoc);
context.SaveChanges();
}
}
Yes, I hard-coded the AssocType
value. However, you can see from the above code that it's easy to work with an explicit association table. Oddly, I didn't have to add the association to the context's collection:
context.AddToPersonNameLocationAssociation(assoc);
so there must be some behind-the-scenes magic going on.
Miscellaneous Comments
The code that is generated includes factory methods, such as:
public static PersonName CreatePersonName(int id, string firstName, string lastName)
but notice that these methods include the table's ID, which is pointless (or supposedly so) because the IDs, at least in the data model I've been using, are identity fields, so they're assigned by the server, not the client. I did read something about the values being temporarily used in relationships, but it would seem unnecessary.
The factory method for the association table seems particularly useless as it requires the ID values of the associated tables (again, assigned by the server in my model, as they are identity fields):
public static PersonNameLocationAssociation
CreatePersonNameLocationAssociation(
int personNameID,
int locationID,
string assocType)
Other things that might be interesting to look at in future articles:
- client-side validation
- client-side error handling of server errors
- caching
- n-tier architecture
- system performance
- SQL statement performance
- disconnected state (Smart Client)
- security / encryption
- application context
- etc.
Basically, the common things one would have to deal with in an enterprise architecture, all of which touch upon the data service architecture, in this case being the Entity Framework.
Conclusion
So, that's an introduction on the Entity Framework with regards to different database association patterns. Being new to the Entity Framework, I hope that I haven't made any gross errors, but if I have, please leave a comment and I'll update the article. The Entity Framework seems to be pretty easy to use for basic stuff, and hopefully, Microsoft will address my pet peeve, not being able to update a joined data set easily, in some future release. Another problem that bothers me is not being able to easily view the generated SQL. And of course, I feel like I've just scratched the surface in determining whether the Entity Framework is a viable technology, and the list above seems rather daunting.