Saturday, October 4, 2008

My First NHibernate Project, Part 2

Part1, Part3

The next time...


Now that I have one project under my belt, what will I do differently next time? I think now that I have more experience and a better understanding of NHibernate I'd probably do several things a bit differently if I were starting this project over. Here's a list of the main things I'd consider the next time I started a new project using NHibernate.

NConstruct... to use or not to use?
Although I'll consider NConstruct again (if nothing else, their development staff appears quite responsive to feedback and is very quick to incorporate requests), I'll probably try creating at least some of the classes and mapping files myself. Maybe I'm just a bit of a control freak, but now that I'm familiar with the file structures, I think it will be easier to create them myself than to use a tool and then have to correct the code generated by that tool. Still, it was able to generate quite a lot of code and XML with very little effort, so I'd probably still use it for most of my basic entities unless I decide to use interfaces and then just clean up the property names afterward (Refactor! is a pretty handy tool for helping with that). Maybe I'll beat my DBA around the head a little bit as well to see if I can get him to loosen up on his table naming rules a bit, which would drastically cut down on the amount of fixing up I need to do to property names!

Interfaces
Next time i think I'll also look into using interfaces with NHibernate. This is a bit of a trade-off, so I'm not sure whether I'll stick with that approach though. On one hand, using interfaces makes unit testing easier. On the other hand, with interfaces you can't map database columns to fields... you must map to properties, which may not be acceptable if your setter method does things that you don't need to do when initializing the object from the database. For example, if your workplace is anything like mine, you may have to work with a legacy database with some not-too-clean data. You might want to add validation in your setter so that new data going in will be clean and shiny, but couldn't use that setter when initializing an object from the database because you need to accept the record as-is, errors and all.

Components
Components sound interesting because they're basically a way to deconstruct part of a table into a separate class which is then included as a property in the table's entity representation. The classic example of this is a table that includes address information (such as almost any database that includes contact records). The address fields (street address, city, state, zip code, etc) can be defined as a component of type Address and the entity will then contain a field of type Address that contains the values of these columns. This is a good way to create small, general-purpose business objects that can be reused in different applications. In my current project I have a couple tables that could have been represented this way had I been able to absorb everything about NHibernate at once! Instead, I modified the NConstruct-derived classes for these tables so that they implemented a pre-existing business object interface, encapsulated a concrete instance of that interface within the class to hold the properties of that interface, and modified the generated getters and setters of the entity class to use that instance. This could have been accomplished more easily using a component (although in this specific case it was a better implementation choice to not use a component, but at the time I didn't have enough familiarity with NHibernate to evaluate the applicability of components to the problem.

Named Queries
Named queries could also be useful for encapsulating some of the data layer. My data layer code consists of a number of methods that query the database to return single entities or collections of entities that match specific criteria. It also contains methods that return collections of arrays consisting of specific properties from entities (for example, when I only need a couple properties such as the id and name in order to populate a combobox or a list). This means that the data layer consists mainly of methods that look basically like this:

Public Function GetSomeEntitiesBySomeProperty(ByVal propertyValue As PropertyType) As IList(Of EntityType)
Try
Dim s As ISession = GetSession()
Dim entities As IList(Of EntityType) = s.CreateQuery(queryString) _
.SetString(0, propertyValue).List(Of EntityType)()
Return entities
Finally
DoSessionCleanup()
End Try
End Function

where GetSession and DoSessionCleanup are methods in my data layer that manage the reuse of Session objects.

The question then is where does the query string come from? It could be hard-coded into each method in a string, but that's a fairly inflexible approach. It could also be coded into the method using properties of the ICriteria interface, but that suffers from the same problem (not to mention is just much to wordy for my taste, but if you like the declarative programming approach, feel free to go for it; I find that approach quite useful when writing unit tests in NUnit, but I'll probably pass in favor of HQL and SQL queries in NHibernate).

I ended up putting my query strings in Resource strings, the advantage being that it takes the query strings out of my code and into a resource file where it can be modified without touching the actual code. I could achieve basically the same end result using named queries, so that's an approach I'll probably look into next time (or maybe this time... there's still time to move the queries from the resource file into the mapping files if I want to give it a shot). The main advantage of moving the queries from the resource file into the named queries is that named queries are parsed once as opposed to queries passed in as strings, which are parsed each time unless they're cached (I believe NHibernate currently caches a certain number of the most recently used queries).

Generics and ICriteria
Since the majority of my queries are very similar -- "Get all of the {some object}s where {some property} is {some value}" -- occasionally with an "ordered by {some property}" I could could probably also drastically cut down on the number of queries I need to write if I used some combination of ICriteria and generic methods. But wait... didn't I just say that ICriteria wasn't any more flexible than hard-coding the queries as strings? Yes, but you for basic lookups you can pack a lot of bang for your buck if you write your lookup method using generics and a few input parameters. Here's a quick example of my previous pseudocode implemented using generics:
Public Function GetFilteredEntities(Of T As EntityBase) _
(ByVal propertyName as String, ByVal propertyValue as Object, _
Optional ByVal orderBy as String = Nothing) As IList(Of T)
Try
Dim s As ISession = GetSession()
Dim crit as ICriteria = s.CreateCriteria(GetType(T)) _
.Add(Expression.Eq(propertyName, propertyValue))
If orderBy IsNot Nothing AndAlso Not String.IsNullOrEmpty(orderBy.Trim()) Then
crit.AddOrder(Order.Asc(orderBy))
End If
Return crit.List()
Finally
DoSessionCleanup()
End Try
End Function

This one method could replace over a dozen methods in my current data layer, along with their associated query string resources.

While generic classes are a very powerful feature, I'm starting to believe that generic methods may even be more powerful. By using generics, this method can return a list of any entity type instead of a single type. In the above example, all of my entity classes inherit from the base class EntityBase so I specified that the generic type passed in be of that type or a subclass. Since each entity will have different properties and I may need to filter the same entity by different properties, I am passing in the name of the property by which to filter as a parameter and the value of that property as another parameter. Also, since ordering the results is a common requirement, I've addedn an optional parameter to specify a column by which to order the results.

Granted, this one method only handles a very limited set of possible queries, but I've found that this limited set actually covers most of the queries I need to do. In my experience I'm usually doing one of three things. Most often, I'm getting a single item from the database by Id. If I'm not getting a single item, I'm either getting everything from a table (but usually only if the table is fairly small), in which case I could use a method like the one above but without the propertyValue an propertyName parameters, or I'm getting a simple subset of a table, filtered on a single value (all users in group X, all orders for customer Y, all items for order Z, etc). In this last case, the method above would work perfectly well. I could also use the method listed above for the first case where I'm getting a single item by Id, but I'd probably write another method similar to the one above except that it executes a Load() method instead of using ICriteria and returns a single item instead of a collection containing one item.

If you're concerned with needing to filter on multiple criteria, you could easily extend the method listed above by passing in a dictionary of property name/value pairs instead of the single propertyName, propertyValue parameters. The method would then iterate over the dictionary, adding crit.Add(Expression.Eq(prop.Key, prop.Value)) for each property in the dictionary. This would build a filter where all of the properties must equal their specified values. If you wanted to return everything from a table, pass in an empty dictionary (I'd probably allow Nothing as a value and check for that to skip the loop as well).

If you really wanted to get fancy, you could probably come up with ways to pass in a data structure that can specify more advanced filtering such as "greater than" and "less than" instead of just "equals" or to allow "or" as well as "and," but then your setup for the method and handling of the data structure start becoming too cumbersome to be useful in opinion. Unless I really needed the data layer to be totally dynamic, I'd go as far as the parameter arrays and write specialized methods for the remaining 10% of my queries that can't fit into my general-purpose methods.

Coming up

In part three of my series on my first NHibernate project, I'll be discussing the main lessons I've learned and giving a few tips for new NHibernate users.
full window

Monday, September 29, 2008

Lessons from My First NHibernate Project

Part2, Part3

I recently used NHibernate for the first time. As a result, I learned a number of things... some obvious, some not so obvious... that I thought might be helpful to other people coming to NHibernate for the first time. This is the first article of a three-part feature. I originally planned to post them all as one large article, but life interfered and it's now three weeks after I had started this article and I've decided to split it up so I can get part of it out there now while I finish the rest.

This first article briefly describes my experience getting started with NHibernate. The following two articles describe the things I think I might do differently next time now that I know a bit more about NHibernate, and the various tips and tricks I've learned so far along the way. There are a couple aspects of NHibernate that I felt are especially tricky for new users. I've mentioned those in this series of articles, but I'm planning to write a separate article covering each of those in more detail as well.

None of these articles are step-by-step tutorials on configuring or using NHibernate. I found that the documentation on NHibernate's site is very useful for that and writing an article on that would be redundant. Also, there are several articles on CodeProject that are quite useful for programmers getting started with NHibernate for the first time. This series of articles is more a listing of tips, tricks, and traps to avoid that I thought would be a useful for programmers after they've gone through NHibernate's Quick Start Guide.

The Learning Curve

When it came time to start developing the data layer of my current work project, I decided to try NHibernate instead of coding yet another data layer myself in ADO.NET hoping to (1) save some development time, and (2) learn a new tool. Being new to NHibernate and a bit daunted by all the XML files required by it, I also decided to try NConstruct (the free lite version) to automate the generation of the initial classes and XML files. I seriously considered ActiveRecord from Castle Project since that replaces NHibernate's config files with attribute tags on the classes and properties, but finally decided against it for a couple reasons:
  • ActiveRecord doesn't support stored procedures, and at the time I was planning to use stored procedures (I didn't realized at the time that although NHibernate does now support stored procedures, its implementation is rather clunky to say the least. Also, after quite a bit of research, I discovered that many of my reasons for using stored procedures weren't totally valid and I ended up doing the project without them).
  • By using attributes, entity classes defined using ActiveRecord are not true POCO (Plain-Old CLR Objects). A true POCO should not include anything in its implementation that ties it to a data source or service. At its most basic, an entity POCO classes would consist of little more than one or more constructors and a property for each column in the database table being modelled. NHibernate follows this contract by placing the mapping information between the class and table in a separate XML file. ActiveRecord breaks this contract slightly by putting the mapping information inside the class itself.

NConstruct worked fairly well for me, but I quickly discovered a few big drawbacks. The main drawback is that you cannot use NConstruct to modify its own generated NHibernate classes and mapping files if your database changes. Since the application and database are being developed in parallel, I'm occasionally making changes to the database structure as I start developing features that require additional tables or refine the design. However, if your database structure changes you need to either:
  • run NConstruct again and completely replace the existing classes and mapping files
  • create a new NConstruct project and then manually move the changed classes and mapping files into your existing project
  • bite the bullet and learn to make the changes in the maping files and code yourself
I decided to go with the last option.

Like any new technology or API, there's always a learning curve when you first try to use it. NHibernate's certainly no exception. While I'm convinced I made the right decision, the ride has not been the easiest, and I've wondered more than once if I actually saved much time by using it rather than writing the ADO.NET code myself. I think I have... especially since you get a lot of things for "free" with NHibernate such as session management and transactions that normally suck up a lot of development time to get right when added into a basic data layer. However, it's taken a bit more effort to get there than I had originally planned. But now that I have one NHibernate project under my belt I believe that I'll be able to save a considerable amount of time developing data layers for future projects.

False Starts

I'll readily admit my first NHibernate data layer is not without its share of mistakes. For starters, since NConstruct doesn't generate interfaces, its code makes a lot of direct references to other entity classes (modelling parent-child relationships, for example). I extracted interfaces for all of the generated classes, but because of the way the original code was generated, my interfaces are a bit of a mess and NHibernate isn't using them (and since I knew almost nothing about NHibernate when I started, I didn't know how to change that). I'm using the interfaces instead of the classes outside the data layer, but again because of the way the code was generated, I've had to do a lot of casting between interfaces and concreate classes. Working under a time crunch, I originally found it easier to just do the casting, but I really need to go back and clean that up!

The second problem I discovered is that I needed to do a lot of changes to the auto-generated code, which is basically means that that code can never be auto-generated again! (thus my decision to write any changes to the mappings myself instead of trying to regenerate them using NConstruct). The main reasons for this are:
  • Being new to both NHibernate and NConstruct, I made some mistakes when using NConstruct's wizard that I didn't discover right away
  • NConstruct couldn't do everything I wanted it to do (they've since changed some of that)
  • I don't code the same way the makers of NConstruct do
An example of the first situation is my lack of understanding of how NHibernate handles creating sub-classes and NConstruct's implementation of that. As a result, instead of creating true subclasses for the types of Documents in the system (all of which derive several basic properties from a base Document class and table), I ended up having to encapsulate the base document class inside the classes of documents that derive from it... an implementation issue that rears its ugly head from time to time. Next time I'll know better and be able to do this the correct way.

I also found that I needed to go back through the mapping files and classes and change all of the property names. By default, NConstruct generates property and field names that are identical to the column names in the database table. Since our DBA prefixes the table name to all column names, the property names generated were less than friendly on the fingers (for example, my "FieldDefinition" class had properties like "FieldDefinitionName" and "FieldDefinitionIsEditable" where I wanted the entity class' properties to be simply "Name" and "IsEditable").

Finally, my database included quite a few many-to-many linking tables (for example, a User can belong to multiple Groups and each Group can contain multiple Users). Instead of creating a many-to-many set, NConstruct created a separate class for the linking table. Maybe that was due to my lack of knowledge about NConstruct and NHibernate, but I found it extremely annoying to correct.

On the plus side, all of these issues brought me up to speed on NHibernate mapping files very quickly!

Now, it may sound like I'm knocking NConstruct quite a bit. Not so. I think it's a very nice application... it just has a few rough edges. Also, like I mentioned earlier, my programming style is a bit different than NConstruct's programmers so I had to redo several things to make the generated classes work for me. I can't reasonably expect them to generate code that can fit everyone's preferences exactly, though, so I don't have a complaint with that. Even with all of the extra work NConstruct still probably got me up and running with NHibernate much faster than if I had to create the mapping files and classes completely on my own.

Sunday, August 24, 2008

XML Documentation

OK, that last post ended up as being a bit of a tease, so I'll jump right into things today. I write most of my comments using XML. Before the days of XML, I used to write "header comments" for all my major code blocks. Each of these header sections would contain a number of pre-defined elements. With the advent of XML and the inclusion of support for XML-style comments in Visual Studio, it became easy to create consistent, thorough documentation for each primary element of my code (by "primary element" I mean classes, properties, methods, and events).

Using XML comments in .NET code has several advantages.

  • XML's well-defined tags and the assistance of auto-complete make it easy to create consistent comment blocks.
  • Starting with Visual Studio 2005, XML Documentation Comments can be extracted from VB, C#, and C++ source code files. When you compile with the /doc parameter, the compiler will extract all XML tags from your source code and create an XML documentation file. You can instruct the compiler to extract these comments into an XML file. These comments can be used by NDoc and SandCastle to create HTML and chm help files.
  • Finally, these XML files can also be used to add IntelliSense in VisualStudio to your own classes.

  • I try to comment every class, method, property, and event. In order for the compiler to extract documentation, the code comments need to meet the following two conditions:

  • The comment must begin with ''' in VB files and /// in C# or C++ files.
  • The documentation to be extracted must be encased within XML tags.

  • While the compiler will accept any valid XML tags, I generally use a specific set of tags for each element that I'm commenting. In the remainder of this post I'll describe how I comment classes and properties. I'll finish my discussion of XML comments in the next post where I describe how I comment methods and events.

    Classes
    At the very minimum, I use the <summary> tag to provide a short summary of the class. This summary describes what the class is and its use. A more detailed description goes in the <remarks> tag. Other tags I may use in a class comments block are <example> (usually used with the <code> tag), and <seealso> to add a See Also section to the documentation containing links to related classes. Finally, I add the non-standard tag <author> to all my class comment blocks.

    Here's an example:

    ''' <summary>
    ''' A simple class that represents a US mailing address.
    ''' </summary>
    ''' Keven Lehmann
    ''' <remarks>
    ''' <para>
    ''' This class contains properties for all of the elements in a common US
    ''' mailing address. It provides basic validation for state abbreviations
    ''' and ZIP code formats (basic and ZIP+4) but does not verify that the
    ''' address represents an actual US mailing address.
    ''' </para>
    ''' <para>
    ''' The following example demonstrates a typical use for this class:
    ''' </para>
    ''' <example>
    ''' <code>
    ''' Public Function GetSampleAddress() as String
    ''' Dim addr as USAddress = new USAddress()
    ''' addr.Recipient = "Keven Lehmann"
    ''' addr.Street = "101 Main St"
    ''' addr.City = "Somewhere"
    ''' addr.State = "MD"
    ''' addr.ZIP = "21212-1212"
    ''' Return addr.ToUspsString()
    ''' End Function
    ''' </code>
    ''' </example>
    ''' </remarks>
    Public Class USAddress
    ...
    End Class


    Properties
    For properties, I always include the <summary> and <value> tags, although <value> often turns out to be a rehashing of the <summary>. If I feel there is anything about the property that needs extra clarification, I'll include that in a <remarks> tag. Other common tags I may use with a property are <example> for examples of use and <see> to refer to related functions from within the body of the documentation.

    Here is an example:

    ''' <summary>
    ''' Gets the <see cref="Person"></see>'s full name.
    ''' </summary>
    ''' <value>
    ''' The full name of the <b>Person</b>, comprising their
    ''' <see cref="FirstName"></see> and <see cref="LastName"></see>,
    ''' separated by a space.
    ''' </value>
    Public ReadOnly Property FullName() As String
    ...
    End Property


    That's it for right now. Next time I'll finish up XML comments by covering Method and Event comments.

    Sunday, August 17, 2008

    Documentation is the Key

    Documentation is one of the most important, yet overlooked, aspects of programming. No one programs in a vacuum. At some point, someone else will look at your code and have to understand it. For that matter, at some point in the future you may have to look at your code and understand it! When the code is fresh in your mind, understanding it is easy. But how easy is it to follow six months from now? One year? Two years? By that time, several projects have come and gone and that code is stale and musty in your mind. Do you really want to laboriously read through it line-by-line just to understand how it works again? Wouldn't it be easier if you could skim through your code and quickly home in on the piece you need to fix or change?

    Clearly-written code can certainly help you achieve that objective, but I feel that clearly-documented code can make the task even easier. With well-documented code you can read through the documentation rather than the code itself to understand what each piece of code is supposed to do. And well-written documentation in plain English (or French or German or whatever your native language is) is always much easier to understand than the actual code itself (and if it isn't, then you really need to work on your writing skills).

    If you're tasked with making changes to a project that you've never been involved with before and the previous developers didn't properly document their work, you may quickly feel the urge to track down those programmers and do bodily harm (if so, hopefully they were consultants who were long gone and not your co-workers). No documentation means having to slog through thousands of lines of code (or more) to track down the code you need. Sometimes you have to wade through all of the code just to get a basic understanding of the application before you can even dream of modifying it. Don't be one of those programmers that other programmers will one day want to kill!

    On the other hand, good documentation means having external documents that help you see the overall structure of the application -- the big picture. From that, you can see which pieces of the code (layers, namespaces, or classes) are important to your current task. More detailed documentation about those code sections (both external documents as well as class-level comments) can help you drill-down to even smaller pieces of the code. Finally, documentation on the class properties and methods as well as inline comments can help you pinpoint the exact sections of your code that you need.

    Good documentation involves several things. The main aspects of well-documented code are internal documentation, self-documenting code, and external documentation.

    Internal Documentation
    Actively documenting the body of your code is perhaps the biggest topic of the three types of documentation I mentioned. Thus, I'll start with that and spend most of my time talking about it. Internal documentation itself can be broken down into several topics: class-level comments; method-, property-, and event-level comments; and inline (or code block) comments. In addition, there are a couple side-topics I'd like to discuss: XML documentation and code regions.

    It is vitally important to rigorously comment your code. Unfortunately, writing comments tends to be boring compared to the challenge of coding and is often neglected. It can also be time-consuming and thus easy to skip when you're working under a tight deadline. I know I'm still far from perfect when it comes to properly documenting my code, but I'm aware of its importance and since I constantly strive to improve the quality of my code, I also constantly strive to improve the quality of my code comments (which I feel has a net effect of improving the quality of the code itself).

    The next few sections delve into each aspect of internal documentation in more detail. At the end, I'll provide a few tips and tricks I've discovered that help make it easier to include and maintain good documentation. Some of them even make you a more efficient programmer, so you have even less excuses for not including good comments in your code!

    OK, I'm long-winded, I'll admit it. so without further adieu, let's start commenting about comments... next time. ;-)

    full window
    full window
    full window
    full window