Thursday, November 27, 2008

Thankful not just today...

Today is Thanksgiving over here in the U.S. It's a day that spurs us to think back over the past year and remember all the things for which we're thankful. While Thanksgiving the "Day" is almost over, I think we all need to remember that today is not the only day in which we should remember to be "thankful."

This year has been very turbulent for a great many people... both in the U.S. as well as the rest of the world. The crumbling housing and financial markets have sent shockwaves through an already fragile economy, with ripples that can be felt worldwide. Violence such as the recent attacks in Mumbai has shattered lives. The passage of Proposition 8 in California and similar referendums in Arizona and Florida have torn apart the stability of many families. For these things, no one can be thankful.

But even amongst these tragedies, there is room for most of us to be thankful. Most of us have been spared being directly affected by the worst that 2008 has tossed at as, and for that we can be thankful, even while we think of those who have been less fortunate. Amongst the darkness there have been rays of light shining forth. While the elections on November 4th were marred by Proposition 8, a brighter light shone forth that day.

In each of our lives we can find something for which to be thankful, something to lift our spirits and encourage us to make the new year a better one than the last.

So, looking back on 2008 and forward to 2009, here are a few of the things for which I'm thankful:


  • A stable job in a time of uncertainty.

  • A job that I actually enjoy with people I enjoy working with.

  • The love of my partner, family, and friends.

  • A house when so many are finding theirs in danger.

  • The joy brought into my life by the adoption of my two newest cats.

  • The memory of many years of love given by my sweetheart kitty Ophelia who passed away last year.

  • The continuing joy and love given by my other surviving cat Edison (also a rescue).

  • My health.

  • The good health of those I love.

  • November 4th and the hope for a long overdue change in our country's direction.

  • That Vista on my new laptop doesn't suck as badly as I was afraid it would.

  • That I get to sleep in tomorrow!

  • The presence of Hope which for so long seemed absent.

OK, so they're not all profound, but being thankful for the little things in life is just as important as being thankful for the big things. They're also easier to find. So for all those little things in life and the ability to find them, I'm thankful.

Thursday, November 20, 2008

XML Documentation (continued)

Wow! I got sidetracked on getting out some other posts and, well... life, and I just realized I never continued the articles I started about XML documentation! For my earlier posts about code documentation, see Documentation is the Key and XML Documentation. So, as promised, here's how I use XML comments to document methods and events.

Methods
Documenting methods can get a little more complicated. Like all other comment blocks, I always include a <summary></summary> tag that gives a brief description of the method. If the method contains parameters, I'll follow the summary with a <param> tag for each parameter passed. If the method returns a value, I'll follow the parameter tags with a <returns></returns> tag that describes the return value. If the method throws any exceptions, those get documented within <exception></exception> tags. Finally, if the method needs further explanation, I'll include a <remarks></remarks> tag and possibly an example in an <example></example> tag.

In addition to the above tags, method documentation might also include <see></see> tags to refer to related properties, methods, or classes. If I need to refer to a parameter by name within the documentation (outside of that parameter's <param> tag), I'll surround it with a <paramref></paramref> tag. If any parameter type is an enumeration, I may also include a <list></list> with enumeration values and their meaning within the context of the method.

Method comments can get rather complex, so here's an example:

''' <summary>
''' Saves the document to a file.
''' </summary>
''' <param cref="filename">
''' The name of the file to which the document will be saved.
''' <param>
''' <param cref="format">
''' The format in which to save the document. Can be any of
''' the following values:
''' <list>
''' <li>
''' XML
''' <description>
''' Document text and formatting saved using XML schema.
''' </description>
''' </li>
''' <li>
''' Binary
''' <description>
''' Document saved as binary data.
''' </description>
''' </li>
''' <li>
''' RTF
''' <description>
''' Document saved in rich-text format.
''' </description>
''' </li>
''' </list>
''' <param cref="overwrite">
''' If <b>True</b> and this file already exists, it will be overwritten
''' without warning. If <b>False</b> and this file already exists, this
''' function will not save the file and will exit with a return value of
''' <b>False</b>.
''' <returns>
''' <b>True</b> if the document was saved with no errors; otherwise,
''' <b>False</b>.
''' </returns>
''' <exceptions cref="ArgumentNullException">
''' Occurs if the filename is null (<b>Nothing</b> in Visual Basic).
''' </exception>
''' <exception cref="IOException">
''' Occurs if an error occurred attempting to write the file.
''' </exception>
''' <remarks>
''' <seealso cref="Load"></seealso>
''' </remarks>
Public Function Save(ByRef filename As String, ByRef format As FileFormatEnum, _
ByRef overwrite As Boolean) As Boolean
...
End Function


Events
Events are pretty straightforward. I usually don't do more than include a <summary> tag and possibly a <remarks> tag.

Example:

''' <summary>
''' Occurs when the document is being the completion status of the
''' progress has changed.
''' </summary>
''' <remarks>
''' The <see cref="ProgressUpdatedEventArgs></see> event argument
''' passed with the event contains an integer property,
''' <see cref="Progress"></see>, that contains the current completion
''' status, represented as a percentage.
''' </remarks>
Public Event ProgressUpdated as EventHandler(Of ProgressUpdatedEventArgs)


More Info
For a complete list of XML tags suggested by Microsoft, see Recommended Tags for Documentation Comments (C#).

Tuesday, November 11, 2008

Signed Assemblies and COM References

Something weird happened recently. A co-worker needed to use an assembly I had written, but he was going to be referencing it from a project that would be called from a classic ASP application so the assembly needed to be strongly-signed. "No problem," I thought. I strongly-signed my assembly and recompiled. A few minutes later another application I was working on -- which also used my newly-signed assembly -- started getting the following error:

"The located assembly's manifest definition with name [AssemblyName] does not match the assembly reference"

So what happened? I wouldn't think that just strongly-signing an assembly would cause problems, but apparently it did. After a bit of digging, I discovered the problem: my assembly had references to COM DLLs. When you add a reference to a COM DLL, Visual Studio creates an interop wrapper around the COM DLL... but you already knew that. What you might not have realized (I certainly didn't!) is that in order to use a COM DLL in a signed assembly, the interop file also has to be strongly signed. When I signed my assembly I inadvertently broke it because the interop files were still unsigned.

Once I knew that the solution was pretty simple: remove the references to the COM interops and add them back again after the project is strongly-signed. With the project strongly-signed, Visual Studio creates strongly-signed versions of the Interop files.

Not so fast...
Actually, the solution wasn't quite that easy. In addition to a reference to the strongly-signed assembly, my application also had references to the same COM DLLs used by that assembly. That meant I had a conflict because my application had unsigned interops for those DLLs but the signed assembly expected those interops to be signed.

Since I no longer actually needed those references in my project (I did at one point, but they were no longer necessary), I deleted them. After deleting those references from my application, I was started getting "File Not Found" messages. I didn't need references to the COM DLL in my application, but my referenced signed assembly did. The problem in this case was that I had copied my strongly-signed assembly to a shared folder and referenced it from there. What I didn't do was copy the strongly-signed interop files and my application could no longer find any interop files for those COM DLLs.

Once I copied those interop files to my network share, everything was fine. I also modified my post-build event to make sure those interop files got copied whenever a release build the project was run. My assembly worked, my coworker's project worked, and my application worked. Above all, I learned something interesting about the inner workings of .NET.

Conclusion
The whole experience was both frustrating and facinating. The error was very confusing and there didn't seem to be much useful information online. Apparently using a COM DLL in a .NET assembly that's then used in a COM environment doesn't come up all that often! Once I started getting a handle on the problem it seemed more logical, but then the next error in the stack reared up. Working through the chain of problems to get everything resolved was an exercise in persistance more than ingenuity, but in the end everything came together.

I never paid much attention to strongly-signing assemblies before this because I'd frankly never had a need to do it. The software I write is deployed internally, so I don't need to worry about third parties using my assemblies and I haven't had to write anything in .NET that needed to be used in a COM environment. However, since strongly signing a project takes all of two seconds once you've created a key I'm going to start strongly-signing all of my reusable code libraries and controls. It doesn't make any difference when the assembly is used in a non-signed project, but is essential in order to be used by a signed project so it just makes sense to me to be consistent and do it on every project.

Sunday, October 19, 2008

My First NHibernate Project, Part 3

Part 1, Part 2

Lessons Learned

What did I take away from this whole experience? For one thing, now that I've worked with NHibernate a bit, I've come to really enjoy it. Its ability to map classes to database structures and its handling of may of the most annoying pieces of data layer coding should make writing my next data access layer much easier. NHibernate has its own little quirks that must be accounted for, but over all if you prefer working with actual objects in your code instead of DataSets, Tables, and Rows, then NHibernate's for you.

Throughout the course of this exercise I've encountered several things that had me pulling out my hair until I found the answer. I've also discovered several tips and tricks to make my life a bit easier. Since I know I'm not the only one to encounter these issues, here's my list of top tips for working with NHibernate.

Don't forget to embed the mapping files!
This is an easy thing to forget and one that bit me more than once after adding new tables to my database, writing the mappings and classes, and then getting all kinds of obscure errors when trying to use them! It's also appears to be one of the most common beginner mistakes based on the number of times I've seen this tip mentioned. Alway, always, always remember to change the Build Action (in the Properties Window) to "Embedded resource." If you don't, NHibernate won't be able to find the mapping file and you'll encounter all sorts of confusing exceptions.

Don't lose your identity
NConstruct created an "Id" property in each of my entity classes that's mapped to the primary key of the underlying table. I like this convention and have stuck with it, but with one small modification... I'm not exposing the properties' Set block publicly. In C# this is very easy because you can set access to the Get and Set blocks separately so I made my Id's Get block public but my Set block Protected (I considered making BOTH protected since you shouldn't be using your primary key in any business logic but eventually decided against it). You can't do this in VB.NET (unless it's been added since .NET 3.0), so you can't do this if you map the database columns directly to properties, but if you map the database columns to fields you should be able to write your Id property using the ReadOnly modifier.Of course, this only applies if your primary keys are automatically generated by the database (which is generally a very good idea). If you are assigning the primary keys yourself and need to have the consumer of the entity objects provide the key value for new objects then you will need to make your Id field read/write.

If your id is set by the database, remember to use the <generator class="native"/> tag inside the "id" tag (you could also use "identity," "sequence," or "hilo" depending on your database). If your application is assigning the identity to the object, use the "assigned" tag. If your entity class is responsible for assigning its own id, then you will need to use one of the other supplied generator classes or write your own generator class.

Closing Sessions is good, but creates extra work
The easiest approach to session management is to create one session, keep it open, and use it throughout the lifetime of the application. However, since Sessions encapsulate database connections, which are a limited resource on the database, this usually isn't the best approach... you can probably get away with it on a single-user database, but generally this is a "bad thing".

Since database interaction in typical applications (i.e. information is retrieved from the database and presented to the user; the user reads and possibly edits the data; any changes are written back to the database) are very "bursty," sessions kept open throughout the life of an application will remain idle most of the time, consuming database resources unnecessarily. A better approach is to only keep sessions open long enough to perform the required database CRUD (create, read, update, delete) operations and then close the session to free up database resources for other users.

This approach leads to a more complicated session management strategy, however. For one thing, you need to reconnect your entities to a session before you can save any changes to them back to the database. This could be a newly-created session, or a session that you've reconnected to the database. I've found that the latter approach works best for me.

Basically, the first time I need a session I get one from the SessionFactory. After I'm done with it, I flush it and disconnect it. Disconnecting the session frees up the database connection but leaves the session object available for reuse, freeing my application from the expense of having to create a new session the next time I need one. When it's time to save or update the database, I reconnect the session, lock the entity to the session, and call the appropriate session methods to persist my changes back to the database.

Lazy loading is very useful... but tricky
If you follow the same decision I did and disconnect your sessions when not in use, you'll probably run afoul of one of the other issues I had: problems retrieving items from collections that were defined as lazy loading. While lazy loading is definitely useful since it delays reading data from the database until it's actually needed, it won't work if the session that loaded the entity is disconnected. There are various ways of dealing with this, but that's a broad enough topic for an article of its own (soon to come).

Sets... not as easy to work with as you'd think
Sets are one of the main types of collections in NHibernate. Unless you have a relationship between tables where the child table has a column that explicitly stores sequential index values, you'll most likely be retrieving child records as an ISet(Of T). The concrete class that you'll be dealing with (if the collection is loaded and not proxied because it's defined as lazy loaded) is a HashedSet.

If you're used to automatically going to some variant of the List class whenever you need a collection (and especially if you migrated to VB.NET from VB6), you'll be shocked to discover that ISet doesn't have an Item property. ISet is not an indexed collection. In fact, that's the whole point of being a set: a set is an unordered collection of distinct values. What this means to you as a developer is that only one of any specific entity can exist within the set and that the set is not sorted or organized in any specific way. Technically, if you retrieve the elements in a set there is a predictable order in which they will be retrieved, but for all practical intents, you cannot rely on the elements in a set to be retrieved in any specific order. A set exists to hold the child entities related to your parent entity, nothing else.

This can take some getting used to. While constructs like For Each x In y... Next will work on Sets, constructs like
For i As Integer = 0 To set.Length -1
Dim x as entityType = set(i)
' do something here with x
Next
will not because there is no default Item property (there's no Item property at all). This means that if you are used to accessing collection items through their index your strategies for accessing items in the collection are going to have to be a bit different.

Objects in collections need special care
I had an entity class that contained a collection of entities from a related table (in this case a form Document and its collection of Fields). When tried to remove one of the fields from the document I discovered that the collection's Remove method was failing and I didn't know why. I finally solved the problem but it took quite some time to figure out (1) what exactly was happening, and (2) what to do about it. The ins and outs of this one are a whole article by itself, so I'll try to summarize this one here and write a more detailed explanation later.

Basically, what was happening was this: I had overridden my entity class' Equals() and GetHashCode() methods and I was apparently a bit over-zealous in what I felt was necessary to say that two instances were "equal" to each other. The real problem was actually my implementation of GetHashCode(), but you can't really override one without overriding the other (more on that in a later article).

Since my collection of fields was an unordered collection, it was defined in the mapping file as a Set. In my code, this set was implemented as a HashedSet. The importance of this is the fact that when an item is added to a HashedSet, its hashcode is used as the key for storing the entity. If you later try to remove that item (or find it using Contains) and pass in a reference to an item in the collection, the HashedSet calculates the hashcode of the item being passed in and uses that hashcode to find the item in the collection.

I got into trouble because the properties of the object I used in my hashcode were not immutable. Somewhere between adding the field to the collection and trying to remove it, some of the properties that are used in the hashcode were changed. As a result, when I tried to remove the item, its current hashcode no longer matched the hashcode that was used to place it in the collection so the HashedSet was unable to find the item again to remove it.

In short, the solution was to change which properties I used in my hashcode (I got a little less zealous about what constituted a "unique" entity) and to make those properties immutable. The actual implementation turned out to be a real pain because that meant having to make a number of changes in the rest of my code to accomodate the fact that I could no longer change the values of those properties. Since GetHashCode and Equals are closely related it also meant changing the implementation of Equals and rethinking exactly what I mean by "equals." To be totally consistant, this also meant changing my implementation of GetHashCode and Equals in several other entity classes as well as modifying code that was no longer valid because the properties used in those methods were now also immutable.

There's a lot more involved in the underlying theory behind object equality, hashcodes, and my final implementations, but those are topics for a later discussion.

Bulk Updates -- Not NHibernate's cup of tea
As much as I'm growing to like NHibernate, I discovered one weakness: it's lack of support for bulk updates. I believe that the recently-released version 2.0 of NHibernate is making inroads on this problem, but with NHibernate 1.2 it doesn't appear that bulk updates/deletes/inserts are possible. In order to insert, update, or delete multiple items using NHibernate 1.2 it appears you only have two choices:
  1. Perform each insert, update, or delete one at a time
  2. Bypass NHibernate and perform your mass operation using an ADO.NET Command.
The first option is obviously very inefficient, especially if you have to retrieve each entity, update it, and save the changes back to the database. Unless you're only dealing with a small number of entities, this isn't really even an option. The second option is really the only option of the two. Fortunately, you won't have to create and maintain a second connection to the database. You can share the connection object used by NHibernate by calling s.Connection.CreateCommand().


From what I understand, in NHibernate 2.0 you can execute bulk queries. To do this, you can write a named SQL Query, get it into an IQuery object, and then call ExecuteUpdate() on that object. The next release of NHibernate should go even further by allowing you to create mass update queries using HQL. For more information about bulk queries, check out this article.

full window
full window

Saturday, October 4, 2008

My First NHibernate Project, Part 2

Part1, Part3

The next time...


Now that I have one project under my belt, what will I do differently next time? I think now that I have more experience and a better understanding of NHibernate I'd probably do several things a bit differently if I were starting this project over. Here's a list of the main things I'd consider the next time I started a new project using NHibernate.

NConstruct... to use or not to use?
Although I'll consider NConstruct again (if nothing else, their development staff appears quite responsive to feedback and is very quick to incorporate requests), I'll probably try creating at least some of the classes and mapping files myself. Maybe I'm just a bit of a control freak, but now that I'm familiar with the file structures, I think it will be easier to create them myself than to use a tool and then have to correct the code generated by that tool. Still, it was able to generate quite a lot of code and XML with very little effort, so I'd probably still use it for most of my basic entities unless I decide to use interfaces and then just clean up the property names afterward (Refactor! is a pretty handy tool for helping with that). Maybe I'll beat my DBA around the head a little bit as well to see if I can get him to loosen up on his table naming rules a bit, which would drastically cut down on the amount of fixing up I need to do to property names!

Interfaces
Next time i think I'll also look into using interfaces with NHibernate. This is a bit of a trade-off, so I'm not sure whether I'll stick with that approach though. On one hand, using interfaces makes unit testing easier. On the other hand, with interfaces you can't map database columns to fields... you must map to properties, which may not be acceptable if your setter method does things that you don't need to do when initializing the object from the database. For example, if your workplace is anything like mine, you may have to work with a legacy database with some not-too-clean data. You might want to add validation in your setter so that new data going in will be clean and shiny, but couldn't use that setter when initializing an object from the database because you need to accept the record as-is, errors and all.

Components
Components sound interesting because they're basically a way to deconstruct part of a table into a separate class which is then included as a property in the table's entity representation. The classic example of this is a table that includes address information (such as almost any database that includes contact records). The address fields (street address, city, state, zip code, etc) can be defined as a component of type Address and the entity will then contain a field of type Address that contains the values of these columns. This is a good way to create small, general-purpose business objects that can be reused in different applications. In my current project I have a couple tables that could have been represented this way had I been able to absorb everything about NHibernate at once! Instead, I modified the NConstruct-derived classes for these tables so that they implemented a pre-existing business object interface, encapsulated a concrete instance of that interface within the class to hold the properties of that interface, and modified the generated getters and setters of the entity class to use that instance. This could have been accomplished more easily using a component (although in this specific case it was a better implementation choice to not use a component, but at the time I didn't have enough familiarity with NHibernate to evaluate the applicability of components to the problem.

Named Queries
Named queries could also be useful for encapsulating some of the data layer. My data layer code consists of a number of methods that query the database to return single entities or collections of entities that match specific criteria. It also contains methods that return collections of arrays consisting of specific properties from entities (for example, when I only need a couple properties such as the id and name in order to populate a combobox or a list). This means that the data layer consists mainly of methods that look basically like this:

Public Function GetSomeEntitiesBySomeProperty(ByVal propertyValue As PropertyType) As IList(Of EntityType)
Try
Dim s As ISession = GetSession()
Dim entities As IList(Of EntityType) = s.CreateQuery(queryString) _
.SetString(0, propertyValue).List(Of EntityType)()
Return entities
Finally
DoSessionCleanup()
End Try
End Function

where GetSession and DoSessionCleanup are methods in my data layer that manage the reuse of Session objects.

The question then is where does the query string come from? It could be hard-coded into each method in a string, but that's a fairly inflexible approach. It could also be coded into the method using properties of the ICriteria interface, but that suffers from the same problem (not to mention is just much to wordy for my taste, but if you like the declarative programming approach, feel free to go for it; I find that approach quite useful when writing unit tests in NUnit, but I'll probably pass in favor of HQL and SQL queries in NHibernate).

I ended up putting my query strings in Resource strings, the advantage being that it takes the query strings out of my code and into a resource file where it can be modified without touching the actual code. I could achieve basically the same end result using named queries, so that's an approach I'll probably look into next time (or maybe this time... there's still time to move the queries from the resource file into the mapping files if I want to give it a shot). The main advantage of moving the queries from the resource file into the named queries is that named queries are parsed once as opposed to queries passed in as strings, which are parsed each time unless they're cached (I believe NHibernate currently caches a certain number of the most recently used queries).

Generics and ICriteria
Since the majority of my queries are very similar -- "Get all of the {some object}s where {some property} is {some value}" -- occasionally with an "ordered by {some property}" I could could probably also drastically cut down on the number of queries I need to write if I used some combination of ICriteria and generic methods. But wait... didn't I just say that ICriteria wasn't any more flexible than hard-coding the queries as strings? Yes, but you for basic lookups you can pack a lot of bang for your buck if you write your lookup method using generics and a few input parameters. Here's a quick example of my previous pseudocode implemented using generics:
Public Function GetFilteredEntities(Of T As EntityBase) _
(ByVal propertyName as String, ByVal propertyValue as Object, _
Optional ByVal orderBy as String = Nothing) As IList(Of T)
Try
Dim s As ISession = GetSession()
Dim crit as ICriteria = s.CreateCriteria(GetType(T)) _
.Add(Expression.Eq(propertyName, propertyValue))
If orderBy IsNot Nothing AndAlso Not String.IsNullOrEmpty(orderBy.Trim()) Then
crit.AddOrder(Order.Asc(orderBy))
End If
Return crit.List()
Finally
DoSessionCleanup()
End Try
End Function

This one method could replace over a dozen methods in my current data layer, along with their associated query string resources.

While generic classes are a very powerful feature, I'm starting to believe that generic methods may even be more powerful. By using generics, this method can return a list of any entity type instead of a single type. In the above example, all of my entity classes inherit from the base class EntityBase so I specified that the generic type passed in be of that type or a subclass. Since each entity will have different properties and I may need to filter the same entity by different properties, I am passing in the name of the property by which to filter as a parameter and the value of that property as another parameter. Also, since ordering the results is a common requirement, I've addedn an optional parameter to specify a column by which to order the results.

Granted, this one method only handles a very limited set of possible queries, but I've found that this limited set actually covers most of the queries I need to do. In my experience I'm usually doing one of three things. Most often, I'm getting a single item from the database by Id. If I'm not getting a single item, I'm either getting everything from a table (but usually only if the table is fairly small), in which case I could use a method like the one above but without the propertyValue an propertyName parameters, or I'm getting a simple subset of a table, filtered on a single value (all users in group X, all orders for customer Y, all items for order Z, etc). In this last case, the method above would work perfectly well. I could also use the method listed above for the first case where I'm getting a single item by Id, but I'd probably write another method similar to the one above except that it executes a Load() method instead of using ICriteria and returns a single item instead of a collection containing one item.

If you're concerned with needing to filter on multiple criteria, you could easily extend the method listed above by passing in a dictionary of property name/value pairs instead of the single propertyName, propertyValue parameters. The method would then iterate over the dictionary, adding crit.Add(Expression.Eq(prop.Key, prop.Value)) for each property in the dictionary. This would build a filter where all of the properties must equal their specified values. If you wanted to return everything from a table, pass in an empty dictionary (I'd probably allow Nothing as a value and check for that to skip the loop as well).

If you really wanted to get fancy, you could probably come up with ways to pass in a data structure that can specify more advanced filtering such as "greater than" and "less than" instead of just "equals" or to allow "or" as well as "and," but then your setup for the method and handling of the data structure start becoming too cumbersome to be useful in opinion. Unless I really needed the data layer to be totally dynamic, I'd go as far as the parameter arrays and write specialized methods for the remaining 10% of my queries that can't fit into my general-purpose methods.

Coming up

In part three of my series on my first NHibernate project, I'll be discussing the main lessons I've learned and giving a few tips for new NHibernate users.
full window

Monday, September 29, 2008

Lessons from My First NHibernate Project

Part2, Part3

I recently used NHibernate for the first time. As a result, I learned a number of things... some obvious, some not so obvious... that I thought might be helpful to other people coming to NHibernate for the first time. This is the first article of a three-part feature. I originally planned to post them all as one large article, but life interfered and it's now three weeks after I had started this article and I've decided to split it up so I can get part of it out there now while I finish the rest.

This first article briefly describes my experience getting started with NHibernate. The following two articles describe the things I think I might do differently next time now that I know a bit more about NHibernate, and the various tips and tricks I've learned so far along the way. There are a couple aspects of NHibernate that I felt are especially tricky for new users. I've mentioned those in this series of articles, but I'm planning to write a separate article covering each of those in more detail as well.

None of these articles are step-by-step tutorials on configuring or using NHibernate. I found that the documentation on NHibernate's site is very useful for that and writing an article on that would be redundant. Also, there are several articles on CodeProject that are quite useful for programmers getting started with NHibernate for the first time. This series of articles is more a listing of tips, tricks, and traps to avoid that I thought would be a useful for programmers after they've gone through NHibernate's Quick Start Guide.

The Learning Curve

When it came time to start developing the data layer of my current work project, I decided to try NHibernate instead of coding yet another data layer myself in ADO.NET hoping to (1) save some development time, and (2) learn a new tool. Being new to NHibernate and a bit daunted by all the XML files required by it, I also decided to try NConstruct (the free lite version) to automate the generation of the initial classes and XML files. I seriously considered ActiveRecord from Castle Project since that replaces NHibernate's config files with attribute tags on the classes and properties, but finally decided against it for a couple reasons:
  • ActiveRecord doesn't support stored procedures, and at the time I was planning to use stored procedures (I didn't realized at the time that although NHibernate does now support stored procedures, its implementation is rather clunky to say the least. Also, after quite a bit of research, I discovered that many of my reasons for using stored procedures weren't totally valid and I ended up doing the project without them).
  • By using attributes, entity classes defined using ActiveRecord are not true POCO (Plain-Old CLR Objects). A true POCO should not include anything in its implementation that ties it to a data source or service. At its most basic, an entity POCO classes would consist of little more than one or more constructors and a property for each column in the database table being modelled. NHibernate follows this contract by placing the mapping information between the class and table in a separate XML file. ActiveRecord breaks this contract slightly by putting the mapping information inside the class itself.

NConstruct worked fairly well for me, but I quickly discovered a few big drawbacks. The main drawback is that you cannot use NConstruct to modify its own generated NHibernate classes and mapping files if your database changes. Since the application and database are being developed in parallel, I'm occasionally making changes to the database structure as I start developing features that require additional tables or refine the design. However, if your database structure changes you need to either:
  • run NConstruct again and completely replace the existing classes and mapping files
  • create a new NConstruct project and then manually move the changed classes and mapping files into your existing project
  • bite the bullet and learn to make the changes in the maping files and code yourself
I decided to go with the last option.

Like any new technology or API, there's always a learning curve when you first try to use it. NHibernate's certainly no exception. While I'm convinced I made the right decision, the ride has not been the easiest, and I've wondered more than once if I actually saved much time by using it rather than writing the ADO.NET code myself. I think I have... especially since you get a lot of things for "free" with NHibernate such as session management and transactions that normally suck up a lot of development time to get right when added into a basic data layer. However, it's taken a bit more effort to get there than I had originally planned. But now that I have one NHibernate project under my belt I believe that I'll be able to save a considerable amount of time developing data layers for future projects.

False Starts

I'll readily admit my first NHibernate data layer is not without its share of mistakes. For starters, since NConstruct doesn't generate interfaces, its code makes a lot of direct references to other entity classes (modelling parent-child relationships, for example). I extracted interfaces for all of the generated classes, but because of the way the original code was generated, my interfaces are a bit of a mess and NHibernate isn't using them (and since I knew almost nothing about NHibernate when I started, I didn't know how to change that). I'm using the interfaces instead of the classes outside the data layer, but again because of the way the code was generated, I've had to do a lot of casting between interfaces and concreate classes. Working under a time crunch, I originally found it easier to just do the casting, but I really need to go back and clean that up!

The second problem I discovered is that I needed to do a lot of changes to the auto-generated code, which is basically means that that code can never be auto-generated again! (thus my decision to write any changes to the mappings myself instead of trying to regenerate them using NConstruct). The main reasons for this are:
  • Being new to both NHibernate and NConstruct, I made some mistakes when using NConstruct's wizard that I didn't discover right away
  • NConstruct couldn't do everything I wanted it to do (they've since changed some of that)
  • I don't code the same way the makers of NConstruct do
An example of the first situation is my lack of understanding of how NHibernate handles creating sub-classes and NConstruct's implementation of that. As a result, instead of creating true subclasses for the types of Documents in the system (all of which derive several basic properties from a base Document class and table), I ended up having to encapsulate the base document class inside the classes of documents that derive from it... an implementation issue that rears its ugly head from time to time. Next time I'll know better and be able to do this the correct way.

I also found that I needed to go back through the mapping files and classes and change all of the property names. By default, NConstruct generates property and field names that are identical to the column names in the database table. Since our DBA prefixes the table name to all column names, the property names generated were less than friendly on the fingers (for example, my "FieldDefinition" class had properties like "FieldDefinitionName" and "FieldDefinitionIsEditable" where I wanted the entity class' properties to be simply "Name" and "IsEditable").

Finally, my database included quite a few many-to-many linking tables (for example, a User can belong to multiple Groups and each Group can contain multiple Users). Instead of creating a many-to-many set, NConstruct created a separate class for the linking table. Maybe that was due to my lack of knowledge about NConstruct and NHibernate, but I found it extremely annoying to correct.

On the plus side, all of these issues brought me up to speed on NHibernate mapping files very quickly!

Now, it may sound like I'm knocking NConstruct quite a bit. Not so. I think it's a very nice application... it just has a few rough edges. Also, like I mentioned earlier, my programming style is a bit different than NConstruct's programmers so I had to redo several things to make the generated classes work for me. I can't reasonably expect them to generate code that can fit everyone's preferences exactly, though, so I don't have a complaint with that. Even with all of the extra work NConstruct still probably got me up and running with NHibernate much faster than if I had to create the mapping files and classes completely on my own.

Sunday, August 24, 2008

XML Documentation

OK, that last post ended up as being a bit of a tease, so I'll jump right into things today. I write most of my comments using XML. Before the days of XML, I used to write "header comments" for all my major code blocks. Each of these header sections would contain a number of pre-defined elements. With the advent of XML and the inclusion of support for XML-style comments in Visual Studio, it became easy to create consistent, thorough documentation for each primary element of my code (by "primary element" I mean classes, properties, methods, and events).

Using XML comments in .NET code has several advantages.

  • XML's well-defined tags and the assistance of auto-complete make it easy to create consistent comment blocks.
  • Starting with Visual Studio 2005, XML Documentation Comments can be extracted from VB, C#, and C++ source code files. When you compile with the /doc parameter, the compiler will extract all XML tags from your source code and create an XML documentation file. You can instruct the compiler to extract these comments into an XML file. These comments can be used by NDoc and SandCastle to create HTML and chm help files.
  • Finally, these XML files can also be used to add IntelliSense in VisualStudio to your own classes.

  • I try to comment every class, method, property, and event. In order for the compiler to extract documentation, the code comments need to meet the following two conditions:

  • The comment must begin with ''' in VB files and /// in C# or C++ files.
  • The documentation to be extracted must be encased within XML tags.

  • While the compiler will accept any valid XML tags, I generally use a specific set of tags for each element that I'm commenting. In the remainder of this post I'll describe how I comment classes and properties. I'll finish my discussion of XML comments in the next post where I describe how I comment methods and events.

    Classes
    At the very minimum, I use the <summary> tag to provide a short summary of the class. This summary describes what the class is and its use. A more detailed description goes in the <remarks> tag. Other tags I may use in a class comments block are <example> (usually used with the <code> tag), and <seealso> to add a See Also section to the documentation containing links to related classes. Finally, I add the non-standard tag <author> to all my class comment blocks.

    Here's an example:

    ''' <summary>
    ''' A simple class that represents a US mailing address.
    ''' </summary>
    ''' Keven Lehmann
    ''' <remarks>
    ''' <para>
    ''' This class contains properties for all of the elements in a common US
    ''' mailing address. It provides basic validation for state abbreviations
    ''' and ZIP code formats (basic and ZIP+4) but does not verify that the
    ''' address represents an actual US mailing address.
    ''' </para>
    ''' <para>
    ''' The following example demonstrates a typical use for this class:
    ''' </para>
    ''' <example>
    ''' <code>
    ''' Public Function GetSampleAddress() as String
    ''' Dim addr as USAddress = new USAddress()
    ''' addr.Recipient = "Keven Lehmann"
    ''' addr.Street = "101 Main St"
    ''' addr.City = "Somewhere"
    ''' addr.State = "MD"
    ''' addr.ZIP = "21212-1212"
    ''' Return addr.ToUspsString()
    ''' End Function
    ''' </code>
    ''' </example>
    ''' </remarks>
    Public Class USAddress
    ...
    End Class


    Properties
    For properties, I always include the <summary> and <value> tags, although <value> often turns out to be a rehashing of the <summary>. If I feel there is anything about the property that needs extra clarification, I'll include that in a <remarks> tag. Other common tags I may use with a property are <example> for examples of use and <see> to refer to related functions from within the body of the documentation.

    Here is an example:

    ''' <summary>
    ''' Gets the <see cref="Person"></see>'s full name.
    ''' </summary>
    ''' <value>
    ''' The full name of the <b>Person</b>, comprising their
    ''' <see cref="FirstName"></see> and <see cref="LastName"></see>,
    ''' separated by a space.
    ''' </value>
    Public ReadOnly Property FullName() As String
    ...
    End Property


    That's it for right now. Next time I'll finish up XML comments by covering Method and Event comments.

    Sunday, August 17, 2008

    Documentation is the Key

    Documentation is one of the most important, yet overlooked, aspects of programming. No one programs in a vacuum. At some point, someone else will look at your code and have to understand it. For that matter, at some point in the future you may have to look at your code and understand it! When the code is fresh in your mind, understanding it is easy. But how easy is it to follow six months from now? One year? Two years? By that time, several projects have come and gone and that code is stale and musty in your mind. Do you really want to laboriously read through it line-by-line just to understand how it works again? Wouldn't it be easier if you could skim through your code and quickly home in on the piece you need to fix or change?

    Clearly-written code can certainly help you achieve that objective, but I feel that clearly-documented code can make the task even easier. With well-documented code you can read through the documentation rather than the code itself to understand what each piece of code is supposed to do. And well-written documentation in plain English (or French or German or whatever your native language is) is always much easier to understand than the actual code itself (and if it isn't, then you really need to work on your writing skills).

    If you're tasked with making changes to a project that you've never been involved with before and the previous developers didn't properly document their work, you may quickly feel the urge to track down those programmers and do bodily harm (if so, hopefully they were consultants who were long gone and not your co-workers). No documentation means having to slog through thousands of lines of code (or more) to track down the code you need. Sometimes you have to wade through all of the code just to get a basic understanding of the application before you can even dream of modifying it. Don't be one of those programmers that other programmers will one day want to kill!

    On the other hand, good documentation means having external documents that help you see the overall structure of the application -- the big picture. From that, you can see which pieces of the code (layers, namespaces, or classes) are important to your current task. More detailed documentation about those code sections (both external documents as well as class-level comments) can help you drill-down to even smaller pieces of the code. Finally, documentation on the class properties and methods as well as inline comments can help you pinpoint the exact sections of your code that you need.

    Good documentation involves several things. The main aspects of well-documented code are internal documentation, self-documenting code, and external documentation.

    Internal Documentation
    Actively documenting the body of your code is perhaps the biggest topic of the three types of documentation I mentioned. Thus, I'll start with that and spend most of my time talking about it. Internal documentation itself can be broken down into several topics: class-level comments; method-, property-, and event-level comments; and inline (or code block) comments. In addition, there are a couple side-topics I'd like to discuss: XML documentation and code regions.

    It is vitally important to rigorously comment your code. Unfortunately, writing comments tends to be boring compared to the challenge of coding and is often neglected. It can also be time-consuming and thus easy to skip when you're working under a tight deadline. I know I'm still far from perfect when it comes to properly documenting my code, but I'm aware of its importance and since I constantly strive to improve the quality of my code, I also constantly strive to improve the quality of my code comments (which I feel has a net effect of improving the quality of the code itself).

    The next few sections delve into each aspect of internal documentation in more detail. At the end, I'll provide a few tips and tricks I've discovered that help make it easier to include and maintain good documentation. Some of them even make you a more efficient programmer, so you have even less excuses for not including good comments in your code!

    OK, I'm long-winded, I'll admit it. so without further adieu, let's start commenting about comments... next time. ;-)

    full window
    full window
    full window
    full window

    Thursday, August 7, 2008

    Apologies, life, and software deadlines

    Yes, it's been a long time since my last blog post. I've been pulling a lot of extra work trying to hit a milestone for an application I've been developing. As a result, in my free time I've mainly wanted to just unwind so other commitments have suffered a bit. You should see the state of my yard!

    Anyway, the deadline has come and gone, and now I'm scrambling to hit the NEXT milestone! It just never ends in the life of a software engineer, does it? Our life is a series of deadlines stretching off (way off) into the distance. In my case, I'm writing an application that's much more complex than anything I've attempted before and there's only one other developer working with me on this project so we're in for a marathon, not a sprint (sorry all you Scrum adherents!).

    As the project lead, I've found this a very interesting challenge. I've actually been working on this project for some time now and am approximately 2/3 through the whole thing. The whole process has led to a number of things I hope to write about here.

    To give you a quick teaser until then, this project is an application to assist our staff with the document creation process. Sounds like all we need to do is purchase Microsoft Word, right? However, these documents are built on over 100 templates (all of which may need to be highly customized), have several input fields, pull information from our databases, and have to integrate with our document imaging and workflow systems.

    Actually, we already have such a system. I wrote it several years ago when the previous system was made obsolete during a mainframe upgrade. And it was built on Microsoft Word. However, now we also need to capture information from these documents for statistical analysis (we currently do capture statistics from these documents, but it's manually entered into our statistics database by another department after the documents have been committed to our document imaging system).

    Sounds like a fairly straightforward addition, right? It's definitely a good idea and one I fully support, but a basic analysis of the actual system quickly quickly reveals a number of twists in an otherwise simple-sounding task. First, the users need to play pretty fast and loose with the templates. Depending on the need, some of the statistics that need to be captured are currently added to the text of the document, not existing fields. Some could come from fields that are inserted dynamically into the text through a variety of predefined text blocks. Also, some statistical elements may occur a variable number of times in a document. Some documents don't contain new statistics, but modify existing statistics. Finally, one of the "templates" is a totally blank document for free-form entry (that one's probably going to be a goner if I have anything to say about it).

    Put it all together, and our existing VB6 application bundled on top of Microsoft Word templates and VBA code just wasn't quite up to the task. So the decision was made to dump Word from the system (we found a fairly good word processor control that we can embed in the new application) and rewrite the application in .NET.

    All this is a long-winded way of saying that I'm sorry for the long delay since my last post and (hopefully) software deadlines will no longer distract me from my blog. I plan to post my next article about coding conventions soon and you can expect to hear more about the past, present, and future of this project as well.

    Saturday, July 12, 2008

    Newest Members of the Family

    I was planning to post the second installment of my article series on code conventions, but I've just been too darn busy... having fun with the newest members of my family. I adopted two new cats from the local SPCA and have been spending most of my spare time playing with them. The two actually seem to like each other, and our existing cat, Edison, seems to actually tolerate them as well. It did seem a little touch-and-go at first, though... lots of hissing and growling in the Lehmann household for a few days.

    And just because it's against the laws of the internet (and cat-lovers) to talk about their new furry friends without posting at least one picture, here are some shots of the new kitties:



    Roxy - exploring bags on the kitchen floor



    Gessle - and his humongous tail!

    Both are still a little on the skinny side, but gaining weight fast. Roxy started out kind of quiet, but now that she's getting used to her new home she's starting to become more and more lively. Gessle's just a sweetheart. He acts like a big kitten at times, but he's also an affectionate lover boy.

    So, who can guess the source
    of the names? Roxy's diminutive for Roxette, one of our favorite musical groups. Gessle's named in honor of Per Gessle, the male half of Roxette.

    And just so he doesn't feel left out, here's a pic of Edison, at almost 10 years old, he's the "old man" of the family:




    Edison - our sleepy, 20 lb "little guy"

    Monday, June 30, 2008

    Code Conventions

    And no, by code "conventions" I'm not talking about JavaOne or VSLive!

    Code conventions are the standards you live by (or are forced by your employer to live by) that nitpick define how your code should look. Code conventions commonly specify things like how you name your variables, functions, and classes; guidelines for how to organize the text of your code on-screen; and requirements for properly documented code. If you haven't had to comply with code conventions (and maybe even if you have), you may wonder why anyone would want or need such stifling-sounding constriants. The easiest way to answer that question is with another question: have you ever had to look at someone else's code and try to understand it?

    I'm assuming almost everyone answered yes to that question. If you've ever had a hard time trying to figure out just what was going through the head of the person who wrote that code you've been staring at, you know why we need code conventions (unless it was actually a bad piece of code, not just badly written... there's plenty of that out there too, but that's another topic entirely). Code conventions help make it easier for others to read and understand your code. Or to put it more selfishly: code conventions make it easier for you to understand someone else's code!

    Now are you ready to accept that code conventions just might be a good thing?

    The main problem with this idea is the question "who makes the standards?" There are some common, wide-spread conventions out there... some good, some not-so-good (Hungarian notation, anyone?). Most likely, the standards you follow will be the ones set down by your employer... hopefully they're making you follow some good ones! Without any universal standards -- and the firepower to force everyone to follow them -- you'll still undoubtedly run into lots of marginally intelligible code, but if your company enforces any sort of code conventions at least you'll be able to read your coworker's code fairly easily after he quits to form what he thinks will be the next MySpace.

    When I started at my current employer, there were no documented code conventions for VB development. We have a fairly small number of programmers so perhaps it was felt that there was no need for them, but all it takes is one sloppy programmer to make a big mess. Since some of our most complex systems have been written by multiple contractors, the code we have to maintain was all over the place, stylistically speaking. It didn't take me too long to realize that even a small team needs some guidelines on how their source code should be written. And since I was a lead programmer, I got to dictate what those guidelines would be!

    When I sat down to start writing my guidelines for coding conventions, I decided to focus on four main areas: documentation, code formatting, programming techniques, and source code organization. Each of these is a very large topic. Some of what I eventually developed came from my own experiences writing and maintaining code. A lot of the naming conventions I adopted come from Sun Microsystem's Java Code Conventions. Many of them I'm still developing as I go along.

    Over the next several posts I hope to share the highlights in each of these areas. And since I know it's impossible for me to have thought of everything, I'd love to see what anyone else has to say on this topic as well!

    full window

    Wednesday, June 25, 2008

    Managing and Motivating Developers: Tips for Management Cluefulness

    OK, I said I wasn't going to just post links to other content very often, but every now and then I run across something that states a concept better than I ever could. This is one of those.

    Managing and Motivating Developers: Tips for Management Cluefulness

    Monday, June 23, 2008

    How the brain works: Illusions video and other Weird Science videos at 5min

    OK... I normally try not to just post links to other pages, but I have to say I've always loved optical illusions and these are kind of freaky to watch.

    How the brain works: Illusions video and other Weird Science videos at 5min

    full window

    Wednesday, June 18, 2008

    Change is Good (part 3)

    The ever-growing (and changing) framework
    As if .NET wasn't big enough already, I'm afraid to find out how many new classes were added between .NET 2.0 and .NET 3.5. With so many classes and types in the framework, can anyone really call themselves a .NET expert? I know I'm a long way away from knowing every class in the .NET 2.0 framework, and I seriously doubt there's anyone out there who's used all of them.

    Let's face it... we're all specialists nowadays. As much as I like to consider myself a generalist, a multi-purpose programmer, a Rennaissance man of code, if you will, I'm as much a specialist as anybody. I've specialized in the APIs that I've had to use to get my work done. Right now that means I've specialized mainly in .NET. And a subset of .NET at that. I've had no need for Remoting yet, for example, but I'm sure others use it all the time. I've used the classes in the Reflection namespace fairly often, but I'm sure there are plenty of .NET programmers who've never needed them. And as much as I love Java, I haven't written a line of Java code for at least three years now, so yeah, I'm a specialist.

    I just happen to be a "specialist" who specializes in learning technologies as needed. ;-)

    Seriously.

    Each time I'm faced with a new problem, one of my first thoughts is "has Microsoft already done this for me?" If I can't find a class for what I need in the humongous .NET framework, then I wonder "has someone else already done this for me?" And if they have, then I ask myself "can I convince my company to buy it?" (assuming it's not open source) It's the lazy programmer's approach to code, but it works fairly well. Of course, if I find out that Microsoft or someone else has written a class or API that might fit what I need, I then need to learn how to use that! But isn't that still much easier than writing it yourself?

    Besides, once you've gone through that learning curve, you've added another tool to your programmer's toolkit and become just a wee bit less specialized.

    I'm currently being "lazy" by trying to learn about NHibernate, but that's a story (or several) for another time.

    What if I don't care about .NET?
    Well, I'm not going to say that you should! I'm certainly not a Microsoft advocate -- anyone who's ever been on the receiving end of my rants about the latest idiosyncracy I've found in Visual Studio or the .NET framework can attest to that. But honestly, Java's not much different. Ten years ago, Java was a much smaller creature than it is today. And when you consider the various flavors of Java like J2SE, J2EE (especially J2EE!), and J2ME, your typical Java programmer has as much of a learning curve as your typical .NET programmer.

    And hey, life's not any easier for you web programmers out there, you know. First, do you use ASP, ASP.NET, Perl, PHP, Python, Ruby on Rails, or some other language as the basis for your web apps (this decision probably isn't yours to make if you're not your own boss)? Add to that the fact that you'll probably need to get fairly good with JavaScript, HTML, DOM, etc. And what about AJAX? There's Atlas, Dojo, Google Web Toolkit (GWT), etc. And I haven't even mentiond Flash and AIR yet! (well, OK, I just did).

    The point is, as each language grows and evolves, it becomes more complex. As each ecosystem (such as the world of web programming) grows, more languages spring up (sorry, Java folks, no pun intended) to fill the various ecological niches.

    And I've just barely scratched the surface. Change is the name of the game for us programmers. We'd better enjoy the challenge, because like it or not, we're all along for the ride.

    full window
    full window