Home
  Latest posts
  My Writings
  My Code
  My Gallery
  About me
 
  rssfeed Syndication
 
Bloggtoppen.se
 
 
Links
  Cornerstone
  SweNug
 
Post categories
  misc (48)
  Architecture (21)
  C# (19)
  Asp.Net (2)
  Vb.Net (2)
  Training (7)
  Data (19)
  Events (40)
  Platform (2)
  Orcas (4)
  Updates (3)
  Methods (10)
  Tools (6)
  Announcements (14)
  Languages (1)
  Patterns (6)
  Opinions (11)
  Fun (3)
  Ineta (1)
  Opinion (0)
  Practices (2)
  WCF (5)
 
 
 

ORM and legacy data

Sunday, June 03, 2007

A week or so ago Gavin from the Hibernate team wrote a very humoristic post (you can read it http://blog.hibernate.org/cgi-bin/blosxom.cgi/2007/05/23#in-defence) where he made a clear case for ORM against OODBM. One of his points was that ORM is the only one of the two that can handle legacy in a satisfying way.

I have used ORM in several projects over the past years, and at least 2/3 of those has been against some kind of legacy database. The thing about legacy (or really screwed up design of the data model) is that they often don't conform to even the second rule of normalization. Mostly they are flat representations of the business problems at hand.

I though I should share one of my absolute nightmare stories and elaborate a bit on the design choices that where made to support the applications scenario over that nightmare.

This scenario is based on a database where the data model was built to support the the lowest common denominator of several RDMBS, where SQL Server, Oracle, DB2 and Sybase is just some of them. This means a data model with almost no normalization for some parts and extreme normalization for others. On top of that I'm certain that parts of that model has been written by an trainee or a monkey of some sort.

There is a table with date bookings, thees date bookings can be one of two things. Either it is a stand alone booking or it is a booking connected to a resource. If the booking is standalone, the ITEM_NO column will hold the value "Appointment" and the description column will hold a value. If it is connected to a resource, the ITEM_NO column will contain an ID of that resource and description will instead be null.

While ORM is really good at handling a lot of different mapping scenarios, some are just not mappable at all when it comes to object models and if it is really hard to find an object model to hold the data, the ORM will fall flat.

In this scenario it was really hard to create a model that actually made sense. Since the domain had no interest in tracking back to the resource but just wanted the resource name as a description if the description was empty. The idea was to only have a Appointment class with the information.

The first idea was to use inheritance and create different mappings for the two scenarios. Now that did not really work out. Inheritance in ORM uses two approaches, one is "table per class" which weren't suitable here since we only had one class the other is using a discriminator column to choose what concrete class to use. Not suitable as well. There was two possible ways of deciding how to separate them, either by the ITEM_NO column or the by the Description column and both of them had no clear distinct way to discriminate between concrete classes.

The second idea was to create a many-to-one relationship with the item class where we just got the value for the resource and use encapsulation in the Description property to make sure that we got the right information presented.

Something like:

protected virtual Resource Resource {...}
public virtual string Description {
  get {
     if ( Resource != null )
      return Resource.Description;
     else
      return _description;    
   }
}

This is a perfectly valid solution. The problem I had with this was that the data model actually got to put constraints on my domain model and made it ugly, though functional. I did not want that.

So instead I let the Data Model get constraints from the Domain Model. The approach for this scenario that I took was to create a view in the database that supported a clean model. The view looked something like this:

create view vwAppointments
as
select no, dateBooked,
      case when resources.description is null then
          appointments.description
     else
         resources.description
     end as description
from appointments
left outer join resources
on resources.item_no = appointments.item_no

Joy! Now I could model my scenario as it was supposed to be in the application and not care about the constraints the legacy database put upon me.

So, yes most ORM are good at mapping legacy, but helping them a bit by cleaning up the model with some nice views do help.

kick it on DotNetKicks.com
 

Comments
6/4/2007 12:56:00 AM     -   henke
 
Great tip! Thanks! Had an aha-moment just as when I saw the presentation at infoq saying how to refactor old databases. (by using a trigger to keep two columns' contents identical and setting a death-time of the older column, at which all systems should be using the new one)
 
6/4/2007 5:07:00 PM   http://youredoingitwrong.mee.nu   -   mcgurk
 
Wow, crappy legacy database. I got a better one.

I write a web frontend against a legacy database. The database was written by a self-taught coder and contains several goodies. These are my favorites:

The data in the system is spread out among multiple databases. So before I can query, I must know which database to query from, then join with tables in the master database.

Each database has its own number generator, essentially a single table with an identity field. Every record in every table in each database gets its ID from this table. So, to insert a record in a table, you must first go to the number generator and get your ID. Then you use that in your new record.

One particular table has a field called foobar (not really). If the int value in tarfu (another field in the same table) has a value of 32, then foobar is the id of a record in another table having to do with security. If the value is other than 32, then foobar contains data that controls how objects are filtered within the table.

The database contains five different security models. Five. Every data access must be tested against each one. My longest query is over 11 pages long, printed in Notepad with the default settings.

I subscribe to the DGPSMN model of database access. DEAR GOD PLEASE SAVE ME NOW!
 
5/5/2009 10:34:00 PM   http://www.orasissoftware.com   -   Adnan
 
Great posting. We ran into the same problems, making an ORM work with a legacy database that was huge. We got so frustrated that we eventually wrote a tool to get the job done. The tool is now a mature product. You download it from http://www.orasissoftware.com

This may sound like a product pitch, but really it got us out of the ORM legacy database mess we were into, and then it was developed into a mature product. It also works very nicely with new databases so it works in both modes


Comment
Title:
Your name:
Your url:
Text:
Please enter the text from the image: