Friday, August 11, 2006

Lot's of progress

Well, I used to work for a guy who said writing software is a lot like baking waffles. The first waffle sticks to the iron and doesn't come out very well, but it primes the iron with just enough grease to make the second waffle perfect.

So it seems I have just baked my second waffle. I'm putting the finishing touches on Shades, a framework for ORMapping. My first waffle was JDOMax. Why the hell does the world need yet another ORMapping framework? Let me tell you:

  1. Too much XML configuration
  2. Transparent object persistence misidentifed as necessary
  3. Relationships between records ARE NOT analogs to references between Objects.
  4. Inheritance relationships are rarely natural in data models.
  5. Transitive closure persistence unnecessary
  6. Too many transactional states increases complexity beyond original problem

I learned all of the above the hard way! I'm not just pontificating on this crap. I spent roughly 3 years developing JDOMax and passing the Sun Test Compatibility Suite, so my first waffle was one mother of a waffle.
Looking at 1, "Too much XML configuration", your first reaction might be to think this is an implementation detail; just a consequence of bad XML design. I argue that this configuration complexity is a natural reflection of inherent complexity. Unfortunately, the inherent complexity lies not in the problem you are trying to solve, but in the act of object relational mapping itself. It seems that a static specification of mapping fails to easily convey *contextually* relevant information. In other words, a "mapping" is too static. What is really needed is an Object/Relational Interface, the implementation of which is *code*, and therefore can, in a few lines, create contextually relevant decisions that make ORMapping decisions dynamic and flexible rather than static and ridgid.

Shades has absolutely no XML nor annotation configuration. Rather, there's an interface called ORMapping that operated really a lot like a TableModel. A DefaultORMapping is usually extended by the program, and only a few methods need to be implemented. At first I was shocked by how easy it was but it's now apparent to me that O/R Mapping is better suited to programmetic configuration than document configuration. The reason XML configuration of O/R Mapping is so complex is because...well, because in THIS CASE it's way more complex than just writing a few lines of code.

Somewhere along the line we all just accepted that XML was the way to configure O/R mapping. We bought into a real big advantage of XML - that it's externalized, and can in theory be edited by this mythical "deployer" person. The problem is that this mythical deployer does not exist. In reality he's the programmer and it turns out, now that the wow cool factor has warn off, that the XML is harder to deal with than a few lines of code.

2. Transparent Object Persistence Misidentified as Necessary
This is a big deal. Transparent Object Persistence (TOP)is the idea that you operate on Objects with no concern for the fact that they may be persisted in a relational database. And even beyond that, TOP espouses that it is the Object AND it's references that are in the database. This just turns out to be wrong. THERE IS NO *INHERENT* MAPPING BETWEEN RELATIONSHIPS IN THE DATABASE AND REFERENCES IN THE JVM. But it is *possible* to relate object references to relationships in the datastore. The problem is that the mapping is not one-to-one, and again we are back to a solution that creates as least as many problems as it solves.

Quick example. Teacher Object has a collection of Student Objects. Each Student Object has a field named teacher. Database has an FK in STUDENT_TABLE that points at the PK of teacher table. So in Java land you remove a student from a teacher's collection. This must immediately null the teacher field in the student Object. This is unexpected. And making it work leads to a complex implementation of proxy collections. When the transaction is committed, the FK in the student is set to null. It's simply not at all clear that removing an element from a collection in memory is the right way to remove a relationship from the database. It's just not an intentional way to program. In other words, in order to "forget" about the relational database, we have to learn more than we ever bargained for, and understand subtleties, that even for experts can be mind bending. And the transparency turns out to be nothing near transparent. Consider a STUDENT_TABLE that places a unique constraint on STUDENT_LNAME. If you created a new Student in memory, and added it to a Teacher Object's collection, when would you find out this was invalid? You'd find out when you committed, and a nested datastore exception was thrown. What's so transparent about that. The transparency is a myth. And how would you handle the exception when you created 19 other Students in the transaction? Yes, yes, it is *possible* to handle using an array of nested exceptions, blah blah blah. BUT FOLKS, IT'S NOT EASY. YOU HAVE TO LEARN MORE TO MAKE IT WORK THAN TO SOLVE YOUR ORIGINAL PROBLEM!!!!!!!!!!!!

No comments:

Post a Comment