Monday, July 6, 2009

Database Design, Software Design: How are they similar?

After presenting myself as someone who was versed in both database and server-side Java, my interviewer asked a great question: what are some similarities in the design considerations with both data modeling and software design? Though I stumbled a bit on my answer, two good things came of it: (1) I did get the job and (2) I've thought about the answers to that question ever since.

That's actually one of the better parts of stumbling over interview questions - it motivates you to circle back to the topic and nail it for future reference. In the bigger picture, that's one of the better things about interviewing frequently (whether you need a job or not) - it gives you reminders around your knowledge gaps. But, I digress.

Today I came across two posts that, taken together, reminded me of one of the first database-software-similarity answers I came up with. The first is around primary keys, and the second is about responsibility-driven design. The primary-key post advocates maintaining separate tables for separate concerns, and the responsibility-focus post advocates maximizing cohesion in your classes with the Single Responsibility Principle. I'd suggest that, in principle, we're talking about the same thing in both cases.

The primary-key post concludes with this: "Database design skills begin with identifying the kinds of things that must be tracked, putting each into a table, and assigning the primary keys to those tables." I'd suggest an analogous takeaway for the software developer: add equals and hashCode implementations to those application classes that call for unique identification at runtime - in particular for use in Java collections. Given this, perhaps the well-worn analogy between database tables and Java classes is not quite right - maybe it's table == collection, where the objects in the collection implement hashCode. And, if so, then hashCode == primary key.

Note that I'm not suggesting that an object's hashCode value should be used as the primary key value in any table that represents it, nor am I suggesting there's necessarily a one-to-one correspondence between classes and tables. Those kinds of discussions are moving more towards implementation concerns; I'm speaking strictly from a design perspective.

I'll follow up with more thoughts about the similarities between database and software design, plus some articles that clarify some misconceptions I've seen around normalization and other database topics.


No comments:

Post a Comment