So, one of the reasons I thought I might like about Google+ (as opposed to Twitter) is that I can write quick rebuttals to other blog posts that I find on the internet, without the effort of maintaining a full-blown blog. This is my first stab at it. Enjoy:
The article in question: http://smoothspan.wordpress.com/2011/07/22/nosql-is-a-premature-optimization/
posted by Bob Warfield.
My attempt at a rebuttal:
First of all, I should start off by saying that I agree with the thought that most small to medium size projects probably* won't have enough traffic and scale to need a NoSQL solution for that reason right out of the gate. However, that alone is not enough to make NoSQL a premature optimization. The reason why not is the general exception in the definition of "premature optimization is the root of all evil": it only applies to "optimizations" that make the codebase less readable, less modular, and less well-designed. A lot of the time, using a NoSQL database can actually make domain modelling easier.
As the original article is divided into three main points, let's tackle them one by one:
1. "NoSQL technologies require more investment than Relational to get going with."
Obviously, I disagree with this statement. A well-designed NoSQL persistence can often be cheaper to implement than using an RDBMS, and the reason why is often "the Object-relational Impedance Mismatch" **:
Put simply, mapping certain concepts that make a ton of sense in the object-oriented world to a relational database can be a huge pain. If you're not using a relational database to persist your data, that pain might just up and vanish. Less pain equals faster development, sometimes dramatically so. How much faster it can be is probably best summed up by my colleague +Chris Turner
in his recent post: http://plus.google.com/107237485306756328866/posts/8a7raE3MJfz
I've had experiences like this as well, and so have others. Here's a specific example: the current project I am working on features a json rest api. We have found that using MongoDB (which persists json documents) as our persistence layer means that we can use one consistent model the whole way through our app, never having to shoehorn a concept into a weird construction in a different paradigm. As a result, the development cost of persistence has been very cheap. Even though we don't have enough data to require NoSQL, it has been a positive choice for us: very far from a premature optimisation.
The post mentions two other points: "a learning curve", and "an operational overhead". The learning curve is fair, but in my experience part of that curve is unlearning some of the tricks years of doing relational have made second nature (like an entity to represent a many-many relationship, and master-detail relationships), and realizing that "the dumb approach" in a relational system might not be that dumb after all.
The operational overhead concern is a fair one, as many ops people are a lot more reluctant the adopt NoSQL than us developers. But not all of them; I've met a few ops people that are incredibly open to the idea, and not just because they have visions of turning the budget for Oracle support contracts into more salary ;-) At any rate, consider the case that this might largely be an issue for medium companies, because a lot of the time small companies don't do correct operational monitoring and maintenance anyways: they just install MySQL on a box, put it in a closet, and pray that nothing goes wrong.
2. "There is no particular advantage to NoSQL until you reach scales that require it....."
Well, obviously there is. I've been ranting about why for a few paragraphs already, so I'll leave this point alone for now.
3. "If you are fortunate enough to need the scaling, you will have the time to migrate to NoSQL and it isn’t that expensive or painful to do so when the time comes"
This is the point at which my opinion of the article descended into incredulity. Not because point #3 is outlandish (it isn't), but because it directly contradicts point #1! How can moving an app from relational storage to NoSQL storage be easy, but developing that same app to persist to NoSQL storage from the beginning be prohibitively hard? Something doesn't feel right here, and looking back at the article, I think I've found it.
The gist of the problem is in this quote: "as Sid Anand says, 'How do you translate relational concepts, where there is an entire industry built up on an understanding of those concepts, to NoSQL?’" Why would one have to translate those concepts if you aren't using them? The big reason why is if you've gotten into the habit of designing your applications using those concepts. This leads me to suspect that the Mr Warfield may partake of a bad practice that is depressingly common in the "enterprise computing" world -- modelling applications in terms of their (almost always relational) persistence instead of designing a persistence strategy in terms of the application.
This is a really, really, bad practice. Modelling an enterprise application in terms of its persistence is like designing a video game around the file format of its save game function. Persistence should be an entirely internal concern. None the less, many people think about the schema design first, and it leads to incredibly nasty problems: like database schema that can't be changed without breaking an external application, because it was "cheaper" for another app to just connect to a big common database.
For whatever reason, I haven't seem the same kind of thinking in the NoSQL world. It might be because the different NoSQL technologies are less similar to each other than RDBMSes, or because the lack of an impedance mismatch makes it easier to extend OO/DDD concepts everywhere, or because everyone is using a different product, or what. I don't really know why, but I do know that I don't want it to show up.
So there's my first rebuttal to a random post that showed up in my twitter feed. It's longer than I thought it was going to be, and it has footnotes -- sorry about that. The point of this idea is to not expend the effort in editing and polish that a full-blown blog would demand, so I'm not going to edit this post to death, OCD be damned.
* There are probably cases where the right NoSQL solution could turn a very large project into a medium project (or, perhaps the other way around), but let's leave that alone for today.