Our Boot from Network Datanode design was conceived in ignorance of real world application. Serial Number vs. MAC Address debates ensued in Ivory Tower minds and a schema was built. I’m currently in-between designs. I’ve consulted our resident data genius and he devised a superior schema that I have not yet been able to implement.
Today was spent trying to figure out why I built a view to contain my base information and how in the Hell to add new DataNodes so everything works as expected. Knowing the future and working with the past can be painful, especially when multiplied by the 0.0.2 changes to the existing schema that “seemed like a good idea” at the time. The bonus multiplier for getting this done is that I have 3 clusters to add in the next 60 days, a new admin to get trained and 6 developers to support. Did I mention that I get to figure out how to re-process failed files in production? The developer who created the process in no longer available and didn’t have time for a knowledge transfer. That’s why we pay $200/hour for Professional Services! So they can go away and leave us trying to understand their mistakes! (Just a little bitter)
What’s the lesson here? Just because you have incrementally better designs, doesn’t mean you should half-ass implement them. Live with what you have until you can fully move to a better design. Full on releases are much better than temporary fixes that get forgotten when you’re distracted for 3 days. I’m planning to create a new database to house all of the changes coming in releas 1.1 of Indostan. Otherwise we’ll be stuck in hybrid upgrade Hell forever. Yes, lazy admin, that means you have to do something the “hard/wrong” way EVEN WHEN you now there is a better/easier one. Until the new stuff is fully baked, use the old stuff.
I know this flies in the face of “continuous deployment” models, but it works much better for foundation level applications. It allows me to compartmentalize change and when I have 3-5 hair on fire events per day, that is essential. Today seemed slow and I think there were 4 crisis that had to be solved “right now.”
Being able to strike a balance between growth and stability is a valuable skill for any administrator. It’s sometimes harder for Hadoop Admins because there are new and better options weekly if not daily which makes it much more of a requirement. Choose an intelligent design and run it for a while. 3 weeks, 3 months or maybe 6 months, but take the time to let evolve in your mind and within your environment. Then re-evaluate and make changes. 6 months is a lifetime in Hadoop add-ons.
~~ GM