Why Enterprise Hadoop jobs will not require Java skills in 3-5 years.

In the late 1979, RSI’s Oracle version 2 ran on Digital’s VAX minicomputers (32bit AND virtual memory!). If you were proficient with the first commercial RDBMS, you had to posses mad Macro-11 or PL-11 (the high level version) skills to actually make many of the functions work that we take for granted now. Many basic tools that DBAs and developers use today simply didn’t exists.  You had to roll your own.  Even the data dictionary was a new concept and often in-flux.

Hello World, Macro-11 style:

        BEQ     DONE    ;IF ZERO, EXIT LOOP
        BR      1$      ;REPEAT LOOP

MSG:    .ASCIZ /Hello, world!/
        .END    HELLO

Don’t forget the RT-11 commands to assemble, link, and run!



Hello, world!

It was an immature but revolutionary way to store and recall information. Bell Labs saw the business benefits of the Oracle RDBMS and thus much hype and exuberance flowed in the land:

“They could take this data out of the database in interesting ways, make it available to nontechnical people, but then look at the data in the database in completely ad hoc ways.” – Ed Oates

During these early days you would need a room full of advanced computer science academics just to keep the system functioning – at each and every business.  There were no safety nets and everyone had there own perspective on how to do both a multi-join query WITH an aggregate function (and on the 4th day RBO was created, and it was good).  Read consistency was still 5 years away!  As time went on, the best brains from the IT collective pioneered standards and best practice that we all use today.  As the tech matured, the need for low-level Macro-11 developers diminished as they were replaced by a more mature product that would appeal to large non-tech companies.  As the need for low-level tech skills went away, patterns were established and the need for highly skilled programmers to keep the data store functioning went away.  Interestingly,  the data and the patterns of its flow remained.  That is why enterprises have DBA to maintain modern relational databases, not developers.

Inevitably, there are some times when advances dictate new low-level programming skills on a large scale.  When RSI released Version 3 in C, there was high demand for developers who could read and speak the prose of Mr. Ritchie.  This was necessary for recompiling and testing a consistent code base across everything from minis and mainframes, to PCs.  While C was quite portable, there was much work to be done in the storage subsystems.  Again, as the need for low-level tech skills went away, the data remained.

When we look at the new world of Hadoop, we must understand that this type of tech revolution has occurred before.  Right now there is much work afoot to solve the primitive questions.  This undoubtedly requires a new breed of low-level Java developers… for awhile.  We see the results of these efforts in tools like Pig, Hive, Impala, and Stinger glued together via HCat.  Once the dust settles, I wouldn’t stake my professional future on mastering MapReduce, but rather focus on mastering the higher level tools.  This will allow the enterprises quicker access to business insight.  As Hadoop’s primitive issues are solved in to standards and patterns in the next 3-5 years, the need for Java developers will diminish substantially in the next 3-5 years.  Just look at how many PL-11 or C++ programmers your enterprise has in their DBA teams; the low-level tech comes and goes, but the data remains.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *