Consulting and Contracting

I was a 1099 employee for a local Consulting “broker.” Basically, we could use the company name and pick your own gigs. I had a few lined up helping another consultant with managing School systems. I’ll get back to the schools. I thought it was interesting to see the types of small businesses I got involved with. Dentists! They usually don’t know or care about IT, but they can make a lot of money by installing Practice Management Software and things like inter-oral cameras. It’s helps them “sell” the dental work that needs done. A few GP Doctors and Realtors were usual customers. I ended up creating a “bundle” where I sourced External Hard Drives and Backup Software. I got to setup off-site backup rotations just like Enterprise class systems.

The schools were interesting because they needed a quick way to wipe and re-install all of the computers in the district over the summer. The first time we did it it took a few weeks to complete the 10 buildings. The second year we worked with IT students and managed to get each building down to a day. It’s multiple laps around the building while systems are plugged in, turned on, wiped, loaded and tested. We only had 1 fire.

The 3rd year was a blast. One other consultant and I were able to do 1 building per day in less than 8 hours. That included installing new managed switches in each building. We configured the switches before taking them out so they were just a matter of swapping hardware and testing. We did have to replace a few cables, but things went pretty smoothly. The real fun was the PXE boot.

PXE1 is super handy if you understand how to setup your environment. We had all of the computers set to attempt a PXE Boot before attempting to boot from the Hard Drive. That allows us to run some, in this case, Bash scripts. We had a floppy based minimal Linux that could work with all of the NIC’s in the district. Since each building was assigned a unique subnet, we were able to determine where we were (to identify the local server for the building) and using DMIDecode we were able to determine what type of system we were running on. Using that, we were able to determine which gold image broadcast we would need to attach to. We had 1 system broadcasting UDP streams of the images and once all of the clients were setup and waiting, we hit the button and went to lunch. 45 minutes later we came back and did a quick test and shut them down. If any failed or had errors, we documented them for a clean up crew.

I really didn’t enjoy “selling” my own work, so I headed back to the corp. world.

1 https://en.wikipedia.org/wiki/Preboot_Execution_Environment


Grease Monkey ~~ GM
Posted in Experience | Tagged | Leave a comment

Software Distribution

My OS/2 experience allowed me to land a gig with a regional bank doing Software Distribution (and installation) over their private banking network. This was sometime in 1996 and the Internet was not yet widely used or secured.

The bank had an IBM mainframe and we ran a package called “Distribution Manager.” It allowed us create packages of software with the software’s automatic installation routines. All of the IBM Software used “response” files that allowed us to do silent installations. Packages could be bundled into Groups and Groups into Profiles. This allowed us to do updates that included many Packages, with only 1 selected Profile.

We were able to remotely install and configure all of the software on all of the systems in the banks branches. Each branch had 1 large server that controlled local PC’s and provided a gateway for Distribution Manager. It was a pretty complex set up and worked very well.

We maintained a “Golden” image for each region’s servers and we could remotely rebuild the server when drives eventually failed. At one point I created a disk duplication system that used an external Compaq RAID controller and a Linux floppy disk that had the dd and tee commands. We could blast 8 copies at a time.

I took a brief contracting gig where I was supposed to doing Software Distribution for a large energy company. On my first day I was informed that my boss and team mate were both out for a week. It got worse from there. After 6 months of nothing to do, I took a programming gig for a Credit Union. Programming Windows software makes my physically ill. I went back to the bank after 1 year.

I bounced around some roles at the bank as a DBA and Team Lead.

Politics in some places it brutal, so I went to work for myself… sort of.


Grease Monkey ~~ GM
Posted in Experience | Tagged | Leave a comment

My Second Start Up

Sometime in 1991 I got in contact with Gary and Phil. They were working on a new idea and needed some programming done.

Fax machines were extremely popular. 1990 Census data was coming available. We had contacts at The Ohio State University. Let’s mash them together.

Product 1: Small Business Reports. Working with the Chair of the Business school at OSU, we said, “If you have these data, what would you look at to determine if a market is over or under served for a given retail type.” Based on Population and Business Census data, we created 2 page reports that hit the highlights of the analysis in brief. We also had a 6 page option that was much more conversational.

The idea was these reports would be generated on the fly using rules given by our business experts. They would then be sent via fax to the buyer. Given the various business codes and regions available we had something over 50 million reports available. I contacted the large fax providers at the time and they said they could host them, we’d just need to fax them the documents and they’d deliver on demand. When I told them we needed to generate them on demand, they said that’s not possible.

I wish they had told me before I did it. The POC was on a 286 with Bigmouth board and an Intel SatisFAXtion card. I was able to write an app that answered the phone via the Bigmouth, play a message, process DTMF input and follow a process for getting all of the required info, including destination fax and CC info. After the data was captured a process was kicked off to analyse the data and create the reports. I got to create a template language so I could properly format the report without the data. The report was converted to fax format and sent to the customer. The entire process took about 45 seconds. Since the fax was never scanned, the image quality was 100%.

We eventually found a company in LA that understood the process and we started working with them. Thanks Ned. 🙂

Product 2: Similar to SBR’s, Franchise Fax was about provided uniform and comparative franchise information. We got access to a database of UFOC’s (Uniform Franchise Offering Circular) so we did the same kind of thing.

Product 3: Import/Export Index. The US Gov hosts a site for Import Export requests. You could subscribe and get a free Index of the day’s items. Call and request additional details for any of the indexed items and we’d send them along.

I got to do a lot of Pascal development a little bit of C and tons of DBF stuff. OS/2 Warp came out and I converted my system immediately.

We worked on a few other projects, but I wasn’t directly involved. Next up; Software Distribution


Grease Monkey ~~ GM
Posted in Experience | Tagged | Leave a comment

My First Start-up

Impact Resources: I started in 1989 when they had outgrown their first office condo. I had to work from home for a few weeks as there was no room for me in the office. I had my trusty PS/2 Model 30 and I was ready to roll.

The company did Consumer Behavior Research by means of surveying people in shopping malls and strips. We collected 400-ish data points per survey and paid an entire dollar for your time. Things like where do you shop for X? Do you prefer value, quality, name brand? What radio station do you listen to? What type of music do you like? What kind of car do you drive? And how much money do you make?

All these data points were hand keyed into PC’s and stored on floppy disks. (We didn’t have fancy LANs at the time). The data from the disks was loaded onto a large (300MB) hard drive and summarized. There were several steps involved in “normalizing” the data and until I was hired, that process involved putting the data on tape, sending it out for processing and restoring from tape when it was returned. It usually took about 2 weeks to turn this around.

When it was all done, you could do some pretty cool queries. You could, for example, show that although the pop radio station was Arbitron #1 in every time slot, it was listened to by 17 -24 year old girls with $0.50 of disposable income per week. Meanwhile, the country station attracted 30-60 year olds with average income > $100,000 and lots of disposable money. As a business owner, you could find out why people shop, or don’t, at your business and update your messaging. Then find an avenue to reach them. As a media outlet you could find your listeners and craft a sales pitch for local businesses to advertise.

We did the top 50 Metropolitan Statistical Areas in the US, so 50 markets to sell local results. Then we normalized those results into a Nationwide view in a product call MART USA. MART: Market Audience Research Traffic.

The “secret sauce” for MART was the adaptable weighting of response records. If you selected a subset of the data, is was weighted on-the-fly to more accurately represent each record. In my testing, the reality was that you could pre-weight the records (weighted averages based on demographic data) and be within 0.05% of the on-the-fly method. The weighting method was built into the query tool and it was written by some Professor in Utah. He wrote it in Pascal, which was very in vogue at the time. Since it was written in Pascal, the entire data refinement process had to be as well. (It didn’t, but the founders didn’t know that.)

So I wrote a series of programs that took the data thru several stages of transformations. I had a read a schema from disk, then read a record and apply as many changes as it required. Most of these were things like change your store of choice from “Moms attic” to “other”. Sometimes we redacted income, if you’re 17, you probably don’t make $2M / year.

We were able to reduce the processing time from 2 weeks to 2 hours. Not bad.

In addition to developing the database processing, I got to play network admin. We bought a “Big” Server and installed Netware 2.15c. It was a forward thinking version as it came with both 5.25″ and 3.5″ floppy disks. We spent about $10k on the server as it had 8MB RAM and 2 300MB SCSI disks. By the time I left we had 2 servers, 1 Dev and 1 Prod. Each had 3x 300MB Disks for nearly 2GB of data.

What happened: We got a buy out offer from Advo (https://www.referenceforbusiness.com/history2/5/ADVO-INC.html) and everyone was pretty excited. The weekend before the final signing, the Venture Cap guys decided to dilute the founders stock and fire them. When the Advo team showed up to sign, they asked about Gary and Phil. When they were told about the firings, Advo pulled the deal. 6 months later, Impact Resources was out of business.

I was able to jump out before the collapse and ended up at a Second Startup.


Grease Monkey ~~ GM
Posted in Experience | Tagged | Leave a comment

If you need to watch “your” people work, you’ve failed.

Remote workers are not a new concept. According to every news outlet this is the first time in history people have “had” to work from home.

Let that sink in… had to work from home?

I thrive on it! I get to do things that profit my business and I don’t commute. Less CO2 for the universe..yay. Whatever.. I have a car that will resell for +10% because of lower miles.

Back to the point. I used to work for a regional bank. My job was in jeopardy because I wouldn’t move to another city where the new boss lived. He wanted to be able to walk out of his office and literally see his employees! That meant everyone in one city, one building and under his eye. Megalomaniac aside, who in the fuck let this concept happen?! He was 4 levels down (at least) from C Suite. And what fucking C level would pull this shit?

I took the payoff. Got re-hired by the same bank, because I had an awesome boss* who knew how to transit the corp BS and maintain levels. Hilarity ensued and we grew older. The point of the story is:

Why do businesses allow bad actors to thrive? Knee jerk reaction is profit! FTW! Sadly, I think it’s more about fear and self preservation. We’re afraid to call someone out in the corporate would. … Because they might have connections, be protected or whatever.

I feel like I’m speaking out of both sides of my mouth with that last bit. I’ve always been a “just do your job very well and be rewarded” kind of guy. After 35+ years of IT, I regret this is not always true. That’s not to say I eschew capitalism, it just means I didn’t know terrile people could be so smart.

My bad. I apparently forgot everything I know about human behavior.


Grease Monkey ~~ GM
Posted in Opinion | Leave a comment

Hadoop Irrelevance

Has Hadoop gone the way of COBOL? Still used by some crusty installations, but past it’s prime for relevance?

I was a cloud skeptic. I still am, but…. The world has embraced cloud. For reasons beyond my understanding, it is now, Generally Acceptable to run your business on any of the 3 major cloud providers. Generally Acceptable here has a legal meaning that implies you aren’t incompent or negligent in moving your system there.

As a former systems administrator, I think it’s entirely crazy to put all of your stuff in a cloud, because of single points of failure. 1 company, 1 SPOF.

What does this have to do with Hadoop? Only that the first round of treatment for ailing Hadoop installs was Hadoop in the Cloud. EMR, HDI and whatever the Google version is promised the same things Hortonworks and Cloudera did for on-prem installs, but eliminated “the problem” of building and maintaining it yourself. Just spin it up and start Hadooping, they said.

“So AWS, how to I submit jobs to my new EMR cluster?” Just like always! Just create these ssh tunnels and…. Right… https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-connect-master-node-ssh.html

So the hype busted again and CEO’s started hearing about Spark and Kafka. They don’t know how it’s any different, but it sounds cool and everyone is talking about it.

The reality is that Spark and Kafka are mostly different from Hadoop, but they also solve different problems. The old elephant was designed for large batchy problems. You (Netflix) can work some magic to do more than batch, but it requires actual knowledge and design. Most enterprise Hadoop customers don’t like that, they just want silver bullets.

So Kafka is fast and so is Spark and they can “stream” things together and be the cool kids at the table for a while. It’s much more complex than most people need, but it’s buzzy.

Meanwhile, our good buddies at the Bezos Circus have been stealing from Google again. (Google actually they gave this away… probably to track your movements) EKS is born and now you can run Spark on k8s against your cloud object store. And this actually works!

It’s dynamic compute (on demand even) with low cost, reasonable performance storage. Learning how to spec out a Spark job will take a while, but that’s on the Dev and/or Data Science side. Because Spark does parallel by design, it scales out pretty quickly. Because Cloud providers own the network connectivity, we don’t have to worry about building out data and compute racks with high speed interconnects.

There are still pain points. Security Group, IAM roles, internode communications and that pesky, “How do I connect again?” But once you solve them, they’re easy to copy. A few software companies will even build your cluster for you on demand. Databricks and Dataiku come to mind.

GCP seems to be the least painful to implement this design. AWS comes in a distant second with much pain, gnashing of teeth and circular documentation that doesn’t really answer the question you ask. Microsoft Azure comes in dead last in ease of use. You’ll consider breaking your own hand as a reason you couldn’t work on that project. You’ll consider it again. Microsoft really does suck.

At the end of the cycle, Hadoop will still have some devout uses who understood what it was meant for and had success using it correctly. Those users will probably also have Spark on k8s, because it solves a different problem. So my prediction is that yes, Hadoop will survive, but much the way COBOL has. Probably not for as long tho…

Followup: https://www.theguardian.com/technology/2021/dec/15/amazon-down-web-services-outage-netflix-slack-ring-doordash-latest

https://www.zdnet.com/article/aws-misfires-once-more-just-days-after-a-massive-failure/


Grease Monkey ~~ GM
Posted in Administration, Opinion | Tagged , , | Leave a comment

Dear Samgung, Google, LG and other Phone Makers

STOP MAKING PRETTY PHONES! I am so sick of hearing about beautiful design, stylish curves and all glass. WTF do you think this is? A Beauty contest? Pretty phones break really easily. Plus the first thing I do to my “pretty” phone is wrap it up in a protective case. No one ever sees the phone!

Give me a ruggedized phone w/ mobile pay and SD Slots. A card holder would be nice, but I’m not gonna push it.

STOP MAKING PRETTY PHONES! They’re pocket computers anyway.


Grease Monkey ~~ GM
Posted in Opinion, Rant | Leave a comment

Rebuilding the site

I’m working on a rebuild of the site. If nothing else, I like being able to look up some of the old articles. I might even post some new ones.


Grease Monkey ~~ GM
Posted in Administration | Leave a comment

The Data Cluster Administrator

This was written and intended to be published on May 31, 2017. Sorry for the delay?
Hadoop Admin, Elastic Search Admin, GlusterFS Admin, etc. What’s the difference? Subtle details regarding Peer Elected Leader vs Dedicated Leader, etc.; but at the end of the day clusters are clusters and data is data.

So, although I have been a Linux Admin, DB Admin, Hadoop Admin and semi-AWS Admin, I’m going to re-brand. Welcome to the brave new world of Data Cluster Administration.

What does a Data Cluster Administrator do? Just about all of the above.
Grease Monkey ~~ GM

Posted in Opinion | Leave a comment

Progress and Setbacks — Hadoop 3.0 breaks things

241 properties have been deprecated in Hadoop 3.0. Is that enough change for you? Dozens of sites have dedicated space to discovering and explaining all the new goodness in Hadoop 3.0. I haven’t read any that discuss the problems that this massive overhaul brings to Hadoop stability. It may be that I don’t read enough?

Way back in Dec. 2017, The Apache Software Foundation announced GA of 3.0. Hortonworks waited until v 3.1 before releasing HDP 3.0 in the Summer of 2018. Now, vendors are trying to decide when the massive changes are worth implementing. Hadoop may be on 3.1, but HDP is 3.0 and I know very few enterprise class users who will be using this in production. This is part of the problem.

I’m personally excited about advancements in software and especially Hadoop. YARN now supports Docker containers, Namenode can now be split (is that different than federated) and other good things that you can read about in other blogs. I am not, however, an enterprise. I also don’t have to develop software that needs to contend with all of these changes.

I have friends that develop software for Hadoop and I do not envy them. They must now incorporate and test 241 property name changes in their code and maintain backward comparability. This does not include the myriad of changes required to resolve other deprecated systems. For example, the MapReduce engine for Hive is no longer an option. That flat out breaks a piece of software I use. I’m sure the Hadoop devs had good reason for removing this, but it does cause issues and we’ve seen this kind of thing before.

Surely Hadoop could not fall into this trap. The Python guys (gender neutral) have been battling with broken upgrade paths for years. Lots of years. I just hope that Hadoop can overcome this divide; otherwise I feel it’s market share will continue to erode and deepen the trough of dissolution.

You may ask, “What can I do to help prevent this erosive divide?” I guess the only thing you can do is push for 3.0 migrations. Hadoop definitely needs to grow or it will fade away. Application developers and integrators are going to need to bite the bullet and spend time and money on refactoring their code. The only way that will happen is by applying the pressure of customer demands. Customers need to communicate to their vendors that they have a timeline for migrating to Hadoop 3.x and they need to know their vendors will have a functional version by then or be replaced.

I’m sure your mileage will vary, but my first experience connecting software to a HDP 3.0 stack did not go well. Nothing exploded, but nothing worked either.
Grease Monkey ~~ GM

Posted in Opinion | Leave a comment