Author Archives: Grease Monkey

About Grease Monkey

Computer nerd since the 80's. Data nerd since the 90's. Generic nerd for a lifetime.

Kafka on AWS EC2 w/ SSL and External Visibility

I’m truly shocked by how difficult this information is to gather up in 1 place. Maybe because AWS has their own version of Kafka functionality. At any rate, after much reading and irritation I have it working. There is still … Continue reading

Leave a comment

Drilling thru Multiple Clusters

…or Using Apache Drill to join data across discreet domains. We’ve been doing some work with Redshift lately. While it’s an effective tool for storing and crunching thru large amounts of structured data, it’s limited by a few “-ism’s” that … Continue reading

Leave a comment

T vs. V and W Shaped People

We talk a lot about hiring T shaped people at my current gig and I think it’s a misnomer for a couple of reasons. First, it implies a ratio of depth to width that is askew. Developers and Admins in … Continue reading


System Administration Rules to Live By

I’ve had a variation of these running around for a while. Tweaks may come and go with trends, but the concepts are the same. When they say “Go Big!” they don’t mean it. Start with optimistic scripts. Finish them defensively. … Continue reading

Leave a comment

A wonderful, ugly script that just keeps working

Today were going to look at parts of a complex “nudge” script as I’ve described previously. It has a few more bells and whistles and constantly amazes me how well it adapts. I’ll show the good bits in sections so … Continue reading

Leave a comment

The 3 Question Test

A burger and fries costs $1.10; the burger costs $1 more than the fries. How much do the fries cost? 5 servers can sort 5 TB of data in 5 minutes; how long would 100 servers take to sort 100 … Continue reading

1 Comment

Experimenting w/ Neo4j

Graph databases are a really neat concept. We’ve started playing with Neo here as we attempt to link customers with visits and actions based on those visits. It seems like a really good fit at first glance. Our challenge is … Continue reading

Leave a comment

Just give it a nudge.

The second definition of nudge, according to Webster, is to “prod lightly: urge into action.” We use that concept in our data environments for various long running processes; for things that we want to happen frequently, but with an unknown … Continue reading

Leave a comment

Redshift ups and downs

AWS Redshift has been popular lately around my current gig. We’ve got a couple of clusters in use and a few more in POC mode. The in-use clusters are easy to justify pre-paid instances. A few thousand dollars and you … Continue reading

1 Comment

Quick Split to Fix data silliness

We have a vendor sending us daily updates on shipping info. We have a well known and defined structure for each type of data and those types map neatly to tables in our database. We have about 9 tables that … Continue reading

Leave a comment