Author Archives: Grease Monkey

A wonderful, ugly script that just keeps working

Today were going to look at parts of a complex “nudge” script as I’ve described previously. It has a few more bells and whistles and constantly amazes me how well it adapts. I’ll show the good bits in sections so … Continue reading

Leave a comment

The 3 Question Test

A burger and fries costs $1.10; the burger costs $1 more than the fries. How much do the fries cost? 5 servers can sort 5 TB of data in 5 minutes; how long would 100 servers take to sort 100 … Continue reading

1 Comment

Experimenting w/ Neo4j

Graph databases are a really neat concept. We’ve started playing with Neo here as we attempt to link customers with visits and actions based on those visits. It seems like a really good fit at first glance. Our challenge is … Continue reading

Leave a comment

Just give it a nudge.

The second definition of nudge, according to Webster, is to “prod lightly: urge into action.” We use that concept in our data environments for various long running processes; for things that we want to happen frequently, but with an unknown … Continue reading

Leave a comment

Redshift ups and downs

AWS Redshift has been popular lately around my current gig. We’ve got a couple of clusters in use and a few more in POC mode. The in-use clusters are easy to justify pre-paid instances. A few thousand dollars and you … Continue reading

1 Comment

Quick Split to Fix data silliness

We have a vendor sending us daily updates on shipping info. We have a well known and defined structure for each type of data and those types map neatly to tables in our database. We have about 9 tables that … Continue reading

Leave a comment

MapR is a better base?

I’ve heard about MapR for a long time and haven’t given it much consideration vs. OSS stacks. I reconsidering my position and conduction some evaluations. Why? MaprFS is a real POSIX File system that runs on Raw devices, not atop … Continue reading

1 Comment

Ubuntu sucks and so does Debian

Full disclosure, I cut my teeth on Slackware and Redhat in the mid 90’s.  I even tried Yggdrasil once.  That being said… I fully fail to understand the allure of Ubuntu or it’s Mommy distro, Debian.  Yes I know Ubuntu … Continue reading

Leave a comment

Cloud Hadoop? Buzzword Fiesta!

We haven’t quite jumped the shark yet, but this is going to be full of buzzwords. Started a new gig where we’re building Dev, POC and possibly some prod clusters on AWS. Once again the first 80% of this was … Continue reading

Leave a comment

Hadoop 2.0 GA

I’ve been watching the Hadoop user mailing lists and jira counts. It sure seems like 2.0 GA is more like 2.0 Beta 1. I’m looking forward to RC 1 before we move it into a serious cluster. Just my $0.02.

Leave a comment