-
Archives
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- December 2021
- November 2021
- June 2020
- May 2020
- May 2019
- August 2018
- December 2017
- December 2016
- November 2016
- October 2016
- September 2016
- July 2015
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- March 2013
- February 2013
- January 2013
- December 2012
-
Meta
Monthly Archives: December 2012
Life on the edge of data node writes
If you’re serious about using Hadoop you should subscribe to the User Mailing Lists. They are a great source of insight as to how things are performing, new features and common problems. I’m currently working on a JIRA to clarify documentation … Continue reading
Comments Off on Life on the edge of data node writes
HCatalog – Embrace the independence
Codd’s Rule 9: Logical data independence: Changes to the logical level (tables, columns, rows, and so on) must not require a change to an application based on the structure. Logical data independence is more difficult to achieve than physical data … Continue reading
Comments Off on HCatalog – Embrace the independence
Trivia for Christmas
To start the ball rolling, what was the quote on the wall in “It’s a Wonderful Life” during the beginning of the “Run on the banks?” It’s a fairly famous quote and should be much more popular than it is. … Continue reading
The Ten Commandments of Hadoop (Work In Progress – feel free to edit)
Thy namenode shall always persist (We will have multiple recovery methods) We accept data as is (Come as you are) We love and expect metadata (Nothing enters, changes, or exits without metadata) The family tree will always be maintained. (We … Continue reading
Comments Off on The Ten Commandments of Hadoop (Work In Progress – feel free to edit)
EXT4 vs. XFS for HDFS
HDFS sits atop a local filesystem. The FS type used can impact performance and resilience of the HDFS Cluster, so picking the right one is important. According to the Hadoop Wiki, either of these are acceptable. Yahoo tends to use XFS (or … Continue reading
Comments Off on EXT4 vs. XFS for HDFS
Thought of the day. Storing Confio detailed data in Hadoop?
If you manage a lot of Oracle, M$SQL, or DB2 – you need to check out Confio. Its an agent-less performance trending tool that will make you’re life much easier. It tracks every SQL run by user/client machine/sql statement/etc and … Continue reading
Hello, McFly?
Should you really trust a data integration team who spends 30 minutes arguing that they SHOULDN’T include all source system customer attribute data in MDM?
Comments Off on Hello, McFly?
Confessions of a data architect
My name is DataG and I’m a data modeler. It’s been 6 weeks since my last star-schema. Lets face it. Codd, Imhoff, Inmon, and Kimball paved the way for almost every data analyst and app-dev professional since the relational model … Continue reading
You forgot to WHAT?!
Silly Admin, you forgot to Rack Aware Enable your Hadoop cluster. Now you’ve got all of your data in Rack 1. Lucky for you, there’s a way to fix it. Create and configure your rack aware script and restart your … Continue reading
Comments Off on You forgot to WHAT?!