Thought of the day. Storing Confio detailed data in Hadoop?

If you manage a lot of Oracle, M$SQL, or DB2 – you need to check out Confio. Its an agent-less performance trending tool that will make you’re life much easier.  It tracks every SQL run by user/client machine/sql statement/etc and trends those over time.

So I was listening to an IT security architect give an informative presentation on security and the topic came up about SQL injection and RDBMS security in general.  It got me to thinking: If an enterprise really wanted to keep tabs on all of the goings-on in their databases for forensic purposes, why not leverage Confio that is already pulling that data?

Confio typically aggregates that data up after 30 days – so there is plenty of time to snag it all and store it in HDFS.

You could even pull Teradata’s dbc.qrylog while your at it.

Imagine knowing what SQL statements you ran 3+ years ago.  Cool.

This entry was posted in Data, Opinions, Uncategorized. Bookmark the permalink.

6 Responses to Thought of the day. Storing Confio detailed data in Hadoop?

  1. I certainly see the value of storing and evaluating queries and performance, but “Cool?” That some serious data geekiness. 😉

  2. DataG says:

    Its cool if I want to know who accessed a database at 1am on 3/13/2006 and what they did.

  3. jbattisti says:

    I did the math on how much data that will be if we keep all of it for seven years and have three copies – a little over a petabyte. We probably want to think about having an archive cluster that we might have less copies on and denser nodes.

  4. DataG says:

    Interesting growth numbers. This makes me think that we should still consider aggregation in our data even though we have a bigger train. So after N years we roll it up by day/week/lunar cycle. Perhaps a new commandment?

  5. PizzaBoy says:

    This would be even more interesting if we could instrument some alerts and triggers when we saw certain questionable behavior.

  6. PizzaBoy says:

    It could also be useful for analyzing the types of SQL we are seeing and the growth of a particular type over time. Could help to focus us on warning the development community about bad SQL, simply solving big problems across the board, or suggest a change in character of a database from say primarily read to a ton more updates.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.