July 2, 2013

Notes and comments, July 2, 2013

I’m not having a productive week, part of the reason being a hard drive crash that took out early drafts of what were to be last weekend’s blog posts. Now I’m operating from a laptop, rather than my preferred dual-monitor set-up. So please pardon me if I’m concise even by comparison to my usual standards.

*Basic and unavoidable ETL (Extract/Transform/Load) of course excepted.

**I could call that ABC (Always Be Comparing) or ABT (Always Be Testing), but they each sound like – well, like The Glove and the Lions.


7 Responses to “Notes and comments, July 2, 2013”

  1. Igor on July 2nd, 2013 5:19 pm

    Vertica for $2K/terabyte for hardware/software combined? Really? What kind of hardware are you talking about given that bare SAS drives cost $500/GB or more (and over $1,000/GB in RAID1)?
    Or is it really about $2M/petabyte – which not that many need and can afford?

  2. Igor on July 2nd, 2013 5:26 pm

    I meant (of course) $500/TB – not $500/GB

  3. Curt Monash on July 2nd, 2013 6:04 pm

    Yes, $2 million/petabyte, and that would be with reasonable assumptions about compression.

  4. Alan Musnikow on July 8th, 2013 10:27 pm

    With regard to “a hard drive crash that took out early drafts of what were to be last weekend’s blog posts:”

    I do all writing and coding in a Dropbox folder so that each file is copied to the cloud within seconds after it is saved to the local disk, as long as the laptop or desktop is connected to the internet. When the hard disk on my Ubuntu laptop failed, I was able to recover from Dropbox the last saved version of everything on which I was working. The free version of Dropbox is more than sufficient for months of my work.

    Dropbox also downloads any newer version of every file in the Dropbox folder to each of my laptops and desktops, including those running Windows 7 and XP, within seconds after I turn one on. This allows me to start working on a file on one PC and continue working on the file on another PC.

    I use free accounts at MiMedia, Mozy, SkyDrive, and Ubuntu One for more extensive backups.

    Nevertheless, a hard drive crash is a pain in the neck, or lower down.

  5. Wayne Thompson on July 10th, 2013 9:41 am

    Enjoyed the post. I too am seeing clients do more stratified modeling. I like to use the target (response) variable and a supervised I(predictive) technique like a shallow decision tree to define initial segments (aka clusters) versus clustering. Then develop a model in each segment. I compare the stratified models to a fit on the entire training data to evaluate potential lift / error reduction. Definitely also seeing more use of boosting methods that require less data preparation.

  6. More notes on predictive modeling | DBMS 2 : DataBase Management System Services on July 12th, 2013 4:37 am

    […] July 2 comments on predictive modeling were far from my best work. Let’s try […]

  7. Some stuff I’m thinking about (early 2014) | DBMS 2 : DataBase Management System Services on February 11th, 2014 5:20 pm

    […] have to be permanent, and in fact there are a number of fairly low-cost RDBMS offerings, such as petascale Vertica, the Teradata 1000 series, or […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.