Backup, Replicate& Multiplex – Redundancy Helps

Flashback! Its November 2007; I am just about three(3) months into my new Job as a DBA; and one of the few things I am tasked to do is babysit an OLTP system with 256GB RAM, a 15TB sized Oracle database and 32 multicore processors –an HP Proliant, a beast, nothing quite like anything I have ever worked with. The system basically maintained the Prepaid subscriber lifecycle process –dynamically computing daily subscriber additions and churn figures by making use of RGE events. So thirty(30) minutes after arriving at the office, I accidentally truncated the table that stored historical details of all churned subscribers, from as far back as 1998 (9 years worth of data). We had a very rudimentary backup mechanism at the time –daily Oracle exports of “critical” aggregated tables. I spent the whole day trying to figure out what to do and I never told anyone what I had done 😉 The good thing is I managed to restore the previous day’s hot backup and then performed a very primitive roll-forward of all transactions for that day. That day is probably amongst the few days I have chosen to remember about my past life because it taught me a number of lessons, the primary one being backup, replicate and multiplex data –“backups, backups and more backups”.

The sad part of it all though is that prior to that incident, I lost four(4) years of data I had gathered whilst at versity –between 2003 and 2007– when the disk controller for my 350GB Seagate hard drive (see snapshot below) got fried. I learnt my lesson the hard way. As a side note though, I still keep the Seagate hard drive with the hope that I will one day manage to retrieve my data (a twisted form of cryopreservation perhaps?).

350GB of data gone - cryopreserved perhaps?

I am certainly not the first person to accidentally loose data. A few moons ago, Google accidentally deleted about 150K Gmail accounts and a Flickr employee accidentally wiped a user’s 4K photos not so long ago… the list is endless, but the moral of the story is backup, replicate and multiplex data you deem important –and “disk only backups” are risky. I backup important data incrementally, on a daily basis, and then I perform a full backup on the first day of every month. The incremental backups are multiplexed on my 1.5TB hard drive and on 4 different computers –mufasa, nightmare& my computer at the laboratory (all located in the same building on campus) and my laptop at home. On a monthly basis, I make a full/complete backup that I archive to DVD. Not only that, I also use Dropbox and Google Docs to store academic related documents. Shit! could I be obsessed :^)

My rudimentary backup Strategy

My primitive backup strategy is illustrated above. I basically use a combination of rsync and Git to replicate my information –the only thing I have not yet started doing is periodically validating my backups. Its probably something I would have to seriously plan for.