Whenever I talk about backup, I remember my L1 days when I used to give interviews for the position of a “Desktop Engineer” and the Interviewer may ask me about "Backup", I used to try and state the bookish definition of Backup which even I didn’t understand, well this brings me to a good point that never try to speak about things in an interview which you yourself don’t understand. Ok so let’s get back to Backup. Another simple definition for you, “Backup is nothing but a copy of data taken from one place and copied to another place so that in case there is a failure or data corruption at the source then you can restore the data back from the place where you had copied it”.
So, Backup is nothing but a Copy of data, now this copy can be taken every once a month, or every week, every day or every hour; depending on the criticality of the application. If it’s a transaction based system such as Core Banking then definitely you may need to take the backup atleast every day, now if you are thinking that taking backup once a day may not suffice as in case if the disaster or a failure happens somewhere in between 24 hours (say 12th hour), then definitely you may lose 12 hours of data which is not acceptable, to tackle such a situation there are other mechanisms such as NDR where you can protect your data in such scenarios, in case you are reading the sections in sequence then you may already know what is an NDR and how you can protect data loss
So, coming back to Backup frequency, it can be anything from once a month to once a week to once a day or once every hour as well
Although making copies seems simple but there are lots of complexities involved when it comes to Backing up data on hundreds of servers and keeping them in an organized manner and should be able to restore only the required data instead of the entire copy of data, plus there are certain challenges when backing up databases as you need a consistent backup copy of the database otherwise a normal backup copy would be useless for a restore, so to mitigate all these challenges the Enterprises normally go for a robust Backup solution which takes care of all such issues and provides you with a solution which is highly automated and can do restores whenever you want and wherever you want, some renowned backup solutions are Veritas Netbackup, Commvault, hp data protector, etc.
I would like to document some principles over here which are used in the Industry
· For VM instances the backup configuration would be snapshot backups, i.e. the entire VM can be backed up along with data as a Snapshot on that particular time which can be restored in case of a VM failure
· For physical systems, it would be agent based backup where the file system would be backed up as per the requirement given by Project team, so there would be an Agent sitting on top of the Physical server which would be collecting data from specific folders or file systems and copying the data onto the backup server
· For databases such as SQL and Oracle the backup would be online backup using an agent based backup over the network and this agent would actually understand the database dynamics while taking the backup because if it goes by a normal copy process and start copying the file by the time it completes the file the data would be already changed and in such scenarios this kind of data backup would be useless, so this agent actually takes a point in time snapshot of the database and only backs up that data till the snapshot
· Many times, for performing the backups we use direct SAN to achieve faster speeds rather than ethernet network based backups
The Standard retention policies used in a financial Institution are as follows:
• Daily backup – retained for one month
• Weekly Backup – retained for 1 month
• Monthly backup -- retained for 1 month
Once the Monthly Backup is taken this must be transferred to a different site to comply with the Regulatory guidelines as you cannot keep your production data and the Backup data at the same location
This Off siting is either done by copying data from Backup appliance to Tape and then sending the Tape to a different location, there are specialized Vendors in the market who do this for you
Apart from tapes if you have large storage and good bandwidth between datacenters then you can also transfer data to another datacenter via a network link, this can also achieve the Off-Siting requirement
Also, an Important point to remember would be that if you do a Tape-out then it is mandatory for you to Encrypt this data as its moving out of your organization