Backup strategy

From CUC3
Revision as of 15:21, 19 May 2006 by import>Cen1001
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Thoughts about our backup strategy.

We currently back up via home grown scripts to JBODs and that works very well. We have no central fileserver and there is no political support to implement one.

Problems

  • lack of offsite backups. Not really a problem because our backup servers are located a very long way away from the machines they back up; this is a huge site.
  • users need a sysadmin to do a restore for them and if sysadmin can't be found they are stuck
  • Running out of space on existing JBODs
  • Workstations tend to contain archives of old user accounts because new users need old users' data (or PI wants to hang onto it) but old user has gone and workstation is full
  • two weeks of backups not quite enough

Don't want to change the technology- it works very well and all the other options are worse (tapes unreliable, not large enough; DVDs not large enough; UDO not taken off). So we need a gigantic new JBOD to accomodate the growing data. Offsite would be nice but I'd want an onsite too, so I don't see how to do it short of purchasing another JBOD and swapping with another site in Cambridge. Too expensive and not really needed? Could we do something with removable disks for the vital data? What is the vital data anyway and how big is it?

Older backups could probably be done by my doing something sensible with the existing backup script to do a special 'one-month' and 'six-month' run for each machine. So basically I have to sit down and script that and the problem is solved. It would be good to combine this with using rsync's --link-dest option to do what appear to be 'full' backups for each day. Would have to keep an eye on usage. Might save us from creeping corruption or hacks not spotted until too late.

How to let people get their own backups back. For a while I thought I'd make a common UID space (now almost done!) and then just give them user accounts on the server, but someone will start using it as extra data storage and destroy their own backups in the process. So I need a way of having the filesystems look mounted read-write to root but read-only to everyone else. This has got to be doable. A read-only NFS export of course. How to make this so that only theory workstations can read it. Maintain a list (OK so I already do that for printing and Mathematica, I really need to organise this so there is one list). Some kind of automatic process that notices when a machine drops off the network and removes it from the export list would also be nice,

Account management will be a pain in the neck; we will certainly need common accounts on the backup servers if we do allow people to log in. Investigate LDAP as NIS too old. LDAP has the upside that eventually we could extend it to an LDAP for all the workstations and let it run over the network (rather than just the backup farm secure private network). Also could handle the list problem, and while we're at it, the email address problem.

Archiving. As much a political as technical problem. Am going to have to make people sign something when they leave because I'm having too many problems with this. Have a place on a JBOD kept on 3rd floor (but where? it's going to be the cupboard in 152A becayse there is nowhere else) where I put old user accounts; no exceptions! if the server is user-accessible then their colleagues can copy the data or consult it easily. That JBOD backs up to JBOD in UCC. Will still need to have a cutoff date. This will be static so no need for regular backups- just a copy at both ends. (A use for pun and pea after we acquire the giant JBOD for the UCC?)

Data protection. People not going to be ecstatic about world-readable suddenly meaning theory-world-readable. Cannot allow workstations where user has root onto this system, far too dangerous. root squash, naturally.

Windows users? Must look into Retrospect or others.

The Giant JBOD needs to be huge, maybe 6Tb. We currently use 3Tb and have grown about 1Tb in the last 12 months alone.