Keep people without relevant experience out during important updates
Yes, some are pretty cool in development and they may be very attentive. But production system update require engineer to be a little paranoid. If you have not bad experience in this area, you probably will be unable to predict side-effect and bad-thing-that-might-happen. In ideal case the update must be done by a pair of engineers. Human-factor will be depressed much in that case.
Have a detailed update plan
Well, if the first thing is not the case... Prepare a detailed update plan with all things (shell commands, file & database operation & etc) included. It would be wonderful if you test it in your staging environment first. In ideal case you must have an auto-install package (deb, rpm or what ever).
Have a detailed fallback plan
This is continuation of the previous thought actually. Your update plan should include graceful fall back procedure description. Some errors appear on production system only. Or your development team might miss something, or
Have a Plan B
In some cases fall back scenario can not help. For example in case of improper execution of update script. For example, if somebody deleted important files or dropped wrong database. Yes, your update script contains backup step. But what if that step failed silently?
General rule is "wait for trouble, always". So regular backup of production is a must have thing. Some data centers (as ours) even provides daily server backups in default service package.
No comments:
Post a Comment