Make checklist for server maintenance


Server Maintenance Checklist

Just like any other computer, servers need periodic maintenance. Here are a twelve things to check on a regular basis to keep your system running smoothly. This is just a quick check list. It is not meant to be exhaustive or explain how to do these things, but keeping tabs on these items can reduce server issues. These are just some of the things we do as part of our server management work.

12 Server Maintenance Tips

1. Verify your backups are working. Before making any changes to your production system, be sure that your backups are working. You may even want to run some test recoveries if you are going to delete critical data. While focused on backups, you may want to make sure you have selected the right backup location.

2. Check disk usage. Don’t use your production system as an archival system. Delete old logs, emails, and software versions no longer used. Keeping your system free of old software limits security issues. A smaller data footprint means faster recovery should a disk fail. If your usage is exceeding 90% of disk capacity, either reduce usage or add more storage. If your partition reaches 100%, your server may stop responding, database tables can corrupt and day can be lost.

3. Check RAID Alarms. If you are using RAID (and you should be), check that your RAID’s error notification system is configured properly and works as expected. Most RAID levels tolerate only a single disk failure. If you miss a RAID notification, a simple disk replacement could turn into a catastrophic failure.

4. Update your OS. Updates for Linux systems are release almost daily. Many of these fix important security issues. At rackAID, we update systems daily (sometimes even more frequently). If you do not have a management service or auto-updates enabled, be sure to review your OS for any critical security updates. Get on the mailing list for your OS so you know when critical security patches are released. If you have a kernel update, you will need to reboot your server unless you use a took like Ksplice.

5. Update your Control Panel. If you are using a hosting or server control panel, be sure to update it as well. Sometimes this means updating not only the control panel itself, but also software it controls. For example, with WHM/cPanel, you must manually update PHP versions to fix known issues. Simply updating the control panel does not also update the underlying Apache and PHP versions used by your OS.

6. Check application updates. Most security issues we investigate are due to outdated web applications. After you have updated your server, be sure to review the web applications and update them as well.

7. Check remote management tools. If your server is co-located or with a dedicated server provider, you will want to check that your remote management tools work. Remote console, remote reboot and rescue mode are what I call the 3 essential tools for remote server management. You want to know that these will work when you need them.

8. Check for hardware errors. You may want to review the logs for any signs of hardware problems. Overheating notices, disk read errors, network failures could be early indicators of potential hardware failure. These are rare but worth a look, especially if the system has not been working within normal ranges.

9. Check server utilization. Review your server’s disk, CPU, RAM and network utilization. If you are nearing limits, you may need to plan on adding resources to your server or migrating to a new one.

10. Review user accounts. If you have had staff changes, client cancellations or other user changes, you will want to remove these users from your system. Storing old sites and users is both a security and legal risk. Depending on your service contracts, you may not have the right to retain a client’s data after they have terminated services.

11. Change passwords. I recommend changing passwords every 6 to 12 months, especially if you have given out passwords to others for maintenance.

12. Check system security. I suggest a periodic review of your server’s security using a remote auditing tool such as Nessus. Regular security audits serve as a check on system configuration, OS updates and other potential security risks. I suggest this at least 4 times a year and preferably monthly.

Image Courtesy