Best common practices

From AdminWiki

(Difference between revisions)
Jump to: navigation, search
(utc-rtc)
m (Obey the RTC)
Line 18: Line 18:
*Your RTC is set to the local timezone.
*Your RTC is set to the local timezone.
-
*Your server has an uptime >= 180 days, meaning that it probably has passed a DST<ref>Dailight saving time</ref> boundary.
+
*Your server has an uptime >= 180 days, meaning that it probably has passed a DST<ref>Daylight saving time</ref> boundary.
*Your server crashes.
*Your server crashes.

Revision as of 03:34, 25 May 2006

This should give you a rundown on the absolute minimum every server should have.

Contents

Time issues

Keep in sync

There is absolutely no excuse for not having a correctly synchronized clock. This will bite you when you you've to compare logfiles from multiple servers and cause problems when you need to deliver accurate logs (police investigation, etc.).

The problem got worse in the last few years (at least that's my impression) because processors got faster and/or time-keeping-mechanisms sloppier. What the operating system basically does[1] when booting up is fetching the current time and date from the RTC, then taking a wild guess on how many CPU cycles[2] are approximately one second and then using this guesstimate as long as the OS runs, which unfortunately is almost never accuracte. Excessive IRQ usage, CPU cycle modulation (power saving) and other factors might also increase the inaccuracy.

What a NTP daemon basically does is comparing the system time with an external timesource (usually a NTP server), estimating on how far off the OS is and then disciplining the system time. It also tracks the inaccuracy of the system clock so that it can keep the clock in sync even when the ntp server should be unreachable for longer periods.

Obey the RTC

Another major issue are wrong times in the RTC. You have to ensure that your system time is correct before your operating switches to multi-user mode.

Common scenario in DST-countries:

  • Your RTC is set to the local timezone.
  • Your server has an uptime >= 180 days, meaning that it probably has passed a DST[3] boundary.
  • Your server crashes.


At this point, if you haven't taken any precautions, you're fucked as soon as the server is online again.

  • Best case: wrong logfile-entries and a few incorrect mtimes on files.
  • Worst case: Important business data (accounting, transactions, etc.) have the wrong timestamps. Good luck correcting these by hand.


There are a few solutions to this problem:

  • Put ntpdate in your startup scripts after your network has initialized and before ntp-server starts. Test it!
This has the drawback that when the network or your ntp-server of choice is down you'll still run into troubles
  • Have hwclock write the system time to the RTC every now and then.
This is still dangerous, since there's a window where your server will boot with the wrong time in the RTC, but it minimizes the risk noticeably.
  • Set the hardware-clock to UTC
Untested. If anybody successfully uses this in a DST-zone, please contact me.

Footnotes

  1. I'm not completely sure about that. If I'm horribly wrong here, please tell me so ;)
  2. (or any other time source, e.g. HPET)
  3. Daylight saving time


The rest

  • backup
  • monitoring
  • sane logging
  • handling of (security) updates
  • minimum amount of installed packages
Personal tools