Best common practices

From AdminWiki

(Difference between revisions)
Jump to: navigation, search
(Other pitfalls)
(Time issues)
Line 3: Line 3:
= Time issues =
= Time issues =
-
== Synchronization ==
+
== Keep in sync! ==
There is absolutely ''no'' excuse for not having a correctly synchronized clock. This will bite you when you you've to compare logfiles from multiple servers and cause problems when you need to deliver ''accurate'' logs (police investigation, etc.).
There is absolutely ''no'' excuse for not having a correctly synchronized clock. This will bite you when you you've to compare logfiles from multiple servers and cause problems when you need to deliver ''accurate'' logs (police investigation, etc.).
Line 11: Line 11:
What a NTP daemon basically does is comparing the system time with an external timesource (usually a NTP server), estimating on how far off the OS is and then disciplining the system time. It also tracks the inaccuracy of the system clock so that it can keep the clock in sync even when the ntp server should be unreachable for longer periods.
What a NTP daemon basically does is comparing the system time with an external timesource (usually a NTP server), estimating on how far off the OS is and then disciplining the system time. It also tracks the inaccuracy of the system clock so that it can keep the clock in sync even when the ntp server should be unreachable for longer periods.
-
== Other pitfalls ==
+
== Obey the RTC! ==
Another major issue are wrong times in the RTC. You have to ensure that your system time is correct ''before'' your operating switches to multi-user mode.
Another major issue are wrong times in the RTC. You have to ensure that your system time is correct ''before'' your operating switches to multi-user mode.

Revision as of 23:17, 24 May 2006

This should give you a rundown on the absolute minimum every server should have.

Contents

Time issues

Keep in sync!

There is absolutely no excuse for not having a correctly synchronized clock. This will bite you when you you've to compare logfiles from multiple servers and cause problems when you need to deliver accurate logs (police investigation, etc.).

The problem got worse in the last few years (at least that's my impression) because processors got faster and/or time-keeping-mechanisms sloppier. What the operating system basically does[1] when booting up is fetching the current time and date from the RTC, then taking a wild guess on how many CPU cycles[2] are approximately one second and then using this guesstimate as long as the OS runs, which unfortunately is almost never accuracte. Excessive IRQ usage, CPU cycle modulation (power saving) and other factors might also aid the inaccuracy.

What a NTP daemon basically does is comparing the system time with an external timesource (usually a NTP server), estimating on how far off the OS is and then disciplining the system time. It also tracks the inaccuracy of the system clock so that it can keep the clock in sync even when the ntp server should be unreachable for longer periods.

Obey the RTC!

Another major issue are wrong times in the RTC. You have to ensure that your system time is correct before your operating switches to multi-user mode.

Common scenario:

  • Your RTC is set to the local timezone.
  • Your server has an uptime >= 180 days, meaning that it probably has passed a DST[3] boundary.
  • Your server crashes.


At this point, if you haven't taken any precautions, you're fucked.

  • Best case: wrong logfile-entries and a few incorrect mtimes on files.
  • Worst case: Important business data (accounting, transactions, etc.) have the wrong timestamps. Good luck correcting these by hand.


There are two solutions to this problem:

  • Put ntpdate in your startup scripts after your network has initialized and before ntp-server starts. Test it!
This has the drawback that when the network or your ntp-server of choice is down you'll still run into troubles
  • Have hwclock write the system time to the RTC every now and then.
This is still dangerous, since there's a window where your server will boot with the wrong time in the RTC, but it minimizes the risk noticeably.

Best thing would be combining both.

Footnotes

  1. I'm not completely sure about that. If I'm horribly wrong here, please tell me so ;)
  2. (or any other time source, e.g. HPET)
  3. Dailight saving time


The rest

  • backup
  • monitoring
  • sane logging
  • handling of (security) updates
  • minimum amount of installed packages
Personal tools