See MonitoringAndControllingNTPDev for discussion of this topic.
8. Monitoring and Controlling NTP
See
TroubleshootingNTP for checking the behavior of a particular machine.
Monitoring NTP
For ongoing monitoring of an NTP network there are a variety of choices, including:
- whatever else we can find
scripts/stats tools
Foremost among these tools is
peer.awk, a script that computes statistics from a peerstats file. It is invoked with:
% gawk -f peer.awk peerstats
where
peerstats is the path and name of the peerstats file to be analyzed.
awk or
nawk may work as well.
The output looks like this:
% gawk -f peer.awk $NTPSTATS/peerstats.20061120
ident cnt mean rms max delay dist disp
==========================================================================
127.127.30.0 5241 -0.000 0.002 0.008 0.000 0.000 0.000
server1 1339 2.055 0.346 4.077 17.993 16.592 3.414
server2 1337 3.542 0.339 3.591 16.791 17.514 3.399
server3 1215 2.196 0.348 2.161 12.954 945.456 4.737
server4 1341 0.897 4.305 36.682 0.538 23.602 3.446
server5 1341 2.580 0.316 3.566 16.445 16.261 3.384
127.127.1.0 1339 0.000 0.000 0.000 0.000 0.000 0.000
The
ident column identifies the server by IP address or pseudo IP address. The line beginning with 127.127.30.0 is a such a pseudo IP address. It designates a hardware reference clock using driver thirty - the Motorola Oncore GPS driver. I have replaced actual IP addresses with server1, server2, etc.
The
cnt column gives the number of samples in the peerstats file for the
ident.
The
mean column contains the arithmetic mean or average of the offsets for that server or reference clock.
The
rms column contains the root mean square of the offsets. This is a measure of central tendency computed as the square root of the sum of the squares of the offsets divided by the number of samples.
The
max column contains the offset with the greatest absolute value.
The
delay column has the mean round trip network delay for a network server.
dist is the maximum observed synchronization distance, where synchcronization distance is defined as the dispersion plus one half the round trip delay.
disp is defined as the maximum error of the server or peer clock relative to the local clock over the network path between them, in seconds.
For all of
mean,
RMS,
max,
delay,
dist, and
disp values closer to 0 (zero) are better.
RRDTOOL notes
RRDTOOL defaults to storing non-negative numbers.
NTP offsets will sometimes go negative.
To allow negative numbers to be stored you need to tune your RRD databases:
rrdtool tune foo.rrd -i ds0:u -i ds1:u
(Do we need to do this for both
ds0 and
ds1?)
Who is using my NTP server?
You can check which hosts are talking to your time server by using the
monlist command of
ntpdc, e.g.
ntpdc -c monlist
Please note that a maximum of 600 entries is supported with current versions of
ntpdc.
The protocol (or better: the contents of the return packets) used by
ntpdc is not standardized, therefore it is recommended to only use
ntpdc with a matching
ntpd, i.e. both should have the same version number.
To get by this 600 entry limitation, many server operators run client statistics scripts, such as Wayne Schlitt's
ntp_clients and
ntp_clients_stats scripts, which can be found at
http://www.schlitt.net/scripts/ntp/index.html . They work very well, but can use quite a bit of system resources if your client counts are in the high thousands. Examples of these scripts in action can be found at:
Controlling NTP
Start and stop NTP on your server
Depending on your operating system there are a number of possibilities how to start and stop the NTP daemon.
On most Linux machines you can use these commands:
/etc/init.d/ntpd start # start NTP daemon
/etc/init.d/ntpd stop # stop NTP daemon
Some Linux distributions use
xntpd instead of
ntpd, even if they are referring to a Version 4 NTP implementation (and not a Version 3, which often has been called
xntpd).
On a Windows machine, you can use the commands
net start ntp
net stop ntp
if you used the Meinberg Installer for NTP. The NTP Time Server Monitor Application by Meinberg (see above) provides functions to start/stop and restart the service on a graphical interface.