Discussion:
[chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
James Feeney
2017-10-27 16:31:16 UTC
Permalink
On start-up, in the log file, there are these informational messages, for instance:

chronyd[622]: System clock wrong by 1.693005 seconds, adjustment started
chronyd[622]: System clock was stepped by 1.693005 seconds

Without some foreknowledge, it is unclear if the quantity "1.693005 seconds" is referring to the system clock with respect to the NTP server clock, or is referring to the NTP server clock with respect to the system clock, and so, the idea of "plus" and "minus" is ambiguous in this context.

How about, rather than using the term "wrong", instead use the terms "fast" and "slow" to describe this quantity "1.693005 seconds"? Then the log message might read:

chronyd[622]: System clock fast by 1.693005 seconds, adjustment started

or

chronyd[622]: System clock slow by 1.693005 seconds, adjustment started

Similarly, the "clock was stepped" message could include "+" and "-", or better, "forward" and "back", to clarify the "direction" of the error. Consider:

chronyd[622]: System clock was stepped back by 1.693005 seconds

or

chronyd[622]: System clock was stepped forward by 1.693005 seconds


As it is now, I literally do not know whether my system clock runs fast or slow, or whether the system clock was stepped forward in time, or back in time.

I'm sure that the meaning of "wrong" and "stepped" is obvious to some of you, simply because you already know how the code is written. But the log message could make this clear to the uninitiated user. Ha!
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-10-27 17:10:32 UTC
Permalink
The sign of that offset is already there. Ie, it could also have said
System clock wrong by -1.693005 seconds
iIe, that report is no ambiguous about whether the clock is fast or slow.
Of course the sentence does not indicate what convention chrony uses (ie does
the plus sign mean that the system clock is ahead or behind the true time)
I can never remember which convetion is used but I think it is
UTC-systemclock.
In the tracking report, one has the terminology

System time : 0.000062501 seconds fast of NTP time

which would not be ambigous as to the sign convention.
Post by James Feeney
chronyd[622]: System clock wrong by 1.693005 seconds, adjustment started
chronyd[622]: System clock was stepped by 1.693005 seconds
Without some foreknowledge, it is unclear if the quantity "1.693005 seconds" is referring to the system clock with respect to the NTP server clock, or is referring to the NTP server clock with respect to the system clock, and so, the idea of "plus" and "minus" is ambiguous in this context.
chronyd[622]: System clock fast by 1.693005 seconds, adjustment started
or
chronyd[622]: System clock slow by 1.693005 seconds, adjustment started
chronyd[622]: System clock was stepped back by 1.693005 seconds
or
chronyd[622]: System clock was stepped forward by 1.693005 seconds
As it is now, I literally do not know whether my system clock runs fast or slow, or whether the system clock was stepped forward in time, or back in time.
I'm sure that the meaning of "wrong" and "stepped" is obvious to some of you, simply because you already know how the code is written. But the log message could make this clear to the uninitiated user. Ha!
--
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-10-27 17:43:45 UTC
Permalink
... Of course the sentence does not indicate what convention chrony uses (ie does
the plus sign mean that the system clock is ahead or behind the true time) I> can never remember which convetion is used but I think it is UTC-systemclock.
Ha! Yes, difficult to remember, on occasion. And then, it seems simple enough to add a single word to clarify the "direction" of the time offsets.
In the tracking report, one has the terminology
System time     : 0.000062501 seconds fast of NTP time
which would not be ambigous as to the sign convention.
Including "fast" there does make it much easier to read. I'm thinking that it would be nice to include this sort of thing also in the log messages.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2017-10-30 11:07:23 UTC
Permalink
Post by James Feeney
chronyd[622]: System clock fast by 1.693005 seconds, adjustment started
or
chronyd[622]: System clock slow by 1.693005 seconds, adjustment started
I agree this would be much clearer for the user. I never remember
which sign is for fast and slow in what context (it's not consistent
unfortunately). The trouble is that it would break existing scripts
that parse the log and the parsing itself would be more difficult if
it had to look for the word "slow" or "fast" instead of the sign. I'm
not sure how important that really is.

Do you think it would make sense to keep the sign and indicate whether
it's fast or slow in parentheses?

System clock wrong by +/-?.??????? (slow/fast) ?

System clock stepped by +/-?.??????? (forward/backward) ?

There are other messages that print an offset, so maybe they could be
all changed at once to keep it consistent.

What do others think?
--
Miroslav Lichvar
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Denny Page
2017-10-30 18:42:14 UTC
Permalink
FWIW, I believe “ahead” and “behind” are the most clear in speaking to the condition before initiating a correct. “Fast” and “Slow” somewhat imply an ongoing condition that may or may not be true. It could be the accuracy of setting the RTC before reboot for instance.

System clock ahead by ?.????????
System clock behind by ?.????????

Denny
Post by Miroslav Lichvar
Post by James Feeney
chronyd[622]: System clock fast by 1.693005 seconds, adjustment started
or
chronyd[622]: System clock slow by 1.693005 seconds, adjustment started
I agree this would be much clearer for the user. I never remember
which sign is for fast and slow in what context (it's not consistent
unfortunately). The trouble is that it would break existing scripts
that parse the log and the parsing itself would be more difficult if
it had to look for the word "slow" or "fast" instead of the sign. I'm
not sure how important that really is.
Do you think it would make sense to keep the sign and indicate whether
it's fast or slow in parentheses?
System clock wrong by +/-?.??????? (slow/fast) ?
System clock stepped by +/-?.??????? (forward/backward) ?
There are other messages that print an offset, so maybe they could be
all changed at once to keep it consistent.
What do others think?
--
Miroslav Lichvar
--
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
FUSTE Emmanuel
2017-10-31 09:19:37 UTC
Permalink
Post by Denny Page
FWIW, I believe “ahead” and “behind” are the most clear in speaking to the condition before initiating a correct. “Fast” and “Slow” somewhat imply an ongoing condition that may or may not be true. It could be the accuracy of setting the RTC before reboot for instance.
System clock ahead by ?.????????
System clock behind by ?.????????
Denny
I second you:

For system logs messages
"ahead" / "behind" for offset
"fast" / "slow" for freq

measurements.log are already totally unambiguous/perfectly specified.

Emmanuel.
Post by Denny Page
Post by Miroslav Lichvar
Post by James Feeney
chronyd[622]: System clock fast by 1.693005 seconds, adjustment started
or
chronyd[622]: System clock slow by 1.693005 seconds, adjustment started
I agree this would be much clearer for the user. I never remember
which sign is for fast and slow in what context (it's not consistent
unfortunately). The trouble is that it would break existing scripts
that parse the log and the parsing itself would be more difficult if
it had to look for the word "slow" or "fast" instead of the sign. I'm
not sure how important that really is.
Do you think it would make sense to keep the sign and indicate whether
it's fast or slow in parentheses?
System clock wrong by +/-?.??????? (slow/fast) ?
System clock stepped by +/-?.??????? (forward/backward) ?
There are other messages that print an offset, so maybe they could be
all changed at once to keep it consistent.
What do others think?
--
Miroslav Lichvar
--
��칻�&�zf���k�|�z�ު笵�k�|��ښ)r��0��n�˛���m觶����r�h���隊W!���u��z����!�����_jh�ʊ��+a��i�{az˛��-
Miroslav Lichvar
2017-10-31 11:12:54 UTC
Permalink
Post by FUSTE Emmanuel
Post by Denny Page
FWIW, I believe “ahead” and “behind” are the most clear in speaking to the condition before initiating a correct. “Fast” and “Slow” somewhat imply an ongoing condition that may or may not be true. It could be the accuracy of setting the RTC before reboot for instance.
System clock ahead by ?.????????
System clock behind by ?.????????
Denny
For system logs messages
"ahead" / "behind" for offset
"fast" / "slow" for freq
That looks good to me.

Another question is whether it's correct to say "System clock". The
messages don't actually print the offset of the system clock, but
rather the change in the offset relative to the NTP time that chronyd
is tracking. On start that is the same as the offset of the system
clock, but if there is an unfinished correction, it's the difference
between the old offset and new offset.

There are at least the following options:
1. change the message to better describe what is behind or ahead
2. change the message to print the real offset of the system clock
a) print it on each update as long as the offset is larger than
logchange
b) print it only when the change in the offset is larger than
logchange

Thoughts?
--
Miroslav Lichvar
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-10-31 16:41:16 UTC
Permalink
Post by Miroslav Lichvar
Post by FUSTE Emmanuel
For system logs messages
"ahead" / "behind" for offset
"fast" / "slow" for freq
That looks good to me.
"ahead" and "behind" also will work for both the offset message and the "was stepped" message.
Post by Miroslav Lichvar
Another question is whether it's correct to say "System clock". The
messages don't actually print the offset of the system clock, but
rather the change in the offset relative to the NTP time that chronyd
is tracking. On start that is the same as the offset of the system
clock, but if there is an unfinished correction, it's the difference
between the old offset and new offset.
I'm sorry, I don't understand. What does that mean, "the change in the offset"?

The quantity ( system_clock_time - NTP_server_time ), I think I understand.

Now, are you describing something like
(( new_system_clock_time - new_NTP_server_time ) -
( old_system_clock_time - old_NTP_server_time ))?

Are you saying that there is a system log message that would be saying "the system clock was behind by 5 seconds, but now, after some correction, the system clock is behind by only 3 seconds, and the offset has been 'reduced' by 2 seconds"?

That is what "the difference between the old offset and new offset" suggests to me. But then - really? Is chrony reporting that? Why would anyone want to see a report about that? Instead, it might be useful to know, in that example, that there was still 3 seconds of correction remaining. But that quantity would be exactly the same as saying "the system clock is now behind by 3 seconds".

So, I need some help there.
Post by Miroslav Lichvar
1. change the message to better describe what is behind or ahead
2. change the message to print the real offset of the system clock
a) print it on each update as long as the offset is larger than
logchange
b) print it only when the change in the offset is larger than
logchange
Thoughts?
If I am understanding, I think I prefer option '2.a)'.

Option '2.b)' does not tell me what event is going to compel a system log message - at the first state-change? And only once? Where option '2.a)' will print a log message "on each update" - which I don't know when chrony normally makes those updates - but then, the log messages would inform me. That sounds like a good thing.

I am suspicious that option '1.' means having to describe some really complicated concept, this "change in the offset". But I'm only guessing, since I don't actually understand the "what". Could you give an example?

In common speaking, I still tend to speak in terms of frequency. I might say something like "That clock is running fast" and "That clock is running slow", where "running" and "slow" are adjectives, as in a "fast running clock" and a "slow running clock". Somehow, instinctively, that is the information for which I am searching. I can infer a "slow clock" when the system clock offset is "behind", but in my mind, I want to reduce the concept to an effective relative frequency. Of course, subsequently, I will want to quantify the resulting offset.

I don't know that other people do that in their head. But it might be something to consider, in the printing of system log messages. A "complicated" log message might say something like "The system clock is slow and is behind by 3.000000 seconds." That's a bit more like natural human speech - though it may not be as "manly" as a terse and cryptic log message. Ha!

Of course, there are also assumptions there,
1) that there exists some local reference clock, and
2) that local clock had been set to the correct time in the past, and
3) that that local clock was used to set the system clock, and
4) the system clock is an analog of the local reference clock.

Assumption 4 is not actually valid if, for instance, the local reference clock and the system clock operate from different crystals, which is common. The system clock might actually run fast while the local clock runs slow. But, as you suggest, the casual assumption is also that "the offset is larger than logchange", which, I think, leaves us pretty much in the realm of setting the system clock at boot-up, or at wake-up. I suspect that the casual user understands this. Once the system clock has been set, tweaking the system clock drift is at a very different scale, and no system log message is printed for that change.

I'm also assuming that booting a device that has no local reference clock, like a network appliance, a router or a switch or an access point, there is not going to be any "offset" message or any "stepped by" message printed in the system log. Or, maybe the system clock time simply defaults to the epic, "0", and the offset to the current time is shown?
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-10-31 18:09:07 UTC
Permalink
"ahead" and "behind" also will work for both the offset message and the> "was stepped" message.
Also:
forward / back
advanced / delayed
leading / lagging These are more traditional technical terms describing offsets.
incremented / decremented
increased / decreased "stepped forward by" -> "increased by", etc.
raised / lowered Ha! Our concept of time is so "horizontal".
early / late

Hmm - on second thought, "was stepped ahead" seems natural, but "was stepped behind", not so much. And then, "clock behind by 3.000000 seconds" followed by "was stepped ahead by 3.000000 seconds" might seem to mix-together "behind" and "ahead" in a contradictory way. Different word pairs for each message might be easier to read and understand.

For reference, from my original example:

chronyd[622]: System clock wrong by 1.693005 seconds, adjustment started
chronyd[622]: System clock was stepped by 1.693005 seconds

I might lean more toward "leading / lagging" and "increased / decreased".
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2017-11-06 16:17:38 UTC
Permalink
Post by James Feeney
Post by Miroslav Lichvar
Post by FUSTE Emmanuel
For system logs messages
"ahead" / "behind" for offset
"fast" / "slow" for freq
That looks good to me.
"ahead" and "behind" also will work for both the offset message and the "was stepped" message.
From the other suggestions that have been made, I liked best "was
stepped backward/forward".
Post by James Feeney
Post by Miroslav Lichvar
Another question is whether it's correct to say "System clock". The
messages don't actually print the offset of the system clock, but
rather the change in the offset relative to the NTP time that chronyd
is tracking. On start that is the same as the offset of the system
clock, but if there is an unfinished correction, it's the difference
between the old offset and new offset.
I'm sorry, I don't understand. What does that mean, "the change in the offset"?
The quantity ( system_clock_time - NTP_server_time ), I think I understand.
Now, are you describing something like
(( new_system_clock_time - new_NTP_server_time ) -
( old_system_clock_time - old_NTP_server_time ))?
Are you saying that there is a system log message that would be saying "the system clock was behind by 5 seconds, but now, after some correction, the system clock is behind by only 3 seconds, and the offset has been 'reduced' by 2 seconds"?
No, I think it's something else. The issue is that the clock is
normally corrected by slewing (instead of stepping) and that takes
time. Before a correction is finished, another measurement can be made
and the correction needs to be adjusted. That adjustment is what
chronyd reports with the "clock wrong by" message.

Normally, if both the time source and system clock are stable, that
adjustment should be small and we don't expect to see a second "clock
wrong by" message. But if one of them is not stable enough (we
generally don't know which), the adjustment can exceed the logchange
threshold and another "wrong by" message will be logged.

For example, if the initial offset was 5 seconds and the system clock
was already corrected by 2 seconds when another measurement is made,
which says the offset of the system clock is now actually 4 seconds
instead of the expected 3, a message will be logged that the clock is
wrong by 1 second. The question is which offset of the system clock
should be reported (4 or 1) and when it should be reported (abs(4) >
logchange or abs(1) > logchange).
Post by James Feeney
Post by Miroslav Lichvar
1. change the message to better describe what is behind or ahead
2. change the message to print the real offset of the system clock
a) print it on each update as long as the offset is larger than
logchange
b) print it only when the change in the offset is larger than
logchange
Thoughts?
If I am understanding, I think I prefer option '2.a)'.
What I don't like about 2.a is that if there is a large initial offset
and stepping is not enabled by "makestep", there may be a large number
of the "clock wrong by" messages in the log, even if everything is
working as expected.
--
Miroslav Lichvar
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-11-07 21:27:50 UTC
Permalink
Post by Miroslav Lichvar
From the other suggestions that have been made, I liked best "was
stepped backward/forward".
That's good too.
Post by Miroslav Lichvar
For example, if the initial offset was 5 seconds and the system clock
was already corrected by 2 seconds when another measurement is made,
which says the offset of the system clock is now actually 4 seconds
instead of the expected 3, a message will be logged that the clock is
wrong by 1 second. The question is which offset of the system clock
should be reported (4 or 1) and when it should be reported (abs(4) >
logchange or abs(1) > logchange).
Ahhhhh - too many small numbers - is the "1" from the "5-4" or from the "4-3"?

I do not understand in what sense the clock is then wrong by 1 second. And, I am thinking that the log message should never say that - too confusing. If I am understanding, you are saying that currently, a message like that would be logged - and maybe there should not be a message like that. It is not immediately evident to me why a message like that would be informative. It may be better to only log a simpler message, describing the *current* relationship between the system clock and the NTP server clock.
Post by Miroslav Lichvar
Post by James Feeney
Post by Miroslav Lichvar
1. change the message to better describe what is behind or ahead
2. change the message to print the real offset of the system clock
a) print it on each update as long as the offset is larger than
logchange
b) print it only when the change in the offset is larger than
logchange
Thoughts?
If I am understanding, I think I prefer option '2.a)'.
What I don't like about 2.a is that if there is a large initial offset
and stepping is not enabled by "makestep", there may be a large number
of the "clock wrong by" messages in the log, even if everything is
working as expected.
I am guessing that you mean, if a slewing correction is used, that there would then be a log message for every instance in which a small change is made? Yes, that would be too much unimportant information.

I was supposing that alternative '2.b)' would mean that no message would be printed saying anything about the system clock when the offset was less than logchange. I may have misunderstood. I'm thinking that something should be printed initially about the system clock offset, but then, subsequently, only "state change" events should be printed in the log. For instance, if the system clock drifts and, for some reason, the offset exceeds logchange, or if, when slewing the system clock, the system clock has achieved synchronization.

So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds."
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-11-07 21:37:05 UTC
Permalink
Post by James Feeney
Post by Miroslav Lichvar
From the other suggestions that have been made, I liked best "was
stepped backward/forward".
That's good too.
Post by Miroslav Lichvar
For example, if the initial offset was 5 seconds and the system clock
was already corrected by 2 seconds when another measurement is made,
which says the offset of the system clock is now actually 4 seconds
instead of the expected 3, a message will be logged that the clock is
wrong by 1 second. The question is which offset of the system clock
should be reported (4 or 1) and when it should be reported (abs(4) >
logchange or abs(1) > logchange).
Ahhhhh - too many small numbers - is the "1" from the "5-4" or from the "4-3"?
I do not understand in what sense the clock is then wrong by 1 second. And, I am thinking that the log message should never say that - too confusing. If I am understanding, you are saying that currently, a message like that would be logged - and maybe there should not be a message like that. It is not immediately evident to me why a message like that would be informative. It may be better to only log a simpler message, describing the *current* relationship between the system clock and the NTP server clock.
The system is drifting the clock so as to get rid of the 5 second offset. It
should have drifted it by 2 seconds by this time, so that time should have
read 3 sec wrong. But instead it is 4 seconds wrong. If one allowed the drift
to continue it would be 0 seconds wrong in 1 1/2 measurement periods. With
that same drift however, it would, at the end of the drift still be 1 second
out. Ie, the clock is now out by 1 second from the expected drifting clock.
Post by James Feeney
Post by Miroslav Lichvar
Post by James Feeney
Post by Miroslav Lichvar
1. change the message to better describe what is behind or ahead
2. change the message to print the real offset of the system clock
a) print it on each update as long as the offset is larger than
logchange
b) print it only when the change in the offset is larger than
logchange
Thoughts?
If I am understanding, I think I prefer option '2.a)'.
What I don't like about 2.a is that if there is a large initial offset
and stepping is not enabled by "makestep", there may be a large number
of the "clock wrong by" messages in the log, even if everything is
working as expected.
I am guessing that you mean, if a slewing correction is used, that there would then be a log message for every instance in which a small change is made? Yes, that would be too much unimportant information.
I was supposing that alternative '2.b)' would mean that no message would be printed saying anything about the system clock when the offset was less than logchange. I may have misunderstood. I'm thinking that something should be printed initially about the system clock offset, but then, subsequently, only "state change" events should be printed in the log. For instance, if the system clock drifts and, for some reason, the offset exceeds logchange, or if, when slewing the system clock, the system clock has achieved synchronization.
So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds."
Slewing is the expected behaviour. Stepping is not. Thus one would expect a
log warning on stepping but not slewing, is what I would expect.
Post by James Feeney
--
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2017-11-09 12:25:29 UTC
Permalink
Post by James Feeney
Post by Miroslav Lichvar
For example, if the initial offset was 5 seconds and the system clock
was already corrected by 2 seconds when another measurement is made,
which says the offset of the system clock is now actually 4 seconds
instead of the expected 3, a message will be logged that the clock is
wrong by 1 second. The question is which offset of the system clock
should be reported (4 or 1) and when it should be reported (abs(4) >
logchange or abs(1) > logchange).
Ahhhhh - too many small numbers - is the "1" from the "5-4" or from the "4-3"?
It's the "4-3".
Post by James Feeney
I do not understand in what sense the clock is then wrong by 1 second. And, I am thinking that the log message should never say that - too confusing. If I am understanding, you are saying that currently, a message like that would be logged - and maybe there should not be a message like that. It is not immediately evident to me why a message like that would be informative. It may be better to only log a simpler message, describing the *current* relationship between the system clock and the NTP server clock.
I think the idea was to warn the user when the clock needs a larger
adjustment. The logchange directive which configures this message is
described in the manual as "the threshold for the adjustment of the
system clock that will generate a syslog message. An example ...
message to be generated if a system clock error of over 0.1 seconds
starts to be compensated".

So, if there is a large adjustment running and a new measurement says
the offset of the clock is not what expect (e.g. the clock or the
server has drifted for some reason), should we report by how much the
adjustment which is still running had to be adjusted, or the total
amount of the remaining adjustment? Currently, it's the former.
Post by James Feeney
So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds."
What value should be reported if the previous slew didn't finish yet?
That's the normal case. It's just that the slew is rarely larger than
logchange.
--
Miroslav Lichvar
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-11-09 20:38:18 UTC
Permalink
Post by Miroslav Lichvar
So, if there is a large adjustment running and a new measurement says
the offset of the clock is not what expect (e.g. the clock or the
server has drifted for some reason), should we report by how much the
adjustment which is still running had to be adjusted, or the total
amount of the remaining adjustment? Currently, it's the former.
Well, my intuition and naive assumption would be that any allusion to a system clock offset was describing an "absolute offset", relative to the NTP server clock, and NOT relative to some chronyd internal process, which, in this case, would be the slewing process. And I also naively would expect that other people would make that same assumption.

Any log message that was referring to some offset of the system clock relative to some clock *other* than the NTP server clock, in my mind, should require some very very clear and emphasized language. Even then, how would that be worded? I am still not clear about this offset that is being reported! A log message would seem to require some wording as "involved" as the expression of your example: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3 seconds." I don't even know how often "another measurement is made", so I cannot put that offset error in a precise context.

And, why would anyone care about this "differential slew error"? Is there some attempt being made to illustrate a possible hardware fault, for instance, where the system clock is drifting faster than the chrony slewing algorithm is able to track? And then, if that were the case, I would expect a different log message, such as: "Unable to correct system clock - excessive drift."
Post by Miroslav Lichvar
Post by James Feeney
So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds."
What value should be reported if the previous slew didn't finish yet?
That's the normal case. It's just that the slew is rarely larger than
logchange.
My instinctive answer is "always the absolute offset of the system clock relative to the NTP server clock."

But, I don't actually have enough information to understand what is meant by "didn't finish yet". When, or how often, and in what manner is this "previous slew" measured? Is there some "current slew" that is distinct from a "previous slew"? Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock?

My intuitive assumption would be that the slewing process is simply initiated when chronyd is started, and that the slewing process changes its rate dynamically, however much is required to "catch up" the system clock to the NTP server clock, and that a single slewing process will continue until synchronization is achieved. Does "slewing" work like that? Or some other way?

I have no idea how often chronyd actually "tweaks" the system clock, whether chronyd is in a "slewing" state, or even in a "synchronized" state. Is there some hard-coded maximum allowed offset before the system clock is "tweaked"? Or is there a periodic "tweak" rate? And is this "tweak" interval dynamically adjusted? Is the system clock "frequency" automatically "scaled" by the kernel? I know that those are kind of basic concepts, but I really don't know what chronyd is doing at that low level.

I'm appreciating your patience with this explanation...
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-11-09 21:22:45 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by James Feeney
Post by Miroslav Lichvar
So, if there is a large adjustment running and a new measurement says
the offset of the clock is not what expect (e.g. the clock or the
server has drifted for some reason), should we report by how much the
adjustment which is still running had to be adjusted, or the total
amount of the remaining adjustment? Currently, it's the former.
Well, my intuition and naive assumption would be that any allusion to a system clock offset was describing an "absolute offset", relative to the NTP server clock, and NOT relative to some chronyd internal process, which, in this case, would be the slewing process. And I also naively would expect that other people would make that same assumption.
That is unclear. Chrony knows that it is out by a certain amount. That is why
it is slewing the clock, and in a few seconds or minutes the system time will
be exactly what it thinks NTP time is. It now finds it is out by a second.
Does it report that it is out by a second or by that second plus the three
seconds that it already is correcting.
Post by James Feeney
Any log message that was referring to some offset of the system clock relative to some clock *other* than the NTP server clock, in my mind, should require some very very clear and emphasized language. Even then, how would that be worded? I am still not clear about this offset that is being reported! A log message would seem to require some wording as "involved" as the expression of your example: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3 seconds." I don't even know how often "another measurement is made", so I cannot put that offset error in a precise context.
Precisely, so why would you report that?
Post by James Feeney
And, why would anyone care about this "differential slew error"? Is there some attempt being made to illustrate a possible hardware fault, for instance, where the system clock is drifting faster than the chrony slewing algorithm is able to track? And then, if that were the case, I would expect a different log message, such as: "Unable to correct system clock - excessive drift."
Post by Miroslav Lichvar
Post by James Feeney
So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds."
The logs are not the place to give a complete description of the theory of
chrony operations. That would just be silly, verbost and totally confusing.
One could put that into the documentation.
Post by James Feeney
Post by Miroslav Lichvar
What value should be reported if the previous slew didn't finish yet?
That's the normal case. It's just that the slew is rarely larger than
logchange.
My instinctive answer is "always the absolute offset of the system clock relative to the NTP server clock."
Why? Possibly because your naive expectation is "If the clock is out, fix it
immediately" But that is not what chrony does.
Post by James Feeney
But, I don't actually have enough information to understand what is meant by "didn't finish yet". When, or how often, and in what manner is this "previous slew" measured? Is there some "current slew" that is distinct from a "previous slew"? Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock?
Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error. It is
more important that the the clock deliver a smooth time than that it deliver
the correct time. For example, backstepping the clock (so that there are two
times at which the system clock shows 1496549 seconds for example). That kind
of backstepping can cause havoc to the filesystem.
Post by James Feeney
My intuitive assumption would be that the slewing process is simply initiated when chronyd is started, and that the slewing process changes its rate dynamically, however much is required to "catch up" the system clock to the NTP server clock, and that a single slewing process will continue until synchronization is achieved. Does "slewing" work like that? Or some other way?
chrony keeps measuring its deviation from the ntp time, and slews the clock,
at various rates, to correct for difference. This takes place all the time.
Post by James Feeney
I have no idea how often chronyd actually "tweaks" the system clock, whether chronyd is in a "slewing" state, or even in a "synchronized" state. Is there some hard-coded maximum allowed offset before the system clock is "tweaked"? Or is there a periodic "tweak" rate? And is this "tweak" interval dynamically adjusted? Is the system clock "frequency" automatically "scaled" by the kernel? I know that those are kind of basic concepts, but I really don't know what chronyd is doing at that low level.
It tweaks it every time it makes a measurement. And not to all your questions.
You really need to read about the theory of ntpd clock adjustment which is
that same to lowest order as that of chrony.
Post by James Feeney
I'm appreciating your patience with this explanation...
--
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-11-10 02:44:06 UTC
Permalink
Post by Bill Unruh
That is unclear. Chrony knows that it is out by a certain amount. That is why
it is slewing the clock, and in a few seconds or minutes the system time will
be exactly what it thinks NTP time is. It now finds it is out by a second.
Does it report that it is out by a second or by that second plus the three
seconds that it already is correcting.
What does that mean, "already is correcting"? Either the system clock current instantaneous offset is 1 second, 3 seconds, or 4 seconds, relative to the NTP server clock.
Post by Bill Unruh
Precisely, so why would you report that?
Indeed. I do not see any reason to report that.
Post by Bill Unruh
The logs are not the place to give a complete description of the theory of
chrony operations. That would just be silly, verbost and totally confusing.
One could put that into the documentation.
Yes. So, probably not useful to have a log message that even tries to report that "differential slew error".
Post by Bill Unruh
Why? Possibly because your naive expectation is "If the clock is out, fix it
immediately" But that is not what chrony does.
"Slewing" is a different thing from "fix it immediately". chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately".
Post by Bill Unruh
But, I don't actually have enough information to understand what is meant by "didn't finish yet".  When, or how often, and in what manner is this "previous slew" measured?  Is there some "current slew" that is distinct from a "previous slew"?  Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock?
Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error.
There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances.

When a "new measurement" is made of a "previously begun" slewing process - increasing or decreasing the rate of the system clock - the log message should be clear as to the nature of that measurement. What was measured?
Post by Bill Unruh
chrony keeps measuring its deviation from the ntp time, and slews the clock,
at various rates, to correct for difference. This takes place all the time.
So the system clock frequency is changed, rather than the clock value being increased or decreased with small discontinuous offsets greater than 2 counts. Clearly, that is a challenge for the kernel, since there is no variable frequency oscillator hardware, and some hardware counter will need to increment or decrement by 2 counts instead of by 1 count at some rate, generated using a second counter.
Post by Bill Unruh
It tweaks it every time it makes a measurement.
Presumably, a measurement of the current instantaneous absolute offset of the system clock relative to the NTP server clock. I believe that that is - or would be - the intuitive interpretation of most users. The idea that this "measurement" reported in the the log message would represent something much more abstract - and obtuse - than that is - unfortunate.
Post by Bill Unruh
And not to all your questions. You really need to read about the theory of ntpd clock adjustment which is
that same to lowest order as that of chrony.
That would not hurt, but the issue I am raising is about the meaning of a simple log message, not about the theory of the network time protocol. This does not need to be complicated. I believe that it should not be *made* complicated.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-11-10 04:05:12 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by James Feeney
Post by Bill Unruh
That is unclear. Chrony knows that it is out by a certain amount. That is why
it is slewing the clock, and in a few seconds or minutes the system time will
be exactly what it thinks NTP time is. It now finds it is out by a second.
Does it report that it is out by a second or by that second plus the three
seconds that it already is correcting.
What does that mean, "already is correcting"? Either the system clock current instantaneous offset is 1 second, 3 seconds, or 4 seconds, relative to the NTP server clock.
The three seconds is being corrected, meaning that the clock has been slowed
down so that say in 10 seconds it would be out by 0 seconds. It is a known
error and it is an error that, if everything is left as it is, will be 0 in 10
sec.
Post by James Feeney
Post by Bill Unruh
Precisely, so why would you report that?
Indeed. I do not see any reason to report that.
But you are claiming you should report that 3 sec.
Post by James Feeney
Post by Bill Unruh
The logs are not the place to give a complete description of the theory of
chrony operations. That would just be silly, verbost and totally confusing.
One could put that into the documentation.
Yes. So, probably not useful to have a log message that even tries to report that "differential slew error".
Post by Bill Unruh
Why? Possibly because your naive expectation is "If the clock is out, fix it
immediately" But that is not what chrony does.
"Slewing" is a different thing from "fix it immediately". chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately".
No, it will only make a step if the step size is larger than some value.
Post by James Feeney
Post by Bill Unruh
But, I don't actually have enough information to understand what is meant by "didn't finish yet".  When, or how often, and in what manner is this "previous slew" measured?  Is there some "current slew" that is distinct from a "previous slew"?  Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock?
chrony measures the offset of the clock by doing a least squared fit on the
past N measurements offsets, to determine what the best value of the current
offset is. Once it has that information it alters the rate of the clock so
that in something like DT*2000 seconds (where DT is that measured offset) the offset will
be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the error
disappears in 10*DT if DT is very large and the DT is less than the value to
make a step.)
Post by James Feeney
Post by Bill Unruh
Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error.
There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances.
No, it is not because all the measurements have uncertainties-- due to travel
time fluctuations, due to measurement accuracy, due to interrupt handling
fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is.
Post by James Feeney
When a "new measurement" is made of a "previously begun" slewing process - increasing or decreasing the rate of the system clock - the log message should be clear as to the nature of that measurement. What was measured?
What is measured is the difference between the average of the time when the
ntp packet was sent out and when it was received again according to the local
clock , and the time when the
remote machine claimed to have received it according to the remote clock .
Post by James Feeney
Post by Bill Unruh
chrony keeps measuring its deviation from the ntp time, and slews the clock,
at various rates, to correct for difference. This takes place all the time.
So the system clock frequency is changed, rather than the clock value being increased or decreased with small discontinuous offsets greater than 2 counts. Clearly, that is a challenge for the kernel, since there is no variable frequency oscillator hardware, and some hardware counter will need to increment or decrement by 2 counts instead of by 1 count at some rate, generated using a second counter.
Yes, there is a variable frequency oscillator. Or rather the time in seconds
between the ticks of the cpu clock can be changed.
Post by James Feeney
Post by Bill Unruh
It tweaks it every time it makes a measurement.
Presumably, a measurement of the current instantaneous absolute offset of the system clock relative to the NTP server clock. I believe that that is - or would be - the intuitive interpretation of most users. The idea that this "measurement" reported in the the log message would represent something much more abstract - and obtuse - than that is - unfortunate.
No. In chrony it is the least squares linear fit of the last N measurements of
the offset that is used to estimate the current offset. Because the
instantaneously measured offset is beset by uncertainties that can be
"averaged out" by doing a least squares fit. N is chosen actively to make sure
that over that set of N measurements the offsets are well approximated by a
straight line with gaussian noise.
Unfortunately in order to get a good estimate, one HAS to use "abstract - and
obtuse" in your words, procedures. chrony's way of doing it is a factor of 3
to a 20 times better than ntpd's method according to measurements that both I
and Lichvar have made.
Post by James Feeney
Post by Bill Unruh
And not to all your questions. You really need to read about the theory of ntpd clock adjustment which is
that same to lowest order as that of chrony.
That would not hurt, but the issue I am raising is about the meaning of a simple log message, not about the theory of the network time protocol. This does not need to be complicated. I believe that it should not be *made* complicated.
But what is reported in the logs is precisely tied in with how chrony
estimates the uncertainties.

Note that if you want the raw measurements of the offsets look in
/var/log/chrony/measurements, and /var/log/chrony/refclock logs. There you
will be given the raw offsets of each measurement of the local clock against
the estimate of the time given by the remote server. If that is what you want
to look at, look there. In the tracking log, and in chronyc tracking and otherwise, the best estimate
of the current offset from the time of the servers is reported instead.
Post by James Feeney
--
James Feeney
2017-11-10 19:01:54 UTC
Permalink
Bill, I really do not understand what point you are trying to present here, with respect to the system log message. I was only suggesting that the log message should be more clear as to, let us call it, "leading" and "lagging", "forward" and "backward".
Post by Bill Unruh
The three seconds is being corrected, meaning that the clock has been slowed
down so that say in 10 seconds it would be out by 0 seconds. It is a known
error and it is an error that, if everything is left as it is, will be 0 in 10
sec.
Post by Bill Unruh
Precisely, so why would you report that?
Indeed.  I do not see any reason to report that.
But you are claiming you should report that 3 sec.
Uhm - 4seconds, using Miroslav's example. I am saying that the log message should report that the system clock offset is currently 4seconds. I do not understand where you interpret a "but" out of that. Miroslav has said that, presently, chronyd will report instead, what I understand to be a kind of "differential slew error", which, in this example, would be "1second", not "4seconds". And, I was suggesting that the log message should *not* report "1second".

What is it that you are thinking that the log message should report?

Below, you say that chronyd, by slewing the clock, will correct an offset in "10*DT", "where DT is that measured offset". We see that 10*4seconds = 40seconds, not the 10seconds in your example.

Think of this another way. Observing the display time, adding or subtracting 1second to the display will not give the correct time when the clock is off by 4seconds.
Post by Bill Unruh
"Slewing" is a different thing from "fix it immediately".  chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately".
No, it will only make a step if the step size is larger than some value.
Of course. I should not have to remind you that the "makestep" option takes a "threshold" argument? What is your point?
Post by Bill Unruh
chrony measures the offset of the clock by doing a least squared fit on the
past N measurements offsets, to determine what the best value of the current
offset is. Once it has that information it alters the rate of the clock so
that in something like DT*2000 seconds (where DT is that measured offset) the
offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the
error disappears in 10*DT if DT is very large and the DT is less than the value
to make a step.)
That's useful to know.
Post by Bill Unruh
Post by Bill Unruh
Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error.
There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances.
No, it is not because all the measurements have uncertainties-- due to travel
time fluctuations, due to measurement accuracy, due to interrupt handling
fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is.
You state the obvious. What is your point?

If I wanted to be pedantic - which I often do - I might have said "the instantaneous offset estimate", instead of simply "the instantaneous offset". But, we are all adults here, yes?
Post by Bill Unruh
Yes, there is a variable frequency oscillator. Or rather the time in seconds
between the ticks of the cpu clock can be changed.
More of an electronics hardware guy would take issue with that blurring of the distinction between analog circuitry and digital circuitry. ;)
Post by Bill Unruh
Presumably, a measurement of the current instantaneous absolute offset of the system clock relative to the NTP server clock.  I believe that that is - or would be - the intuitive interpretation of most users.  The idea that this "measurement" reported in the the log message would represent something much more abstract - and obtuse - than that is - unfortunate.
No. In chrony it is the least squares linear fit of the last N measurements of
the offset that is used to estimate the current offset. Because the
instantaneously measured offset is beset by uncertainties that can be
"averaged out" by doing a least squares fit. N is chosen actively to make sure
that over that set of N measurements the offsets are well approximated by a
straight line with gaussian noise. Unfortunately in order to get a good estimate, one HAS to  use "abstract - and
obtuse" in your words, procedures. chrony's way of doing it is a factor of 3
to a 20 times better than ntpd's method according to measurements that both I
and Lichvar have made.
That's nice. Still, you seem to misunderstand what I am referring to as "obtuse". Let me try again. Remember that the topic here is the log message, and nothing about the internal workings of chornyd - which we all admire - and nothing about the theory of the NTP - which is the subject of several standards.

And also, let me remind that the topic here is the *system log* message, and nothing to do with the "/var/log/chrony/*log" log messages, which you have referenced previously. Are we on the same page now?

I repeat the explanation for chronyd's current version of the log message: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3, a message will be logged that the clock is wrong by 1 second." *That message* is what I have called "obtuse".

Perhaps you disagree with Miroslav about what is being reported?

Now, this is information that I do *not* want to know. And, I have not heard an argument that this is information that *anyone* wants to know. So, I have suggested that this log message not be printed. Instead, I have suggested that a simpler log message be printed, a message which describes the current instantaneous offset, of the system clock relative to the NTP server clock *estimate*.

Do you think also that that is a good idea? Or, do you think that that is a bad idea?

Bill, so far, you have failed to say what it is that you think *is* being reported in the system log, and have not said what you think chronyd *should* be reporting in the system log.
Post by Bill Unruh
That would not hurt, but the issue I am raising is about the meaning of a simple log message, not about the theory of the network time protocol.  This does not need to be complicated.  I believe that it should not be *made* complicated.
But what is reported in the logs is precisely tied in with how chrony
estimates the uncertainties.
Ha! Oh, I'm sure it's "tied in" there somewhere. But, as I said, "obtuse", not interested.

Or maybe you are again talking about the "/var/log/chrony/*log" log messages? That would be off topic.
Post by Bill Unruh
Note that if you want the raw measurements of the offsets look in
/var/log/chrony/measurements, and /var/log/chrony/refclock logs. There you
will be given the raw offsets of each measurement of the local clock against
the estimate of the time given by the remote server. If that is what you want
to look at, look there. In the tracking log, and in chronyc tracking  and otherwise, the best estimate
of the current offset from the time of the servers  is reported instead.
As I have tried to explain, that is *not* what I want to see. I only want to see something ever so slightly more informative than the term "wrong", as used now in the system log.

I do not get the impression that you disagree. But you do seem to be going off on tangents - informative, but not entirely relevant.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-11-10 20:06:07 UTC
Permalink
Post by James Feeney
Bill, I really do not understand what point you are trying to present here, with respect to the system log message. I was only suggesting that the log message should be more clear as to, let us call it, "leading" and "lagging", "forward" and "backward".
You are requesting something further. You are also requesting a change in the
value that is reported, not just making the sign of the error clearer. I am in
perfect agreement with the request for making the meaning of the sign clearer.
I am (perhaps) not in agreement with the value you want reported.
Post by James Feeney
Post by Bill Unruh
The three seconds is being corrected, meaning that the clock has been slowed
down so that say in 10 seconds it would be out by 0 seconds. It is a known
error and it is an error that, if everything is left as it is, will be 0 in 10
sec.
Post by Bill Unruh
Precisely, so why would you report that?
Indeed.  I do not see any reason to report that.
But you are claiming you should report that 3 sec.
Uhm - 4seconds, using Miroslav's example. I am saying that the log message should report that the system clock offset is currently 4seconds. I do not understand where you interpret a "but" out of that. Miroslav has said that, presently, chronyd will report instead, what I understand to be a kind of "differential slew error", which, in this example, would be "1second", not "4seconds". And, I was suggesting that the log message should *not* report "1second".
There are two components of that 4 seconds. One is the three seconds which is
already being corrected by the alteration of the slew. The other is the 1
second which is a new offset. The three seconds is already being corrected,
and to report it now again may be confusing.
Post by James Feeney
What is it that you are thinking that the log message should report?
From what Lichvar says, it seems that it now reports the 1 sec which is an
additional uncorrected error.
The measurments log already reports the raw current offset and the corrected
offset.
Post by James Feeney
Below, you say that chronyd, by slewing the clock, will correct an offset in "10*DT", "where DT is that measured offset". We see that 10*4seconds = 40seconds, not the 10seconds in your example.
Think of this another way. Observing the display time, adding or subtracting 1second to the display will not give the correct time when the clock is off by 4seconds.
But that is not what the log reports. Note that as I understand it, chrony
itself reports to other machines the corrected time -- the system time minus
the 3 seconds,( but I am not sure of this). Ie, there is the system time, and the
system time corrected for the ongoing slew,
Post by James Feeney
Post by Bill Unruh
"Slewing" is a different thing from "fix it immediately".  chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately".
No, it will only make a step if the step size is larger than some value.
Of course. I should not have to remind you that the "makestep" option takes a "threshold" argument? What is your point?
Post by Bill Unruh
chrony measures the offset of the clock by doing a least squared fit on the
past N measurements offsets, to determine what the best value of the current
offset is. Once it has that information it alters the rate of the clock so
that in something like DT*2000 seconds (where DT is that measured offset) the
offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the
error disappears in 10*DT if DT is very large and the DT is less than the value
to make a step.)
That's useful to know.
Post by Bill Unruh
Post by Bill Unruh
Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error.
There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances.
No, it is not because all the measurements have uncertainties-- due to travel
time fluctuations, due to measurement accuracy, due to interrupt handling
fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is.
You state the obvious. What is your point?
I have no idea what is obvious to you.
Post by James Feeney
If I wanted to be pedantic - which I often do - I might have said "the instantaneous offset estimate", instead of simply "the instantaneous offset". But, we are all adults here, yes?
Post by Bill Unruh
Yes, there is a variable frequency oscillator. Or rather the time in seconds
between the ticks of the cpu clock can be changed.
More of an electronics hardware guy would take issue with that blurring of the distinction between analog circuitry and digital circuitry. ;)
It is all digital. The cpu clock ticks ticks at a certain, unknown rate. The
system counts those ticks and converts them to time. That coversion process
must be calbrated and that calibration can be changed. It is primarily that
change in calibration that chrony or ntp controls--as I understand it.
Post by James Feeney
Post by Bill Unruh
Presumably, a measurement of the current instantaneous absolute offset of the system clock relative to the NTP server clock.  I believe that that is - or would be - the intuitive interpretation of most users.  The idea that this "measurement" reported in the the log message would represent something much more abstract - and obtuse - than that is - unfortunate.
No. In chrony it is the least squares linear fit of the last N measurements of
the offset that is used to estimate the current offset. Because the
instantaneously measured offset is beset by uncertainties that can be
"averaged out" by doing a least squares fit. N is chosen actively to make sure
that over that set of N measurements the offsets are well approximated by a
straight line with gaussian noise. Unfortunately in order to get a good estimate, one HAS to  use "abstract - and
obtuse" in your words, procedures. chrony's way of doing it is a factor of 3
to a 20 times better than ntpd's method according to measurements that both I
and Lichvar have made.
That's nice. Still, you seem to misunderstand what I am referring to as "obtuse". Let me try again. Remember that the topic here is the log message, and nothing about the internal workings of chornyd - which we all admire - and nothing about the theory of the NTP - which is the subject of several standards.
And also, let me remind that the topic here is the *system log* message, and nothing to do with the "/var/log/chrony/*log" log messages, which you have referenced previously. Are we on the same page now?
But ignoring those messages means that you are in danger of unnecessary
duplication of logs. Why should one care, as long as one knows what it is,
what the system log reports? IF you want to use the reports from chrony, and
if the chrony/*log messages give you what you need, why do you care what is in
the system log messages?
Post by James Feeney
I repeat the explanation for chronyd's current version of the log message: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3, a message will be logged that the clock is wrong by 1 second." *That message* is what I have called "obtuse".
There are two clocks-- the chrony clock and the system clock, which may not be
the same if there is a correction slew going on.
Post by James Feeney
Perhaps you disagree with Miroslav about what is being reported?
Nope. I have not spent the time recently going through the source code to
figure out exactly what is being reported.
Post by James Feeney
Now, this is information that I do *not* want to know. And, I have not heard an argument that this is information that *anyone* wants to know. So, I have suggested that this log message not be printed. Instead, I have suggested that a simpler log message be printed, a message which describes the current instantaneous offset, of the system clock relative to the NTP server clock *estimate*.
Do you think also that that is a good idea? Or, do you think that that is a bad idea?
It all depends on the truth of your contention that this is not "information
that *anyone* wants to know". Certainly if noone wants to know that
information, then reporting it is silly. Unfortunately I do not have the
expertise to know what whether or not your contention is true. I grant that
you do not want to know it.

So, The clock is out by 4 seconds. three of those seconds are being actively
fixed. One is an additional offset that needs now to be fixed. Is it better to
know about that additional offset or about the total? I can certainly see
arguments for either, especially as, if you really want to know, there is a
whole bunch of information in the chrony/*log files which you can look at if
you want more detailed information.
Post by James Feeney
Bill, so far, you have failed to say what it is that you think *is* being reported in the system log, and have not said what you think chronyd *should* be reporting in the system log.
Post by Bill Unruh
That would not hurt, but the issue I am raising is about the meaning of a simple log message, not about the theory of the network time protocol.  This does not need to be complicated.  I believe that it should not be *made* complicated.
But what is reported in the logs is precisely tied in with how chrony
estimates the uncertainties.
Ha! Oh, I'm sure it's "tied in" there somewhere. But, as I said, "obtuse", not interested.
Or maybe you are again talking about the "/var/log/chrony/*log" log messages? That would be off topic.
Post by Bill Unruh
Note that if you want the raw measurements of the offsets look in
/var/log/chrony/measurements, and /var/log/chrony/refclock logs. There you
will be given the raw offsets of each measurement of the local clock against
the estimate of the time given by the remote server. If that is what you want
to look at, look there. In the tracking log, and in chronyc tracking  and otherwise, the best estimate
of the current offset from the time of the servers  is reported instead.
As I have tried to explain, that is *not* what I want to see. I only want to see something ever so slightly more informative than the term "wrong", as used now in the system log.
I will agree that "wrong" is insufficiently precise and is worth changing to
"ahead" or "behind" or something that tells in which direction it is wrong.
That is not where my argument lies. As to what I want, I really do not care.
In my logs the offset is of the order of microseconds or nanoseconds, and that
is far more accurate than I need. Once it gets up to seconds, something is
seriously wrong, and it needs fixing, and I can than look at the chrony/*log
files to get a much more detailed view of the situation.
Post by James Feeney
I do not get the impression that you disagree. But you do seem to be going off on tangents - informative, but not entirely relevant.
Perhaps because it is still not clear to me what it is that you regard as
relevant and certainly because I donot know what your level of knowledge is of
how chrony works.

That report in the syslogs is basically an emergency. The system clock is way
out and chrony has started correcting it. Once it has told you it has started
correcting it, why do you need more information. And if another huge offset
comes in that it needs to tell you about, it tells you what that new
additional correction is. You do not want to know every 4 seconds or every
hour that the clock is still out and that the correction is proceeding, do
you? But if something additional comes in that now needs additional work to
correct it, that is what you want to know surely.
So on my system, these are the three reports over the past two years.
Apr 14 14:45:17 info chronyd[2011]: System clock wrong by 1.013722 seconds,
adjustment started
Dec 15 15:31:14 info chronyd[15415]: System clock wrong by 1007.000208
seconds, adjustment started
Dec 15 15:30:55 info chronyd[15415]: System clock wrong by 1280.000250
seconds, adjustment started

Did I want to receive notifications all during the slew about how wrong the
offset was now? Not particularly. Had there been additional huge offsets, I
probably would want to know about them.
James Feeney
2017-11-14 18:08:11 UTC
Permalink
Yes, the "system clock wrong by" messages were meant to be stacked on
top of each other. If we change it to report the current remaining
correction, it won't be clear how much the clock has drifted between
the messages. If the log had
system clock behind 10 seconds
system clock behind 9 seconds
the clock could have drifted by any value between the logchange
threshold and 9 seconds.
If you were to ask me to describe to some other third person what it is that is now being reported by chronyd in these log messages, based upon my own understanding, derived from your descriptions and Bill's descriptions, I would be unable to provide any kind of precise explanation. I do not mean to be difficult or contrarian. I just do not understand what is being expressed with these "messages ... stacked on top of each other."

You seem to be describing some kind of log message that has no meaning in isolation - like an episode of a TV show that makes no sense unless you've seen the entire series.

Personally, I would say that that is just very bad User Interface design. In contrast, I believe that each log message should be "self contained", that it should have a clear and precise meaning, even in isolation.

So far, after much back and forth, I have come away with an incomplete understanding.

Miroslav, if you really believe in the importance of this "stacked messages", "current remaining correction" concept, then please write a precise mathematical expression for what you are describing, beginning first with a complete list of variable names and their descriptions.

So far, the variables bandied about include:
1) the NTP server clock estimate
2) a not always present on-board Hardware Clock
3) the kernel's System Clock
4) some "hidden" chronyd internal clock
5-10) the Six offsets between each pair of those "clocks"
11-X) some, as yet, uncertain set of "partial offsets"
Y-Z) some, as yet, uncertain set of "cumulative offsets"
A-B) possibly some additional set of statistical "running estimates"
- What value should be compared with the logchange threshold?
- What value should be reported in the message?
- How should be that value described in the message?
Please be certain that each of these proposed "value" choices is included in the list of variable names and the list of equations.

I just like when chronyd reports what it is doing when it starts-up. Still, based upon my best guess, I don't think I care about any "remaining correction" that has no relationship to any "user visible" quantity. I assume that you know the story: "The answer is - 42! But, what is the question?" Does chronyd harbor some "hidden variables"?
The old "system clock wrong" I interpreted as an error
that the clock acquired between updates, but "ahead/behind" I'd
interpret relative to NTP or UTC.
So, you are actually talking about two different things? "acquired between updates", and "relative to [ the NTP server clock estimate ]"? But, "acquired", is relative to what? The NTP server clock estimate?

I'd suggest that two different things should be treated as two different things, with proper descriptions, so that they are not confused. Different log messages could say:

"System clock acquired a 2second lead between NTP clock estimate updates."
"System clock is leading the NTP clock estimate by 2seconds."

But then, what's the difference?
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Denny Page
2017-11-14 19:46:42 UTC
Permalink
I think the core issue is that the log messages are intended to communicate expressions of multiple complex states. I think of it in this simplified way: This is the state from 3 minutes ago; This is the correction action that I computed for that state which was scheduled to be completed 2 minutes from now; This is the state that I computed that I should currently be in based on that correction process; This is the state that I am actually in, and it doesn’t match the computed state. All of this is very difficult to reduce to a single layman oriented log message.

2017-11-14 11:39:17 chronyd: System clock appears to be unreliable

Denny
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-11-14 19:58:22 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by James Feeney
Yes, the "system clock wrong by" messages were meant to be stacked on
top of each other. If we change it to report the current remaining
correction, it won't be clear how much the clock has drifted between
the messages. If the log had
system clock behind 10 seconds
system clock behind 9 seconds
the clock could have drifted by any value between the logchange
threshold and 9 seconds.
If you were to ask me to describe to some other third person what it is that is now being reported by chronyd in these log messages, based upon my own understanding, derived from your descriptions and Bill's descriptions, I would be unable to provide any kind of precise explanation. I do not mean to be difficult or contrarian. I just do not understand what is being expressed with these "messages ... stacked on top of each other."
Each message says that the clock has suffered an unexpected shift in time from
what it should be at that time, and the message says tht that shift is being
corrected. The key works are "unexpected" and "shift". It may also have
suffered a shift in the past, but that past shift is being corrected already
and so it is not unexpected.
Post by James Feeney
You seem to be describing some kind of log message that has no meaning in isolation - like an episode of a TV show that makes no sense unless you've seen the entire series.
Personally, I would say that that is just very bad User Interface design. In contrast, I believe that each log message should be "self contained", that it should have a clear and precise meaning, even in isolation.
Interpreted as above I believe it does have meaning in isolation. "Something
happened to the time and chrony will not try to correct it."
Post by James Feeney
So far, after much back and forth, I have come away with an incomplete understanding.
Miroslav, if you really believe in the importance of this "stacked messages", "current remaining correction" concept, then please write a precise mathematical expression for what you are describing, beginning first with a complete list of variable names and their descriptions.
1) the NTP server clock estimate
2) a not always present on-board Hardware Clock
That has nothing to do with chrony except when the system starts up. So never
worry about that.
Post by James Feeney
3) the kernel's System Clock
4) some "hidden" chronyd internal clock
Chrony corrects errors by altering the drift of the clock (assuming that the
error is not so large as to demand a step of the clock).
Post by James Feeney
5-10) the Six offsets between each pair of those "clocks"
Ignore the hardware clock. It is never of any relevance.
Post by James Feeney
11-X) some, as yet, uncertain set of "partial offsets"
No such thing.
Post by James Feeney
Y-Z) some, as yet, uncertain set of "cumulative offsets"
No such thing.
Post by James Feeney
A-B) possibly some additional set of statistical "running estimates"
No such thing.
The time is out by 5 sec. The rate of the clock is changed so that that 5 sec
offset should go to 0 in 100 seconds. The difference between that prediction
and the estimate of the NTP time is the "current remaining offset".
Post by James Feeney
- What value should be compared with the logchange threshold?
- What value should be reported in the message?
- How should be that value described in the message?
Please be certain that each of these proposed "value" choices is included in the list of variable names and the list of equations.
I think that you first need to go and study the theory of how ntp and chrony
operate. It is really a bit much to be asking Miroslav to give you a complete
less in the theory of chrony.
Post by James Feeney
I just like when chronyd reports what it is doing when it starts-up. Still, based upon my best guess, I don't think I care about any "remaining correction" that has no relationship to any "user visible" quantity. I assume that you know the story: "The answer is - 42! But, what is the question?" Does chronyd harbor some "hidden variables"?
Yes, many. The last N measurements of the offsets from the measured NTP time
from the server, corrected for changes in rate which have occured since they
were measured. These are used to make a least squares fit to estimate what the
difference in rate and time of the system clock and the NTP server's clock.
Post by James Feeney
The old "system clock wrong" I interpreted as an error
that the clock acquired between updates, but "ahead/behind" I'd
interpret relative to NTP or UTC.
So, you are actually talking about two different things? "acquired between updates", and "relative to [ the NTP server clock estimate ]"? But, "acquired", is relative to what? The NTP server clock estimate?
Relative to the corrected chrony time.
Post by James Feeney
"System clock acquired a 2second lead between NTP clock estimate updates."
"System clock is leading the NTP clock estimate by 2seconds."
But then, what's the difference?
Learn how chrony operates.
Post by James Feeney
--
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2017-11-20 14:30:44 UTC
Permalink
Post by James Feeney
You seem to be describing some kind of log message that has no meaning in isolation - like an episode of a TV show that makes no sense unless you've seen the entire series.
I think it has a meaning in isolation. It says that the chronyd's idea
of the true time has shifted by x seconds, even though we don't know
what exactly that is or why it changed.

How to describe that in a syslog message?
Post by James Feeney
Personally, I would say that that is just very bad User Interface design. In contrast, I believe that each log message should be "self contained", that it should have a clear and precise meaning, even in isolation.
So far, after much back and forth, I have come away with an incomplete understanding.
Miroslav, if you really believe in the importance of this "stacked messages", "current remaining correction" concept, then please write a precise mathematical expression for what you are describing, beginning first with a complete list of variable names and their descriptions.
I suspect that description would be very long, for either of those
values, and it couldn't be precisely described in a syslog message.
There are at least four different offsets that chronyd works with,
starting with individual NTP measurements and ending with an estimate
of the offset of the system clock that should be corrected.

BTW, there is an interesting case that could happen if the message was
changed to report the "true" offset of the system clock, but kept the
threshold based on the change. It could report an offset smaller than
the threshold. For example, if the first update was to correct a 10
second offset, and in the middle of the correction there was an update
of -5 seconds, there would be a message saying that the "true" offset
is zero.
--
Miroslav Lichvar
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Gerriet M. Denkmann
2017-11-22 06:49:50 UTC
Permalink
The sub-makes (like make docs) do not work if the make filename (“/Volumes/x.12 + เม่น/Applications/Xcode.app/Contents/Developer/usr/bin/make” in my case) contains any spaces.

Changing all occurrences of : $(MAKE) to: ‘$(MAKE)’ in chrony-3.2/Makefile.in seems to fix this problem.


Kind regards,

Gerriet.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2017-11-22 07:57:11 UTC
Permalink
Post by Gerriet M. Denkmann
The sub-makes (like make docs) do not work if the make filename (“/Volumes/x.12 + เม่น/Applications/Xcode.app/Contents/Developer/usr/bin/make” in my case) contains any spaces.
Changing all occurrences of : $(MAKE) to: ‘$(MAKE)’ in chrony-3.2/Makefile.in seems to fix this problem.
You mean 'make' using the apostrophe character (ASCII 0x27)?

I have never seen commands in a Makefile quoted like that. As I
understand it, make generally does not support filenames with spaces.

Do other projects build ok with paths like that? From a quick search,
it looks like creating a symlink which does not contain spaces is a
recommended work around.
--
Miroslav Lichvar
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Gerriet M. Denkmann
2017-11-22 08:32:48 UTC
Permalink
Post by Gerriet M. Denkmann
The sub-makes (like make docs) do not work if the make filename (“/Volumes/x.12 + เม่น/Applications/Xcode.app/Contents/Developer/usr/bin/make” in my case) contains any spaces.
Changing all occurrences of : $(MAKE) to: ‘$(MAKE)’ in chrony-3.2/Makefile.in seems to fix this problem.
You mean ‘make' using the apostrophe character (ASCII 0x27)?
Yes. Unicode APOSTROPHE = 0x27.
I have never seen commands in a Makefile quoted like that. As I
understand it, make generally does not support filenames with spaces.
I am not very experienced with Makefiles; but I think all Unix stuff supports spaces; although sometimes some additional efforts (like quoting) are required.

Using APOSTROPHEs may well not be the correct way to do this - but it works (with GNU Make 3.81) and I do not know of any better way of quoting.


Kind regards,

Gerriet.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-10-30 20:09:22 UTC
Permalink
existing scripts that parse the log
I am not familiar with any of those scripts - who parses chrony log files? - so my initial reaction is to be unsympathetic. But I don't know - how important is that?
Do you think it would make sense to keep the sign ...
I am confused - unclear about what chrony is actually reporting. In my example, the implicit sign was positive, '+'. But then, in apparent contradiction, "chronyc tracking" reports "Frequency : 5.626 ppm slow", which would have accumulated a *negative* offset, '-', of the system time relative to the actual UTC time.

Is there really a sign associated with the system time reported in the log?
and indicate whether it's fast or slow in parentheses?
If that is necessary, yes, that would work. I'm just not clear about the sign thing, and that the term "wrong" with a back-reference needs to be used.

Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words "forward" and "backward" in parentheses. Still, saying "stepped backward by -2.000000 seconds" would have the flavor of a double negative. Similarly, the phrase "slow by -2.000000 seconds" would have a nagging ambiguity about it.

If chrony were to use explicitly signed numbers in the log messages, the parenthetic slow/fast and forward/backward would be most appropriate.

But then, I'd like to see an explicit 'plus' sign on the numbers, when appropriate, along with the explicit 'minus' sign, to make clear that the numbers are actually signed, and are not simply magnitudes.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-10-30 20:42:09 UTC
Permalink
Post by James Feeney
existing scripts that parse the log
I am not familiar with any of those scripts - who parses chrony log files? - so my initial reaction is to be unsympathetic. But I don't know - how important is that?
I certainly do. Mind you I convert the time to seconds myself with awk.
Post by James Feeney
Do you think it would make sense to keep the sign ...
I am confused - unclear about what chrony is actually reporting. In my example, the implicit sign was positive, '+'. But then, in apparent contradiction, "chronyc tracking" reports "Frequency : 5.626 ppm slow", which would have accumulated a *negative* offset, '-', of the system time relative to the actual UTC time.
The frequency and the offset are not really related to each other.
Post by James Feeney
Is there really a sign associated with the system time reported in the log?
Yes. The offset can be positive or negative.
Post by James Feeney
and indicate whether it's fast or slow in parentheses?
If that is necessary, yes, that would work. I'm just not clear about the sign thing, and that the term "wrong" with a back-reference needs to be used.
Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words "forward" and "backward" in parentheses. Still, saying "stepped backward by -2.000000 seconds" would have the flavor of a double negative. Similarly, the phrase "slow by -2.000000 seconds" would have a nagging ambiguity about it.
If you were going to keep the sign, then you would choose one or the other
(ahead or behind is far better than fast or slow, since those are rate terms,
not offset term. A clock can be slow and still be ahead. Eg the clock could
tick only 59 times per minute, and yet be set ahead.

So, the clock is -3.564452 ahead of NTP time is pretty unambiguous.
Post by James Feeney
If chrony were to use explicitly signed numbers in the log messages, the parenthetic slow/fast and forward/backward would be most appropriate.
It does explicity use signed numbers.

A few lines from my measurements.log

========================================
Date (UTC) Time IP Address L St 1234 abc 5678 LP RP Score Offset Peer del. Peer disp. Root del. Root disp.
========================================================================================================================
2017-10-22 11:02:59 142.103.234.23 N 1 1111 111 1111 6 6 1.00 -3.600e-04 2.440e-02 1.007e-06 0.000e+00 3.052e-05

Note the -3.600 in the offset column.
Post by James Feeney
But then, I'd like to see an explicit 'plus' sign on the numbers, when appropriate, along with the explicit 'minus' sign, to make clear that the numbers are actually signed, and are not simply magnitudes.
--
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
James Feeney
2017-10-30 22:03:38 UTC
Permalink
Post by Bill Unruh
existing scripts that parse the log
I am not familiar with any of those scripts - who parses chrony log files? - so my initial reaction is to be unsympathetic.  But I don't know - how important is that?
I certainly do. Mind you I convert the time to seconds myself with awk.
Ok - and you would need to modify your script, if the word "wrong" was removed, but not if there were trailing text?
Post by Bill Unruh
Do you think it would make sense to keep the sign ...
I am confused - unclear about what chrony is actually reporting.  In my example, the implicit sign was positive, '+'.  But then, in apparent contradiction, "chronyc tracking" reports "Frequency       : 5.626 ppm slow", which would have accumulated a *negative* offset, '-', of the system time relative to the actual UTC time.
The frequency and the offset are not really related to each other.
Uhm - no. The system clock frequency and the instantaneous system clock offset on reboot really *are* related to each other.
Post by Bill Unruh
Is there really a sign associated with the system time reported in the log?
Yes. The offset can be positive or negative.
Ok. All of my hardware clocks must be slow.
Post by Bill Unruh
Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words "forward" and "backward" in parentheses.  Still, saying "stepped backward by -2.000000 seconds" would have the flavor of a double negative.  Similarly, the phrase "slow by -2.000000 seconds" would have a nagging ambiguity about it.
If you were going to keep the sign, then you would choose one or the other
(ahead or behind is far better than fast or slow, since those are rate terms,
not offset term. A clock can be slow and still be ahead. Eg the clock could
tick only 59 times per minute, and yet be set ahead.
Yes, ok, "ahead" and "behind".

But again, frequency errors in my system clock are the cause of instantaneous offsets in the current time. The time is not being set from scratch, with a clock that has never been set. Technically, there are two pieces of information to be stated: 1) the frequency error or drift rate, and 2) the instantaneous absolute offset error. The displayed units serve to distinguish these, "seconds per second" for a frequency error, and "seconds" for an offset error, and so this is already clear in the system log message. I suppose it depends on what language you are use to hearing, as far as fast/slow or ahead/behind.
Post by Bill Unruh
So, the clock is -3.564452 ahead of NTP time is pretty unambiguous.
Ha! No, that sounds like an explicitly contradictory statement to me. I'm not even sure what you are trying to say with that example.
Post by Bill Unruh
If chrony were to use explicitly signed numbers in the log messages, the parenthetic slow/fast and forward/backward would be most appropriate.
It does explicity use signed numbers.
A few lines from my measurements.log
========================================
   Date (UTC) Time     IP Address   L St 1234 abc 5678 LP RP Score Offset Peer del. Peer disp. Root del.  Root disp.
   ========================================================================================================================
   2017-10-22 11:02:59 142.103.234.23  N  1 1111 111 1111  6  6 1.00 -3.600e-04  2.440e-02  1.007e-06  0.000e+00  3.052e-05
Note the -3.600 in the offset column.
I'm confused - why are you presenting your measurements.log? The log file message that I was discussing was the chrony message in the system log.

So, when you said, above, that "The offset can be positive or negative", were you actually talking about your "measurements.log", and not the system log?

Was my meaning unclear? "Explicitly signed" means that a positive number always has an explicit leading '+'. Of course that is still only suggestive, that the number could just as well have been negative, and does not represent a simple magnitude. But that would be my interpretation.

I am back to guessing that chrony does not use signed numbers in the system log message, not having actually looked at the code myself.
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-10-30 23:12:23 UTC
Permalink
Post by James Feeney
Post by Bill Unruh
existing scripts that parse the log
I am not familiar with any of those scripts - who parses chrony log files? - so my initial reaction is to be unsympathetic.  But I don't know - how important is that?
I certainly do. Mind you I convert the time to seconds myself with awk.
Ok - and you would need to modify your script, if the word "wrong" was removed, but not if there were trailing text?
Yes, that occurs in the system logs (/var/log/messages, or journalctl ) and I
do not parse those. There is virtually nothing there in the older versions of
chrony. I go to the /var/log/chrony/* logs
Post by James Feeney
Post by Bill Unruh
Do you think it would make sense to keep the sign ...
I am confused - unclear about what chrony is actually reporting.  In my example, the implicit sign was positive, '+'.  But then, in apparent contradiction, "chronyc tracking" reports "Frequency       : 5.626 ppm slow", which would have accumulated a *negative* offset, '-', of the system time relative to the actual UTC time.
The frequency and the offset are not really related to each other.
Uhm - no. The system clock frequency and the instantaneous system clock offset on reboot really *are* related to each other.
Not sure what you mean. On boot, the system clock frequency is irrelevant
since it was not operating between reboots. What is important is the rtc clock
offset and frequency, and yes, they can be related to each other.
Post by James Feeney
Post by Bill Unruh
Is there really a sign associated with the system time reported in the log?
Yes. The offset can be positive or negative.
Ok. All of my hardware clocks must be slow.
Could be. There is certainly a sign associated.
Post by James Feeney
Post by Bill Unruh
Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words "forward" and "backward" in parentheses.  Still, saying "stepped backward by -2.000000 seconds" would have the flavor of a double negative.  Similarly, the phrase "slow by -2.000000 seconds" would have a nagging ambiguity about it.
If you were going to keep the sign, then you would choose one or the other
(ahead or behind is far better than fast or slow, since those are rate terms,
not offset term. A clock can be slow and still be ahead. Eg the clock could
tick only 59 times per minute, and yet be set ahead.
Yes, ok, "ahead" and "behind".
But again, frequency errors in my system clock are the cause of instantaneous offsets in the current time. The time is not being set from scratch, with a clock that has never been set. Technically, there are two pieces of information to be stated: 1) the frequency error or drift rate, and 2) the instantaneous absolute offset error. The displayed units serve to distinguish these, "seconds per second" for a frequency error, and "seconds" for an offset error, and so this is already clear in the system log message. I suppose it depends on what language you are use to hearing, as far as fast/slow or ahead/behind.
Yes, the rate and the offset are the two pieces that chrony uses. It is not
clear here what you are trying to get at.
Post by James Feeney
Post by Bill Unruh
So, the clock is -3.564452 ahead of NTP time is pretty unambiguous.
Ha! No, that sounds like an explicitly contradictory statement to me. I'm not even sure what you are trying to say with that example.
It means that to get the system time, you add whatever that number is
(-3.564452) to the NTP time.
Post by James Feeney
Post by Bill Unruh
If chrony were to use explicitly signed numbers in the log messages, the parenthetic slow/fast and forward/backward would be most appropriate.
It does explicity use signed numbers.
A few lines from my measurements.log
========================================
   Date (UTC) Time     IP Address   L St 1234 abc 5678 LP RP Score Offset Peer del. Peer disp. Root del.  Root disp.
   ========================================================================================================================
   2017-10-22 11:02:59 142.103.234.23  N  1 1111 111 1111  6  6 1.00 -3.600e-04  2.440e-02  1.007e-06  0.000e+00  3.052e-05
Note the -3.600 in the offset column.
I'm confused - why are you presenting your measurements.log? The log file message that I was discussing was the chrony message in the system log.
Because it is an example of the report of a negative offset.
Post by James Feeney
So, when you said, above, that "The offset can be positive or negative", were you actually talking about your "measurements.log", and not the system log?
The example was the measurement log. Here is an example from
/var/log/daemonlog
:
/var/log/daemonlog:Jan 12 00:58:04 planet chronyd[820]: System clock wrong by
-1.121814 seconds, adjustment started

Note the negative sign.
Post by James Feeney
Was my meaning unclear? "Explicitly signed" means that a positive number always has an explicit leading '+'. Of course that is still only suggestive, that the number could just as well have been negative, and does not represent a simple magnitude. But that would be my interpretation.
I am back to guessing that chrony does not use signed numbers in the system log message, not having actually looked at the code myself.
--
James Feeney
2017-10-31 00:19:13 UTC
Permalink
Post by Bill Unruh
Yes, that occurs in the system logs (/var/log/messages, or journalctl ) and I
do not parse those. There is virtually nothing there in the older versions of
chrony. I go to the /var/log/chrony/* logs
Ok, I see. But that's a different thing altogether.

Do you have any preference with respect to just the system log messages?
--
To unsubscribe email chrony-dev-***@chrony.tuxfamily.org with "unsubscribe" in the subject.
For help email chrony-dev-***@chrony.tuxfamily.org with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Loading...