start_timer and timer fired lag

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

start_timer and timer fired lag

Alexander Petrovsky-2
Hi!

I'm usgin nksip library for working with sip protocol and in my virtual machine (windows -> virtualbox -> ubuntu) I'm observer very strange behaviour with erlang:start_timer [1]. Some time in common test, the timer fires with lacency with hundred milliseconds.

I've check, there is no messages in queue, msacc says there is no problem with schedulers and whole system is idle, erlang:monitor(time_offset, clock_service) show that time in virtual machine flows monotonically.

I wonder, how can I find out why erlang behaves strange sometimes?

Erlang version is:

$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [kernel-poll:false]
Eshell V9.2  (abort with ^G)



--
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: +7 931 9877991


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: start_timer and timer fired lag

Jesper Louis Andersen-2
Pure guess: You are being CPU throttled in order to save power, or the virtual machine is throttling you.

Enable the erlang system monitor and start looking for long schedules. These are usually an indication that you are being throttled by something. Either a bad NIF which isn't dirty-scheduling the right way, or by the underlying system stalling you. Perhaps because your timer is on a separate core and async-fires to another core. In general, timers can be stalled by the system, so you have no guarantee that they will fire close after their trigger point. But if you have timers which are periodically behind schedule, I'd look into the virtual setup.

One simple test is to run outside virtualization to make sure it isn't causing the problems. But I'd enable the system monitor anyway since it is a good way to detect unruly underlying hardware/virtualization layers.

On Mon, Mar 26, 2018 at 4:19 PM Alexander Petrovsky <[hidden email]> wrote:
Hi!

I'm usgin nksip library for working with sip protocol and in my virtual machine (windows -> virtualbox -> ubuntu) I'm observer very strange behaviour with erlang:start_timer [1]. Some time in common test, the timer fires with lacency with hundred milliseconds.

I've check, there is no messages in queue, msacc says there is no problem with schedulers and whole system is idle, erlang:monitor(time_offset, clock_service) show that time in virtual machine flows monotonically.

I wonder, how can I find out why erlang behaves strange sometimes?

Erlang version is:

$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [kernel-poll:false]
Eshell V9.2  (abort with ^G)



--
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: <a href="tel:+7%20931%20987-79-91" value="+79319877991" target="_blank">+7 931 9877991

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: start_timer and timer fired lag

Alexander Petrovsky-2

2018-03-26 18:35 GMT+03:00 Jesper Louis Andersen <[hidden email]>:
Pure guess: You are being CPU throttled in order to save power, or the virtual machine is throttling you.

Enable the erlang system monitor and start looking for long schedules. These are usually an indication that you are being throttled by something. Either a bad NIF which isn't dirty-scheduling the right way, or by the underlying system stalling you. Perhaps because your timer is on a separate core and async-fires to another core. In general, timers can be stalled by the system, so you have no guarantee that they will fire close after their trigger point. But if you have timers which are periodically behind schedule, I'd look into the virtual setup.

 Hm, it's occur from time to time, not for every test run, so looks lile you are right about "being throttled by something".

One simple test is to run outside virtualization to make sure it isn't causing the problems.

Outside VM all tests passed, so I've just want to undestand why and where VM causing problems.
 
But I'd enable the system monitor anyway since it is a good way to detect unruly underlying hardware/virtualization layers.

When I run system_monitor, which metric should I use? Only  long_schedule?
 

On Mon, Mar 26, 2018 at 4:19 PM Alexander Petrovsky <[hidden email]> wrote:
Hi!

I'm usgin nksip library for working with sip protocol and in my virtual machine (windows -> virtualbox -> ubuntu) I'm observer very strange behaviour with erlang:start_timer [1]. Some time in common test, the timer fires with lacency with hundred milliseconds.

I've check, there is no messages in queue, msacc says there is no problem with schedulers and whole system is idle, erlang:monitor(time_offset, clock_service) show that time in virtual machine flows monotonically.

I wonder, how can I find out why erlang behaves strange sometimes?

Erlang version is:

$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [kernel-poll:false]
Eshell V9.2  (abort with ^G)



--
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: <a href="tel:+7%20931%20987-79-91" value="+79319877991" target="_blank">+7 931 9877991

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



--
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: +7 931 9877991


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: start_timer and timer fired lag

Leo Liu-2
In reply to this post by Jesper Louis Andersen-2
On 2018-03-26 15:35 +0000, Jesper Louis Andersen wrote:
> Enable the erlang system monitor and start looking for long schedules.

What are you referring to by "erlang system monitor"? First time hear
this and would like to try it out ;)

Cheers,
Leo
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: start_timer and timer fired lag

Alexander Petrovsky-2
Leo, I think Jesper talk about 

вт, 27 марта 2018 г. в 5:04, Leo Liu <[hidden email]>:
On 2018-03-26 15:35 +0000, Jesper Louis Andersen wrote:
> Enable the erlang system monitor and start looking for long schedules.

What are you referring to by "erlang system monitor"? First time hear
this and would like to try it out ;)

Cheers,
Leo
--
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: +7 931 9877991


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: start_timer and timer fired lag

Jesper Louis Andersen-2
Yes, that is the thing I'm talking about. Apart from long_schedule, I'd also enable the other kinds of monitors, especially long_gc. Usually, when the system reports one of these things, it is worth investigating why it is doing so. A timer might also arrive late because a poor process has way too much GC work to do, while also having too few cores. So having a marker for this tend to be good infrastructure design.

IIRC, Fred's book "Erlang in Anger" also touches on this (but haven't looked it up, I must admit).

On Tue, Mar 27, 2018 at 10:36 AM Alexander Petrovsky <[hidden email]> wrote:
Leo, I think Jesper talk about 

вт, 27 марта 2018 г. в 5:04, Leo Liu <[hidden email]>:
On 2018-03-26 15:35 +0000, Jesper Louis Andersen wrote:
> Enable the erlang system monitor and start looking for long schedules.

What are you referring to by "erlang system monitor"? First time hear
this and would like to try it out ;)

Cheers,
Leo
--
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: <a href="tel:+7%20931%20987-79-91" value="+79319877991" target="_blank">+7 931 9877991


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: start_timer and timer fired lag

Leo Liu-2
In reply to this post by Alexander Petrovsky-2
On 2018-03-27 08:36 +0000, Alexander Petrovsky wrote:
> Leo, I think Jesper talk about
> http://erlang.org/doc/man/erlang.html#system_monitor-2

On 2018-03-27 12:29 +0000, Jesper Louis Andersen wrote:
> Yes, that is the thing I'm talking about. Apart from long_schedule, I'd
> also enable the other kinds of monitors, especially long_gc. Usually, when
> the system reports one of these things, it is worth investigating why it is
> doing so. A timer might also arrive late because a poor process has way too
> much GC work to do, while also having too few cores. So having a marker for
> this tend to be good infrastructure design.
>
> IIRC, Fred's book "Erlang in Anger" also touches on this (but haven't
> looked it up, I must admit).

Thanks for the pointers. yes, EiA did touch on it.

Leo

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions