lowering jitter: best practices?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

lowering jitter: best practices?

Felix Gallo-2
For a game server, I have a large number (~3000) of concurrent gen_fsm processes which need to send out UDP packets every 30 milliseconds.  Each individual UDP-sending function can be presumed to be very short (<1ms) and not dominate or otherwise affect the scheduling.

I've tested a number of different scenarios:

1.  Each gen_fsm schedules itself via the gen_fsm timeout mechanism.  This is the easiest and most natural way, but jitter can be +7ms in the 95% case, and I occasionally see unusual events (e.g. timeout event happens when only 25-28ms of real time have elapsed, despite 30ms being scheduled).  

2.  One gen_fsm 'god timer' process schedules itself and sends out messages to all of the concurrent gen_fsms to trigger them to send out their UDP packets upon its own timeout.  Jitter is more variable, probably because the BEAM decides that the god timer is being too chatty, and sometimes the gen_fsms overwhelm the god timer in the schedulers.

3.  One port 'god timer' process (written in C, emitting a byte every 30ms).  Jitter is significantly reduced over #2 because apparently port processes get better scheduling and BEAM doesn't get as angry at the chattiness.  About on par with #1, maybe a little better, but the design is unpretty owing to the increased complexity.

Additionally, I've heard of someone using a c-written port-process god-timer per-fsm, which sounds to me like it would have the absolute best latency scenario but would involve several thousand external processes, all competing with erlang for thread execution, which sounds even unprettier.  Haven't tested that one.

Ideally I'd get as close to isochronous as possible, because the game 'feels' right if the stream of incoming packets is uniform.

None of the solutions gets me quite where I wanted; it's possible that any of the above are the optimal available path and I just don't know that yet, but I'm wondering if anyone has traveled down this same road and has advice, tuning secrets, or other plans to share.

F.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Sergej Jurečko
What about 2. with timer process having high priority?


Sergej

On 26 May 2015, at 20:14, Felix Gallo <[hidden email]> wrote:

> For a game server, I have a large number (~3000) of concurrent gen_fsm processes which need to send out UDP packets every 30 milliseconds.  Each individual UDP-sending function can be presumed to be very short (<1ms) and not dominate or otherwise affect the scheduling.
>
> I've tested a number of different scenarios:
>
> 1.  Each gen_fsm schedules itself via the gen_fsm timeout mechanism.  This is the easiest and most natural way, but jitter can be +7ms in the 95% case, and I occasionally see unusual events (e.g. timeout event happens when only 25-28ms of real time have elapsed, despite 30ms being scheduled).  
>
> 2.  One gen_fsm 'god timer' process schedules itself and sends out messages to all of the concurrent gen_fsms to trigger them to send out their UDP packets upon its own timeout.  Jitter is more variable, probably because the BEAM decides that the god timer is being too chatty, and sometimes the gen_fsms overwhelm the god timer in the schedulers.
>
> 3.  One port 'god timer' process (written in C, emitting a byte every 30ms).  Jitter is significantly reduced over #2 because apparently port processes get better scheduling and BEAM doesn't get as angry at the chattiness.  About on par with #1, maybe a little better, but the design is unpretty owing to the increased complexity.
>
> Additionally, I've heard of someone using a c-written port-process god-timer per-fsm, which sounds to me like it would have the absolute best latency scenario but would involve several thousand external processes, all competing with erlang for thread execution, which sounds even unprettier.  Haven't tested that one.
>
> Ideally I'd get as close to isochronous as possible, because the game 'feels' right if the stream of incoming packets is uniform.
>
> None of the solutions gets me quite where I wanted; it's possible that any of the above are the optimal available path and I just don't know that yet, but I'm wondering if anyone has traveled down this same road and has advice, tuning secrets, or other plans to share.
>
> F.
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

dmkolesnikov
In reply to this post by Felix Gallo-2
hello,

I am curious about #1. Are you using timer or send_after?

- Dmitry

> On 26 May 2015, at 21:14, Felix Gallo <[hidden email]> wrote:
>
> For a game server, I have a large number (~3000) of concurrent gen_fsm processes which need to send out UDP packets every 30 milliseconds.  Each individual UDP-sending function can be presumed to be very short (<1ms) and not dominate or otherwise affect the scheduling.
>
> I've tested a number of different scenarios:
>
> 1.  Each gen_fsm schedules itself via the gen_fsm timeout mechanism.  This is the easiest and most natural way, but jitter can be +7ms in the 95% case, and I occasionally see unusual events (e.g. timeout event happens when only 25-28ms of real time have elapsed, despite 30ms being scheduled).  
>
> 2.  One gen_fsm 'god timer' process schedules itself and sends out messages to all of the concurrent gen_fsms to trigger them to send out their UDP packets upon its own timeout.  Jitter is more variable, probably because the BEAM decides that the god timer is being too chatty, and sometimes the gen_fsms overwhelm the god timer in the schedulers.
>
> 3.  One port 'god timer' process (written in C, emitting a byte every 30ms).  Jitter is significantly reduced over #2 because apparently port processes get better scheduling and BEAM doesn't get as angry at the chattiness.  About on par with #1, maybe a little better, but the design is unpretty owing to the increased complexity.
>
> Additionally, I've heard of someone using a c-written port-process god-timer per-fsm, which sounds to me like it would have the absolute best latency scenario but would involve several thousand external processes, all competing with erlang for thread execution, which sounds even unprettier.  Haven't tested that one.
>
> Ideally I'd get as close to isochronous as possible, because the game 'feels' right if the stream of incoming packets is uniform.
>
> None of the solutions gets me quite where I wanted; it's possible that any of the above are the optimal available path and I just don't know that yet, but I'm wondering if anyone has traveled down this same road and has advice, tuning secrets, or other plans to share.
>
> F.
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Felix Gallo-2
For #1 and #2, I'm using the 4-item tuple return value for state calls in gen_fsm:StateName/2:

{next_state,NextStateName,NewStateData,Timeout}

e.g.:

{ next_state, s_firststate, StateData#state{ last_time = NewTime, count = Count + 1, pid = NotificationPid, jitters = [Jitter | Jitters] }, TickInterval };



On Tue, May 26, 2015 at 11:42 AM, Dmitry Kolesnikov <[hidden email]> wrote:
hello,

I am curious about #1. Are you using timer or send_after?

- Dmitry

> On 26 May 2015, at 21:14, Felix Gallo <[hidden email]> wrote:
>
> For a game server, I have a large number (~3000) of concurrent gen_fsm processes which need to send out UDP packets every 30 milliseconds.  Each individual UDP-sending function can be presumed to be very short (<1ms) and not dominate or otherwise affect the scheduling.
>
> I've tested a number of different scenarios:
>
> 1.  Each gen_fsm schedules itself via the gen_fsm timeout mechanism.  This is the easiest and most natural way, but jitter can be +7ms in the 95% case, and I occasionally see unusual events (e.g. timeout event happens when only 25-28ms of real time have elapsed, despite 30ms being scheduled).
>
> 2.  One gen_fsm 'god timer' process schedules itself and sends out messages to all of the concurrent gen_fsms to trigger them to send out their UDP packets upon its own timeout.  Jitter is more variable, probably because the BEAM decides that the god timer is being too chatty, and sometimes the gen_fsms overwhelm the god timer in the schedulers.
>
> 3.  One port 'god timer' process (written in C, emitting a byte every 30ms).  Jitter is significantly reduced over #2 because apparently port processes get better scheduling and BEAM doesn't get as angry at the chattiness.  About on par with #1, maybe a little better, but the design is unpretty owing to the increased complexity.
>
> Additionally, I've heard of someone using a c-written port-process god-timer per-fsm, which sounds to me like it would have the absolute best latency scenario but would involve several thousand external processes, all competing with erlang for thread execution, which sounds even unprettier.  Haven't tested that one.
>
> Ideally I'd get as close to isochronous as possible, because the game 'feels' right if the stream of incoming packets is uniform.
>
> None of the solutions gets me quite where I wanted; it's possible that any of the above are the optimal available path and I just don't know that yet, but I'm wondering if anyone has traveled down this same road and has advice, tuning secrets, or other plans to share.
>
> F.
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Jesper Louis Andersen-2

On Tue, May 26, 2015 at 8:52 PM, Felix Gallo <[hidden email]> wrote:
{next_state,NextStateName,NewStateData,Timeout}

This explains why you sometimes get less than 30ms sleep times. If an event reaches the process before Timeout, then the timeout is not triggered. Also, it may explain the jitter you are seeing, because an early event will reset the timeout. Try using gen_fsm:start_timer/2 or erlang:send_after...

If the problem persists, check lcnt. If you are locked on the timer wheel, then consider release 18 :)


--
J.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Felix Gallo-2
Innovative thinking, Jesper!  But in this case, in this testbed, the fsms aren't getting any messages other than those which they are delivering to themselves.  Which adds to the intrigue.  

I took your suggestion and tried using gen_fsm:start_timer/2.  Interestingly it slightly increased the jitter variance and the negative jitter issue is still present.  It's possible that my, ah, rapidly-and-pragmatically-built testbed suffers from some flaw, but I'm not seeing it.

Here's my code:


Here's sample output on this small but moderately modern non-cloud osx machine:

> test_fsm5:go(1000,40,40,10).
waiting for 1000 FSMs, tickrate 40
avg: 1324.1012703862662
max: 50219
min: -184
median: 1018
95th: 2615
99th: 9698

note that the max is 50ms of jitter; the min is negative 184 us jitter, and the median jitter is about 1ms, which correlates well with my beliefs about scheduler wakeup timers...

F.


On Tue, May 26, 2015 at 12:09 PM, Jesper Louis Andersen <[hidden email]> wrote:

On Tue, May 26, 2015 at 8:52 PM, Felix Gallo <[hidden email]> wrote:
{next_state,NextStateName,NewStateData,Timeout}

This explains why you sometimes get less than 30ms sleep times. If an event reaches the process before Timeout, then the timeout is not triggered. Also, it may explain the jitter you are seeing, because an early event will reset the timeout. Try using gen_fsm:start_timer/2 or erlang:send_after...

If the problem persists, check lcnt. If you are locked on the timer wheel, then consider release 18 :)


--
J.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Chandru-4
In reply to this post by Felix Gallo-2

On 26 May 2015 at 19:14, Felix Gallo <[hidden email]> wrote:
For a game server, I have a large number (~3000) of concurrent gen_fsm processes which need to send out UDP packets every 30 milliseconds.  Each individual UDP-sending function can be presumed to be very short (<1ms) and not dominate or otherwise affect the scheduling.

I've tested a number of different scenarios:

1.  Each gen_fsm schedules itself via the gen_fsm timeout mechanism.  This is the easiest and most natural way, but jitter can be +7ms in the 95% case, and I occasionally see unusual events (e.g. timeout event happens when only 25-28ms of real time have elapsed, despite 30ms being scheduled).  

Here are a few ideas, obviously all untested.

* How about setting the timer to fire initially at 10-15ms, and adjust the next timer interval based on observed drift?
* Let all the gen_fsm processes insert the packets they have to send into an ordered_set (ordered by time) ets table and have a single process which is checking the ETS table for messages to send at a certain point in time?
* Do you have any control on the receiving end? Can some smoothing of this jitter be done there?

Chandru

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

dmkolesnikov
In reply to this post by Felix Gallo-2
Hello,

This appears to be an interesting issue. First of all I’ve not seen a negative jitter and my impression was that your FSM loop suffer from measurement error. I’ve changed the measurement loop to be very-very tiny but there was not significant gain on jitter it is on par with your original code.

...
T0 = os:timestamp(),
receive
...
after TickInterval ->
   T1     = os:timestamp(),
   Jitter = timer:now_diff(T1, T0) - (TickInterval * 1000)
   ...
end
...

All-in-all, I’ve run the measurement both on my laptop Mac Book (Intel i5) and Amazon (cr1.8xlarge). The results was steady. You can find them below. I’ve seen jitter over 50ms for 10K process on my laptop only.
I am tend to thing that you are experience high jitter due to excessive CPU utilisation in your test bed. However, you can run the test with single FSM, the jitter is far from 0. May be it is time look VM internal. 

You have proposed three possible solutions in your earlier email, I am afraid all of them will suffer from ‘jitter’ due to multiple reasons including the overhead in network communication. The option #1 (timer per fsm) still looks more feasible from my perspective. You might implement adaptive timer to minimise experienced errors. 

Laptop:
kolesnik@pro:tmp$ erl -sbt ts -sws very_eager -swt high
Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V6.2  (abort with ^G)
1> test_fsm5:go(1000, 50, 50, 1).
waiting for 1000 FSMs, tickrate 50
avg: 3164.69932160804
max: 13372
min: 4
median: 2426
95th: 8339
99th: 10552
all_done

AWS:

[ec2-user@xxx ~]$ /usr/local/xxx/erts-6.2/bin/erl -sbt ts -sws very_eager -swt high
Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:32:32] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V6.2  (abort with ^G)
1> test_fsm5:go(1000, 50, 50, 1).
waiting for 1000 FSMs, tickrate 50
avg: 998.798351758794
max: 1926
min: 82
median: 998
95th: 1152
99th: 1204
all_done

- Dmitry


On 27 May 2015, at 00:03, Felix Gallo <[hidden email]> wrote:

Innovative thinking, Jesper!  But in this case, in this testbed, the fsms aren't getting any messages other than those which they are delivering to themselves.  Which adds to the intrigue.  

I took your suggestion and tried using gen_fsm:start_timer/2.  Interestingly it slightly increased the jitter variance and the negative jitter issue is still present.  It's possible that my, ah, rapidly-and-pragmatically-built testbed suffers from some flaw, but I'm not seeing it.

Here's my code:


Here's sample output on this small but moderately modern non-cloud osx machine:

> test_fsm5:go(1000,40,40,10).
waiting for 1000 FSMs, tickrate 40
avg: 1324.1012703862662
max: 50219
min: -184
median: 1018
95th: 2615
99th: 9698

note that the max is 50ms of jitter; the min is negative 184 us jitter, and the median jitter is about 1ms, which correlates well with my beliefs about scheduler wakeup timers...

F.


On Tue, May 26, 2015 at 12:09 PM, Jesper Louis Andersen <[hidden email]> wrote:

On Tue, May 26, 2015 at 8:52 PM, Felix Gallo <[hidden email]> wrote:
{next_state,NextStateName,NewStateData,Timeout}

This explains why you sometimes get less than 30ms sleep times. If an event reaches the process before Timeout, then the timeout is not triggered. Also, it may explain the jitter you are seeing, because an early event will reset the timeout. Try using gen_fsm:start_timer/2 or erlang:send_after...

If the problem persists, check lcnt. If you are locked on the timer wheel, then consider release 18 :)


--
J.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Jesper Louis Andersen-2
In reply to this post by Felix Gallo-2
Hi,

I applied the following patch:

; diff -u test_fsm5.orig test_fsm5.erl
--- test_fsm5.orig 2015-05-27 13:54:34.381978128 +0200
+++ test_fsm5.erl 2015-05-27 13:51:38.521826422 +0200
@@ -10,7 +10,7 @@
 -define(MINCOUNT, 300). % iters.  Will not record jitter until this many timeouts have passed.  Crude attempt to give schedulers settle time.
 
 init([TickInterval,NotificationPid]) ->
-  State = #state{ last_time = get_os_time(), tickinterval = TickInterval, pid = NotificationPid },
+  State = #state{ last_time = erlang:monotonic_time(), tickinterval = TickInterval, pid = NotificationPid },
   {ok, s_firststate, State, TickInterval}.
 
 handle_event(_Event, StateName, StateData) ->
@@ -29,10 +29,10 @@
   {ok, StateName, StateData}.
 
 s_firststate(timeout, #state{ last_time = LastTime, count = Count , tickinterval = TickInterval, pid = NotificationPid, jitters = Jitters } = StateData) ->
-  NewTime = get_os_time(),
-  TimeDiff = NewTime - LastTime,
-  Jitter = TimeDiff - (TickInterval * 1000), % microseconds
+  NewTime = erlang:monotonic_time(),
   gen_fsm:start_timer(TickInterval, timeout),
+  TimeDiff = erlang:convert_time_unit(NewTime - LastTime, native, micro_seconds),
+  Jitter = TimeDiff - (TickInterval * 1000), % microseconds
   case {(Count > ?MINCOUNT), (Count < ?MAXCOUNT)} of
     {false, true} ->
       { next_state, s_firststate, StateData#state{ last_time = NewTime, count = Count + 1, pid = NotificationPid, jitters = Jitters } };
@@ -81,10 +81,6 @@
   report(TickFrom,NumFSMs),
   go_run(NumFSMs, TickFrom + TickStep, TickTo, TickStep).
 
-get_os_time() ->
-  {MegaS, S, MicroS} = os:timestamp(),
-  (MegaS * 1000000 * 1000000 + S * 1000000 + MicroS).
-
 await(0) -> ok;
 await(N) ->
   receive _ ->
@@ -93,6 +89,7 @@
 
 report(Tick, NumFSMs) ->
   X = lists:sort([A || {_, A} <- ets:lookup(metrics,jitter)]),
+  file:write_file("observations.txt", [[integer_to_binary(E), $\n] || E <- X]),
   Avg = lists:sum(X)/length(X),
   Max = lists:max(X),
   Min = lists:min(X),

which switches to Erlang/OTP 18-rc2 and also uses the new time API. Machine is a Core i7-4900MQ CPU, ArchLinux, fairly recent linux kernel (Linux lady-of-pain 4.0.4-2-ARCH #1 SMP PREEMPT Fri May 22 03:05:23 UTC 2015 x86_64 GNU/Linux). It also dumps the jitter to "observations.txt" so we can look at the observations.

We run `erl +sbt db +C multi_time_warp` to bind schedulers to cores and request a timing mode which is able to maximize precise/accurate monotonic time.

Then I ran the data through a bit of R:

> x <- read.csv("observations.txt", header=FALSE)
> require(ggplot2)
> p <- ggplot(x, aes(x = V1))
> png("observations.png")
> p + geom_density()
> dev.off()
X11cairo 
       2 
> summary(x)
       V1      
 Min.   : 153  
 1st Qu.: 973  
 Median : 998  
 Mean   :1001  
 3rd Qu.:1028  
 Max.   :8018  

One may wonder what the spread is of the upper quantiles:

> quantile(x$V1,  c(.9, .95, .99, .999, .9999, .99999))
     90%      95%      99%    99.9%   99.99%  99.999% 
1065.000 1083.000 1142.000 1818.001 5045.000 7945.010 

The kernel density plot is attached, and it places itself around 1000 quite precisely.

This is actually somewhat to be expected. When we request a timeout of 40ms, we are somewhere inside a milli-second T. We can be close to the "start", i.e., T.001, or close to the "end", i.e., T.997. If we define T = 0 as our start point for our time, then we clearly can't wake up at P = 40, because 40 - 0.997 is 39.003, which would violate our wait time and wake up too early. Hence, we round our wake-time up to the next full milli-second, which is 41. This tracks extremely well with our numbers.

But it is highly unlikely that our 1000 processes would all trigger in the *same* millisecond, which would make a few of them round up to a different timepoint for wakeup. At least this would be a plausible explanation for jitter less than 1000 micro-seconds.

As for the 99.99th and 99.999th percentile, I think these can be attributed to something else happening in the system: OS, Garbage Collection, etc. I'm not sure the culprit there is the timer wheels.

Another point worth mentioning is that if 1000 processes wake up at the same millisecond edge, then they will queue over N cores. So invariably, this means some of the processes will have jitter. The workload you are working with is very "spiky" in this case, even if the average load on the system is very low. Do the math, assuming perfect spread, each core gets 1000 div 8 = 125 processes. They all awaken at the same time. Even if we can handle each process in 1 micro-second, there will be a process with a 125 micro-second latency. That is, since we pack so many processes in a short window, the system becomes more sensitive to small fluctuations. The numbers suggest we handle each process in less than a microsecond.

Going even faster, in the low nanoseconds, requires a change of system, since Erlang isn't the right tool for the job at that scale. You need to pack data in arrays to get better cache-access patterns at that scale since a DRAM hit is roughly 30ns (or more!). The functional nature of Erlang will hurt here. This is usually a job for OCaml, C, Go, GPUs, FPGAs or ASICs.


On Tue, May 26, 2015 at 11:03 PM, Felix Gallo <[hidden email]> wrote:
Innovative thinking, Jesper!  But in this case, in this testbed, the fsms aren't getting any messages other than those which they are delivering to themselves.  Which adds to the intrigue.  

I took your suggestion and tried using gen_fsm:start_timer/2.  Interestingly it slightly increased the jitter variance and the negative jitter issue is still present.  It's possible that my, ah, rapidly-and-pragmatically-built testbed suffers from some flaw, but I'm not seeing it.

Here's my code:


Here's sample output on this small but moderately modern non-cloud osx machine:

> test_fsm5:go(1000,40,40,10).
waiting for 1000 FSMs, tickrate 40
avg: 1324.1012703862662
max: 50219
min: -184
median: 1018
95th: 2615
99th: 9698

note that the max is 50ms of jitter; the min is negative 184 us jitter, and the median jitter is about 1ms, which correlates well with my beliefs about scheduler wakeup timers...

F.


On Tue, May 26, 2015 at 12:09 PM, Jesper Louis Andersen <[hidden email]> wrote:

On Tue, May 26, 2015 at 8:52 PM, Felix Gallo <[hidden email]> wrote:
{next_state,NextStateName,NewStateData,Timeout}

This explains why you sometimes get less than 30ms sleep times. If an event reaches the process before Timeout, then the timeout is not triggered. Also, it may explain the jitter you are seeing, because an early event will reset the timeout. Try using gen_fsm:start_timer/2 or erlang:send_after...

If the problem persists, check lcnt. If you are locked on the timer wheel, then consider release 18 :)


--
J.




--
J.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

observations.png (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: lowering jitter: best practices?

Felix Gallo-2
Jesper, thanks for the analysis.

Since your results suggested that R17 os:timestamp was heavily implicated in my jitter error calculations (all too typical a result for microbenchmarking, but I had to trust something), I tried again on a variety of cloud linux boxes and got similar results, suggesting that something may be up with R17.5.  Since you're further not seeing that with R18rc2, I have to guess that either the new time API or the timer wheel rework are improving matters.  Which is excellent news, even if it currently doesn't compile on my box. :)

I'll do some more jitter exploration with R18rc2 in the coming weeks; if anyone is reading this in the future and would like to know how it all went down, please feel free to contact me for details.  Thanks to all for the help.

F. 



On Wed, May 27, 2015 at 5:17 AM, Jesper Louis Andersen <[hidden email]> wrote:
Hi,

I applied the following patch:

; diff -u test_fsm5.orig test_fsm5.erl
--- test_fsm5.orig 2015-05-27 13:54:34.381978128 +0200
+++ test_fsm5.erl 2015-05-27 13:51:38.521826422 +0200
@@ -10,7 +10,7 @@
 -define(MINCOUNT, 300). % iters.  Will not record jitter until this many timeouts have passed.  Crude attempt to give schedulers settle time.
 
 init([TickInterval,NotificationPid]) ->
-  State = #state{ last_time = get_os_time(), tickinterval = TickInterval, pid = NotificationPid },
+  State = #state{ last_time = erlang:monotonic_time(), tickinterval = TickInterval, pid = NotificationPid },
   {ok, s_firststate, State, TickInterval}.
 
 handle_event(_Event, StateName, StateData) ->
@@ -29,10 +29,10 @@
   {ok, StateName, StateData}.
 
 s_firststate(timeout, #state{ last_time = LastTime, count = Count , tickinterval = TickInterval, pid = NotificationPid, jitters = Jitters } = StateData) ->
-  NewTime = get_os_time(),
-  TimeDiff = NewTime - LastTime,
-  Jitter = TimeDiff - (TickInterval * 1000), % microseconds
+  NewTime = erlang:monotonic_time(),
   gen_fsm:start_timer(TickInterval, timeout),
+  TimeDiff = erlang:convert_time_unit(NewTime - LastTime, native, micro_seconds),
+  Jitter = TimeDiff - (TickInterval * 1000), % microseconds
   case {(Count > ?MINCOUNT), (Count < ?MAXCOUNT)} of
     {false, true} ->
       { next_state, s_firststate, StateData#state{ last_time = NewTime, count = Count + 1, pid = NotificationPid, jitters = Jitters } };
@@ -81,10 +81,6 @@
   report(TickFrom,NumFSMs),
   go_run(NumFSMs, TickFrom + TickStep, TickTo, TickStep).
 
-get_os_time() ->
-  {MegaS, S, MicroS} = os:timestamp(),
-  (MegaS * 1000000 * 1000000 + S * 1000000 + MicroS).
-
 await(0) -> ok;
 await(N) ->
   receive _ ->
@@ -93,6 +89,7 @@
 
 report(Tick, NumFSMs) ->
   X = lists:sort([A || {_, A} <- ets:lookup(metrics,jitter)]),
+  file:write_file("observations.txt", [[integer_to_binary(E), $\n] || E <- X]),
   Avg = lists:sum(X)/length(X),
   Max = lists:max(X),
   Min = lists:min(X),

which switches to Erlang/OTP 18-rc2 and also uses the new time API. Machine is a Core i7-4900MQ CPU, ArchLinux, fairly recent linux kernel (Linux lady-of-pain 4.0.4-2-ARCH #1 SMP PREEMPT Fri May 22 03:05:23 UTC 2015 x86_64 GNU/Linux). It also dumps the jitter to "observations.txt" so we can look at the observations.

We run `erl +sbt db +C multi_time_warp` to bind schedulers to cores and request a timing mode which is able to maximize precise/accurate monotonic time.

Then I ran the data through a bit of R:

> x <- read.csv("observations.txt", header=FALSE)
> require(ggplot2)
> p <- ggplot(x, aes(x = V1))
> png("observations.png")
> p + geom_density()
> dev.off()
X11cairo 
       2 
> summary(x)
       V1      
 Min.   : 153  
 1st Qu.: 973  
 Median : 998  
 Mean   :1001  
 3rd Qu.:1028  
 Max.   :8018  

One may wonder what the spread is of the upper quantiles:

> quantile(x$V1,  c(.9, .95, .99, .999, .9999, .99999))
     90%      95%      99%    99.9%   99.99%  99.999% 
1065.000 1083.000 1142.000 1818.001 5045.000 7945.010 

The kernel density plot is attached, and it places itself around 1000 quite precisely.

This is actually somewhat to be expected. When we request a timeout of 40ms, we are somewhere inside a milli-second T. We can be close to the "start", i.e., T.001, or close to the "end", i.e., T.997. If we define T = 0 as our start point for our time, then we clearly can't wake up at P = 40, because 40 - 0.997 is 39.003, which would violate our wait time and wake up too early. Hence, we round our wake-time up to the next full milli-second, which is 41. This tracks extremely well with our numbers.

But it is highly unlikely that our 1000 processes would all trigger in the *same* millisecond, which would make a few of them round up to a different timepoint for wakeup. At least this would be a plausible explanation for jitter less than 1000 micro-seconds.

As for the 99.99th and 99.999th percentile, I think these can be attributed to something else happening in the system: OS, Garbage Collection, etc. I'm not sure the culprit there is the timer wheels.

Another point worth mentioning is that if 1000 processes wake up at the same millisecond edge, then they will queue over N cores. So invariably, this means some of the processes will have jitter. The workload you are working with is very "spiky" in this case, even if the average load on the system is very low. Do the math, assuming perfect spread, each core gets 1000 div 8 = 125 processes. They all awaken at the same time. Even if we can handle each process in 1 micro-second, there will be a process with a 125 micro-second latency. That is, since we pack so many processes in a short window, the system becomes more sensitive to small fluctuations. The numbers suggest we handle each process in less than a microsecond.

Going even faster, in the low nanoseconds, requires a change of system, since Erlang isn't the right tool for the job at that scale. You need to pack data in arrays to get better cache-access patterns at that scale since a DRAM hit is roughly 30ns (or more!). The functional nature of Erlang will hurt here. This is usually a job for OCaml, C, Go, GPUs, FPGAs or ASICs.


On Tue, May 26, 2015 at 11:03 PM, Felix Gallo <[hidden email]> wrote:
Innovative thinking, Jesper!  But in this case, in this testbed, the fsms aren't getting any messages other than those which they are delivering to themselves.  Which adds to the intrigue.  

I took your suggestion and tried using gen_fsm:start_timer/2.  Interestingly it slightly increased the jitter variance and the negative jitter issue is still present.  It's possible that my, ah, rapidly-and-pragmatically-built testbed suffers from some flaw, but I'm not seeing it.

Here's my code:


Here's sample output on this small but moderately modern non-cloud osx machine:

> test_fsm5:go(1000,40,40,10).
waiting for 1000 FSMs, tickrate 40
avg: 1324.1012703862662
max: 50219
min: -184
median: 1018
95th: 2615
99th: 9698

note that the max is 50ms of jitter; the min is negative 184 us jitter, and the median jitter is about 1ms, which correlates well with my beliefs about scheduler wakeup timers...

F.


On Tue, May 26, 2015 at 12:09 PM, Jesper Louis Andersen <[hidden email]> wrote:

On Tue, May 26, 2015 at 8:52 PM, Felix Gallo <[hidden email]> wrote:
{next_state,NextStateName,NewStateData,Timeout}

This explains why you sometimes get less than 30ms sleep times. If an event reaches the process before Timeout, then the timeout is not triggered. Also, it may explain the jitter you are seeing, because an early event will reset the timeout. Try using gen_fsm:start_timer/2 or erlang:send_after...

If the problem persists, check lcnt. If you are locked on the timer wheel, then consider release 18 :)


--
J.




--
J.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions