Sender punishment removed

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Sender punishment removed

Karl Nilsson-2
So I saw that the sender punishment was removed in [1]. The commit message doesn't outline any of the reasoning behind this. Is there any more details available about this anywhere I can read? I understand it never really worked that well but it would still be interesting to understand a bit further.

On a similar note what is the current thinking on flow control between erlang processes? Are there any improvements on mixing in a few calls in with the casts?


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Jesper Louis Andersen-2
As Lukas already wrote on slack somewhere:

Imagine you have a 2000'es machine: single core, around 400 mhz if you are lucky. In that setting, sender punishment can in some situations rectify a system that is going overboard. We simply let the offending process have less time on the scheduler in the hope that the overloaded mailbox process can catch up and do its work. It is not a surefire solution, but it may avoid some situations in which the system would otherwise topple.

Fast forward 18 years. Now, the machines are multicore, at least 4 threads commonly. Here, a sender might live on one core whereas the reciever might live on another process. It is less clear why the punishment strategy is good: we get to stop the sender, but there were already a scheduler for the other core and it is still overloaded. Worse, perhaps all the other cores are able to send messages through to the overloaded process.

As for the flow control: Erlang systems already employ flow control, namely TCP flow control between distributed nodes. I've seen two recent problems pertaining to having flow control inside flow control: gRPC has 3 layers: gRPC itself, HTTP/2 and TCP. And HTTP/2 has a layer on top of TCP. This is dangerous as the flow control of the underlying system can interfere with the flow control of the system above.

By extension, any Erlang-mechanism of flow control needs to protect against a scenario where your application has its own layer and make sure it doesn't interfere.

Personally, I think Ulf Wiger's "Jobs" model paved the way[0]: Apply flow control on input edge of the system, but don't apply it internally. If you do this correct, then a system shouldn't overload because of the border-limit. If you apply internal flow control, you also expose yourself to the danger of an artificial internal bottleneck. Rather, sample internally and use this as a feedback mechanism for the input edge.

Also note distributed flow control is a considerably harder problem to solve, and since Erlang is distributed by default, any general solution has to address this as well.


On Fri, Jan 19, 2018 at 9:51 AM Karl Nilsson <[hidden email]> wrote:
So I saw that the sender punishment was removed in [1]. The commit message doesn't outline any of the reasoning behind this. Is there any more details available about this anywhere I can read? I understand it never really worked that well but it would still be interesting to understand a bit further.

On a similar note what is the current thinking on flow control between erlang processes? Are there any improvements on mixing in a few calls in with the casts?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Karl Nilsson-2
Thanks Jesper, keeping flow control to the input edge makes perfect sense to me but developing a good feedback mechanism that is both safe and not overly cautious is likely to be quite challenging in non-trivial, potentially distributed domains.

I have some thinking to do. :) Any further reading along these lines that anyone knows of would be very welcome.

On Fri, 19 Jan 2018 at 21:17 Jesper Louis Andersen <[hidden email]> wrote:
As Lukas already wrote on slack somewhere:

Imagine you have a 2000'es machine: single core, around 400 mhz if you are lucky. In that setting, sender punishment can in some situations rectify a system that is going overboard. We simply let the offending process have less time on the scheduler in the hope that the overloaded mailbox process can catch up and do its work. It is not a surefire solution, but it may avoid some situations in which the system would otherwise topple.

Fast forward 18 years. Now, the machines are multicore, at least 4 threads commonly. Here, a sender might live on one core whereas the reciever might live on another process. It is less clear why the punishment strategy is good: we get to stop the sender, but there were already a scheduler for the other core and it is still overloaded. Worse, perhaps all the other cores are able to send messages through to the overloaded process.

As for the flow control: Erlang systems already employ flow control, namely TCP flow control between distributed nodes. I've seen two recent problems pertaining to having flow control inside flow control: gRPC has 3 layers: gRPC itself, HTTP/2 and TCP. And HTTP/2 has a layer on top of TCP. This is dangerous as the flow control of the underlying system can interfere with the flow control of the system above.

By extension, any Erlang-mechanism of flow control needs to protect against a scenario where your application has its own layer and make sure it doesn't interfere.

Personally, I think Ulf Wiger's "Jobs" model paved the way[0]: Apply flow control on input edge of the system, but don't apply it internally. If you do this correct, then a system shouldn't overload because of the border-limit. If you apply internal flow control, you also expose yourself to the danger of an artificial internal bottleneck. Rather, sample internally and use this as a feedback mechanism for the input edge.

Also note distributed flow control is a considerably harder problem to solve, and since Erlang is distributed by default, any general solution has to address this as well.


On Fri, Jan 19, 2018 at 9:51 AM Karl Nilsson <[hidden email]> wrote:
So I saw that the sender punishment was removed in [1]. The commit message doesn't outline any of the reasoning behind this. Is there any more details available about this anywhere I can read? I understand it never really worked that well but it would still be interesting to understand a bit further.

On a similar note what is the current thinking on flow control between erlang processes? Are there any improvements on mixing in a few calls in with the casts?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Fred Hebert-2
I have written Handling Overload (https://ferd.ca/handling-overload.html) as tour of the multiple options in Erlang. It may prove helpful.

On Mon, Jan 22, 2018 at 4:52 AM, Karl Nilsson <[hidden email]> wrote:
Thanks Jesper, keeping flow control to the input edge makes perfect sense to me but developing a good feedback mechanism that is both safe and not overly cautious is likely to be quite challenging in non-trivial, potentially distributed domains.

I have some thinking to do. :) Any further reading along these lines that anyone knows of would be very welcome.

On Fri, 19 Jan 2018 at 21:17 Jesper Louis Andersen <[hidden email]> wrote:
As Lukas already wrote on slack somewhere:

Imagine you have a 2000'es machine: single core, around 400 mhz if you are lucky. In that setting, sender punishment can in some situations rectify a system that is going overboard. We simply let the offending process have less time on the scheduler in the hope that the overloaded mailbox process can catch up and do its work. It is not a surefire solution, but it may avoid some situations in which the system would otherwise topple.

Fast forward 18 years. Now, the machines are multicore, at least 4 threads commonly. Here, a sender might live on one core whereas the reciever might live on another process. It is less clear why the punishment strategy is good: we get to stop the sender, but there were already a scheduler for the other core and it is still overloaded. Worse, perhaps all the other cores are able to send messages through to the overloaded process.

As for the flow control: Erlang systems already employ flow control, namely TCP flow control between distributed nodes. I've seen two recent problems pertaining to having flow control inside flow control: gRPC has 3 layers: gRPC itself, HTTP/2 and TCP. And HTTP/2 has a layer on top of TCP. This is dangerous as the flow control of the underlying system can interfere with the flow control of the system above.

By extension, any Erlang-mechanism of flow control needs to protect against a scenario where your application has its own layer and make sure it doesn't interfere.

Personally, I think Ulf Wiger's "Jobs" model paved the way[0]: Apply flow control on input edge of the system, but don't apply it internally. If you do this correct, then a system shouldn't overload because of the border-limit. If you apply internal flow control, you also expose yourself to the danger of an artificial internal bottleneck. Rather, sample internally and use this as a feedback mechanism for the input edge.

Also note distributed flow control is a considerably harder problem to solve, and since Erlang is distributed by default, any general solution has to address this as well.


On Fri, Jan 19, 2018 at 9:51 AM Karl Nilsson <[hidden email]> wrote:
So I saw that the sender punishment was removed in [1]. The commit message doesn't outline any of the reasoning behind this. Is there any more details available about this anywhere I can read? I understand it never really worked that well but it would still be interesting to understand a bit further.

On a similar note what is the current thinking on flow control between erlang processes? Are there any improvements on mixing in a few calls in with the casts?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Jesper Louis Andersen-2
To be more concrete:

Amazon might have a fancy ALB load balancer.

Amazon might also use HTTP/2 in that fancy ALB load balancer.

HTTP/2 has a flow control system because you are muxing several streams into one TCP connection. Also note that TCP has a send window.

Amazon might have defaulted their HTTP/2 Window frames to 16kilobyte in the upload direction.

If you do the tcp plots, this leads to a fun situation where the packets are flapping between MSS and ack-only packets, and your speed is limited by the RTT of the line.

People running gRPC knows that "using the Amazon layer 4 load balancer works", but they have yet to analyze the problem further. They know it "doesn't work with HTTP/2".

gRPC has its own flow control (!!) on top of HTTP/2, because "gRPC has to be transport agnostic in case you want to use UDP".

The end result is that you are not running at maximal utilization of the underlying TCP connection.

...

Another thing to think about: If you employ flow control, you have to think about the "lock" situation in which the flow control locks up the system. If the sender can be flow controlled, the problem is twofold at sender and at receiver side. This interaction is somewhat complex, compared to interaction in one side only. Corollary: Go, with bounded channels, have way more points where quasi-deadlock can occur than an Erlang program. But the Erlang program can overflow its mailbox in contrast.


On Mon, Jan 22, 2018 at 3:25 PM Fred Hebert <[hidden email]> wrote:
I have written Handling Overload (https://ferd.ca/handling-overload.html) as tour of the multiple options in Erlang. It may prove helpful.

On Mon, Jan 22, 2018 at 4:52 AM, Karl Nilsson <[hidden email]> wrote:
Thanks Jesper, keeping flow control to the input edge makes perfect sense to me but developing a good feedback mechanism that is both safe and not overly cautious is likely to be quite challenging in non-trivial, potentially distributed domains.

I have some thinking to do. :) Any further reading along these lines that anyone knows of would be very welcome.

On Fri, 19 Jan 2018 at 21:17 Jesper Louis Andersen <[hidden email]> wrote:
As Lukas already wrote on slack somewhere:

Imagine you have a 2000'es machine: single core, around 400 mhz if you are lucky. In that setting, sender punishment can in some situations rectify a system that is going overboard. We simply let the offending process have less time on the scheduler in the hope that the overloaded mailbox process can catch up and do its work. It is not a surefire solution, but it may avoid some situations in which the system would otherwise topple.

Fast forward 18 years. Now, the machines are multicore, at least 4 threads commonly. Here, a sender might live on one core whereas the reciever might live on another process. It is less clear why the punishment strategy is good: we get to stop the sender, but there were already a scheduler for the other core and it is still overloaded. Worse, perhaps all the other cores are able to send messages through to the overloaded process.

As for the flow control: Erlang systems already employ flow control, namely TCP flow control between distributed nodes. I've seen two recent problems pertaining to having flow control inside flow control: gRPC has 3 layers: gRPC itself, HTTP/2 and TCP. And HTTP/2 has a layer on top of TCP. This is dangerous as the flow control of the underlying system can interfere with the flow control of the system above.

By extension, any Erlang-mechanism of flow control needs to protect against a scenario where your application has its own layer and make sure it doesn't interfere.

Personally, I think Ulf Wiger's "Jobs" model paved the way[0]: Apply flow control on input edge of the system, but don't apply it internally. If you do this correct, then a system shouldn't overload because of the border-limit. If you apply internal flow control, you also expose yourself to the danger of an artificial internal bottleneck. Rather, sample internally and use this as a feedback mechanism for the input edge.

Also note distributed flow control is a considerably harder problem to solve, and since Erlang is distributed by default, any general solution has to address this as well.


On Fri, Jan 19, 2018 at 9:51 AM Karl Nilsson <[hidden email]> wrote:
So I saw that the sender punishment was removed in [1]. The commit message doesn't outline any of the reasoning behind this. Is there any more details available about this anywhere I can read? I understand it never really worked that well but it would still be interesting to understand a bit further.

On a similar note what is the current thinking on flow control between erlang processes? Are there any improvements on mixing in a few calls in with the casts?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Max Lapshin-2
It is a pity, but this advice is not enough:

> Just use synchronous OTP calls in gen_* behaviours for all desirable interactions

If you have a HTTP API that sends gen_server:call to some process, then you can get into a storm situation:

1) client comes to HTTP
2) http handler makes gen_server:call to singleton server
3) waits for 5 or 60 seconds and then exits, but message is already in process queue
4) server fetches this useless message from mailbox and starts making useless expensive operations
5) meanwhile client makes another duplicate request and again fills singleton mailbox with the same request


It is a dangereous situation and sometimes it is required to look at gen_server message_queue_len before calling it, but
if you do it on a multicore machine, you get into  lock contention and you will see low CPU and low RPS with high locks.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Guilherme Andrade

On 23 January 2018 at 10:30, Max Lapshin <[hidden email]> wrote:
1) client comes to HTTP
2) http handler makes gen_server:call to singleton server
3) waits for 5 or 60 seconds and then exits, but message is already in process queue
4) server fetches this useless message from mailbox and starts making useless expensive operations
5) meanwhile client makes another duplicate request and again fills singleton mailbox with the same request

I've recently started to explore sbroker[1] as an alternative (in some cases) and I've been getting very interesting results.

Then again it's also (IIRC) backed by single processes that might become a bottleneck themselves, but at least those processes aren't doing anything else. Also, because it's this big toolkit with a lot of bells and whistles, it's easy to shoot yourself in the foot if you're not already acquainted with some of the abstractions, and so there might be somewhat of a learning curve.

--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Jesper Louis Andersen-2

call(Msg) ->
    gen_server:call({Msg, erlang:monotonic_time()}).

receive
    {Msg, In} ->
      Out = erlang:monotonic_time(),
      Sojourn = erlang:convert_time_unit(Out - In, native, milli_seconds),
      case Sojourn > 5000 of
          true -> {reply, argh, State}
          false -> ...
       end
end

is usually a better way of tracking calls if you have overflow troubles. Sojourn tracking and head-drop from the FIFO is what CoDel does, and it works wonders on network connections. It can avoid having to process messages in the queue which already fell for a timeout. Better, you can do case (Sojourn + ExpectedProcessingTime) > ?LIMIT of ..., which can avoid doing work which will break the time limit anyway, proactively.



On Tue, Jan 23, 2018 at 12:35 PM Guilherme Andrade <[hidden email]> wrote:
On 23 January 2018 at 10:30, Max Lapshin <[hidden email]> wrote:
1) client comes to HTTP
2) http handler makes gen_server:call to singleton server
3) waits for 5 or 60 seconds and then exits, but message is already in process queue
4) server fetches this useless message from mailbox and starts making useless expensive operations
5) meanwhile client makes another duplicate request and again fills singleton mailbox with the same request

I've recently started to explore sbroker[1] as an alternative (in some cases) and I've been getting very interesting results.

Then again it's also (IIRC) backed by single processes that might become a bottleneck themselves, but at least those processes aren't doing anything else. Also, because it's this big toolkit with a lot of bells and whistles, it's easy to shoot yourself in the foot if you're not already acquainted with some of the abstractions, and so there might be somewhat of a learning curve.

--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Loïc Hoguin-3
How about adding this to gen_server directly? Perhaps as a new function?
I think this could be done with and without the expected processing time
in a fairly generic manner.

On 01/23/2018 02:05 PM, Jesper Louis Andersen wrote:

>
> call(Msg) ->
>      gen_server:call({Msg, erlang:monotonic_time()}).
>
> receive
>      {Msg, In} ->
>        Out = erlang:monotonic_time(),
>        Sojourn = erlang:convert_time_unit(Out - In, native, milli_seconds),
>        case Sojourn > 5000 of
>            true -> {reply, argh, State}
>            false -> ...
>         end
> end
>
> is usually a better way of tracking calls if you have overflow troubles.
> Sojourn tracking and head-drop from the FIFO is what CoDel does, and it
> works wonders on network connections. It can avoid having to process
> messages in the queue which already fell for a timeout. Better, you can
> do case (Sojourn + ExpectedProcessingTime) > ?LIMIT of ..., which can
> avoid doing work which will break the time limit anyway, proactively.
>
>
>
> On Tue, Jan 23, 2018 at 12:35 PM Guilherme Andrade <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 23 January 2018 at 10:30, Max Lapshin <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         1) client comes to HTTP
>         2) http handler makes gen_server:call to singleton server
>         3) waits for 5 or 60 seconds and then exits, but message is
>         already in process queue
>         4) server fetches this useless message from mailbox and starts
>         making useless expensive operations
>         5) meanwhile client makes another duplicate request and again
>         fills singleton mailbox with the same request
>
>
>     I've recently started to explore sbroker[1] as an alternative (in
>     some cases) and I've been getting very interesting results.
>
>     Then again it's also (IIRC) backed by single processes that might
>     become a bottleneck themselves, but at least those processes aren't
>     doing anything else. Also, because it's this big toolkit with a lot
>     of bells and whistles, it's easy to shoot yourself in the foot if
>     you're not already acquainted with some of the abstractions, and so
>     there might be somewhat of a learning curve.
>
>     [1]: https://github.com/fishcakez/sbroker
>
>     --
>     Guilherme
>
>
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>

--
Loïc Hoguin
https://ninenines.eu
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Jesper Louis Andersen-2
At some point writing the post, I considered

recieve
    Msg:Sojourn -> ...
end

but this requires an EEP, and then I realized it is better to handle this in the language in an orthogonal fashion. But I like the point of considering such a change on gen_server, indeed. Queue Sojourn is often a far better metric to track on a queue rather than length (because msg processing times differ) or bytes in the queue (because processing time varies).


On Tue, Jan 23, 2018 at 2:09 PM Loïc Hoguin <[hidden email]> wrote:
How about adding this to gen_server directly? Perhaps as a new function?
I think this could be done with and without the expected processing time
in a fairly generic manner.

On 01/23/2018 02:05 PM, Jesper Louis Andersen wrote:
>
> call(Msg) ->
>      gen_server:call({Msg, erlang:monotonic_time()}).
>
> receive
>      {Msg, In} ->
>        Out = erlang:monotonic_time(),
>        Sojourn = erlang:convert_time_unit(Out - In, native, milli_seconds),
>        case Sojourn > 5000 of
>            true -> {reply, argh, State}
>            false -> ...
>         end
> end
>
> is usually a better way of tracking calls if you have overflow troubles.
> Sojourn tracking and head-drop from the FIFO is what CoDel does, and it
> works wonders on network connections. It can avoid having to process
> messages in the queue which already fell for a timeout. Better, you can
> do case (Sojourn + ExpectedProcessingTime) > ?LIMIT of ..., which can
> avoid doing work which will break the time limit anyway, proactively.
>
>
>
> On Tue, Jan 23, 2018 at 12:35 PM Guilherme Andrade <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 23 January 2018 at 10:30, Max Lapshin <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         1) client comes to HTTP
>         2) http handler makes gen_server:call to singleton server
>         3) waits for 5 or 60 seconds and then exits, but message is
>         already in process queue
>         4) server fetches this useless message from mailbox and starts
>         making useless expensive operations
>         5) meanwhile client makes another duplicate request and again
>         fills singleton mailbox with the same request
>
>
>     I've recently started to explore sbroker[1] as an alternative (in
>     some cases) and I've been getting very interesting results.
>
>     Then again it's also (IIRC) backed by single processes that might
>     become a bottleneck themselves, but at least those processes aren't
>     doing anything else. Also, because it's this big toolkit with a lot
>     of bells and whistles, it's easy to shoot yourself in the foot if
>     you're not already acquainted with some of the abstractions, and so
>     there might be somewhat of a learning curve.
>
>     [1]: https://github.com/fishcakez/sbroker
>
>     --
>     Guilherme
>
>
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>

--
Loïc Hoguin
https://ninenines.eu

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Guilherme Andrade


On 23 January 2018 at 13:13, Jesper Louis Andersen <[hidden email]> wrote:
I like the point of considering such a change on gen_server, indeed.

I took a look at the OTP source as this seemed simple enough to implement; the code was kind enough to remind me that one can call gen processes on remote nodes. At that point any possibility of using the monotonic clock breaks down.

The functionality could be restricted to local gen processes, but then it's this one more crevasse people have to consider when working with the gen behaviours OTP provides, so I don't see it flying as a PR.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Pierre Fenoll-2
Having a timeout inside the pattern matching semantics of receiving messages would be interesting.
But yes as Guilherme points out how would that work with receiving from distributed nodes?
Maybe something along

recieve
    Msg:Node -> ...
end

maybe better:

recieve
    Msg:Node:Sojourn -> ...
end

which alows

recieve
    Msg:node() -> ...
    Msg:_:Sojourn -> ...
end

where node() would just get compiled to matching the local node.
Well this leaves the question of how would the libraries look like in the future.


Cheers,
-- 
Pierre Fenoll


On 24 January 2018 at 21:56, Guilherme Andrade <[hidden email]> wrote:


On 23 January 2018 at 13:13, Jesper Louis Andersen <[hidden email]> wrote:
I like the point of considering such a change on gen_server, indeed.

I took a look at the OTP source as this seemed simple enough to implement; the code was kind enough to remind me that one can call gen processes on remote nodes. At that point any possibility of using the monotonic clock breaks down.

The functionality could be restricted to local gen processes, but then it's this one more crevasse people have to consider when working with the gen behaviours OTP provides, so I don't see it flying as a PR.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Jesper Louis Andersen-2
On Wed, Jan 24, 2018 at 10:23 PM Pierre Fenoll <[hidden email]> wrote:
But yes as Guilherme points out how would that work with receiving from distributed nodes?

This would work exactly as advertised. When the message enters the mailbox on the receiving nodes end. You cannot in any way build a safe time construct over multiple nodes, so you opt for the next best case: a lower bound on the sojourn. This means that a message has spent at least XXms in queues inside the system. But it could have spent more time than that.

I don't think you can do better than this in a distributed network, but of course this requires a proof.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Guilherme Andrade


On 25 January 2018 at 14:28, Jesper Louis Andersen <[hidden email]> wrote:
This would work exactly as advertised. When the message enters the mailbox on the receiving nodes end. You cannot in any way build a safe time construct over multiple nodes, so you opt for the next best case: a lower bound on the sojourn. This means that a message has spent at least XXms in queues inside the system. But it could have spent more time than that.

How would this work? Monotonic timestamps originating on different nodes are not comparable. Let's imagine:

- There's two nodes connected through distribution: A and B;
- There's a process named Alice at node A;
- There's a gen process named Bob at node B;
- At a given instant T1, monotonic_timestamp{A} = 3s and monotonic_timestamp{B} = 10s;
- Alice calls Bob; it attaches monotonic_timestamp{A} to the call message;
- Excluding buffering and network delays, the call message arrives at Bob with a measured Sojourn of (10 - 3) = 7s and gets dropped, even though ~0s have passed.

And this is the ugly case, of course. Reverse the roles of Alice and Bob and we would get the bad case, as it would mean a lot of work would go through even if arriving terribly late.

As an alternative, a temporary caller process could always be launched on the same node the gen process resides in; it would be inside this caller process that the timestamp would get attached and the actual call performed, and so local Sojourn would be measurable, but that implies higher saturation of the distribution channels, besides telling us nothing of buffer or network -induced delays.

I would very much like to see this working, I just don't see a good way of dealing with distribution. But please correct me if I'm wrong in any of the assumptions above.


--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Guilherme Andrade


On 25 January 2018 at 16:34, Jesper Louis Andersen <[hidden email]> wrote:
2. Node B, where Bob lives, receives the message in its TCP stack. At this point the message enters the mailbox of process Bob, so we take a Ts = monotonic_timestamp(), but at node B. This means the timestamp is a bound on the sojourn. The reported value is always going to be lower than the actual sojourn (since we don't measure the TCP travel time). But if the reported sojourn is too high for our taste, we can always reject the message.

Ah yes, that makes perfect sense, of course. I was reasoning within the expectation that we wouldn't want to change something so fundamental as the nature of inboxes in ERTS. But indeed there's no good alternative.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Sender punishment removed

Guilherme Andrade
I've opened a PR[1] proposing a solution for expirable gen calls. Any input is welcome[2].

[1]: https://github.com/erlang/otp/pull/1693
[2]: Including naming. I'm really not fond of 'expirable'.

--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions