C-Node breakage with git master

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

C-Node breakage with git master

Andreas Schultz
Hi,

Something in the C-node handling changed between otp-20.1.3 and
today git master branch.

I have a C-node that implements the remote end of net_adm:ping/1.
To do that, it needs to answer to

  gen:call({net_kernel, Node}, '$gen_call', {is_auth, node()}, infinity)

Code is at [1].

In 20.1.3, the call is send as a plain message. With today's git master
it first attempts to setup a monitor (a ERL_MONITOR_P message is sent).

Monitors are not supported by C-node and ei_xreceive_msg therefor returns
an error, causing a connection abort.

Reading through some the distribution code and gen.erl code, it seems
that the C-node support is somewhat broken to begin with. There is
a comment in gen.erl [2] that suggests that a attempting to setup a
monitor on a C-node should return with an error. However, this would
need to be done on sending side. ei_xreceive_msg has no support to
deal with it and the error code would not allow the consumer to
implement proper handling.

There are two flags in the distribution protocol (DFLAG_DIST_MONITOR and
DFLAG_DIST_MONITOR_NAME). C-Nodes do set the DFLAG_DIST_MONITOR, but not
the DFLAG_DIST_MONITOR_NAME flag. This seems to be wrong, IMHO the
DFLAG_DIST_MONITOR should be cleared.

But, I also can't find the place where erlang:monitor/1 would actually
check and honor those flags. The monitor BIF seems to always send a
monitor request regardless of the node flags.

So my questions are:

1. Has anyone an idea what changed to cause the ping/is_auth change?
2. What is the correct way to implement a gen_server in a C-Node or
   is it in deed currently not possible/broken?

Regards
Andreas

[1]: https://github.com/travelping/capwap-dp/blob/master/src/capwap-dp.c#L943
[2]: https://github.com/erlang/otp/blob/master/lib/stdlib/src/gen.erl#L184
--
Dipl.-Inform. Andreas Schultz

email: [hidden email]
phone: +49-391-819099-224

----------------------- enabling your networks ----------------------

Travelping GmbH                     phone:  +49-391-81 90 99 0
Roentgenstr.  13                    fax:    +49-391-81 90 99 299
39108 Magdeburg                     email:  [hidden email]
GERMANY                             web:    http://www.travelping.com

Company Registration: Amtsgericht Stendal        Reg No.:   HRB 10578
Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.: DE236673780
---------------------------------------------------------------------
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: C-Node breakage with git master

Andreas Schultz
----- On Nov 21, 2017, at 4:23 PM, Andreas Schultz [hidden email] wrote:

> Hi,
>
> Something in the C-node handling changed between otp-20.1.3 and
> today git master branch.

Did sent to early, the behavior was changed by this commit:

https://github.com/erlang/otp/commit/17e198d6ee60f7dec9abfed272cf4226aea44535

I think the removal of the DFLAG_DIST_MONITOR and DFLAG_DIST_MONITOR_NAME
in that cset is a mistake.

Also, the removed DFLAG_DIST_MONITOR test proves IMHO that a
C-Node should not set that flag.

Comments?

Regards
Andreas

>
> I have a C-node that implements the remote end of net_adm:ping/1.
> To do that, it needs to answer to
>
>  gen:call({net_kernel, Node}, '$gen_call', {is_auth, node()}, infinity)
>
> Code is at [1].
>
> In 20.1.3, the call is send as a plain message. With today's git master
> it first attempts to setup a monitor (a ERL_MONITOR_P message is sent).
>
> Monitors are not supported by C-node and ei_xreceive_msg therefor returns
> an error, causing a connection abort.
>
> Reading through some the distribution code and gen.erl code, it seems
> that the C-node support is somewhat broken to begin with. There is
> a comment in gen.erl [2] that suggests that a attempting to setup a
> monitor on a C-node should return with an error. However, this would
> need to be done on sending side. ei_xreceive_msg has no support to
> deal with it and the error code would not allow the consumer to
> implement proper handling.
>
> There are two flags in the distribution protocol (DFLAG_DIST_MONITOR and
> DFLAG_DIST_MONITOR_NAME). C-Nodes do set the DFLAG_DIST_MONITOR, but not
> the DFLAG_DIST_MONITOR_NAME flag. This seems to be wrong, IMHO the
> DFLAG_DIST_MONITOR should be cleared.
>
> But, I also can't find the place where erlang:monitor/1 would actually
> check and honor those flags. The monitor BIF seems to always send a
> monitor request regardless of the node flags.
>
> So my questions are:
>
> 1. Has anyone an idea what changed to cause the ping/is_auth change?
> 2. What is the correct way to implement a gen_server in a C-Node or
>   is it in deed currently not possible/broken?
>
> Regards
> Andreas
>
> [1]: https://github.com/travelping/capwap-dp/blob/master/src/capwap-dp.c#L943
> [2]: https://github.com/erlang/otp/blob/master/lib/stdlib/src/gen.erl#L184
> --
> Dipl.-Inform. Andreas Schultz
>
> email: [hidden email]
> phone: +49-391-819099-224
>
> ----------------------- enabling your networks ----------------------
>
> Travelping GmbH                     phone:  +49-391-81 90 99 0
> Roentgenstr.  13                    fax:    +49-391-81 90 99 299
> 39108 Magdeburg                     email:  [hidden email]
> GERMANY                             web:    http://www.travelping.com
>
> Company Registration: Amtsgericht Stendal        Reg No.:   HRB 10578
> Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.: DE236673780
> ---------------------------------------------------------------------
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: C-Node breakage with git master

Sverker Eriksson-4
Hi Andreas

Current master is broken with respect to monitor of c-node "processes".

I'm currently schratching my head trying to fix this mess. Nice to have
a shoulder to cry on.

The commit https://github.com/erlang/otp/commit/17e198d6ee60f7dec9abfed
272cf4226 is part of a work to make distributed operations (like send,
monitor, link, etc) to *not* wait for the connections setup to
complete. They should just enqueue their request and trigger a
connection setup to commence.

This can be done in a backward compatible manner as all of them have a
truly asynchronous interface.

All except erlang:monitor, when called toward c-nodes which do not
support process monitoring. Old behaviour is to throw badarg. But that
would force erlang:monitor to still be synchronous waiting for
connection setup in order to know if the node supports it or not.

So the idea is to slightly change the behavoir and instead of badarg go
ahead and create the monitor but only let it supervise the connection.
That is, you will only get 'DOWN' message with 'noconnection' from such
a monitor. This is similar to what gen:call implements today by cathing
badarg and using monitor_node.

Yes, erl_interface sets DFLAG_DIST_MONITOR but it does not implement
it. Why? I think it's an old sloppy fix to make erl_global_register()
work, where the c-node receives a monitor request as part of the reply
which it just ignores.


/Sverker



On tis, 2017-11-21 at 16:31 +0100, Andreas Schultz wrote:

> ----- On Nov 21, 2017, at 4:23 PM, Andreas Schultz [hidden email]
> wrote:
>
> >
> > Hi,
> >
> > Something in the C-node handling changed between otp-20.1.3 and
> > today git master branch.
> Did sent to early, the behavior was changed by this commit:
>
> https://github.com/erlang/otp/commit/17e198d6ee60f7dec9abfed272cf4226
> aea44535
>
> I think the removal of the DFLAG_DIST_MONITOR and
> DFLAG_DIST_MONITOR_NAME
> in that cset is a mistake.
>
> Also, the removed DFLAG_DIST_MONITOR test proves IMHO that a
> C-Node should not set that flag.
>
> Comments?
>
> Regards
> Andreas
>
> >
> >
> > I have a C-node that implements the remote end of net_adm:ping/1.
> > To do that, it needs to answer to
> >
> >  gen:call({net_kernel, Node}, '$gen_call', {is_auth, node()},
> > infinity)
> >
> > Code is at [1].
> >
> > In 20.1.3, the call is send as a plain message. With today's git
> > master
> > it first attempts to setup a monitor (a ERL_MONITOR_P message is
> > sent).
> >
> > Monitors are not supported by C-node and ei_xreceive_msg therefor
> > returns
> > an error, causing a connection abort.
> >
> > Reading through some the distribution code and gen.erl code, it
> > seems
> > that the C-node support is somewhat broken to begin with. There is
> > a comment in gen.erl [2] that suggests that a attempting to setup a
> > monitor on a C-node should return with an error. However, this
> > would
> > need to be done on sending side. ei_xreceive_msg has no support to
> > deal with it and the error code would not allow the consumer to
> > implement proper handling.
> >
> > There are two flags in the distribution protocol
> > (DFLAG_DIST_MONITOR and
> > DFLAG_DIST_MONITOR_NAME). C-Nodes do set the DFLAG_DIST_MONITOR,
> > but not
> > the DFLAG_DIST_MONITOR_NAME flag. This seems to be wrong, IMHO the
> > DFLAG_DIST_MONITOR should be cleared.
> >
> > But, I also can't find the place where erlang:monitor/1 would
> > actually
> > check and honor those flags. The monitor BIF seems to always send a
> > monitor request regardless of the node flags.
> >
> > So my questions are:
> >
> > 1. Has anyone an idea what changed to cause the ping/is_auth
> > change?
> > 2. What is the correct way to implement a gen_server in a C-Node or
> >   is it in deed currently not possible/broken?
> >
> > Regards
> > Andreas
> >
> > [1]: https://github.com/travelping/capwap-dp/blob/master/src/capwap
> > -dp.c#L943
> > [2]: https://github.com/erlang/otp/blob/master/lib/stdlib/src/gen.e
> > rl#L184
> > --
> > Dipl.-Inform. Andreas Schultz
> >
> > email: [hidden email]
> > phone: +49-391-819099-224
> >
> > ----------------------- enabling your networks --------------------
> > --
> >
> > Travelping GmbH                     phone:  +49-391-81 90 99 0
> > Roentgenstr.  13                    fax:    +49-391-81 90 99 299
> > 39108 Magdeburg                     email:  [hidden email]
> > GERMANY                             web:    http://www.travelping.c
> > om
> >
> > Company Registration: Amtsgericht Stendal        Reg No.:   HRB
> > 10578
> > Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.:
> > DE236673780
> > -----------------------------------------------------------------
> > ----
> > _______________________________________________
> > erlang-questions mailing list
> > [hidden email]
> > http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: C-Node breakage with git master

Andreas Schultz
Hi,

----- On Nov 21, 2017, at 6:19 PM, Sverker Eriksson [hidden email] wrote:

> Hi Andreas
>
> Current master is broken with respect to monitor of c-node "processes".

Good to know ;-)
 

> I'm currently schratching my head trying to fix this mess. Nice to have
> a shoulder to cry on.
>
> The commit https://github.com/erlang/otp/commit/17e198d6ee60f7dec9abfed
> 272cf4226 is part of a work to make distributed operations (like send,
> monitor, link, etc) to *not* wait for the connections setup to
> complete. They should just enqueue their request and trigger a
> connection setup to commence.
>
> This can be done in a backward compatible manner as all of them have a
> truly asynchronous interface.
>
> All except erlang:monitor, when called toward c-nodes which do not
> support process monitoring. Old behaviour is to throw badarg. But that
> would force erlang:monitor to still be synchronous waiting for
> connection setup in order to know if the node supports it or not.

I would be nice if we could have to way to support monitors in a
C-node as well. It would have to be the responsibility of the
C-node to implement whatever kind of monitoring it sees fit when such
a request comes in.

And since I'm dreaming, cached atoms in C-nodes would be great ;-)

> So the idea is to slightly change the behavoir and instead of badarg go
> ahead and create the monitor but only let it supervise the connection.
> That is, you will only get 'DOWN' message with 'noconnection' from such
> a monitor. This is similar to what gen:call implements today by cathing
> badarg and using monitor_node.

Sounds good to me.

Andreas

> Yes, erl_interface sets DFLAG_DIST_MONITOR but it does not implement
> it. Why? I think it's an old sloppy fix to make erl_global_register()
> work, where the c-node receives a monitor request as part of the reply
> which it just ignores.
>
>
>
> /Sverker
>
>
>
> On tis, 2017-11-21 at 16:31 +0100, Andreas Schultz wrote:
>> ----- On Nov 21, 2017, at 4:23 PM, Andreas Schultz [hidden email]
>> wrote:
>>
>> >
>> > Hi,
>> >
>> > Something in the C-node handling changed between otp-20.1.3 and
>> > today git master branch.
>> Did sent to early, the behavior was changed by this commit:
>>
>> https://github.com/erlang/otp/commit/17e198d6ee60f7dec9abfed272cf4226
>> aea44535
>>
>> I think the removal of the DFLAG_DIST_MONITOR and
>> DFLAG_DIST_MONITOR_NAME
>> in that cset is a mistake.
>>
>> Also, the removed DFLAG_DIST_MONITOR test proves IMHO that a
>> C-Node should not set that flag.
>>
>> Comments?
>>
>> Regards
>> Andreas
>>
>> >
>> >
>> > I have a C-node that implements the remote end of net_adm:ping/1.
>> > To do that, it needs to answer to
>> >
>> >  gen:call({net_kernel, Node}, '$gen_call', {is_auth, node()},
>> > infinity)
>> >
>> > Code is at [1].
>> >
>> > In 20.1.3, the call is send as a plain message. With today's git
>> > master
>> > it first attempts to setup a monitor (a ERL_MONITOR_P message is
>> > sent).
>> >
>> > Monitors are not supported by C-node and ei_xreceive_msg therefor
>> > returns
>> > an error, causing a connection abort.
>> >
>> > Reading through some the distribution code and gen.erl code, it
>> > seems
>> > that the C-node support is somewhat broken to begin with. There is
>> > a comment in gen.erl [2] that suggests that a attempting to setup a
>> > monitor on a C-node should return with an error. However, this
>> > would
>> > need to be done on sending side. ei_xreceive_msg has no support to
>> > deal with it and the error code would not allow the consumer to
>> > implement proper handling.
>> >
>> > There are two flags in the distribution protocol
>> > (DFLAG_DIST_MONITOR and
>> > DFLAG_DIST_MONITOR_NAME). C-Nodes do set the DFLAG_DIST_MONITOR,
>> > but not
>> > the DFLAG_DIST_MONITOR_NAME flag. This seems to be wrong, IMHO the
>> > DFLAG_DIST_MONITOR should be cleared.
>> >
>> > But, I also can't find the place where erlang:monitor/1 would
>> > actually
>> > check and honor those flags. The monitor BIF seems to always send a
>> > monitor request regardless of the node flags.
>> >
>> > So my questions are:
>> >
>> > 1. Has anyone an idea what changed to cause the ping/is_auth
>> > change?
>> > 2. What is the correct way to implement a gen_server in a C-Node or
>> >   is it in deed currently not possible/broken?
>> >
>> > Regards
>> > Andreas
>> >
>> > [1]: https://github.com/travelping/capwap-dp/blob/master/src/capwap
>> > -dp.c#L943
>> > [2]: https://github.com/erlang/otp/blob/master/lib/stdlib/src/gen.e
>> > rl#L184
>> > --
>> > Dipl.-Inform. Andreas Schultz
>> >
>> > email: [hidden email]
>> > phone: +49-391-819099-224
>> >
>> > ----------------------- enabling your networks --------------------
>> > --
>> >
>> > Travelping GmbH                     phone:  +49-391-81 90 99 0
>> > Roentgenstr.  13                    fax:    +49-391-81 90 99 299
>> > 39108 Magdeburg                     email:  [hidden email]
>> > GERMANY                             web:    http://www.travelping.c
>> > om
>> >
>> > Company Registration: Amtsgericht Stendal        Reg No.:   HRB
>> > 10578
>> > Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.:
>> > DE236673780
>> > -----------------------------------------------------------------
>> > ----
>> > _______________________________________________
>> > erlang-questions mailing list
>> > [hidden email]
> > > http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: C-Node breakage with git master

Sverker Eriksson-4
I've pushed a fix to master now.
https://github.com/erlang/otp/commit/8ed0d75c186d9da24bd6cfb85732487b17
a3b054

/Sverker


On tis, 2017-11-21 at 18:31 +0100, Andreas Schultz wrote:

> Hi,
>
> ----- On Nov 21, 2017, at 6:19 PM, Sverker Eriksson sverker.eriksson@
> ericsson.com wrote:
>
> >
> > Hi Andreas
> >
> > Current master is broken with respect to monitor of c-node
> > "processes".
> Good to know ;-)
>  
> >
> > I'm currently schratching my head trying to fix this mess. Nice to
> > have
> > a shoulder to cry on.
> >
> > The commit https://github.com/erlang/otp/commit/17e198d6ee60f7dec9a
> > bfed
> > 272cf4226 is part of a work to make distributed operations (like
> > send,
> > monitor, link, etc) to *not* wait for the connections setup to
> > complete. They should just enqueue their request and trigger a
> > connection setup to commence.
> >
> > This can be done in a backward compatible manner as all of them
> > have a
> > truly asynchronous interface.
> >
> > All except erlang:monitor, when called toward c-nodes which do not
> > support process monitoring. Old behaviour is to throw badarg. But
> > that
> > would force erlang:monitor to still be synchronous waiting for
> > connection setup in order to know if the node supports it or not.
> I would be nice if we could have to way to support monitors in a
> C-node as well. It would have to be the responsibility of the
> C-node to implement whatehttps://github.com/erlang/otp/commit/8ed0d75
> c186d9da24bd6cfb85732487b17a3b054ver kind of monitoring it sees fit
> when such
> a request comes in.
>
> And since I'm dreaming, cached atoms in C-nodes would be great ;-)
>
> >
> > So the idea is to slightly change the behavoir and instead of
> > badarg go
> > ahead and create the monitor but only let it supervise the
> > connection.
> > That is, you will only get 'DOWN' message with 'noconnection' from
> > such
> > a monitor. This is similar to what gen:call implements today by
> > cathing
> > badarg and using monitor_node.
> Sounds good to me.
>
> Andreas
>
> >
> > Yes, erl_interface sets DFLAG_DIST_MONITOR but it does not
> > implement
> > it. Why? I think it's an old sloppy fix to
> > make erl_global_register()
> > work, where the c-node receives a monitor request as part of the
> > reply
> > which it just ignores.
> >
> >
> >
> > /Sverker
> >
> >
> >
> > On tis, 2017-11-21 at 16:31 +0100, Andreas Schultz wrote:
> > >
> > > ----- On Nov 21, 2017, at 4:23 PM, Andreas Schultz aschultz@tpip.
> > > net
> > > wrote:
> > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > > Something in the C-node handling changed between otp-20.1.3 and
> > > > today git master branch.
> > > Did sent to early, the behavior was changed by this commit:
> > >
> > > https://github.com/erlang/otp/commit/17e198d6ee60f7dec9abfed272cf
> > > 4226
> > > aea44535
> > >
> > > I think the removal of the DFLAG_DIST_MONITOR and
> > > DFLAG_DIST_MONITOR_NAME
> > > in that cset is a mistake.
> > >
> > > Also, the removed DFLAG_DIST_MONITOR test proves IMHO that a
> > > C-Node should not set that flag.
> > >
> > > Comments?
> > >
> > > Regards
> > > Andreas
> > >
> > > >
> > > >
> > > >
> > > > I have a C-node that implements the remote end of
> > > > net_adm:ping/1.
> > > > To do that, it needs to answer to
> > > >
> > > >  gen:call({net_kernel, Node}, '$gen_call', {is_auth, node()},
> > > > infinity)
> > > >
> > > > Code is at [1].
> > > >
> > > > In 20.1.3, the call is send as a plain message. With today's
> > > > git
> > > > master
> > > > it first attempts to setup a monitor (a ERL_MONITOR_P message
> > > > is
> > > > sent).
> > > >
> > > > Monitors are not supported by C-node and ei_xreceive_msg
> > > > therefor
> > > > returns
> > > > an error, causing a connection abort.
> > > >
> > > > Reading through some the distribution code and gen.erl code, it
> > > > seems
> > > > that the C-node support is somewhat broken to begin with. There
> > > > is
> > > > a comment in gen.erl [2] that suggests that a attempting to
> > > > setup a
> > > > monitor on a C-node should return with an error. However, this
> > > > would
> > > > need to be done on sending side. ei_xreceive_msg has no support
> > > > to
> > > > deal with it and the error code would not allow the consumer to
> > > > implement proper handling.
> > > >
> > > > There are two flags in the distribution protocol
> > > > (DFLAG_DIST_MONITOR and
> > > > DFLAG_DIST_MONITOR_NAME). C-Nodes do set the
> > > > DFLAG_DIST_MONITOR,
> > > > but not
> > > > the DFLAG_DIST_MONITOR_NAME flag. This seems to be wrong, IMHO
> > > > the
> > > > DFLAG_DIST_MONITOR should be cleared.
> > > >
> > > > But, I also can't find the place where erlang:monitor/1 would
> > > > actually
> > > > check and honor those flags. The monitor BIF seems to always
> > > > send a
> > > > monitor request regardless of the node flags.
> > > >
> > > > So my questions are:
> > > >
> > > > 1. Has anyone an idea what changed to cause the ping/is_auth
> > > > change?
> > > > 2. What is the correct way to implement a gen_server in a C-
> > > > Node or
> > > >   is it in deed currently not possible/broken?
> > > >
> > > > Regards
> > > > Andreas
> > > >
> > > > [1]: https://github.com/travelping/capwap-dp/blob/master/src/ca
> > > > pwap
> > > > -dp.c#L943
> > > > [2]: https://github.com/erlang/otp/blob/master/lib/stdlib/src/g
> > > > en.e
> > > > rl#L184
> > > > --
> > > > Dipl.-Inform. Andreas Schultz
> > > >
> > > > email: [hidden email]
> > > > phone: +49-391-819099-224
> > > >
> > > > ----------------------- enabling your networks ----------------
> > > > ----
> > > > --
> > > >
> > > > Travelping GmbH                     phone:  +49-391-81 90 99 0
> > > > Roentgenstr.  13                    fax:    +49-391-81 90 99
> > > > 299
> > > > 39108 Magdeburg                     email:  [hidden email]
> > > > GERMANY                             web:    http://www.travelpi
> > > > ng.c
> > > > om
> > > >
> > > > Company Registration: Amtsgericht Stendal        Reg No.:   HRB
> > > > 10578
> > > > Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.:
> > > > DE236673780
> > > > -------------------------------------------------------------
> > > > ----
> > > > ----
> > > > _______________________________________________
> > > > erlang-questions mailing list
> > > > [hidden email]
> > > > http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions