Node not responding to init:stop()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Node not responding to init:stop()

Roger Lipscombe-2
This sounds similar to
http://erlang.org/pipermail/erlang-questions/2012-December/071223.html,
but it's not quite the same, as far as I can tell.

I've noticed that, at some point since upgrading from OTP-20.3 to
OTP-21.0 (along with the necessary dependency updates), my Erlang
nodes are no longer stopping at the end of our system test run.

The nodes are orchestrated by having 'erlexec' run a bash script which
uses (effectively) 'foo/bin/foo foreground &'. I'm relying on erlexec
killing the bash script and that killing the nodes. This works fine
when the nodes are using OTP-20.3, but not with OTP-21.0.

If I connect to the node, I can issue 'init:stop()', and it returns
ok, but nothing happens. If I use 'application:which_applications()',
I get a timeout.

Unlike the linked discussion, I _can_ repeatedly connect a remote
shell (using erl -remsh), but I have to resort to erlang:halt() to
stop the node. Since my system test environment relies on orderly
process group teardown to stop the nodes, that's not useful.

As I understand it, process group teardown results in SIGTERM, which
results in a call to init:stop, which should stop the node. It isn't.

How do I figure out why init:stop() isn't working?
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Node not responding to init:stop()

Roger Lipscombe-2
Right, so I've done some digging. I got a crash dump and by looking at
the stack trace for application_controller, I can see that it's
hanging in application_controller:terminate, presumably while waiting
for one of the processes to stop. The pid given is <0.893.0>. Looking
a bit further up the stack trace gets me the process state (possibly
stale?), which contains a bunch of applications and pids. The pid it's
hanging on is 'kernel', which I'd expect to be the last (?)
application to be shut down. The process state (assuming it's not
stale) has a load of other applications in there with pids. There are
a bunch of ours, but also others, such as runtime_tools, cowboy,
amqp_client, hackney, lager, etc.

Is any of this useful in finding out why it's hanging?

On 2 July 2018 at 16:54, Roger Lipscombe <[hidden email]> wrote:

> This sounds similar to
> http://erlang.org/pipermail/erlang-questions/2012-December/071223.html,
> but it's not quite the same, as far as I can tell.
>
> I've noticed that, at some point since upgrading from OTP-20.3 to
> OTP-21.0 (along with the necessary dependency updates), my Erlang
> nodes are no longer stopping at the end of our system test run.
>
> The nodes are orchestrated by having 'erlexec' run a bash script which
> uses (effectively) 'foo/bin/foo foreground &'. I'm relying on erlexec
> killing the bash script and that killing the nodes. This works fine
> when the nodes are using OTP-20.3, but not with OTP-21.0.
>
> If I connect to the node, I can issue 'init:stop()', and it returns
> ok, but nothing happens. If I use 'application:which_applications()',
> I get a timeout.
>
> Unlike the linked discussion, I _can_ repeatedly connect a remote
> shell (using erl -remsh), but I have to resort to erlang:halt() to
> stop the node. Since my system test environment relies on orderly
> process group teardown to stop the nodes, that's not useful.
>
> As I understand it, process group teardown results in SIGTERM, which
> results in a call to init:stop, which should stop the node. It isn't.
>
> How do I figure out why init:stop() isn't working?
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Node not responding to init:stop()

Siri Hansen
Hi Roger,

there's been a few changes in the shutdown procedure (init.erl) with the introduction of Logger in OTP-21. There are also some changes in the supervision tree of the Kernel application related to Logger. I don't know if this has anything to do with your problem, but the chances exist, of course.

As you say, Kernel is the last application to be taken down (it's the last element in the #state.running of application_controller), and the state you see is probably stale in the sense that all applications listed before kernel in the 'running' list should already have been terminated. From the crash dump - can you see how many processes are still alive? Are they all part of kernel, or are there more applications that did not completely terminate?

And if we are to look closer at the logger track - do you know which logger handlers were active? Are there any children under the logger_sup supervisor, and if so, what are their state? You mention lager above, do you know how lager and logger are connected (is lager a handler under logger, or does it run independently)?

If you see anything suspicious related to logger, please feel free to send me the crash dump off-list, and I'll see if I can figure out anything more.

Regards
/siri



Den man. 2. jul. 2018 kl. 18:24 skrev Roger Lipscombe <[hidden email]>:
Right, so I've done some digging. I got a crash dump and by looking at
the stack trace for application_controller, I can see that it's
hanging in application_controller:terminate, presumably while waiting
for one of the processes to stop. The pid given is <0.893.0>. Looking
a bit further up the stack trace gets me the process state (possibly
stale?), which contains a bunch of applications and pids. The pid it's
hanging on is 'kernel', which I'd expect to be the last (?)
application to be shut down. The process state (assuming it's not
stale) has a load of other applications in there with pids. There are
a bunch of ours, but also others, such as runtime_tools, cowboy,
amqp_client, hackney, lager, etc.

Is any of this useful in finding out why it's hanging?

On 2 July 2018 at 16:54, Roger Lipscombe <[hidden email]> wrote:
> This sounds similar to
> http://erlang.org/pipermail/erlang-questions/2012-December/071223.html,
> but it's not quite the same, as far as I can tell.
>
> I've noticed that, at some point since upgrading from OTP-20.3 to
> OTP-21.0 (along with the necessary dependency updates), my Erlang
> nodes are no longer stopping at the end of our system test run.
>
> The nodes are orchestrated by having 'erlexec' run a bash script which
> uses (effectively) 'foo/bin/foo foreground &'. I'm relying on erlexec
> killing the bash script and that killing the nodes. This works fine
> when the nodes are using OTP-20.3, but not with OTP-21.0.
>
> If I connect to the node, I can issue 'init:stop()', and it returns
> ok, but nothing happens. If I use 'application:which_applications()',
> I get a timeout.
>
> Unlike the linked discussion, I _can_ repeatedly connect a remote
> shell (using erl -remsh), but I have to resort to erlang:halt() to
> stop the node. Since my system test environment relies on orderly
> process group teardown to stop the nodes, that's not useful.
>
> As I understand it, process group teardown results in SIGTERM, which
> results in a call to init:stop, which should stop the node. It isn't.
>
> How do I figure out why init:stop() isn't working?
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Node not responding to init:stop()

Roger Lipscombe-2
Before I go looking for permission to release the crash dump (it'll
take at least a couple of hours; I'll ask), let's see what we can
figure out from here.

There are 44 processes alive. Most of them look fairly innocuous, but,
in terms of logger, I see:

- A process <0.2130.0> sitting in erlang:apply, issuing
error_logger:stop, which is a gen_server:call, timeout infinity, to
<0.927.0>. Stack is logger_server:'-call_h_async/4-fun-0-'/2 + 152 ->
logger_server:call_h/4 + 96 -> error_logger:removing_handler/1 + 112
-> error_logger:stop/0 + 72
- <0.927.0> is 'logger_sup'; it appears to be trying to call
<0.890.0>, which is 'logger'.
- <0.890.0> is, apparently, sitting in proc_lib:init_p_do_apply/3

I suspect that logger_sup has failed in supervisor:do_terminate, and
is attempting to report an error, via logger (which is problematic, I
guess). I see the following on the logger_sup stack:

#{level => error,
  meta =>
      #{domain => [otp,sasl],
        error_logger => #{tag => error_report,type => supervisor_report},
        file => "supervisor.erl",
        gl => <0.894.0>,
        line => 838,
        logger_formatter => #{title => "SUPERVISOR REPORT"},
        mfa => {supervisor,do_terminate,2},
        pid => <0.927.0>,
        report_cb => fun logger:format_otp_report/1,time => 1530616282326266},
  msg =>
      {report,#{label => {supervisor,shutdown_error},
                report =>
                    [{supervisor,{local,logger_sup}},
                     {errorContext,shutdown_error},
                     {reason,killed},
                     {offender,[{pid,<0.929.0>},
                                {id,default},
                                {mfargs,{logger_std_h,start_link,undefined}},
                                {restart_type,temporary},
                                {shutdown,2000},
                                {child_type,worker}]}]}}}

At this point, logger_sup has zero links, is monitoring 'logger' and
is monitored by kernel_sup and the process that's calling
error_logger:stop.

For a *running* node, I get:

1> supervisor:which_children(logger_sup).
[{error_logger,<0.1023.0>,worker,dynamic},
 {default,<0.929.0>,worker,[logger_std_h]}]

Assuming that the low process IDs are relatively stable, that
<0.929.0> for the default logger correlates with the offender in the
?reportError above.

I *believe* that lager's running independently. That is: I've done
nothing deliberate to connect the two. I'm using lager 3.6.3.

Thanks,
Roger.

On 3 July 2018 at 11:32, Siri Hansen <[hidden email]> wrote:

> Hi Roger,
>
> there's been a few changes in the shutdown procedure (init.erl) with the
> introduction of Logger in OTP-21. There are also some changes in the
> supervision tree of the Kernel application related to Logger. I don't know
> if this has anything to do with your problem, but the chances exist, of
> course.
>
> As you say, Kernel is the last application to be taken down (it's the last
> element in the #state.running of application_controller), and the state you
> see is probably stale in the sense that all applications listed before
> kernel in the 'running' list should already have been terminated. From the
> crash dump - can you see how many processes are still alive? Are they all
> part of kernel, or are there more applications that did not completely
> terminate?
>
> And if we are to look closer at the logger track - do you know which logger
> handlers were active? Are there any children under the logger_sup
> supervisor, and if so, what are their state? You mention lager above, do you
> know how lager and logger are connected (is lager a handler under logger, or
> does it run independently)?
>
> If you see anything suspicious related to logger, please feel free to send
> me the crash dump off-list, and I'll see if I can figure out anything more.
>
> Regards
> /siri
>
>
>
> Den man. 2. jul. 2018 kl. 18:24 skrev Roger Lipscombe
> <[hidden email]>:
>>
>> Right, so I've done some digging. I got a crash dump and by looking at
>> the stack trace for application_controller, I can see that it's
>> hanging in application_controller:terminate, presumably while waiting
>> for one of the processes to stop. The pid given is <0.893.0>. Looking
>> a bit further up the stack trace gets me the process state (possibly
>> stale?), which contains a bunch of applications and pids. The pid it's
>> hanging on is 'kernel', which I'd expect to be the last (?)
>> application to be shut down. The process state (assuming it's not
>> stale) has a load of other applications in there with pids. There are
>> a bunch of ours, but also others, such as runtime_tools, cowboy,
>> amqp_client, hackney, lager, etc.
>>
>> Is any of this useful in finding out why it's hanging?
>>
>> On 2 July 2018 at 16:54, Roger Lipscombe <[hidden email]> wrote:
>> > This sounds similar to
>> > http://erlang.org/pipermail/erlang-questions/2012-December/071223.html,
>> > but it's not quite the same, as far as I can tell.
>> >
>> > I've noticed that, at some point since upgrading from OTP-20.3 to
>> > OTP-21.0 (along with the necessary dependency updates), my Erlang
>> > nodes are no longer stopping at the end of our system test run.
>> >
>> > The nodes are orchestrated by having 'erlexec' run a bash script which
>> > uses (effectively) 'foo/bin/foo foreground &'. I'm relying on erlexec
>> > killing the bash script and that killing the nodes. This works fine
>> > when the nodes are using OTP-20.3, but not with OTP-21.0.
>> >
>> > If I connect to the node, I can issue 'init:stop()', and it returns
>> > ok, but nothing happens. If I use 'application:which_applications()',
>> > I get a timeout.
>> >
>> > Unlike the linked discussion, I _can_ repeatedly connect a remote
>> > shell (using erl -remsh), but I have to resort to erlang:halt() to
>> > stop the node. Since my system test environment relies on orderly
>> > process group teardown to stop the nodes, that's not useful.
>> >
>> > As I understand it, process group teardown results in SIGTERM, which
>> > results in a call to init:stop, which should stop the node. It isn't.
>> >
>> > How do I figure out why init:stop() isn't working?
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Node not responding to init:stop()

Roger Lipscombe-2
Note that I can reproduce this consistently, so if there's any
experimentation (tracing, patches, etc.) you want me to try, I should
be able to do that.

On 3 July 2018 at 12:45, Roger Lipscombe <[hidden email]> wrote:

> Before I go looking for permission to release the crash dump (it'll
> take at least a couple of hours; I'll ask), let's see what we can
> figure out from here.
>
> There are 44 processes alive. Most of them look fairly innocuous, but,
> in terms of logger, I see:
>
> - A process <0.2130.0> sitting in erlang:apply, issuing
> error_logger:stop, which is a gen_server:call, timeout infinity, to
> <0.927.0>. Stack is logger_server:'-call_h_async/4-fun-0-'/2 + 152 ->
> logger_server:call_h/4 + 96 -> error_logger:removing_handler/1 + 112
> -> error_logger:stop/0 + 72
> - <0.927.0> is 'logger_sup'; it appears to be trying to call
> <0.890.0>, which is 'logger'.
> - <0.890.0> is, apparently, sitting in proc_lib:init_p_do_apply/3
>
> I suspect that logger_sup has failed in supervisor:do_terminate, and
> is attempting to report an error, via logger (which is problematic, I
> guess). I see the following on the logger_sup stack:
>
> #{level => error,
>   meta =>
>       #{domain => [otp,sasl],
>         error_logger => #{tag => error_report,type => supervisor_report},
>         file => "supervisor.erl",
>         gl => <0.894.0>,
>         line => 838,
>         logger_formatter => #{title => "SUPERVISOR REPORT"},
>         mfa => {supervisor,do_terminate,2},
>         pid => <0.927.0>,
>         report_cb => fun logger:format_otp_report/1,time => 1530616282326266},
>   msg =>
>       {report,#{label => {supervisor,shutdown_error},
>                 report =>
>                     [{supervisor,{local,logger_sup}},
>                      {errorContext,shutdown_error},
>                      {reason,killed},
>                      {offender,[{pid,<0.929.0>},
>                                 {id,default},
>                                 {mfargs,{logger_std_h,start_link,undefined}},
>                                 {restart_type,temporary},
>                                 {shutdown,2000},
>                                 {child_type,worker}]}]}}}
>
> At this point, logger_sup has zero links, is monitoring 'logger' and
> is monitored by kernel_sup and the process that's calling
> error_logger:stop.
>
> For a *running* node, I get:
>
> 1> supervisor:which_children(logger_sup).
> [{error_logger,<0.1023.0>,worker,dynamic},
>  {default,<0.929.0>,worker,[logger_std_h]}]
>
> Assuming that the low process IDs are relatively stable, that
> <0.929.0> for the default logger correlates with the offender in the
> ?reportError above.
>
> I *believe* that lager's running independently. That is: I've done
> nothing deliberate to connect the two. I'm using lager 3.6.3.
>
> Thanks,
> Roger.
>
> On 3 July 2018 at 11:32, Siri Hansen <[hidden email]> wrote:
>> Hi Roger,
>>
>> there's been a few changes in the shutdown procedure (init.erl) with the
>> introduction of Logger in OTP-21. There are also some changes in the
>> supervision tree of the Kernel application related to Logger. I don't know
>> if this has anything to do with your problem, but the chances exist, of
>> course.
>>
>> As you say, Kernel is the last application to be taken down (it's the last
>> element in the #state.running of application_controller), and the state you
>> see is probably stale in the sense that all applications listed before
>> kernel in the 'running' list should already have been terminated. From the
>> crash dump - can you see how many processes are still alive? Are they all
>> part of kernel, or are there more applications that did not completely
>> terminate?
>>
>> And if we are to look closer at the logger track - do you know which logger
>> handlers were active? Are there any children under the logger_sup
>> supervisor, and if so, what are their state? You mention lager above, do you
>> know how lager and logger are connected (is lager a handler under logger, or
>> does it run independently)?
>>
>> If you see anything suspicious related to logger, please feel free to send
>> me the crash dump off-list, and I'll see if I can figure out anything more.
>>
>> Regards
>> /siri
>>
>>
>>
>> Den man. 2. jul. 2018 kl. 18:24 skrev Roger Lipscombe
>> <[hidden email]>:
>>>
>>> Right, so I've done some digging. I got a crash dump and by looking at
>>> the stack trace for application_controller, I can see that it's
>>> hanging in application_controller:terminate, presumably while waiting
>>> for one of the processes to stop. The pid given is <0.893.0>. Looking
>>> a bit further up the stack trace gets me the process state (possibly
>>> stale?), which contains a bunch of applications and pids. The pid it's
>>> hanging on is 'kernel', which I'd expect to be the last (?)
>>> application to be shut down. The process state (assuming it's not
>>> stale) has a load of other applications in there with pids. There are
>>> a bunch of ours, but also others, such as runtime_tools, cowboy,
>>> amqp_client, hackney, lager, etc.
>>>
>>> Is any of this useful in finding out why it's hanging?
>>>
>>> On 2 July 2018 at 16:54, Roger Lipscombe <[hidden email]> wrote:
>>> > This sounds similar to
>>> > http://erlang.org/pipermail/erlang-questions/2012-December/071223.html,
>>> > but it's not quite the same, as far as I can tell.
>>> >
>>> > I've noticed that, at some point since upgrading from OTP-20.3 to
>>> > OTP-21.0 (along with the necessary dependency updates), my Erlang
>>> > nodes are no longer stopping at the end of our system test run.
>>> >
>>> > The nodes are orchestrated by having 'erlexec' run a bash script which
>>> > uses (effectively) 'foo/bin/foo foreground &'. I'm relying on erlexec
>>> > killing the bash script and that killing the nodes. This works fine
>>> > when the nodes are using OTP-20.3, but not with OTP-21.0.
>>> >
>>> > If I connect to the node, I can issue 'init:stop()', and it returns
>>> > ok, but nothing happens. If I use 'application:which_applications()',
>>> > I get a timeout.
>>> >
>>> > Unlike the linked discussion, I _can_ repeatedly connect a remote
>>> > shell (using erl -remsh), but I have to resort to erlang:halt() to
>>> > stop the node. Since my system test environment relies on orderly
>>> > process group teardown to stop the nodes, that's not useful.
>>> >
>>> > As I understand it, process group teardown results in SIGTERM, which
>>> > results in a call to init:stop, which should stop the node. It isn't.
>>> >
>>> > How do I figure out why init:stop() isn't working?
>>> _______________________________________________
>>> erlang-questions mailing list
>>> [hidden email]
>>> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Node not responding to init:stop()

Siri Hansen
For information: After some off-list trouble shooting we found the reason for this issue to be a dead-lock in error_logger and the logger_sup supervisor. A correction is merged to maint and master today.
Regards
/siri


Den tir. 3. jul. 2018 kl. 13:48 skrev Roger Lipscombe <[hidden email]>:
Note that I can reproduce this consistently, so if there's any
experimentation (tracing, patches, etc.) you want me to try, I should
be able to do that.

On 3 July 2018 at 12:45, Roger Lipscombe <[hidden email]> wrote:
> Before I go looking for permission to release the crash dump (it'll
> take at least a couple of hours; I'll ask), let's see what we can
> figure out from here.
>
> There are 44 processes alive. Most of them look fairly innocuous, but,
> in terms of logger, I see:
>
> - A process <0.2130.0> sitting in erlang:apply, issuing
> error_logger:stop, which is a gen_server:call, timeout infinity, to
> <0.927.0>. Stack is logger_server:'-call_h_async/4-fun-0-'/2 + 152 ->
> logger_server:call_h/4 + 96 -> error_logger:removing_handler/1 + 112
> -> error_logger:stop/0 + 72
> - <0.927.0> is 'logger_sup'; it appears to be trying to call
> <0.890.0>, which is 'logger'.
> - <0.890.0> is, apparently, sitting in proc_lib:init_p_do_apply/3
>
> I suspect that logger_sup has failed in supervisor:do_terminate, and
> is attempting to report an error, via logger (which is problematic, I
> guess). I see the following on the logger_sup stack:
>
> #{level => error,
>   meta =>
>       #{domain => [otp,sasl],
>         error_logger => #{tag => error_report,type => supervisor_report},
>         file => "supervisor.erl",
>         gl => <0.894.0>,
>         line => 838,
>         logger_formatter => #{title => "SUPERVISOR REPORT"},
>         mfa => {supervisor,do_terminate,2},
>         pid => <0.927.0>,
>         report_cb => fun logger:format_otp_report/1,time => 1530616282326266},
>   msg =>
>       {report,#{label => {supervisor,shutdown_error},
>                 report =>
>                     [{supervisor,{local,logger_sup}},
>                      {errorContext,shutdown_error},
>                      {reason,killed},
>                      {offender,[{pid,<0.929.0>},
>                                 {id,default},
>                                 {mfargs,{logger_std_h,start_link,undefined}},
>                                 {restart_type,temporary},
>                                 {shutdown,2000},
>                                 {child_type,worker}]}]}}}
>
> At this point, logger_sup has zero links, is monitoring 'logger' and
> is monitored by kernel_sup and the process that's calling
> error_logger:stop.
>
> For a *running* node, I get:
>
> 1> supervisor:which_children(logger_sup).
> [{error_logger,<0.1023.0>,worker,dynamic},
>  {default,<0.929.0>,worker,[logger_std_h]}]
>
> Assuming that the low process IDs are relatively stable, that
> <0.929.0> for the default logger correlates with the offender in the
> ?reportError above.
>
> I *believe* that lager's running independently. That is: I've done
> nothing deliberate to connect the two. I'm using lager 3.6.3.
>
> Thanks,
> Roger.
>
> On 3 July 2018 at 11:32, Siri Hansen <[hidden email]> wrote:
>> Hi Roger,
>>
>> there's been a few changes in the shutdown procedure (init.erl) with the
>> introduction of Logger in OTP-21. There are also some changes in the
>> supervision tree of the Kernel application related to Logger. I don't know
>> if this has anything to do with your problem, but the chances exist, of
>> course.
>>
>> As you say, Kernel is the last application to be taken down (it's the last
>> element in the #state.running of application_controller), and the state you
>> see is probably stale in the sense that all applications listed before
>> kernel in the 'running' list should already have been terminated. From the
>> crash dump - can you see how many processes are still alive? Are they all
>> part of kernel, or are there more applications that did not completely
>> terminate?
>>
>> And if we are to look closer at the logger track - do you know which logger
>> handlers were active? Are there any children under the logger_sup
>> supervisor, and if so, what are their state? You mention lager above, do you
>> know how lager and logger are connected (is lager a handler under logger, or
>> does it run independently)?
>>
>> If you see anything suspicious related to logger, please feel free to send
>> me the crash dump off-list, and I'll see if I can figure out anything more.
>>
>> Regards
>> /siri
>>
>>
>>
>> Den man. 2. jul. 2018 kl. 18:24 skrev Roger Lipscombe
>> <[hidden email]>:
>>>
>>> Right, so I've done some digging. I got a crash dump and by looking at
>>> the stack trace for application_controller, I can see that it's
>>> hanging in application_controller:terminate, presumably while waiting
>>> for one of the processes to stop. The pid given is <0.893.0>. Looking
>>> a bit further up the stack trace gets me the process state (possibly
>>> stale?), which contains a bunch of applications and pids. The pid it's
>>> hanging on is 'kernel', which I'd expect to be the last (?)
>>> application to be shut down. The process state (assuming it's not
>>> stale) has a load of other applications in there with pids. There are
>>> a bunch of ours, but also others, such as runtime_tools, cowboy,
>>> amqp_client, hackney, lager, etc.
>>>
>>> Is any of this useful in finding out why it's hanging?
>>>
>>> On 2 July 2018 at 16:54, Roger Lipscombe <[hidden email]> wrote:
>>> > This sounds similar to
>>> > http://erlang.org/pipermail/erlang-questions/2012-December/071223.html,
>>> > but it's not quite the same, as far as I can tell.
>>> >
>>> > I've noticed that, at some point since upgrading from OTP-20.3 to
>>> > OTP-21.0 (along with the necessary dependency updates), my Erlang
>>> > nodes are no longer stopping at the end of our system test run.
>>> >
>>> > The nodes are orchestrated by having 'erlexec' run a bash script which
>>> > uses (effectively) 'foo/bin/foo foreground &'. I'm relying on erlexec
>>> > killing the bash script and that killing the nodes. This works fine
>>> > when the nodes are using OTP-20.3, but not with OTP-21.0.
>>> >
>>> > If I connect to the node, I can issue 'init:stop()', and it returns
>>> > ok, but nothing happens. If I use 'application:which_applications()',
>>> > I get a timeout.
>>> >
>>> > Unlike the linked discussion, I _can_ repeatedly connect a remote
>>> > shell (using erl -remsh), but I have to resort to erlang:halt() to
>>> > stop the node. Since my system test environment relies on orderly
>>> > process group teardown to stop the nodes, that's not useful.
>>> >
>>> > As I understand it, process group teardown results in SIGTERM, which
>>> > results in a call to init:stop, which should stop the node. It isn't.
>>> >
>>> > How do I figure out why init:stop() isn't working?
>>> _______________________________________________
>>> erlang-questions mailing list
>>> [hidden email]
>>> http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions