OTP 20.0.2 segfault at 5c ip in beam.smp

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

OTP 20.0.2 segfault at 5c ip in beam.smp

Gerhard Lazu
Hi,

A RabbitMQ 3.6.11-rc.3 node crashed when OTP 20.0.2 segfaulted. This is the only information that we have about the Erlang VM itself:

    Aug 16 04:08:59 localhost kernel: [61139.180699] 2_scheduler[15162]: segfault at 5c ip 00000000005ba760 sp 00007f8fea1ffc40 err
or 4 in beam.smp[400000+333000]

This is the last line logged by RMQ, 2 seconds before the Erlang VM segfault:

    =INFO REPORT==== 16-Aug-2017::04:08:57 === closing AMQP connection <0.24230.9> (10.0.16.35:42562 -> 10.0.16.23:5672, vhost: '/', user: 'admin')

beam.smp runs with the following flags:

    /var/vcap/packages/erlang-20.0.2/lib/erlang/erts-9.0.2/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -- -root /var/vcap/packages/erlang-20.0.2/lib/erlang -progname erl -- -home /home/vcap -- -pa /var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/ebin -noshell -noinput -s rabbit boot -sname rabbit@rmq0-rmq-gcp-36 -boot start_sasl -config /var/vcap/jobs/rabbitmq-server/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36.log"} -rabbit sasl_error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36-sasl.log"} -rabbit enabled_plugins_file "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/plugins" -rabbit plugins_expand_dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

We are running on:

    4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 GNU/Linux

When this happens again, how do I capture more debugging information? I would like to file an OTP bug report, but I'm not sure that we have enough information.

Apparently, this was reported at least once before, against OTP 18.1: https://github.com/rabbitmq/rabbitmq-server/issues/459

Thank you, Gerhard.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: OTP 20.0.2 segfault at 5c ip in beam.smp

Lukas Larsson-8
Hello,

In order to debug this we need access to a core file generated from the crash and also the beam.smp binary that was used to run it.

Lukas

On Wed, Aug 16, 2017 at 1:20 PM, Gerhard Lazu <[hidden email]> wrote:
Hi,

A RabbitMQ 3.6.11-rc.3 node crashed when OTP 20.0.2 segfaulted. This is the only information that we have about the Erlang VM itself:

    Aug 16 04:08:59 localhost kernel: [61139.180699] 2_scheduler[15162]: segfault at 5c ip 00000000005ba760 sp 00007f8fea1ffc40 err
or 4 in beam.smp[400000+333000]

This is the last line logged by RMQ, 2 seconds before the Erlang VM segfault:

    =INFO REPORT==== 16-Aug-2017::04:08:57 === closing AMQP connection <0.24230.9> (10.0.16.35:42562 -> 10.0.16.23:5672, vhost: '/', user: 'admin')

beam.smp runs with the following flags:

    /var/vcap/packages/erlang-20.0.2/lib/erlang/erts-9.0.2/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -- -root /var/vcap/packages/erlang-20.0.2/lib/erlang -progname erl -- -home /home/vcap -- -pa /var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/ebin -noshell -noinput -s rabbit boot -sname rabbit@rmq0-rmq-gcp-36 -boot start_sasl -config /var/vcap/jobs/rabbitmq-server/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36.log"} -rabbit sasl_error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36-sasl.log"} -rabbit enabled_plugins_file "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/plugins" -rabbit plugins_expand_dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

We are running on:

    4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 GNU/Linux

When this happens again, how do I capture more debugging information? I would like to file an OTP bug report, but I'm not sure that we have enough information.

Apparently, this was reported at least once before, against OTP 18.1: https://github.com/rabbitmq/rabbitmq-server/issues/459

Thank you, Gerhard.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: OTP 20.0.2 segfault at 5c ip in beam.smp

Gerhard Lazu
The crash dump was never generated because the Erlang VM segfaulted.

Is there a way to force extra debugging output in segfault scenarios?

On Wed, Aug 16, 2017 at 1:42 PM, Lukas Larsson <[hidden email]> wrote:
Hello,

In order to debug this we need access to a core file generated from the crash and also the beam.smp binary that was used to run it.

Lukas

On Wed, Aug 16, 2017 at 1:20 PM, Gerhard Lazu <[hidden email]> wrote:
Hi,

A RabbitMQ 3.6.11-rc.3 node crashed when OTP 20.0.2 segfaulted. This is the only information that we have about the Erlang VM itself:

    Aug 16 04:08:59 localhost kernel: [61139.180699] 2_scheduler[15162]: segfault at 5c ip 00000000005ba760 sp 00007f8fea1ffc40 err
or 4 in beam.smp[400000+333000]

This is the last line logged by RMQ, 2 seconds before the Erlang VM segfault:

    =INFO REPORT==== 16-Aug-2017::04:08:57 === closing AMQP connection <0.24230.9> (10.0.16.35:42562 -> 10.0.16.23:5672, vhost: '/', user: 'admin')

beam.smp runs with the following flags:

    /var/vcap/packages/erlang-20.0.2/lib/erlang/erts-9.0.2/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -- -root /var/vcap/packages/erlang-20.0.2/lib/erlang -progname erl -- -home /home/vcap -- -pa /var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/ebin -noshell -noinput -s rabbit boot -sname rabbit@rmq0-rmq-gcp-36 -boot start_sasl -config /var/vcap/jobs/rabbitmq-server/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36.log"} -rabbit sasl_error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36-sasl.log"} -rabbit enabled_plugins_file "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/plugins" -rabbit plugins_expand_dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

We are running on:

    4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 GNU/Linux

When this happens again, how do I capture more debugging information? I would like to file an OTP bug report, but I'm not sure that we have enough information.

Apparently, this was reported at least once before, against OTP 18.1: https://github.com/rabbitmq/rabbitmq-server/issues/459

Thank you, Gerhard.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: OTP 20.0.2 segfault at 5c ip in beam.smp

Lukas Larsson-8
You need to instruct the operating system to save a core dump, http://man7.org/linux/man-pages/man5/core.5.html.

Some operating systems can be configured to dump parts of the core dump to the syslog when a segfault happens, which can be useful but a full core dump contains much more information so is a lot more useful.

Lukas

On Wed, Aug 16, 2017 at 5:00 PM, Gerhard Lazu <[hidden email]> wrote:
The crash dump was never generated because the Erlang VM segfaulted.

Is there a way to force extra debugging output in segfault scenarios?

On Wed, Aug 16, 2017 at 1:42 PM, Lukas Larsson <[hidden email]> wrote:
Hello,

In order to debug this we need access to a core file generated from the crash and also the beam.smp binary that was used to run it.

Lukas

On Wed, Aug 16, 2017 at 1:20 PM, Gerhard Lazu <[hidden email]> wrote:
Hi,

A RabbitMQ 3.6.11-rc.3 node crashed when OTP 20.0.2 segfaulted. This is the only information that we have about the Erlang VM itself:

    Aug 16 04:08:59 localhost kernel: [61139.180699] 2_scheduler[15162]: segfault at 5c ip 00000000005ba760 sp 00007f8fea1ffc40 err
or 4 in beam.smp[400000+333000]

This is the last line logged by RMQ, 2 seconds before the Erlang VM segfault:

    =INFO REPORT==== 16-Aug-2017::04:08:57 === closing AMQP connection <0.24230.9> (10.0.16.35:42562 -> 10.0.16.23:5672, vhost: '/', user: 'admin')

beam.smp runs with the following flags:

    /var/vcap/packages/erlang-20.0.2/lib/erlang/erts-9.0.2/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -- -root /var/vcap/packages/erlang-20.0.2/lib/erlang -progname erl -- -home /home/vcap -- -pa /var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/ebin -noshell -noinput -s rabbit boot -sname rabbit@rmq0-rmq-gcp-36 -boot start_sasl -config /var/vcap/jobs/rabbitmq-server/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36.log"} -rabbit sasl_error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36-sasl.log"} -rabbit enabled_plugins_file "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/plugins" -rabbit plugins_expand_dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

We are running on:

    4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 GNU/Linux

When this happens again, how do I capture more debugging information? I would like to file an OTP bug report, but I'm not sure that we have enough information.

Apparently, this was reported at least once before, against OTP 18.1: https://github.com/rabbitmq/rabbitmq-server/issues/459

Thank you, Gerhard.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: OTP 20.0.2 segfault at 5c ip in beam.smp

Sverker Eriksson-4

You can provoke a VM "crash" with erlang:halt(abort) to test if the OS creates a core dump file.


/Sverker


On 08/16/2017 05:10 PM, Lukas Larsson wrote:
You need to instruct the operating system to save a core dump,
http://man7.org/linux/man-pages/man5/core.5.html.

Some operating systems can be configured to dump parts of the core dump to
the syslog when a segfault happens, which can be useful but a full core
dump contains much more information so is a lot more useful.

Lukas

On Wed, Aug 16, 2017 at 5:00 PM, Gerhard Lazu [hidden email] wrote:

The crash dump was never generated because the Erlang VM segfaulted.

Is there a way to force extra debugging output in segfault scenarios?

On Wed, Aug 16, 2017 at 1:42 PM, Lukas Larsson [hidden email] wrote:

Hello,

In order to debug this we need access to a core file generated from the
crash and also the beam.smp binary that was used to run it.

Lukas

On Wed, Aug 16, 2017 at 1:20 PM, Gerhard Lazu [hidden email] wrote:

Hi,

A RabbitMQ 3.6.11-rc.3 node crashed when OTP 20.0.2 segfaulted. This is
the only information that we have about the Erlang VM itself:

    Aug 16 04:08:59 localhost kernel: [61139.180699] 2_scheduler[15162]:
segfault at 5c ip 00000000005ba760 sp 00007f8fea1ffc40 err
or 4 in beam.smp[400000+333000]

This is the last line logged by RMQ, 2 seconds before the Erlang VM
segfault:

    =INFO REPORT==== 16-Aug-2017::04:08:57 === closing AMQP connection
<0.24230.9> (10.0.16.35:42562 -> 10.0.16.23:5672, vhost: '/', user:
'admin')

beam.smp runs with the following flags:

    /var/vcap/packages/erlang-20.0.2/lib/erlang/erts-9.0.2/bin/beam.smp
-W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -- -root
/var/vcap/packages/erlang-20.0.2/lib/erlang -progname erl -- -home
/home/vcap -- -pa /var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/ebin
-noshell -noinput -s rabbit boot -sname rabbit@rmq0-rmq-gcp-36 -boot
start_sasl -config /var/vcap/jobs/rabbitmq-server/rabbitmq -kernel
inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl
sasl_error_logger false -rabbit error_logger {file,"/var/vcap/sys/log/rabbi
[hidden email]"} -rabbit sasl_error_logger
{file,[hidden email]}
-rabbit enabled_plugins_file "/var/vcap/jobs/rabbitmq-serve
r/packages/rabbitmq-server/etc/rabbitmq/enabled_plugins" -rabbit
plugins_dir "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/plugins"
-rabbit plugins_expand_dir "/var/vcap/store/rabbitmq-serv
er/mnesia/rabbit@rmq0-rmq-gcp-36-plugins-expand" -os_mon start_cpu_sup
false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir
"/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36" -kernel
inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

We are running on:

    4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC
2017 x86_64 GNU/Linux

When this happens again, how do I capture more debugging information? I
would like to file an OTP bug report, but I'm not sure that we have enough
information.

Apparently, this was reported at least once before, against OTP 18.1:
https://github.com/rabbitmq/rabbitmq-server/issues/459

Thank you, Gerhard.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



        
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



      

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: OTP 20.0.2 segfault at 5c ip in beam.smp

Gerhard Lazu
In reply to this post by Lukas Larsson-8
Thank you Lukas & Sverker. Patiently waiting for the next Erlang VM segfault : )

On Wed, Aug 16, 2017 at 4:10 PM, Lukas Larsson <[hidden email]> wrote:
You need to instruct the operating system to save a core dump, http://man7.org/linux/man-pages/man5/core.5.html.

Some operating systems can be configured to dump parts of the core dump to the syslog when a segfault happens, which can be useful but a full core dump contains much more information so is a lot more useful.

Lukas

On Wed, Aug 16, 2017 at 5:00 PM, Gerhard Lazu <[hidden email]> wrote:
The crash dump was never generated because the Erlang VM segfaulted.

Is there a way to force extra debugging output in segfault scenarios?

On Wed, Aug 16, 2017 at 1:42 PM, Lukas Larsson <[hidden email]> wrote:
Hello,

In order to debug this we need access to a core file generated from the crash and also the beam.smp binary that was used to run it.

Lukas

On Wed, Aug 16, 2017 at 1:20 PM, Gerhard Lazu <[hidden email]> wrote:
Hi,

A RabbitMQ 3.6.11-rc.3 node crashed when OTP 20.0.2 segfaulted. This is the only information that we have about the Erlang VM itself:

    Aug 16 04:08:59 localhost kernel: [61139.180699] 2_scheduler[15162]: segfault at 5c ip 00000000005ba760 sp 00007f8fea1ffc40 err
or 4 in beam.smp[400000+333000]

This is the last line logged by RMQ, 2 seconds before the Erlang VM segfault:

    =INFO REPORT==== 16-Aug-2017::04:08:57 === closing AMQP connection <0.24230.9> (10.0.16.35:42562 -> 10.0.16.23:5672, vhost: '/', user: 'admin')

beam.smp runs with the following flags:

    /var/vcap/packages/erlang-20.0.2/lib/erlang/erts-9.0.2/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -- -root /var/vcap/packages/erlang-20.0.2/lib/erlang -progname erl -- -home /home/vcap -- -pa /var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/ebin -noshell -noinput -s rabbit boot -sname rabbit@rmq0-rmq-gcp-36 -boot start_sasl -config /var/vcap/jobs/rabbitmq-server/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36.log"} -rabbit sasl_error_logger {file,"/var/vcap/sys/log/rabbitmq-server/rabbit@rmq0-rmq-gcp-36-sasl.log"} -rabbit enabled_plugins_file "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/var/vcap/jobs/rabbitmq-server/packages/rabbitmq-server/plugins" -rabbit plugins_expand_dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/vcap/store/rabbitmq-server/mnesia/rabbit@rmq0-rmq-gcp-36" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

We are running on:

    4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 GNU/Linux

When this happens again, how do I capture more debugging information? I would like to file an OTP bug report, but I'm not sure that we have enough information.

Apparently, this was reported at least once before, against OTP 18.1: https://github.com/rabbitmq/rabbitmq-server/issues/459

Thank you, Gerhard.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions