Segfault with Erlang R22

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Segfault with Erlang R22

Bekes, Andras G

Hi All,

 

After upgrading to Erlang R22, my software crashes the Erlang VM with Segmentation fault.

It happens rarely, only after several days of test workload, so I can’t really reproduce.

 

I made more than 10 core dumps so far, loaded them into gdb, and all of them died at these 2 crash points:

 

Program terminated with signal 11, Segmentation fault.

#0  process_main (x_reg_array=0x20002, f_reg_array=0x2ade29280590) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:4064

4064      if (is_not_tuple(r(0))) {

 

or

 

#0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252

5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));

 

The software is not doing any tracing when the crash happens, nor does it have any NIFs.

 

What should be the next step of investigation?

 

Thanks

 

Andras G. Bekes, Vice President   
Morgan Stanley | Institutional Securities Tech   
Lechner Odon fasor 8 | Floor 07   
Budapest, 1095   
Phone: +36 1 882-0791   
[hidden email]   
http://mgstn.ly/budapest   
   



NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent required and/or permitted under applicable law, to monitor electronic communications, including telephone calls with Morgan Stanley personnel. This message is subject to the Morgan Stanley General Disclaimers available at the following link: http://www.morganstanley.com/disclaimers.  If you cannot access the links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you acknowledge that you have read, understand and consent, (where applicable), to the foregoing and the Morgan Stanley General Disclaimers.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Segfault with Erlang R22

Mikael Pettersson-5
On Fri, Oct 18, 2019 at 4:34 PM Bekes, Andras G
<[hidden email]> wrote:

>
> Hi All,
>
>
>
> After upgrading to Erlang R22, my software crashes the Erlang VM with Segmentation fault.
>
> It happens rarely, only after several days of test workload, so I can’t really reproduce.
>
>
>
> I made more than 10 core dumps so far, loaded them into gdb, and all of them died at these 2 crash points:
>
>
>
> Program terminated with signal 11, Segmentation fault.
>
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2ade29280590) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:4064
>
> 4064      if (is_not_tuple(r(0))) {
>
>
>
> or
>
>
>
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
>
> 5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
>
>
>
> The software is not doing any tracing when the crash happens, nor does it have any NIFs.

I think you should open a bug report in the erlang bug tracker.

The second crash site above is in the remove_message() function, in a
block where ERL_MESSAGE_TOKEN(msgp)
is neither NIL nor am_undefined, but SEQ_TRACE_TOKEN(c_p) is invalid
(not the expected 5-tuple).

Maybe printing *c_p in gdb when that happens could shed some light.

/Mikael
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

RE: Segfault with Erlang R22

Bekes, Andras G
Hi Mikael,

I filed a bug report in the bug tracker: https://bugs.erlang.org/browse/ERL-1074

Unfortunately printing *c_p did not reveal anything:

(gdb) print c_p
$1 = <value optimized out>
(gdb) print *c_p
value has been optimized out

What should be the next step?
I can reliably produce 5-10 core dumps per week in my test system.

-----Original Message-----
From: Mikael Pettersson [mailto:[hidden email]]
Sent: Friday, October 18, 2019 8:03 PM
To: Bekes, Andras G (IST)
Cc: Erlang Questions
Subject: Re: [erlang-questions] Segfault with Erlang R22

On Fri, Oct 18, 2019 at 4:34 PM Bekes, Andras G
<[hidden email]> wrote:

>
> Hi All,
>
>
>
> After upgrading to Erlang R22, my software crashes the Erlang VM with Segmentation fault.
>
> It happens rarely, only after several days of test workload, so I can’t really reproduce.
>
>
>
> I made more than 10 core dumps so far, loaded them into gdb, and all of them died at these 2 crash points:
>
>
>
> Program terminated with signal 11, Segmentation fault.
>
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2ade29280590) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:4064
>
> 4064      if (is_not_tuple(r(0))) {
>
>
>
> or
>
>
>
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
>
> 5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
>
>
>
> The software is not doing any tracing when the crash happens, nor does it have any NIFs.

I think you should open a bug report in the erlang bug tracker.

The second crash site above is in the remove_message() function, in a
block where ERL_MESSAGE_TOKEN(msgp)
is neither NIL nor am_undefined, but SEQ_TRACE_TOKEN(c_p) is invalid
(not the expected 5-tuple).

Maybe printing *c_p in gdb when that happens could shed some light.

/Mikael

--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.
Reply | Threaded
Open this post in threaded view
|

Re: Segfault with Erlang R22

Mikael Pettersson-5
On Thu, Oct 24, 2019 at 4:57 PM Bekes, Andras G
<[hidden email]> wrote:

>
> Hi Mikael,
>
> I filed a bug report in the bug tracker: https://bugs.erlang.org/browse/ERL-1074
>
> Unfortunately printing *c_p did not reveal anything:
>
> (gdb) print c_p
> $1 = <value optimized out>
> (gdb) print *c_p
> value has been optimized out
>
> What should be the next step?
> I can reliably produce 5-10 core dumps per week in my test system.

I'd try to get a backtrace (bt command in gdb) from the crashed
thread, then maybe print
the c_p parameter via its raw value (print *(Process*)0x.....) if gdb
insists that the value
is optimized out.

/Mikael

>
> -----Original Message-----
> From: Mikael Pettersson [mailto:[hidden email]]
> Sent: Friday, October 18, 2019 8:03 PM
> To: Bekes, Andras G (IST)
> Cc: Erlang Questions
> Subject: Re: [erlang-questions] Segfault with Erlang R22
>
> On Fri, Oct 18, 2019 at 4:34 PM Bekes, Andras G
> <[hidden email]> wrote:
> >
> > Hi All,
> >
> >
> >
> > After upgrading to Erlang R22, my software crashes the Erlang VM with Segmentation fault.
> >
> > It happens rarely, only after several days of test workload, so I can’t really reproduce.
> >
> >
> >
> > I made more than 10 core dumps so far, loaded them into gdb, and all of them died at these 2 crash points:
> >
> >
> >
> > Program terminated with signal 11, Segmentation fault.
> >
> > #0  process_main (x_reg_array=0x20002, f_reg_array=0x2ade29280590) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:4064
> >
> > 4064      if (is_not_tuple(r(0))) {
> >
> >
> >
> > or
> >
> >
> >
> > #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
> >
> > 5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
> >
> >
> >
> > The software is not doing any tracing when the crash happens, nor does it have any NIFs.
>
> I think you should open a bug report in the erlang bug tracker.
>
> The second crash site above is in the remove_message() function, in a
> block where ERL_MESSAGE_TOKEN(msgp)
> is neither NIL nor am_undefined, but SEQ_TRACE_TOKEN(c_p) is invalid
> (not the expected 5-tuple).
>
> Maybe printing *c_p in gdb when that happens could shed some light.
>
> /Mikael
>
> --------------------------------------------------------------------------------
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.
Reply | Threaded
Open this post in threaded view
|

RE: Segfault with Erlang R22

Bekes, Andras G
Program terminated with signal 11, Segmentation fault.
#0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252

5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
(gdb) bt
#0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
#1  0x00000000004641a4 in sched_thread_func (vesdp=0x2b8244840200) at beam/erl_process.c:8465
#2  0x000000000069262a in thr_wrapper (vtwd=<value optimized out>) at pthread/ethread.c:118
#3  0x00002b81f80f7dd5 in _L_unlock_48 () from /lib64/libpthread.so.0
#4  0x00002b81f80f5eb3 in __find_thread_by_id () from /lib64/libpthread.so.0
#5  0x0000000000000000 in ?? ()
(gdb)

I am not sure how to " print the c_p parameter via its raw value (print *(Process*)0x.....)".
Where should I take the value 0x..... from?

-----Original Message-----
From: Mikael Pettersson [mailto:[hidden email]]
Sent: Thursday, October 24, 2019 7:10 PM
To: Bekes, Andras G (IST)
Cc: Erlang Questions
Subject: Re: [erlang-questions] Segfault with Erlang R22

On Thu, Oct 24, 2019 at 4:57 PM Bekes, Andras G
<[hidden email]> wrote:

>
> Hi Mikael,
>
> I filed a bug report in the bug tracker: https://bugs.erlang.org/browse/ERL-1074
>
> Unfortunately printing *c_p did not reveal anything:
>
> (gdb) print c_p
> $1 = <value optimized out>
> (gdb) print *c_p
> value has been optimized out
>
> What should be the next step?
> I can reliably produce 5-10 core dumps per week in my test system.

I'd try to get a backtrace (bt command in gdb) from the crashed
thread, then maybe print
the c_p parameter via its raw value (print *(Process*)0x.....) if gdb
insists that the value
is optimized out.

/Mikael

>
> -----Original Message-----
> From: Mikael Pettersson [mailto:[hidden email]]
> Sent: Friday, October 18, 2019 8:03 PM
> To: Bekes, Andras G (IST)
> Cc: Erlang Questions
> Subject: Re: [erlang-questions] Segfault with Erlang R22
>
> On Fri, Oct 18, 2019 at 4:34 PM Bekes, Andras G
> <[hidden email]> wrote:
> >
> > Hi All,
> >
> >
> >
> > After upgrading to Erlang R22, my software crashes the Erlang VM with Segmentation fault.
> >
> > It happens rarely, only after several days of test workload, so I can’t really reproduce.
> >
> >
> >
> > I made more than 10 core dumps so far, loaded them into gdb, and all of them died at these 2 crash points:
> >
> >
> >
> > Program terminated with signal 11, Segmentation fault.
> >
> > #0  process_main (x_reg_array=0x20002, f_reg_array=0x2ade29280590) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:4064
> >
> > 4064      if (is_not_tuple(r(0))) {
> >
> >
> >
> > or
> >
> >
> >
> > #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
> >
> > 5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
> >
> >
> >
> > The software is not doing any tracing when the crash happens, nor does it have any NIFs.
>
> I think you should open a bug report in the erlang bug tracker.
>
> The second crash site above is in the remove_message() function, in a
> block where ERL_MESSAGE_TOKEN(msgp)
> is neither NIL nor am_undefined, but SEQ_TRACE_TOKEN(c_p) is invalid
> (not the expected 5-tuple).
>
> Maybe printing *c_p in gdb when that happens could shed some light.
>
> /Mikael
>
> --------------------------------------------------------------------------------
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.

--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.
Reply | Threaded
Open this post in threaded view
|

Re: Segfault with Erlang R22

Eckard Brauer
It's a few years ago, but IIRC either "print *c_p" or "print
*((Process*) c_p)". Problem would probably be that the processor
already left the stack frame where c_p is valid.

You can do "info stack" at this point and select the frame with "frame
<#>" to try it again. If you're a little familiar with assembly
language, you can even have a look at "disassemble <address>" or
"disassemble function" to get an idea of where values are at what point
in the instruction/processing flow - sometimes this helps too.

I'd investigate starting with frame 2 here, as all frames below are
already in libpthread.

Hope that helps a bit...

Eckard


Am Fri, 1 Nov 2019 18:22:18 +0000
schrieb "Bekes, Andras G" <[hidden email]>:

> Program terminated with signal 11, Segmentation fault.
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at
> x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
>
> 5252          c_p->seq_trace_lastcnt =
> unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p)); Missing separate
> debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
> (gdb) bt
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at
> x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
> #1 0x00000000004641a4 in sched_thread_func (vesdp=0x2b8244840200) at
> beam/erl_process.c:8465
> #2  0x000000000069262a in thr_wrapper (vtwd=<value optimized out>)
> at pthread/ethread.c:118
> #3 0x00002b81f80f7dd5 in _L_unlock_48 () from /lib64/libpthread.so.0
> #4 0x00002b81f80f5eb3 in __find_thread_by_id () from
> /lib64/libpthread.so.0
> #5  0x0000000000000000 in ?? ()
> (gdb)
>
> I am not sure how to " print the c_p parameter via its raw value
> (print *(Process*)0x.....)". Where should I take the value 0x.....
> from?
>
> -----Original Message-----
> From: Mikael Pettersson [mailto:[hidden email]]
> Sent: Thursday, October 24, 2019 7:10 PM
> To: Bekes, Andras G (IST)
> Cc: Erlang Questions
> Subject: Re: [erlang-questions] Segfault with Erlang R22
>
> On Thu, Oct 24, 2019 at 4:57 PM Bekes, Andras G
> <[hidden email]> wrote:
>  [...]  
>
> I'd try to get a backtrace (bt command in gdb) from the crashed
> thread, then maybe print
> the c_p parameter via its raw value (print *(Process*)0x.....) if gdb
> insists that the value
> is optimized out.
>
> /Mikael
>
>  [...]  
>  [...]  
>  [...]  
>
> --------------------------------------------------------------------------------
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the
> Dodd-Frank Wall Street Reform and Consumer Protection Act. If you
> have received this communication in error, please destroy all
> electronic and paper copies and notify the sender immediately.
> Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted
> under applicable law, to monitor electronic communications. This
> message is subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers  If you cannot access these
> links, please notify us by reply message and we will send the
> contents to you. By communicating with Morgan Stanley you consent to
> the foregoing and to the voice recording of conversations with
> personnel of Morgan Stanley.



--
Wir haften nicht für die korrekte Funktion der in dieser eMail
enthaltenen Viren. We are not liable for correct function of the
viruses in this email! :)
Reply | Threaded
Open this post in threaded view
|

RE: Segfault with Erlang R22

Bekes, Andras G
I am not entirely sure of what we're doing, but here is the output:

(gdb) frame 0
#0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
(gdb) print x_reg_array
$4 = (Eterm *) 0x20002
(gdb) print *x_reg_array
Cannot access memory at address 0x20002
(gdb) print f_reg_array
$5 = (FloatDef *) 0x2
(gdb) print *f_reg_array
Cannot access memory at address 0x2
 (gdb) print c_p
$7 = <value optimized out>
(gdb) print *c_p
value has been optimized out

(gdb) frame 1
#1  0x00000000004641a4 in sched_thread_func (vesdp=0x2b8244840200) at beam/erl_process.c:8465
8465        process_main(esdp->x_reg_array, esdp->f_reg_array);
(gdb) print esdp
$8 = (ErtsSchedulerData *) 0x2b8244840200
(gdb) print *esdp
$9 = {x_reg_array = 0x2b823e940200, f_reg_array = 0x2b823e942240, timer_wheel = 0x2b82450f5c80,
  next_tmo_ref = 0x2b8245136120, timer_service = 0x2b8245176680, tid = 47838514915072, erl_bits_state = {
    byte_buf_ = 0x2b823ad81058 "", byte_buf_len_ = 1, erts_current_bin_ = 0x2b832e60c688 "\n\274\362T",
    erts_bin_offset_ = 32, erts_writable_bin_ = 0}, match_pseudo_process = 0x2b823bec7c78, free_process = 0x0,
  thr_progress_data = {id = 1, is_managed = 1, is_blocking = 0, is_temporary = 0, wakeup_request = {5707836, 5707869,
      5707862, 5707859}, leader = 0, active = 1, confirmed = 5707879, leader_state = {next = 5707875,
      current = 18446744073709551615, chk_next_ix = 2, umrefc_ix = {current = 0, waiting = -1}}}, ssi = 0x2b823be7e680,
  current_process = 0x2b82431401d8, type = ERTS_SCHED_NORMAL, no = 1, dirty_no = 0, flxctr_slot_no = 1,
  current_nif = 0x0, dirty_shadow_process = 0x0, current_port = 0x0, run_queue = 0x2b823be77ec0, virtual_reds = 0,
  cpu_id = -1, aux_work_data = {sched_id = 1, esdp = 0x2b8244840200, ssi = 0x2b823be7e680, current_thr_prgr = 5707878,
    latest_wakeup = 5707869, misc = {ix = 0, thr_prgr = 18446744073709551615}, dd = {thr_prgr = 5707869}, cncld_tmrs = {
      thr_prgr = 5707146}, later_op = {thr_prgr = 5707880, size = 65384, first = 0x2b832e8d76c8,
      last = 0x2b832e8d76c8}, async_ready = {need_thr_prgr = 0, thr_prgr = 18446744073709551615,
      queue = 0x2b8245059880}, delayed_wakeup = {next = 18446744073709551615, sched2jix = 0x2b82443650c8, jix = -1,
      job = 0x2b8244364f00}, yield = {alcu_blockscan = {current = 0x0, last = 0x0}, ets_all = {ongoing = 0x0,
        hfrag = 0x0, tab = 0x0, queue = 0x0}}, debug = {wait_completed = {flags = 0, callback = 0, arg = 0x0}}},
  atom_cache_map = {hdr_sz = -1, sz = 0, long_atoms = 0, cix = {0 <repeats 2048 times>}, cache = {{atom = 0,
        iix = -1} <repeats 2048 times>}}, last_monotonic_time = 54631431230926, check_time_reds = 3137, thr_id = 1,
  unique = 251, ref = 1016430404454740281, alloc_data = {deallctr = {0x0, 0x0, 0x0, 0x2b81f77a9200, 0x2b823ad37200,
      0x2b823bdaa200, 0x2b823e8d5200, 0x2b8240f13200, 0x2b8242ff9200, 0x0, 0x0, 0x2b823fea0200, 0x2b8241f86200, 0x0},
    pref_ix = {0, -1, 1, 1, 1, 1, 1, 1, 1, -1, -1, 1, 1, -1}, flist_ix = {0 <repeats 14 times>}, pre_alc_ix = 0}, io = {
    out = 21255996115, in = 21476723837}, pending_signal = {sig = 0x0, to = 0}, reductions = 1006796378,
  sched_wall_time = {u = {mod = {counter = 0}, need = 0}, enabled = 0, start = 0, working = {total = 0, start = 0}},
  gc_info = {reclaimed = 775481964, garbage_cols = 680476}, nosuspend_port_task_handle = {counter = 0}, ets_tables = {
    count = {counter = 0}, clist = 0x0}}
(gdb) print esdp->x_reg_array
$10 = (Eterm *) 0x2b823e940200
(gdb) print *esdp->x_reg_array
$11 = 2522015978211937347
(gdb) print esdp->f_reg_array
$12 = (FloatDef *) 0x2b823e942240
(gdb) print *esdp->f_reg_array
$13 = {fd = 0.002545, fb = "\323\023\226x@\331d?", fs = {5075, 30870, 55616, 16228}, fw = {2023101395, 1063573824},
  fdw = 4568014792984761299}

frame 2 is in already in pthread/ethread.c


-----Original Message-----
From: erlang-questions [mailto:erlang-questions-bounces+andras.bekes=[hidden email]] On Behalf Of Eckard Brauer
Sent: Saturday, November 02, 2019 10:01 AM
To: [hidden email]
Subject: Re: [erlang-questions] Segfault with Erlang R22

It's a few years ago, but IIRC either "print *c_p" or "print
*((Process*) c_p)". Problem would probably be that the processor
already left the stack frame where c_p is valid.

You can do "info stack" at this point and select the frame with "frame
<#>" to try it again. If you're a little familiar with assembly
language, you can even have a look at "disassemble <address>" or
"disassemble function" to get an idea of where values are at what point
in the instruction/processing flow - sometimes this helps too.

I'd investigate starting with frame 2 here, as all frames below are
already in libpthread.

Hope that helps a bit...

Eckard


Am Fri, 1 Nov 2019 18:22:18 +0000
schrieb "Bekes, Andras G" <[hidden email]>:

> Program terminated with signal 11, Segmentation fault.
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at
> x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
>
> 5252          c_p->seq_trace_lastcnt =
> unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p)); Missing separate
> debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
> (gdb) bt
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at
> x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
> #1 0x00000000004641a4 in sched_thread_func (vesdp=0x2b8244840200) at
> beam/erl_process.c:8465
> #2  0x000000000069262a in thr_wrapper (vtwd=<value optimized out>)
> at pthread/ethread.c:118
> #3 0x00002b81f80f7dd5 in _L_unlock_48 () from /lib64/libpthread.so.0
> #4 0x00002b81f80f5eb3 in __find_thread_by_id () from
> /lib64/libpthread.so.0
> #5  0x0000000000000000 in ?? ()
> (gdb)
>
> I am not sure how to " print the c_p parameter via its raw value
> (print *(Process*)0x.....)". Where should I take the value 0x.....
> from?
>
> -----Original Message-----
> From: Mikael Pettersson [mailto:[hidden email]]
> Sent: Thursday, October 24, 2019 7:10 PM
> To: Bekes, Andras G (IST)
> Cc: Erlang Questions
> Subject: Re: [erlang-questions] Segfault with Erlang R22
>
> On Thu, Oct 24, 2019 at 4:57 PM Bekes, Andras G
> <[hidden email]> wrote:
>  [...]  
>
> I'd try to get a backtrace (bt command in gdb) from the crashed
> thread, then maybe print
> the c_p parameter via its raw value (print *(Process*)0x.....) if gdb
> insists that the value
> is optimized out.
>
> /Mikael
>
>  [...]  
>  [...]  
>  [...]  
>
> --------------------------------------------------------------------------------
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the
> Dodd-Frank Wall Street Reform and Consumer Protection Act. If you
> have received this communication in error, please destroy all
> electronic and paper copies and notify the sender immediately.
> Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted
> under applicable law, to monitor electronic communications. This
> message is subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers  If you cannot access these
> links, please notify us by reply message and we will send the
> contents to you. By communicating with Morgan Stanley you consent to
> the foregoing and to the voice recording of conversations with
> personnel of Morgan Stanley.



--
Wir haften nicht für die korrekte Funktion der in dieser eMail
enthaltenen Viren. We are not liable for correct function of the
viruses in this email! :)

--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.
Reply | Threaded
Open this post in threaded view
|

Re: Segfault with Erlang R22

Jonas Falkevik
Have you tried loading the erts gdb scripts?
should be found under "erts/etc/unix/etp-commands"
Then you can get stacktrace from process for example..

(gdb) source <path to etp-commands>
(gdb) set $p = (Process *)0x2b82431401d8
(gdb) etp-stacktrace $p

/Jonas



On Thu, Nov 7, 2019 at 5:59 PM Bekes, Andras G <[hidden email]> wrote:
I am not entirely sure of what we're doing, but here is the output:

(gdb) frame 0
#0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
5252          c_p->seq_trace_lastcnt = unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p));
(gdb) print x_reg_array
$4 = (Eterm *) 0x20002
(gdb) print *x_reg_array
Cannot access memory at address 0x20002
(gdb) print f_reg_array
$5 = (FloatDef *) 0x2
(gdb) print *f_reg_array
Cannot access memory at address 0x2
 (gdb) print c_p
$7 = <value optimized out>
(gdb) print *c_p
value has been optimized out

(gdb) frame 1
#1  0x00000000004641a4 in sched_thread_func (vesdp=0x2b8244840200) at beam/erl_process.c:8465
8465        process_main(esdp->x_reg_array, esdp->f_reg_array);
(gdb) print esdp
$8 = (ErtsSchedulerData *) 0x2b8244840200
(gdb) print *esdp
$9 = {x_reg_array = 0x2b823e940200, f_reg_array = 0x2b823e942240, timer_wheel = 0x2b82450f5c80,
  next_tmo_ref = 0x2b8245136120, timer_service = 0x2b8245176680, tid = 47838514915072, erl_bits_state = {
    byte_buf_ = 0x2b823ad81058 "", byte_buf_len_ = 1, erts_current_bin_ = 0x2b832e60c688 "\n\274\362T",
    erts_bin_offset_ = 32, erts_writable_bin_ = 0}, match_pseudo_process = 0x2b823bec7c78, free_process = 0x0,
  thr_progress_data = {id = 1, is_managed = 1, is_blocking = 0, is_temporary = 0, wakeup_request = {5707836, 5707869,
      5707862, 5707859}, leader = 0, active = 1, confirmed = 5707879, leader_state = {next = 5707875,
      current = 18446744073709551615, chk_next_ix = 2, umrefc_ix = {current = 0, waiting = -1}}}, ssi = 0x2b823be7e680,
  current_process = 0x2b82431401d8, type = ERTS_SCHED_NORMAL, no = 1, dirty_no = 0, flxctr_slot_no = 1,
  current_nif = 0x0, dirty_shadow_process = 0x0, current_port = 0x0, run_queue = 0x2b823be77ec0, virtual_reds = 0,
  cpu_id = -1, aux_work_data = {sched_id = 1, esdp = 0x2b8244840200, ssi = 0x2b823be7e680, current_thr_prgr = 5707878,
    latest_wakeup = 5707869, misc = {ix = 0, thr_prgr = 18446744073709551615}, dd = {thr_prgr = 5707869}, cncld_tmrs = {
      thr_prgr = 5707146}, later_op = {thr_prgr = 5707880, size = 65384, first = 0x2b832e8d76c8,
      last = 0x2b832e8d76c8}, async_ready = {need_thr_prgr = 0, thr_prgr = 18446744073709551615,
      queue = 0x2b8245059880}, delayed_wakeup = {next = 18446744073709551615, sched2jix = 0x2b82443650c8, jix = -1,
      job = 0x2b8244364f00}, yield = {alcu_blockscan = {current = 0x0, last = 0x0}, ets_all = {ongoing = 0x0,
        hfrag = 0x0, tab = 0x0, queue = 0x0}}, debug = {wait_completed = {flags = 0, callback = 0, arg = 0x0}}},
  atom_cache_map = {hdr_sz = -1, sz = 0, long_atoms = 0, cix = {0 <repeats 2048 times>}, cache = {{atom = 0,
        iix = -1} <repeats 2048 times>}}, last_monotonic_time = 54631431230926, check_time_reds = 3137, thr_id = 1,
  unique = 251, ref = 1016430404454740281, alloc_data = {deallctr = {0x0, 0x0, 0x0, 0x2b81f77a9200, 0x2b823ad37200,
      0x2b823bdaa200, 0x2b823e8d5200, 0x2b8240f13200, 0x2b8242ff9200, 0x0, 0x0, 0x2b823fea0200, 0x2b8241f86200, 0x0},
    pref_ix = {0, -1, 1, 1, 1, 1, 1, 1, 1, -1, -1, 1, 1, -1}, flist_ix = {0 <repeats 14 times>}, pre_alc_ix = 0}, io = {
    out = 21255996115, in = 21476723837}, pending_signal = {sig = 0x0, to = 0}, reductions = 1006796378,
  sched_wall_time = {u = {mod = {counter = 0}, need = 0}, enabled = 0, start = 0, working = {total = 0, start = 0}},
  gc_info = {reclaimed = 775481964, garbage_cols = 680476}, nosuspend_port_task_handle = {counter = 0}, ets_tables = {
    count = {counter = 0}, clist = 0x0}}
(gdb) print esdp->x_reg_array
$10 = (Eterm *) 0x2b823e940200
(gdb) print *esdp->x_reg_array
$11 = 2522015978211937347
(gdb) print esdp->f_reg_array
$12 = (FloatDef *) 0x2b823e942240
(gdb) print *esdp->f_reg_array
$13 = {fd = 0.002545, fb = "\323\023\226x@\331d?", fs = {5075, 30870, 55616, 16228}, fw = {2023101395, 1063573824},
  fdw = 4568014792984761299}

frame 2 is in already in pthread/ethread.c


-----Original Message-----
From: erlang-questions [mailto:[hidden email]=[hidden email]] On Behalf Of Eckard Brauer
Sent: Saturday, November 02, 2019 10:01 AM
To: [hidden email]
Subject: Re: [erlang-questions] Segfault with Erlang R22

It's a few years ago, but IIRC either "print *c_p" or "print
*((Process*) c_p)". Problem would probably be that the processor
already left the stack frame where c_p is valid.

You can do "info stack" at this point and select the frame with "frame
<#>" to try it again. If you're a little familiar with assembly
language, you can even have a look at "disassemble <address>" or
"disassemble function" to get an idea of where values are at what point
in the instruction/processing flow - sometimes this helps too.

I'd investigate starting with frame 2 here, as all frames below are
already in libpthread.

Hope that helps a bit...

Eckard


Am Fri, 1 Nov 2019 18:22:18 +0000
schrieb "Bekes, Andras G" <[hidden email]>:

> Program terminated with signal 11, Segmentation fault.
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at
> x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
>
> 5252          c_p->seq_trace_lastcnt =
> unsigned_val(SEQ_TRACE_TOKEN_SERIAL(c_p)); Missing separate
> debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
> (gdb) bt
> #0  process_main (x_reg_array=0x20002, f_reg_array=0x2) at
> x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:5252
> #1 0x00000000004641a4 in sched_thread_func (vesdp=0x2b8244840200) at
> beam/erl_process.c:8465
> #2  0x000000000069262a in thr_wrapper (vtwd=<value optimized out>)
> at pthread/ethread.c:118
> #3 0x00002b81f80f7dd5 in _L_unlock_48 () from /lib64/libpthread.so.0
> #4 0x00002b81f80f5eb3 in __find_thread_by_id () from
> /lib64/libpthread.so.0
> #5  0x0000000000000000 in ?? ()
> (gdb)
>
> I am not sure how to " print the c_p parameter via its raw value
> (print *(Process*)0x.....)". Where should I take the value 0x.....
> from?
>
> -----Original Message-----
> From: Mikael Pettersson [mailto:[hidden email]]
> Sent: Thursday, October 24, 2019 7:10 PM
> To: Bekes, Andras G (IST)
> Cc: Erlang Questions
> Subject: Re: [erlang-questions] Segfault with Erlang R22
>
> On Thu, Oct 24, 2019 at 4:57 PM Bekes, Andras G
> <[hidden email]> wrote:
>  [...] 
>
> I'd try to get a backtrace (bt command in gdb) from the crashed
> thread, then maybe print
> the c_p parameter via its raw value (print *(Process*)0x.....) if gdb
> insists that the value
> is optimized out.
>
> /Mikael
>
>  [...] 
>  [...] 
>  [...] 
>
> --------------------------------------------------------------------------------
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the
> Dodd-Frank Wall Street Reform and Consumer Protection Act. If you
> have received this communication in error, please destroy all
> electronic and paper copies and notify the sender immediately.
> Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted
> under applicable law, to monitor electronic communications. This
> message is subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers  If you cannot access these
> links, please notify us by reply message and we will send the
> contents to you. By communicating with Morgan Stanley you consent to
> the foregoing and to the voice recording of conversations with
> personnel of Morgan Stanley.



--
Wir haften nicht für die korrekte Funktion der in dieser eMail
enthaltenen Viren. We are not liable for correct function of the
viruses in this email! :)

--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.