beam core file R17

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

beam core file R17

Matthew Evans-2
Hi,

This core was found on a live system (R17):

10:16:38:# erl

Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]


Eshell V6.2  (abort with ^G)

1> 



........


May 28 22:18:56 [info   ] plexxi kernel: [1235119.885465] beam.smp[2267] general protection ip:4b698a sp:7faeb6a7d650 error:0 in beam.smp[400000+1ac000]


warning: Can't read pathname for load map: Input/output error.

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/libthread_db.so.1".

Core was generated by `/usr/lib/erlang/erts-6.2/bin/beam.smp -K true -A 24 -P 350000 -- -root /usr/lib'.

Program terminated with signal 11, Segmentation fault.

#0  0x00000000004b698a in sweep_off_heap ()

(gdb) bt

#0  0x00000000004b698a in sweep_off_heap ()

#1  0x00000000004b77d1 in do_minor ()

#2  0x00000000004b8479 in erts_garbage_collect ()

#3  0x00000000004e1374 in process_main ()

#4  0x000000000048071d in sched_thread_func ()

#5  0x0000000000549f89 in thr_wrapper ()

#6  0x00007faeba32ba30 in start_thread () from /lib/libpthread.so.0

#7  0x00007faeb9e8a53d in clone () from /lib/libc.so.6

(gdb) up


_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: beam core file R17

Mikael Pettersson-5
Matthew Evans writes:
 > Hi,
 > This core was found on a live system (R17):
 >
 >
 >
 >
 >
 >
 >
 >
 > 10:16:38:# erl
 > Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
 >
 >
 > Eshell V6.2  (abort with ^G)
 > 1>
 >
 > ........
 >
 >
 >
 >
 >
 >
 >
 >
 > May 28 22:18:56 [info   ] plexxi kernel: [1235119.885465] beam.smp[2267] general protection ip:4b698a sp:7faeb6a7d650 error:0 in beam.smp[400000+1ac000]
 > warning: Can't read pathname for load map: Input/output error.[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/libthread_db.so.1".Core was generated by `/usr/lib/erlang/erts-6.2/bin/beam.smp -K true -A 24 -P 350000 -- -root /usr/lib'.Program terminated with signal 11, Segmentation fault.#0  0x00000000004b698a in sweep_off_heap ()(gdb) bt#0  0x00000000004b698a in sweep_off_heap ()#1  0x00000000004b77d1 in do_minor ()#2  0x00000000004b8479 in erts_garbage_collect ()#3  0x00000000004e1374 in process_main ()#4  0x000000000048071d in sched_thread_func ()#5  0x0000000000549f89 in thr_wrapper ()#6  0x00007faeba32ba30 in start_thread () from /lib/libpthread.so.0#7  0x00007faeb9e8a53d in clone () from /lib/libc.so.6

It looks like your beam.smp binary lacks debugging information, so we only know the general
area where it crashed (sweep_off_heap() as called from do_minor()).  Crashes here would usually
be due to memory corruption, which could be caused by:
- a bug in the VM
- a bug in a NIF
- a bug in HiPE
- a bug in the C compiler used to compile the VM (I've seen that happen at least 3 times)
- a HW error (though you'd then also find e.g. machine check events logged)

If you want to debug this, you should first ensure that your beam.smp gets built and installed
with full debugging information (just attach gdb, bt, and list to verify).  You should also try
without NIFs or native code, if those are used and you can configure them not to be.
_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: beam core file R17

Matthew Evans-2
Thanks, this software does use nifs

Sent from my iPhone

> On May 29, 2016, at 11:56 AM, Mikael Pettersson <[hidden email]> wrote:
>
> Matthew Evans writes:
>> Hi,
>> This core was found on a live system (R17):
>>
>>
>>
>>
>>
>>
>>
>>
>> 10:16:38:# erl
>> Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
>>
>>
>> Eshell V6.2  (abort with ^G)
>> 1>
>>
>> ........
>>
>>
>>
>>
>>
>>
>>
>>
>> May 28 22:18:56 [info   ] plexxi kernel: [1235119.885465] beam.smp[2267] general protection ip:4b698a sp:7faeb6a7d650 error:0 in beam.smp[400000+1ac000]
>> warning: Can't read pathname for load map: Input/output error.[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/libthread_db.so.1".Core was generated by `/usr/lib/erlang/erts-6.2/bin/beam.smp -K true -A 24 -P 350000 -- -root /usr/lib'.Program terminated with signal 11, Segmentation fault.#0  0x00000000004b698a in sweep_off_heap ()(gdb) bt#0  0x00000000004b698a in sweep_off_heap ()#1  0x00000000004b77d1 in do_minor ()#2  0x00000000004b8479 in erts_garbage_collect ()#3  0x00000000004e1374 in process_main ()#4  0x000000000048071d in sched_thread_func ()#5  0x0000000000549f89 in thr_wrapper ()#6  0x00007faeba32ba30 in start_thread () from /lib/libpthread.so.0#7  0x00007faeb9e8a53d in clone () from /lib/libc.so.6
>
> It looks like your beam.smp binary lacks debugging information, so we only know the general
> area where it crashed (sweep_off_heap() as called from do_minor()).  Crashes here would usually
> be due to memory corruption, which could be caused by:
> - a bug in the VM
> - a bug in a NIF
> - a bug in HiPE
> - a bug in the C compiler used to compile the VM (I've seen that happen at least 3 times)
> - a HW error (though you'd then also find e.g. machine check events logged)
>
> If you want to debug this, you should first ensure that your beam.smp gets built and installed
> with full debugging information (just attach gdb, bt, and list to verify).  You should also try
> without NIFs or native code, if those are used and you can configure them not to be.
_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: beam core file R17

Matthew Evans-2

Thanks,

Fortunately our latest release has moved all the NIF logic to a separate CNODE based process. I am leaning towards the NIF as the cause since the VM itself has proven to be very stable.



Date: Sun, 29 May 2016 21:23:15 +0200
Subject: Re: [erlang-bugs] beam core file R17
From: [hidden email]
To: [hidden email]
CC: [hidden email]

sweep_off_heap is run when collecting refc binaries (among other things), so if you by mistake have decremented the reference count of a binary in a nif too much, this error will happen when the GC is inspecting the binary. I'd recommend looking for something like that in any nifs you have.

On Sun, May 29, 2016 at 6:04 PM, Matthew Evans <[hidden email]> wrote:
Thanks, this software does use nifs

Sent from my iPhone

> On May 29, 2016, at 11:56 AM, Mikael Pettersson <[hidden email]> wrote:
>
> Matthew Evans writes:
>> Hi,
>> This core was found on a live system (R17):
>>
>>
>>
>>
>>
>>
>>
>>
>> 10:16:38:# erl
>> Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
>>
>>
>> Eshell V6.2  (abort with ^G)
>> 1>
>>
>> ........
>>
>>
>>
>>
>>
>>
>>
>>
>> May 28 22:18:56 [info   ] plexxi kernel: [1235119.885465] beam.smp[2267] general protection ip:4b698a sp:7faeb6a7d650 error:0 in beam.smp[400000+1ac000]
>> warning: Can't read pathname for load map: Input/output error.[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/libthread_db.so.1".Core was generated by `/usr/lib/erlang/erts-6.2/bin/beam.smp -K true -A 24 -P 350000 -- -root /usr/lib'.Program terminated with signal 11, Segmentation fault.#0  0x00000000004b698a in sweep_off_heap ()(gdb) bt#0  0x00000000004b698a in sweep_off_heap ()#1  0x00000000004b77d1 in do_minor ()#2  0x00000000004b8479 in erts_garbage_collect ()#3  0x00000000004e1374 in process_main ()#4  0x000000000048071d in sched_thread_func ()#5  0x0000000000549f89 in thr_wrapper ()#6  0x00007faeba32ba30 in start_thread () from /lib/libpthread.so.0#7  0x00007faeb9e8a53d in clone () from /lib/libc.so.6
>
> It looks like your beam.smp binary lacks debugging information, so we only know the general
> area where it crashed (sweep_off_heap() as called from do_minor()).  Crashes here would usually
> be due to memory corruption, which could be caused by:
> - a bug in the VM
> - a bug in a NIF
> - a bug in HiPE
> - a bug in the C compiler used to compile the VM (I've seen that happen at least 3 times)
> - a HW error (though you'd then also find e.g. machine check events logged)
>
> If you want to debug this, you should first ensure that your beam.smp gets built and installed
> with full debugging information (just attach gdb, bt, and list to verify).  You should also try
> without NIFs or native code, if those are used and you can configure them not to be.
_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs


_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [erlang-questions] beam core file R17

Matthew Evans-2
For what it's worth here's a gdb with symbols:

(gdb) bt

#0  sweep_off_heap (p=0x7faeb8fc5488, fullsweep=-1295963664) at beam/erl_gc.c:2353

#1  0x00000000004b77d1 in do_minor (p=0x7faeb8fc5488, new_sz=<optimized out>, objv=<optimized out>, nobj=<optimized out>) at beam/erl_gc.c:1166

#2  0x00000000004b8479 in minor_collection (recl=<optimized out>, nobj=<optimized out>, objv=<optimized out>, need=<optimized out>, p=<optimized out>) at beam/erl_gc.c:876

#3  erts_garbage_collect (p=0x7faeb8fc5488, need=<optimized out>, objv=<optimized out>, nobj=<optimized out>) at beam/erl_gc.c:450

#4  0x00000000004e1374 in process_main () at beam/beam_emu.c:1858

#5  0x000000000048071d in sched_thread_func (vesdp=<optimized out>) at beam/erl_process.c:7719

#6  0x0000000000549f89 in thr_wrapper (vtwd=<optimized out>) at pthread/ethread.c:106

#7  0x00007faeba32ba30 in ?? ()

#8  0x0000000000000000 in ?? ()




From: [hidden email]
To: [hidden email]
Date: Sun, 29 May 2016 21:32:58 -0400
CC: [hidden email]; [hidden email]
Subject: Re: [erlang-questions] [erlang-bugs] beam core file R17


Thanks,

Fortunately our latest release has moved all the NIF logic to a separate CNODE based process. I am leaning towards the NIF as the cause since the VM itself has proven to be very stable.



Date: Sun, 29 May 2016 21:23:15 +0200
Subject: Re: [erlang-bugs] beam core file R17
From: [hidden email]
To: [hidden email]
CC: [hidden email]

sweep_off_heap is run when collecting refc binaries (among other things), so if you by mistake have decremented the reference count of a binary in a nif too much, this error will happen when the GC is inspecting the binary. I'd recommend looking for something like that in any nifs you have.

On Sun, May 29, 2016 at 6:04 PM, Matthew Evans <[hidden email]> wrote:
Thanks, this software does use nifs

Sent from my iPhone

> On May 29, 2016, at 11:56 AM, Mikael Pettersson <[hidden email]> wrote:
>
> Matthew Evans writes:
>> Hi,
>> This core was found on a live system (R17):
>>
>>
>>
>>
>>
>>
>>
>>
>> 10:16:38:# erl
>> Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
>>
>>
>> Eshell V6.2  (abort with ^G)
>> 1>
>>
>> ........
>>
>>
>>
>>
>>
>>
>>
>>
>> May 28 22:18:56 [info   ] plexxi kernel: [1235119.885465] beam.smp[2267] general protection ip:4b698a sp:7faeb6a7d650 error:0 in beam.smp[400000+1ac000]
>> warning: Can't read pathname for load map: Input/output error.[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/libthread_db.so.1".Core was generated by `/usr/lib/erlang/erts-6.2/bin/beam.smp -K true -A 24 -P 350000 -- -root /usr/lib'.Program terminated with signal 11, Segmentation fault.#0  0x00000000004b698a in sweep_off_heap ()(gdb) bt#0  0x00000000004b698a in sweep_off_heap ()#1  0x00000000004b77d1 in do_minor ()#2  0x00000000004b8479 in erts_garbage_collect ()#3  0x00000000004e1374 in process_main ()#4  0x000000000048071d in sched_thread_func ()#5  0x0000000000549f89 in thr_wrapper ()#6  0x00007faeba32ba30 in start_thread () from /lib/libpthread.so.0#7  0x00007faeb9e8a53d in clone () from /lib/libc.so.6
>
> It looks like your beam.smp binary lacks debugging information, so we only know the general
> area where it crashed (sweep_off_heap() as called from do_minor()).  Crashes here would usually
> be due to memory corruption, which could be caused by:
> - a bug in the VM
> - a bug in a NIF
> - a bug in HiPE
> - a bug in the C compiler used to compile the VM (I've seen that happen at least 3 times)
> - a HW error (though you'd then also find e.g. machine check events logged)
>
> If you want to debug this, you should first ensure that your beam.smp gets built and installed
> with full debugging information (just attach gdb, bt, and list to verify).  You should also try
> without NIFs or native code, if those are used and you can configure them not to be.
_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs


_______________________________________________ erlang-questions mailing list [hidden email] http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs
Loading...