Cause heartbeat timeout

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Cause heartbeat timeout

Dmitry Kolesnikov-2
Hello,

I am trying to debug an issue of node termination by heartbeat timeout and its recovery. I am looking for advice about the reproduction of heartbeat timeout in controlled environment. What is the best approach to freeze Erlang node?

Best Regards,
Dmitry
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Cause heartbeat timeout

Dmitry Klionsky-2
I faced a situation like this two times.
It was on a 2 CPUs EC2 instances. Heartbeat timeout was 45 secs.

The first offender was zlib:gzip/1 and ~1GB file, the other was ++ over
two long lists. Both operations took about 1 min to complete.

My explanation is the heart's erlang part can't send the heartbeat to
the heart port in time because
either the scheduler it's bound to is busy or port I/O is busy/congested.

Hope this helps.

On 11/08/2017 10:13 AM, Dmitry Kolesnikov wrote:

> Hello,
>
> I am trying to debug an issue of node termination by heartbeat timeout and its recovery. I am looking for advice about the reproduction of heartbeat timeout in controlled environment. What is the best approach to freeze Erlang node?
>
> Best Regards,
> Dmitry
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

--
BR,
Dmitry

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Cause heartbeat timeout

Dmitry Klionsky-2
Correction. It was -- the second time.

On 11/08/2017 12:49 PM, Dmitry Klionsky wrote:

> I faced a situation like this two times.
> It was on a 2 CPUs EC2 instances. Heartbeat timeout was 45 secs.
>
> The first offender was zlib:gzip/1 and ~1GB file, the other was ++
> over two long lists. Both operations took about 1 min to complete.
>
> My explanation is the heart's erlang part can't send the heartbeat to
> the heart port in time because
> either the scheduler it's bound to is busy or port I/O is busy/congested.
>
> Hope this helps.
>
> On 11/08/2017 10:13 AM, Dmitry Kolesnikov wrote:
>> Hello,
>>
>> I am trying to debug an issue of node termination by heartbeat
>> timeout and its recovery. I am looking for advice about the
>> reproduction of heartbeat timeout in controlled environment. What is
>> the best approach to freeze Erlang node?
>>
>> Best Regards,
>> Dmitry
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>

--
BR,
Dmitry

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Cause heartbeat timeout

Dmitry Kolesnikov-2
In reply to this post by Dmitry Klionsky-2
Thank you for tips.

I’ve not managed to reproduce it with lists.
The node get killed due to OOM.

- Dmitry.

> On 8 Nov 2017, at 11.49, Dmitry Klionsky <[hidden email]> wrote:
>
> I faced a situation like this two times.
> It was on a 2 CPUs EC2 instances. Heartbeat timeout was 45 secs.
>
> The first offender was zlib:gzip/1 and ~1GB file, the other was ++ over two long lists. Both operations took about 1 min to complete.
>
> My explanation is the heart's erlang part can't send the heartbeat to the heart port in time because
> either the scheduler it's bound to is busy or port I/O is busy/congested.
>
> Hope this helps.
>
> On 11/08/2017 10:13 AM, Dmitry Kolesnikov wrote:
>> Hello,
>>
>> I am trying to debug an issue of node termination by heartbeat timeout and its recovery. I am looking for advice about the reproduction of heartbeat timeout in controlled environment. What is the best approach to freeze Erlang node?
>>
>> Best Regards,
>> Dmitry
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>
> --
> BR,
> Dmitry
>

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions