Infinite loop in async_del in erl_async.c

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Infinite loop in async_del in erl_async.c

Anders.Ramsell

Hi!

We have recently experienced a problem where our linked in driver
using the asynchronous thread pool suddenly enters an infinite
loop eventually causing the whole Erlang runtime system to shut
down (without generating an erl_crash.dump).

This problem occured on Windows 2000/2003 Server with SMP support
disabled and with 1024 asynchronous threads (+A 1024) running
Erlang/OTP R12B-2.

After a lot of investigation the error appeared to be in the
function async_del in erl_async.c and now one of my colleagues
believes he has identified the bug.

The problem is that the code does not advance passed the first
element in a non-empty queue on a thread. If the first queue
element found is not the id we are looking for we get an infinite
loop. The code included below includes a suggested fix by my
colleague which solves the problem.

This bug is still present in R14.


static int async_del(long id)
{
    int i;
    /* scan all queue for an entry with async_id == 'id' */

    for (i = 0; i < erts_async_max_threads; i++) {
        ErlAsync* a;
        erts_mtx_lock(&async_q[i].mtx);
       
        a = async_q[i].head;
        while(a != NULL) {
            if (a->async_id == id) {
                if (a->prev != NULL)
                    a->prev->next = a->next;
                else
                    async_q[i].head = a->next;
                if (a->next != NULL)
                    a->next->prev = a->prev;
                else
                    async_q[i].tail = a->prev;
                async_q[i].len--;
                erts_mtx_unlock(&async_q[i].mtx);
                if (a->async_free != NULL)
                    a->async_free(a->async_data);
                async_detach(a->hndl);
                erts_free(ERTS_ALC_T_ASYNC, a);
                return 1;
            }
            a = a->next;         //<-- Add this line.
        }
        erts_mtx_unlock(&async_q[i].mtx);
    }
    return 0;
}


/Best regards
Anders Ramsell
_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs
Reply | Threaded
Open this post in threaded view
|

Re: Infinite loop in async_del in erl_async.c

Rickard Green-2
Thanks! We'll fix this in the soon to be released R14B03.

Regards,
Rickard Green, Erlang/OTP, Ericsson AB

On May 16, 2011, at 6:51 PM, <[hidden email]> <[hidden email]> wrote:

>
> Hi!
>
> We have recently experienced a problem where our linked in driver
> using the asynchronous thread pool suddenly enters an infinite
> loop eventually causing the whole Erlang runtime system to shut
> down (without generating an erl_crash.dump).
>
> This problem occured on Windows 2000/2003 Server with SMP support
> disabled and with 1024 asynchronous threads (+A 1024) running
> Erlang/OTP R12B-2.
>
> After a lot of investigation the error appeared to be in the
> function async_del in erl_async.c and now one of my colleagues
> believes he has identified the bug.
>
> The problem is that the code does not advance passed the first
> element in a non-empty queue on a thread. If the first queue
> element found is not the id we are looking for we get an infinite
> loop. The code included below includes a suggested fix by my
> colleague which solves the problem.
>
> This bug is still present in R14.
>
>
> static int async_del(long id)
> {
>    int i;
>    /* scan all queue for an entry with async_id == 'id' */
>
>    for (i = 0; i < erts_async_max_threads; i++) {
> ErlAsync* a;
> erts_mtx_lock(&async_q[i].mtx);
>
> a = async_q[i].head;
> while(a != NULL) {
>    if (a->async_id == id) {
> if (a->prev != NULL)
>    a->prev->next = a->next;
> else
>    async_q[i].head = a->next;
> if (a->next != NULL)
>    a->next->prev = a->prev;
> else
>    async_q[i].tail = a->prev;
> async_q[i].len--;
> erts_mtx_unlock(&async_q[i].mtx);
> if (a->async_free != NULL)
>    a->async_free(a->async_data);
> async_detach(a->hndl);
> erts_free(ERTS_ALC_T_ASYNC, a);
> return 1;
>    }
>    a = a->next;         //<-- Add this line.
> }
> erts_mtx_unlock(&async_q[i].mtx);
>    }
>    return 0;
> }
>
>
> /Best regards
> Anders Ramsell
> _______________________________________________
> erlang-bugs mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-bugs

_______________________________________________
erlang-bugs mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-bugs