Understanding supervisor / start_link behaviour

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding supervisor / start_link behaviour

Steve Strong
Hi,

I've got some strange behaviour with gen_event within a supervision tree which I don't fully understand.  Consider the following supervisor (completely standard, feel free to skip over):

<snip>

-module(sup).
-behaviour(supervisor).
-export([start_link/0, init/1]).
-define(SERVER, ?MODULE).

start_link() ->
    supervisor:start_link({local, ?SERVER}, ?MODULE, []).

init([]) ->
    Child1 = {child, {child, start_link, []}, permanent, 2000, worker, [child]},
    {ok, {{one_for_all, 1000, 3600}, [Child1]}}.

</snip>

and corresponding gen_server (interesting code in bold):

<snip>

-module(child).
-behaviour(gen_server).
-export([start_link/0, init/1, handle_call/3, handle_cast/2, 
handle_info/2, terminate/2, code_change/3]).

start_link() ->
    gen_server:start_link({local, child}, child, [], []).

init([]) ->
    io:format("about to start gen_event~n"),
    X = gen_event:start_link({local, my_gen_event}),
    io:format("gen_event started with ~p~n", [X]),
    {ok, _Pid} = X,

    {ok, {}, 2000}.

handle_call(_Request, _From, State) ->
    {reply, ok, State}.

handle_cast(_Msg, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    io:format("about to crash...~n"),
    1 = 2,
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

</snip>

If I run this from an erl shell like this:

<snip>

--> erl
Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.2  (abort with ^G)
1> application:start(sasl), supervisor:start_link(sup, []).

</snip>

Then the supervisor & server start as expected.  After 2 seconds the server gets a timeout message and crashes itself; the supervisor obviously spots this and restarts it.  Within the init of the gen_server, it also does a start_link on a gen_event process.  By my understanding, whenever the gen_server process exits, the gen_event will also be terminated.

However, every now and then I see the following output (a ton of sasl trace omitted for clarity!):

<snip>

about to crash...
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event

</snip>

What is happening is that the gen_server is crashing but on its restart the gen_event process is still running - hence the gen_server fails in its init and gets restarted again.  Sometimes this loop clears after a few iterations, other times it can continue until the parent supervisor gives up, packs its bags and goes home.

So, my question is whether this is expected behaviour or not.  I assume that the termination of the linked child is happening asynchronously, and that the supervisor is hence restarting its children before things have cleaned up correctly - is that correct?

I can fix this particular scenario by trapping exits within the gen_server, and then calling gen_event:stop within the terminate.  Is this type of processing necessary whenever a process is start_link'ed within a supervisor tree, or is what I'm doing considered bad practice?

Thanks for your time,

Steve

-- 
Steve Strong, Director, id3as
twitter.com/srstrong



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Roberto Ostinelli
hi steve,

your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.

r.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Ahmed Omar
Agree with Roberto, you should put under supervisor. Regarding your case, i would guess you are trapping exit in your init in my_gen_event?

On Wed, Jun 1, 2011 at 11:15 PM, Roberto Ostinelli <[hidden email]> wrote:
hi steve,

your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.

r.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




--
Best Regards,
- Ahmed Omar
Follow me on twitter


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Steve Strong
Yeah, that makes perfect sense and would obviously solve the problem.

The reason we'd gone down this path was that we had a number of "sub" processes (the gen_event just being one example) that we felt would be "polluting" the supervisor; these sub-processes were just helpers of the primary gen_servers that the supervisor was controlling - using a start_link in the primary gen_servers felt like a very clean and easy way of spinning up these other processes in a way that (we thought) would still be resilient to failures.

The thing that bit us was that we naively thought that, due to the sub-process being linked, it would die when the parent died.  Of course, it does, but its death is asynchronous to the notification that the supervisor receives and hence it may well still be alive (doomed, but alive) when the supervisor begins the restart cycle.  Our servers don't crash that often, and when they do this race condition is was rarely seen, which was reinforced our misconceptions.  The only thing that does surprise me is how many times the supervisor can go round the restart loop before the doomed process finally exits - we have seen it thrash round this loop about 1000 times before the supervisor itself finally fails; I guess it's just down to how things are being scheduled by the VM, and in those cases we were just getting unlucky.

Sounds like best-practice within the OTP world is to have everything started via a supervisor - is that a fair comment?

Cheers,

Steve

-- 
Steve Strong, Director, id3as
twitter.com/srstrong

On Wednesday, 1 June 2011 at 23:57, Ahmed Omar wrote:

Agree with Roberto, you should put under supervisor. Regarding your case, i would guess you are trapping exit in your init in my_gen_event?

On Wed, Jun 1, 2011 at 11:15 PM, Roberto Ostinelli <[hidden email]> wrote:
hi steve,

your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.

r.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




--
Best Regards,
- Ahmed Omar
Follow me on twitter



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Ladislav Lenart
Hello.

I am by no means an expert on the topic but I would like to point out
that the only reason you get {already_started, ...} error is because
you attempt to register the helper process with {local, ...}. If it is
a helper, there should be no reason for it to be globally accessible.
And if it wasn't registered, the gen_server would be restarted without
issues creating new helper process. The old helper would die eventually
just as you expect it to.


Ladislav Lenart


On 2.6.2011 09:15, Steve Strong wrote:

> Yeah, that makes perfect sense and would obviously solve the problem.
>
> The reason we'd gone down this path was that we had a number of "sub" processes (the gen_event just being one example) that we felt would be "polluting" the supervisor; these sub-processes were just
> helpers of the primary gen_servers that the supervisor was controlling - using a start_link in the primary gen_servers felt like a very clean and easy way of spinning up these other processes in a way
> that (we thought) would still be resilient to failures.
>
> The thing that bit us was that we naively thought that, due to the sub-process being linked, it would die when the parent died. Of course, it does, but its death is asynchronous to the notification
> that the supervisor receives and hence it may well still be alive (doomed, but alive) when the supervisor begins the restart cycle. Our servers don't crash that often, and when they do this race
> condition is was rarely seen, which was reinforced our misconceptions. The only thing that does surprise me is how many times the supervisor can go round the restart loop before the doomed process
> finally exits - we have seen it thrash round this loop about 1000 times before the supervisor itself finally fails; I guess it's just down to how things are being scheduled by the VM, and in those
> cases we were just getting unlucky.
>
> Sounds like best-practice within the OTP world is to have everything started via a supervisor - is that a fair comment?
>
> Cheers,
>
> Steve
>
> --
> Steve Strong, Director, id3as
> twitter.com/srstrong
>
> On Wednesday, 1 June 2011 at 23:57, Ahmed Omar wrote:
>
>> Agree with Roberto, you should put under supervisor. Regarding your case, i would guess you are trapping exit in your init in my_gen_event?
>>
>> On Wed, Jun 1, 2011 at 11:15 PM, Roberto Ostinelli <[hidden email] <mailto:[hidden email]>> wrote:
>>> hi steve,
>>>
>>> your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.
>>>
>>> r.
>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> [hidden email] <mailto:[hidden email]>
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>
>>
>>
>> --
>> Best Regards,
>> - Ahmed Omar
>> http://nl.linkedin.com/in/adiaa
>> Follow me on twitter
>> @spawn_think <http://twitter.com/#!/spawn_think>
>>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Antoine Koener
In reply to this post by Steve Strong

On Jun 2, 2011, at 09:15 , Steve Strong wrote:

Yeah, that makes perfect sense and would obviously solve the problem.

The reason we'd gone down this path was that we had a number of "sub" processes (the gen_event just being one example) that we felt would be "polluting" the supervisor; these sub-processes were just helpers of the primary gen_servers that the supervisor was controlling - using a start_link in the primary gen_servers felt like a very clean and easy way of spinning up these other processes in a way that (we thought) would still be resilient to failures.

I dealt with such a situation using a supervisor that starts all mandatory process and a child supervisor.
The child supervisor starts all other processes that needs those mandatory processes.


The thing that bit us was that we naively thought that, due to the sub-process being linked, it would die when the parent died.  Of course, it does, but its death is asynchronous to the notification that the supervisor receives and hence it may well still be alive (doomed, but alive) when the supervisor begins the restart cycle.  Our servers don't crash that often, and when they do this race condition is was rarely seen, which was reinforced our misconceptions.  The only thing that does surprise me is how many times the supervisor can go round the restart loop before the doomed process finally exits - we have seen it thrash round this loop about 1000 times before the supervisor itself finally fails; I guess it's just down to how things are being scheduled by the VM, and in those cases we were just getting unlucky.

Sounds like best-practice within the OTP world is to have everything started via a supervisor - is that a fair comment?

Yes, and
application:start(sasl).
is a way "observe" the whole starting process, and see what could be wrong.



Cheers,

Steve

-- 
Steve Strong, Director, id3as
twitter.com/srstrong

On Wednesday, 1 June 2011 at 23:57, Ahmed Omar wrote:

Agree with Roberto, you should put under supervisor. Regarding your case, i would guess you are trapping exit in your init in my_gen_event?

On Wed, Jun 1, 2011 at 11:15 PM, Roberto Ostinelli <[hidden email]> wrote:
hi steve,

your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.

r.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




--
Best Regards,
- Ahmed Omar
Follow me on twitter


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Ahmed Omar
In reply to this post by Ladislav Lenart
True ({local, Name}  -> register locally, not globally, with Name) but you have to be careful if having two instances alive in the same time is acceptable or not.

Steve, let's put it this way it's better to start processes under supervisor specially if you want to benefit from standard restarting strategies, it keeps your application cleaner.
(as a hack, your case can also be solved by a monitor in the gen_server init before starting the gen_server
 erlang:monitor(process, my_gen_event),
    receive
        {'DOWN', Ref, process, Pid, Reason}->
            ok
    end,
)
But using supervisor is much cleaner and safer, and easier to design with in my opinion
On Thu, Jun 2, 2011 at 10:23 AM, Ladislav Lenart <[hidden email]> wrote:
Hello.

I am by no means an expert on the topic but I would like to point out
that the only reason you get {already_started, ...} error is because
you attempt to register the helper process with {local, ...}. If it is
a helper, there should be no reason for it to be globally accessible.
And if it wasn't registered, the gen_server would be restarted without
issues creating new helper process. The old helper would die eventually
just as you expect it to.


Ladislav Lenart



On 2.6.2011 09:15, Steve Strong wrote:
Yeah, that makes perfect sense and would obviously solve the problem.

The reason we'd gone down this path was that we had a number of "sub" processes (the gen_event just being one example) that we felt would be "polluting" the supervisor; these sub-processes were just
helpers of the primary gen_servers that the supervisor was controlling - using a start_link in the primary gen_servers felt like a very clean and easy way of spinning up these other processes in a way
that (we thought) would still be resilient to failures.

The thing that bit us was that we naively thought that, due to the sub-process being linked, it would die when the parent died. Of course, it does, but its death is asynchronous to the notification
that the supervisor receives and hence it may well still be alive (doomed, but alive) when the supervisor begins the restart cycle. Our servers don't crash that often, and when they do this race
condition is was rarely seen, which was reinforced our misconceptions. The only thing that does surprise me is how many times the supervisor can go round the restart loop before the doomed process
finally exits - we have seen it thrash round this loop about 1000 times before the supervisor itself finally fails; I guess it's just down to how things are being scheduled by the VM, and in those
cases we were just getting unlucky.

Sounds like best-practice within the OTP world is to have everything started via a supervisor - is that a fair comment?

Cheers,

Steve

--
Steve Strong, Director, id3as
twitter.com/srstrong

On Wednesday, 1 June 2011 at 23:57, Ahmed Omar wrote:

Agree with Roberto, you should put under supervisor. Regarding your case, i would guess you are trapping exit in your init in my_gen_event?

On Wed, Jun 1, 2011 at 11:15 PM, Roberto Ostinelli <[hidden email] <mailto:[hidden email]>> wrote:
hi steve,

your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.

r.


_______________________________________________
erlang-questions mailing list
[hidden email] <mailto:[hidden email]>



--
Best Regards,
- Ahmed Omar
http://nl.linkedin.com/in/adiaa
Follow me on twitter
@spawn_think <http://twitter.com/#!/spawn_think>




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions





--
Best Regards,
- Ahmed Omar
Follow me on twitter


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Ladislav Lenart
On 2.6.2011 11:03, Ahmed Omar wrote:
> True ({local, Name}  -> register locally, not globally, with Name) but you have to be careful if having two instances alive in the same time is acceptable or not.

My bad. By "globally accessible" I meant that the locally
registered process will be available to all processes on
the local node.


Ladislav Lenart


> Steve, let's put it this way it's better to start processes under supervisor specially if you want to benefit from standard restarting strategies, it keeps your application cleaner.
> (as a hack, your case can also be solved by a monitor in the gen_server init before starting the gen_server
>   erlang:monitor(process, my_gen_event),
>      receive
>          {'DOWN', Ref, process, Pid, Reason}->
>              ok
>      end,
> )
> But using supervisor is much cleaner and safer, and easier to design with in my opinion
> On Thu, Jun 2, 2011 at 10:23 AM, Ladislav Lenart <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hello.
>
>     I am by no means an expert on the topic but I would like to point out
>     that the only reason you get {already_started, ...} error is because
>     you attempt to register the helper process with {local, ...}. If it is
>     a helper, there should be no reason for it to be globally accessible.
>     And if it wasn't registered, the gen_server would be restarted without
>     issues creating new helper process. The old helper would die eventually
>     just as you expect it to.
>
>
>     Ladislav Lenart
>
>
>
>     On 2.6.2011 09:15, Steve Strong wrote:
>
>         Yeah, that makes perfect sense and would obviously solve the problem.
>
>         The reason we'd gone down this path was that we had a number of "sub" processes (the gen_event just being one example) that we felt would be "polluting" the supervisor; these sub-processes
>         were just
>         helpers of the primary gen_servers that the supervisor was controlling - using a start_link in the primary gen_servers felt like a very clean and easy way of spinning up these other processes
>         in a way
>         that (we thought) would still be resilient to failures.
>
>         The thing that bit us was that we naively thought that, due to the sub-process being linked, it would die when the parent died. Of course, it does, but its death is asynchronous to the
>         notification
>         that the supervisor receives and hence it may well still be alive (doomed, but alive) when the supervisor begins the restart cycle. Our servers don't crash that often, and when they do this race
>         condition is was rarely seen, which was reinforced our misconceptions. The only thing that does surprise me is how many times the supervisor can go round the restart loop before the doomed process
>         finally exits - we have seen it thrash round this loop about 1000 times before the supervisor itself finally fails; I guess it's just down to how things are being scheduled by the VM, and in those
>         cases we were just getting unlucky.
>
>         Sounds like best-practice within the OTP world is to have everything started via a supervisor - is that a fair comment?
>
>         Cheers,
>
>         Steve
>
>         --
>         Steve Strong, Director, id3as
>         twitter.com/srstrong <http://twitter.com/srstrong>
>
>         On Wednesday, 1 June 2011 at 23:57, Ahmed Omar wrote:
>
>             Agree with Roberto, you should put under supervisor. Regarding your case, i would guess you are trapping exit in your init in my_gen_event?
>
>             On Wed, Jun 1, 2011 at 11:15 PM, Roberto Ostinelli <[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>
>                 hi steve,
>
>                 your gen_event should be started by your supervisor too. in this case, since you specified a one_for_all behaviour, when gen_server crashes, gen_event will be restarted too.
>
>                 r.
>
>
>                 _______________________________________________
>                 erlang-questions mailing list
>                 [hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>
>
>                 http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
>
>             --
>             Best Regards,
>             - Ahmed Omar
>             http://nl.linkedin.com/in/adiaa
>             Follow me on twitter
>             @spawn_think <http://twitter.com/#!/spawn_think>
>
>
>
>
>         _______________________________________________
>         erlang-questions mailing list
>         [hidden email] <mailto:[hidden email]>
>         http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
>
>
>
> --
> Best Regards,
> - Ahmed Omar
> http://nl.linkedin.com/in/adiaa
> Follow me on twitter
> @spawn_think <http://twitter.com/#!/spawn_think>
>


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

mazenharake
In reply to this post by Steve Strong
Steve,

I wouldn't say that you are wrong. I think that you are reasoning good
about not putting the gen_event module under a supervisor because
*that is what links are for*. Just because you have a supervisor
doesn't mean the you shove everything underneath there! If the
gen_server and the gen_event are truly linked (meaning: gen_server
doesn't act as a "supervisor" keeping track of its gen_event process
and restarts it all the time but rather that they really are linked
and they crash together) then your approach, in my opinion, is good.

There are great benefits in doing it in that way. Many will claim that
it is best practice to put *everything* under a supervisor but this is
simply not true. 90% of cases it *is* the best thing to do and many
times it is more about how you designed your application rather than
where to put the supervisors and their children but doing it the way
you did is not necessarily wrong.

The only problem I see with your approach is that you have registered
the gen_event process which clearly isn't useful (since only the
gen_server should know about it, after all, it started it). Other than
that, this approach is extremely helpful and a nice way to clean up
things after they die/shutdown (Again: assuming truly linked).

There is a big misconception in the community that everything
should/must look like the supervisor-tree model which shows how
gen_servers are put under supervisors and more supervisors under the
"top" supervisor but that is not enforced and the design principles
doesn't take many cases into account where this setup actually brings
more headache to the table than to just exit and clean up using linked
processes (because they do exist).

/M

On 1 June 2011 21:26, Steve Strong <[hidden email]> wrote:

> Hi,
>
> I've got some strange behaviour with gen_event within a supervision tree
> which I don't fully understand.  Consider the following supervisor
> (completely standard, feel free to skip over):
> <snip>
> -module(sup).
> -behaviour(supervisor).
> -export([start_link/0, init/1]).
> -define(SERVER, ?MODULE).
> start_link() ->
>     supervisor:start_link({local, ?SERVER}, ?MODULE, []).
> init([]) ->
>     Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
> [child]},
>     {ok, {{one_for_all, 1000, 3600}, [Child1]}}.
> </snip>
> and corresponding gen_server (interesting code in bold):
> <snip>
> -module(child).
> -behaviour(gen_server).
> -export([start_link/0, init/1, handle_call/3, handle_cast/2,
> handle_info/2, terminate/2, code_change/3]).
> start_link() ->
>     gen_server:start_link({local, child}, child, [], []).
> init([]) ->
>     io:format("about to start gen_event~n"),
>     X = gen_event:start_link({local, my_gen_event}),
>     io:format("gen_event started with ~p~n", [X]),
>     {ok, _Pid} = X,
>     {ok, {}, 2000}.
> handle_call(_Request, _From, State) ->
>     {reply, ok, State}.
> handle_cast(_Msg, State) ->
>     {noreply, State}.
> handle_info(_Info, State) ->
>     io:format("about to crash...~n"),
>     1 = 2,
>     {noreply, State}.
> terminate(_Reason, _State) ->
>     ok.
> code_change(_OldVsn, State, _Extra) ->
>     {ok, State}.
> </snip>
> If I run this from an erl shell like this:
> <snip>
> --> erl
> Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
> [async-threads:0] [hipe] [kernel-poll:false]
> Eshell V5.8.2  (abort with ^G)
> 1> application:start(sasl), supervisor:start_link(sup, []).
> </snip>
>
> Then the supervisor & server start as expected.  After 2 seconds the server
> gets a timeout message and crashes itself; the supervisor obviously spots
> this and restarts it.  Within the init of the gen_server, it also does a
> start_link on a gen_event process.  By my understanding, whenever the
> gen_server process exits, the gen_event will also be terminated.
> However, every now and then I see the following output (a ton of sasl trace
> omitted for clarity!):
> <snip>
> about to crash...
> about to start gen_event
> gen_event started with {error,{already_started,<0.79.0>}}
> about to start gen_event
> gen_event started with {error,{already_started,<0.79.0>}}
> about to start gen_event
> </snip>
> What is happening is that the gen_server is crashing but on its restart the
> gen_event process is still running - hence the gen_server fails in its init
> and gets restarted again.  Sometimes this loop clears after a few
> iterations, other times it can continue until the parent supervisor gives
> up, packs its bags and goes home.
> So, my question is whether this is expected behaviour or not.  I assume that
> the termination of the linked child is happening asynchronously, and that
> the supervisor is hence restarting its children before things have cleaned
> up correctly - is that correct?
> I can fix this particular scenario by trapping exits within the gen_server,
> and then calling gen_event:stop within the terminate.  Is this type of
> processing necessary whenever a process is start_link'ed within a supervisor
> tree, or is what I'm doing considered bad practice?
> Thanks for your time,
> Steve
> --
> Steve Strong, Director, id3as
> twitter.com/srstrong
>
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Tim Watson-5
In reply to this post by Ladislav Lenart
On 2 June 2011 10:14, Ladislav Lenart <[hidden email]> wrote:
> On 2.6.2011 11:03, Ahmed Omar wrote:
>>
>> True ({local, Name}  -> register locally, not globally, with Name) but you
>> have to be careful if having two instances alive in the same time is
>> acceptable or not.
>
> My bad. By "globally accessible" I meant that the locally
> registered process will be available to all processes on
> the local node.

You might consider gproc for these kinds of use cases. It provides a
great deal of simplification around synchronising startups and
registering names etc.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Steve Strong
In reply to this post by mazenharake
That makes a good deal of sense.  I guess the point that something should get promoted up to a supervision tree rather than being start-linked is when it starts getting to a complexity level such that it may have issues if multiple instances of the process are running simultaneously.  At that point it stops sound like a trivial helper process and something that should be managed more actively.

Completely agree on the fact that having the gen_event register wasn't a useful thing, and that not doing so would solve the problem - that was pretty obvious as soon as I spotted the issue, this thread was more to get opinion on how things should be best structured.

-- 
Steve Strong, Director, id3as
twitter.com/srstrong

On Thursday, 2 June 2011 at 11:53, Mazen Harake wrote:

Steve,

I wouldn't say that you are wrong. I think that you are reasoning good
about not putting the gen_event module under a supervisor because
*that is what links are for*. Just because you have a supervisor
doesn't mean the you shove everything underneath there! If the
gen_server and the gen_event are truly linked (meaning: gen_server
doesn't act as a "supervisor" keeping track of its gen_event process
and restarts it all the time but rather that they really are linked
and they crash together) then your approach, in my opinion, is good.

There are great benefits in doing it in that way. Many will claim that
it is best practice to put *everything* under a supervisor but this is
simply not true. 90% of cases it *is* the best thing to do and many
times it is more about how you designed your application rather than
where to put the supervisors and their children but doing it the way
you did is not necessarily wrong.

The only problem I see with your approach is that you have registered
the gen_event process which clearly isn't useful (since only the
gen_server should know about it, after all, it started it). Other than
that, this approach is extremely helpful and a nice way to clean up
things after they die/shutdown (Again: assuming truly linked).

There is a big misconception in the community that everything
should/must look like the supervisor-tree model which shows how
gen_servers are put under supervisors and more supervisors under the
"top" supervisor but that is not enforced and the design principles
doesn't take many cases into account where this setup actually brings
more headache to the table than to just exit and clean up using linked
processes (because they do exist).

/M

On 1 June 2011 21:26, Steve Strong <[hidden email]> wrote:
Hi,

I've got some strange behaviour with gen_event within a supervision tree
which I don't fully understand.  Consider the following supervisor
(completely standard, feel free to skip over):
<snip>
-module(sup).
-behaviour(supervisor).
-export([start_link/0, init/1]).
-define(SERVER, ?MODULE).
start_link() ->
    supervisor:start_link({local, ?SERVER}, ?MODULE, []).
init([]) ->
    Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
[child]},
    {ok, {{one_for_all, 1000, 3600}, [Child1]}}.
</snip>
and corresponding gen_server (interesting code in bold):
<snip>
-module(child).
-behaviour(gen_server).
-export([start_link/0, init/1, handle_call/3, handle_cast/2,
handle_info/2, terminate/2, code_change/3]).
start_link() ->
    gen_server:start_link({local, child}, child, [], []).
init([]) ->
    io:format("about to start gen_event~n"),
    X = gen_event:start_link({local, my_gen_event}),
    io:format("gen_event started with ~p~n", [X]),
    {ok, _Pid} = X,
    {ok, {}, 2000}.
handle_call(_Request, _From, State) ->
    {reply, ok, State}.
handle_cast(_Msg, State) ->
    {noreply, State}.
handle_info(_Info, State) ->
    io:format("about to crash...~n"),
    1 = 2,
    {noreply, State}.
terminate(_Reason, _State) ->
    ok.
code_change(_OldVsn, State, _Extra) ->
    {ok, State}.
</snip>
If I run this from an erl shell like this:
<snip>
--> erl
Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
[async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.2  (abort with ^G)
1> application:start(sasl), supervisor:start_link(sup, []).
</snip>

Then the supervisor & server start as expected.  After 2 seconds the server
gets a timeout message and crashes itself; the supervisor obviously spots
this and restarts it.  Within the init of the gen_server, it also does a
start_link on a gen_event process.  By my understanding, whenever the
gen_server process exits, the gen_event will also be terminated.
However, every now and then I see the following output (a ton of sasl trace
omitted for clarity!):
<snip>
about to crash...
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
</snip>
What is happening is that the gen_server is crashing but on its restart the
gen_event process is still running - hence the gen_server fails in its init
and gets restarted again.  Sometimes this loop clears after a few
iterations, other times it can continue until the parent supervisor gives
up, packs its bags and goes home.
So, my question is whether this is expected behaviour or not.  I assume that
the termination of the linked child is happening asynchronously, and that
the supervisor is hence restarting its children before things have cleaned
up correctly - is that correct?
I can fix this particular scenario by trapping exits within the gen_server,
and then calling gen_event:stop within the terminate.  Is this type of
processing necessary whenever a process is start_link'ed within a supervisor
tree, or is what I'm doing considered bad practice?
Thanks for your time,
Steve
--
Steve Strong, Director, id3as
twitter.com/srstrong


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Frédéric Trottier-Hébert
In reply to this post by mazenharake
There are disadvantages to *not* putting workers under the supervision tree, though. Namely, you'll be losing the ability to have the release handlers walk down the supervision trees to find which processes to suspend/update, and you'll then need to find a different way of doing things.

This is a serious point to consider if you ever plan on going the way of releases/appups if the workers you use are to be long-lived (you don't want them to be killed during a purge). I'm not saying you didn't know this, but I felt I should point it out for the sake of having the arguments clear on the mailing list.

--
Fred Hébert
http://www.erlang-solutions.com


On 2011-06-02, at 05:53 AM, Mazen Harake wrote:

> Steve,
>
> I wouldn't say that you are wrong. I think that you are reasoning good
> about not putting the gen_event module under a supervisor because
> *that is what links are for*. Just because you have a supervisor
> doesn't mean the you shove everything underneath there! If the
> gen_server and the gen_event are truly linked (meaning: gen_server
> doesn't act as a "supervisor" keeping track of its gen_event process
> and restarts it all the time but rather that they really are linked
> and they crash together) then your approach, in my opinion, is good.
>
> There are great benefits in doing it in that way. Many will claim that
> it is best practice to put *everything* under a supervisor but this is
> simply not true. 90% of cases it *is* the best thing to do and many
> times it is more about how you designed your application rather than
> where to put the supervisors and their children but doing it the way
> you did is not necessarily wrong.
>
> The only problem I see with your approach is that you have registered
> the gen_event process which clearly isn't useful (since only the
> gen_server should know about it, after all, it started it). Other than
> that, this approach is extremely helpful and a nice way to clean up
> things after they die/shutdown (Again: assuming truly linked).
>
> There is a big misconception in the community that everything
> should/must look like the supervisor-tree model which shows how
> gen_servers are put under supervisors and more supervisors under the
> "top" supervisor but that is not enforced and the design principles
> doesn't take many cases into account where this setup actually brings
> more headache to the table than to just exit and clean up using linked
> processes (because they do exist).
>
> /M
>
> On 1 June 2011 21:26, Steve Strong <[hidden email]> wrote:
>> Hi,
>>
>> I've got some strange behaviour with gen_event within a supervision tree
>> which I don't fully understand.  Consider the following supervisor
>> (completely standard, feel free to skip over):
>> <snip>
>> -module(sup).
>> -behaviour(supervisor).
>> -export([start_link/0, init/1]).
>> -define(SERVER, ?MODULE).
>> start_link() ->
>>     supervisor:start_link({local, ?SERVER}, ?MODULE, []).
>> init([]) ->
>>     Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
>> [child]},
>>     {ok, {{one_for_all, 1000, 3600}, [Child1]}}.
>> </snip>
>> and corresponding gen_server (interesting code in bold):
>> <snip>
>> -module(child).
>> -behaviour(gen_server).
>> -export([start_link/0, init/1, handle_call/3, handle_cast/2,
>> handle_info/2, terminate/2, code_change/3]).
>> start_link() ->
>>     gen_server:start_link({local, child}, child, [], []).
>> init([]) ->
>>     io:format("about to start gen_event~n"),
>>     X = gen_event:start_link({local, my_gen_event}),
>>     io:format("gen_event started with ~p~n", [X]),
>>     {ok, _Pid} = X,
>>     {ok, {}, 2000}.
>> handle_call(_Request, _From, State) ->
>>     {reply, ok, State}.
>> handle_cast(_Msg, State) ->
>>     {noreply, State}.
>> handle_info(_Info, State) ->
>>     io:format("about to crash...~n"),
>>     1 = 2,
>>     {noreply, State}.
>> terminate(_Reason, _State) ->
>>     ok.
>> code_change(_OldVsn, State, _Extra) ->
>>     {ok, State}.
>> </snip>
>> If I run this from an erl shell like this:
>> <snip>
>> --> erl
>> Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
>> [async-threads:0] [hipe] [kernel-poll:false]
>> Eshell V5.8.2  (abort with ^G)
>> 1> application:start(sasl), supervisor:start_link(sup, []).
>> </snip>
>>
>> Then the supervisor & server start as expected.  After 2 seconds the server
>> gets a timeout message and crashes itself; the supervisor obviously spots
>> this and restarts it.  Within the init of the gen_server, it also does a
>> start_link on a gen_event process.  By my understanding, whenever the
>> gen_server process exits, the gen_event will also be terminated.
>> However, every now and then I see the following output (a ton of sasl trace
>> omitted for clarity!):
>> <snip>
>> about to crash...
>> about to start gen_event
>> gen_event started with {error,{already_started,<0.79.0>}}
>> about to start gen_event
>> gen_event started with {error,{already_started,<0.79.0>}}
>> about to start gen_event
>> </snip>
>> What is happening is that the gen_server is crashing but on its restart the
>> gen_event process is still running - hence the gen_server fails in its init
>> and gets restarted again.  Sometimes this loop clears after a few
>> iterations, other times it can continue until the parent supervisor gives
>> up, packs its bags and goes home.
>> So, my question is whether this is expected behaviour or not.  I assume that
>> the termination of the linked child is happening asynchronously, and that
>> the supervisor is hence restarting its children before things have cleaned
>> up correctly - is that correct?
>> I can fix this particular scenario by trapping exits within the gen_server,
>> and then calling gen_event:stop within the terminate.  Is this type of
>> processing necessary whenever a process is start_link'ed within a supervisor
>> tree, or is what I'm doing considered bad practice?
>> Thanks for your time,
>> Steve
>> --
>> Steve Strong, Director, id3as
>> twitter.com/srstrong
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Steve Strong
That is an interesting point, and not something I'd considered to date

-- 
Steve Strong
Sent with Sparrow

On Thursday, June 2, 2011 at 4:10 PM, Frédéric Trottier-Hébert wrote:

There are disadvantages to *not* putting workers under the supervision tree, though. Namely, you'll be losing the ability to have the release handlers walk down the supervision trees to find which processes to suspend/update, and you'll then need to find a different way of doing things.

This is a serious point to consider if you ever plan on going the way of releases/appups if the workers you use are to be long-lived (you don't want them to be killed during a purge). I'm not saying you didn't know this, but I felt I should point it out for the sake of having the arguments clear on the mailing list.

--
Fred Hébert
http://www.erlang-solutions.com


On 2011-06-02, at 05:53 AM, Mazen Harake wrote:

Steve,

I wouldn't say that you are wrong. I think that you are reasoning good
about not putting the gen_event module under a supervisor because
*that is what links are for*. Just because you have a supervisor
doesn't mean the you shove everything underneath there! If the
gen_server and the gen_event are truly linked (meaning: gen_server
doesn't act as a "supervisor" keeping track of its gen_event process
and restarts it all the time but rather that they really are linked
and they crash together) then your approach, in my opinion, is good.

There are great benefits in doing it in that way. Many will claim that
it is best practice to put *everything* under a supervisor but this is
simply not true. 90% of cases it *is* the best thing to do and many
times it is more about how you designed your application rather than
where to put the supervisors and their children but doing it the way
you did is not necessarily wrong.

The only problem I see with your approach is that you have registered
the gen_event process which clearly isn't useful (since only the
gen_server should know about it, after all, it started it). Other than
that, this approach is extremely helpful and a nice way to clean up
things after they die/shutdown (Again: assuming truly linked).

There is a big misconception in the community that everything
should/must look like the supervisor-tree model which shows how
gen_servers are put under supervisors and more supervisors under the
"top" supervisor but that is not enforced and the design principles
doesn't take many cases into account where this setup actually brings
more headache to the table than to just exit and clean up using linked
processes (because they do exist).

/M

On 1 June 2011 21:26, Steve Strong <[hidden email]> wrote:
Hi,

I've got some strange behaviour with gen_event within a supervision tree
which I don't fully understand. Consider the following supervisor
(completely standard, feel free to skip over):
<snip>
-module(sup).
-behaviour(supervisor).
-export([start_link/0, init/1]).
-define(SERVER, ?MODULE).
start_link() ->
supervisor:start_link({local, ?SERVER}, ?MODULE, []).
init([]) ->
Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
[child]},
{ok, {{one_for_all, 1000, 3600}, [Child1]}}.
</snip>
and corresponding gen_server (interesting code in bold):
<snip>
-module(child).
-behaviour(gen_server).
-export([start_link/0, init/1, handle_call/3, handle_cast/2,
handle_info/2, terminate/2, code_change/3]).
start_link() ->
gen_server:start_link({local, child}, child, [], []).
init([]) ->
io:format("about to start gen_event~n"),
X = gen_event:start_link({local, my_gen_event}),
io:format("gen_event started with ~p~n", [X]),
{ok, _Pid} = X,
{ok, {}, 2000}.
handle_call(_Request, _From, State) ->
{reply, ok, State}.
handle_cast(_Msg, State) ->
{noreply, State}.
handle_info(_Info, State) ->
io:format("about to crash...~n"),
1 = 2,
{noreply, State}.
terminate(_Reason, _State) ->
ok.
code_change(_OldVsn, State, _Extra) ->
{ok, State}.
</snip>
If I run this from an erl shell like this:
<snip>
--> erl
Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
[async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.2 (abort with ^G)
1> application:start(sasl), supervisor:start_link(sup, []).
</snip>

Then the supervisor & server start as expected. After 2 seconds the server
gets a timeout message and crashes itself; the supervisor obviously spots
this and restarts it. Within the init of the gen_server, it also does a
start_link on a gen_event process. By my understanding, whenever the
gen_server process exits, the gen_event will also be terminated.
However, every now and then I see the following output (a ton of sasl trace
omitted for clarity!):
<snip>
about to crash...
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
</snip>
What is happening is that the gen_server is crashing but on its restart the
gen_event process is still running - hence the gen_server fails in its init
and gets restarted again. Sometimes this loop clears after a few
iterations, other times it can continue until the parent supervisor gives
up, packs its bags and goes home.
So, my question is whether this is expected behaviour or not. I assume that
the termination of the linked child is happening asynchronously, and that
the supervisor is hence restarting its children before things have cleaned
up correctly - is that correct?
I can fix this particular scenario by trapping exits within the gen_server,
and then calling gen_event:stop within the terminate. Is this type of
processing necessary whenever a process is start_link'ed within a supervisor
tree, or is what I'm doing considered bad practice?
Thanks for your time,
Steve
--
Steve Strong, Director, id3as
twitter.com/srstrong


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

Jachym Holecek
In reply to this post by mazenharake
# Mazen Harake 2011-06-02:
> I wouldn't say that you are wrong. I think that you are reasoning good
> about not putting the gen_event module under a supervisor because
> *that is what links are for*. Just because you have a supervisor
> doesn't mean the you shove everything underneath there! If the
> gen_server and the gen_event are truly linked (meaning: gen_server
> doesn't act as a "supervisor" keeping track of its gen_event process
> and restarts it all the time but rather that they really are linked
> and they crash together) then your approach, in my opinion, is good.

FWIW couldn't agree more with this. For completeness (it's obvious and you're
no doubt aware of it): 'normal' exits don't kill linked peers, which takes a
little getting used to, but is trivial to manage.

As a more general point, designing sensible supervision trees was probably
the most difficult engineering aspect of OTP for me to learn, so I guess
people shouldn't feel too bad if it feels intimidating initially. :-)

BR,
        -- Jachym
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Understanding supervisor / start_link behaviour

mazenharake
In reply to this post by Frédéric Trottier-Hébert
True. This is a very valid point.

Personally I have very rarely used the live upgrade tools of a node
(relup/appup/release_handler etc) so I don't really know the bad side
of not putting everything under a supervision tree. But then again I
simply don't think the fuzz of specifying every single thing to
reload/change is worth the "uptime" mark.

The strategy I prefer is to have an architecture which enables me to;
take down a node gracefully (detaching itself from the cluster),
manually install a release (I.e. untar the release and changing
start_erl.data to point to it), and start up the node again. This
should not affect the system which should still be operational (say
you have 10 nodes and you do this upgrade one by one). Should the new
release not work or something unexpected turns up then just change the
start_erl.data file to point to the old release and bounce the node
(your version handling on your applications should support this
meaning v1.32.424 in this release has *exactly* the same code as
v1.32.424 in the previous release).

This way of working has been proven very successful to me (and the
systems I took part in building). Specifying relups and appups for
this kind of work is, in my opinion, tedious but some seem to think it
is worth the effort. However you do have a very important point to
consider when not hanging everything under a supervisor tree. If I had
only 2 nodes to consider maybe I'd want them up at all time but then
again they would be built in a way to handle if one goes down (E.g.
when I upgrade them).


2011/6/2 Frédéric Trottier-Hébert <[hidden email]>:

> There are disadvantages to *not* putting workers under the supervision tree, though. Namely, you'll be losing the ability to have the release handlers walk down the supervision trees to find which processes to suspend/update, and you'll then need to find a different way of doing things.
>
> This is a serious point to consider if you ever plan on going the way of releases/appups if the workers you use are to be long-lived (you don't want them to be killed during a purge). I'm not saying you didn't know this, but I felt I should point it out for the sake of having the arguments clear on the mailing list.
>
> --
> Fred Hébert
> http://www.erlang-solutions.com
>
>
> On 2011-06-02, at 05:53 AM, Mazen Harake wrote:
>
>> Steve,
>>
>> I wouldn't say that you are wrong. I think that you are reasoning good
>> about not putting the gen_event module under a supervisor because
>> *that is what links are for*. Just because you have a supervisor
>> doesn't mean the you shove everything underneath there! If the
>> gen_server and the gen_event are truly linked (meaning: gen_server
>> doesn't act as a "supervisor" keeping track of its gen_event process
>> and restarts it all the time but rather that they really are linked
>> and they crash together) then your approach, in my opinion, is good.
>>
>> There are great benefits in doing it in that way. Many will claim that
>> it is best practice to put *everything* under a supervisor but this is
>> simply not true. 90% of cases it *is* the best thing to do and many
>> times it is more about how you designed your application rather than
>> where to put the supervisors and their children but doing it the way
>> you did is not necessarily wrong.
>>
>> The only problem I see with your approach is that you have registered
>> the gen_event process which clearly isn't useful (since only the
>> gen_server should know about it, after all, it started it). Other than
>> that, this approach is extremely helpful and a nice way to clean up
>> things after they die/shutdown (Again: assuming truly linked).
>>
>> There is a big misconception in the community that everything
>> should/must look like the supervisor-tree model which shows how
>> gen_servers are put under supervisors and more supervisors under the
>> "top" supervisor but that is not enforced and the design principles
>> doesn't take many cases into account where this setup actually brings
>> more headache to the table than to just exit and clean up using linked
>> processes (because they do exist).
>>
>> /M
>>
>> On 1 June 2011 21:26, Steve Strong <[hidden email]> wrote:
>>> Hi,
>>>
>>> I've got some strange behaviour with gen_event within a supervision tree
>>> which I don't fully understand.  Consider the following supervisor
>>> (completely standard, feel free to skip over):
>>> <snip>
>>> -module(sup).
>>> -behaviour(supervisor).
>>> -export([start_link/0, init/1]).
>>> -define(SERVER, ?MODULE).
>>> start_link() ->
>>>     supervisor:start_link({local, ?SERVER}, ?MODULE, []).
>>> init([]) ->
>>>     Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
>>> [child]},
>>>     {ok, {{one_for_all, 1000, 3600}, [Child1]}}.
>>> </snip>
>>> and corresponding gen_server (interesting code in bold):
>>> <snip>
>>> -module(child).
>>> -behaviour(gen_server).
>>> -export([start_link/0, init/1, handle_call/3, handle_cast/2,
>>> handle_info/2, terminate/2, code_change/3]).
>>> start_link() ->
>>>     gen_server:start_link({local, child}, child, [], []).
>>> init([]) ->
>>>     io:format("about to start gen_event~n"),
>>>     X = gen_event:start_link({local, my_gen_event}),
>>>     io:format("gen_event started with ~p~n", [X]),
>>>     {ok, _Pid} = X,
>>>     {ok, {}, 2000}.
>>> handle_call(_Request, _From, State) ->
>>>     {reply, ok, State}.
>>> handle_cast(_Msg, State) ->
>>>     {noreply, State}.
>>> handle_info(_Info, State) ->
>>>     io:format("about to crash...~n"),
>>>     1 = 2,
>>>     {noreply, State}.
>>> terminate(_Reason, _State) ->
>>>     ok.
>>> code_change(_OldVsn, State, _Extra) ->
>>>     {ok, State}.
>>> </snip>
>>> If I run this from an erl shell like this:
>>> <snip>
>>> --> erl
>>> Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
>>> [async-threads:0] [hipe] [kernel-poll:false]
>>> Eshell V5.8.2  (abort with ^G)
>>> 1> application:start(sasl), supervisor:start_link(sup, []).
>>> </snip>
>>>
>>> Then the supervisor & server start as expected.  After 2 seconds the server
>>> gets a timeout message and crashes itself; the supervisor obviously spots
>>> this and restarts it.  Within the init of the gen_server, it also does a
>>> start_link on a gen_event process.  By my understanding, whenever the
>>> gen_server process exits, the gen_event will also be terminated.
>>> However, every now and then I see the following output (a ton of sasl trace
>>> omitted for clarity!):
>>> <snip>
>>> about to crash...
>>> about to start gen_event
>>> gen_event started with {error,{already_started,<0.79.0>}}
>>> about to start gen_event
>>> gen_event started with {error,{already_started,<0.79.0>}}
>>> about to start gen_event
>>> </snip>
>>> What is happening is that the gen_server is crashing but on its restart the
>>> gen_event process is still running - hence the gen_server fails in its init
>>> and gets restarted again.  Sometimes this loop clears after a few
>>> iterations, other times it can continue until the parent supervisor gives
>>> up, packs its bags and goes home.
>>> So, my question is whether this is expected behaviour or not.  I assume that
>>> the termination of the linked child is happening asynchronously, and that
>>> the supervisor is hence restarting its children before things have cleaned
>>> up correctly - is that correct?
>>> I can fix this particular scenario by trapping exits within the gen_server,
>>> and then calling gen_event:stop within the terminate.  Is this type of
>>> processing necessary whenever a process is start_link'ed within a supervisor
>>> tree, or is what I'm doing considered bad practice?
>>> Thanks for your time,
>>> Steve
>>> --
>>> Steve Strong, Director, id3as
>>> twitter.com/srstrong
>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> [hidden email]
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>
>
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions