initializing process

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

initializing process

kamiseq
can somebody corrects me (and hopefully help with my problem).

so the problem is like that
1)Im starting a server in erlang and among others I create one process that
needs to decide if another local computer is up before whole server's
initialization process is done.
2)the process is simple gen_server
(AND HERE IS A PROBLEM)
3)as long as I dont return from init/1 fun, my process is not seen (I cant
really use self() - is that correct??) by others so I CANT RECEIVE ANY
MESSAGES. so I though I can spawn another processes while in init/1 that
will loop and ping another side if it is ready. but I am afraid that there
might be a chance that the gen_server will not yet return from init and
spawned worker will try to send some notification, so or I will miss it or
there will be some error.

is this really a problem or Im just missing something here that is so
obvious.
All I really need is to init state of the gen_server, then start dynamically
initializing the server so I cant hard code all conditions in init fun and I
need to communicate with other processes.

hope it is more or less clear

take care

pozdrawiam
Paweł Kamiński

[hidden email]
[hidden email]
______________________
Reply | Threaded
Open this post in threaded view
|

Re: initializing process

Gleb Peregud
2009/9/17 paweł kamiński <[hidden email]>:
> (I cant really use self() - is that correct??) by others so I CANT RECEIVE ANY
> MESSAGES.
Why do you think so? You can use self() in init/1 and you can receive
messages. No problem here. But you have to use message primitives (!
and receive), not the gen_server's counterparts.

> so I though I can spawn another processes while in init/1 that
> will loop and ping another side if it is ready. but I am afraid that there
> might be a chance that the gen_server will not yet return from init and
> spawned worker will try to send some notification, so or I will miss it or
> there will be some error.

There is no problem with synchronizing two such processes. And you
don't have to spawn any process at all - you can do anything you want
in init/1. Either way this process will be stuck in init/1, so no
reason to spawn another process (unless there is something else
important in your situation).

If the initialization can take some extended time I'd use gen_fsm
instead, which will have two states (e.g. 'initializing' and 'ready').

> is this really a problem or Im just missing something here that is so
> obvious.
> All I really need is to init state of the gen_server, then start dynamically
> initializing the server so I cant hard code all conditions in init fun and I
> need to communicate with other processes.
>
> hope it is more or less clear
>
> take care
>
> pozdrawiam
> Paweł Kamiński
>
> [hidden email]
> [hidden email]

Pozdrawiam serdecznie,
Gleb Peregud

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Reply | Threaded
Open this post in threaded view
|

Re: initializing process

Jayson Vantuyl-2
In reply to this post by kamiseq
What exactly are you trying to do?  It's really easy to create complex  
race conditions and strange startup ordering issues doing something  
like this.

If you can have the node signal you (much less work).  Just start out  
in an "uninitialized" state set a short timeout and just go idle.  
When you receive the "other_node_started" event, go into "running"  
mode.  I'd recommend using sync_send_event from the server you're  
waiting on.  That guarantees to close the loop in case they start up  
in a weird order.

Timeouts are handy and highly underutilized.  Other great uses include  
"forcibly garbage collecting during inactivity" and "delaying before  
hibernating".

If you must ping the node from the waiting side, you can simulate a  
delaying loop with a gen_server (and a gen_fsm) by using the timeout  
feature.  Just send the ping and set a timeout.  Have the timeout  
function send the ping again and set a timeout again.  Eventually,  
you'll get the response and can go about your merry way.

Both of these methods prevent you from sitting in init for an  
undefined amount of time.

All of this gets complicated in the face of restarts.  It may not  
really be workable.  It would help to know more about what you're doing.

I think the "correct" way (i.e. overkill in the style of Ericsson) is  
to do this at an application level.  You can set up a distributed  
Erlang application to start in "phases".  Depending on your  
deployment, that might be the ultimate answer, since it also provides  
hooks for takeover / failover, which may be necessary depending on  
your application.

On Sep 17, 2009, at 8:48 AM, paweł kamiński wrote:

> can somebody corrects me (and hopefully help with my problem).
>
> so the problem is like that
> 1)Im starting a server in erlang and among others I create one  
> process that
> needs to decide if another local computer is up before whole server's
> initialization process is done.
> 2)the process is simple gen_server
> (AND HERE IS A PROBLEM)
> 3)as long as I dont return from init/1 fun, my process is not seen  
> (I cant
> really use self() - is that correct??) by others so I CANT RECEIVE ANY
> MESSAGES. so I though I can spawn another processes while in init/1  
> that
> will loop and ping another side if it is ready. but I am afraid that  
> there
> might be a chance that the gen_server will not yet return from init  
> and
> spawned worker will try to send some notification, so or I will miss  
> it or
> there will be some error.
>
> is this really a problem or Im just missing something here that is so
> obvious.
> All I really need is to init state of the gen_server, then start  
> dynamically
> initializing the server so I cant hard code all conditions in init  
> fun and I
> need to communicate with other processes.
>
> hope it is more or less clear
>
> take care
>
> pozdrawiam
> Paweł Kamiński
>
> [hidden email]
> [hidden email]
> ______________________



--
Jayson Vantuyl
[hidden email]




________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Reply | Threaded
Open this post in threaded view
|

Re: initializing process

Richard Andrews-5
2009/9/18 Jayson Vantuyl <[hidden email]>:

> What exactly are you trying to do?  It's really easy to create complex race
> conditions and strange startup ordering issues doing something like this.
>
> If you can have the node signal you (much less work).  Just start out in an
> "uninitialized" state set a short timeout and just go idle.  When you
> receive the "other_node_started" event, go into "running" mode.  I'd
> recommend using sync_send_event from the server you're waiting on.  That
> guarantees to close the loop in case they start up in a weird order.
>
> Timeouts are handy and highly underutilized.  Other great uses include
> "forcibly garbage collecting during inactivity" and "delaying before
> hibernating".
>
> If you must ping the node from the waiting side, you can simulate a delaying
> loop with a gen_server (and a gen_fsm) by using the timeout feature.  Just
> send the ping and set a timeout.  Have the timeout function send the ping
> again and set a timeout again.  Eventually, you'll get the response and can
> go about your merry way.
>
> Both of these methods prevent you from sitting in init for an undefined
> amount of time.

Yep. I use a hello message for this kind of thing. Start in an
unconnected state and start sending hello to a known target. When
responses start getting returned (with a valid ref() of course) then
go into fully running mode. If the target process is the last thing to
start in the peer system then you know the whole peer is up and
running.

> All of this gets complicated in the face of restarts.  It may not really be
> workable.  It would help to know more about what you're doing.
>
> I think the "correct" way (i.e. overkill in the style of Ericsson) is to do
> this at an application level.  You can set up a distributed Erlang
> application to start in "phases".  Depending on your deployment, that might
> be the ultimate answer, since it also provides hooks for takeover /
> failover, which may be necessary depending on your application.

You can do it at the supervisor level rather than application level.
Since children are started in order, you can have one of the children
not return from init() until it confirms the peer is up and running.
This will block the supervisor from starting any subsequent children
until the prerequisites are in place.

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Reply | Threaded
Open this post in threaded view
|

Re: initializing process

Jayson Vantuyl-2
I think that synchronizing the supervisor this way is a bad thing, as  
it doesn't allow you to specify a limited "start time".  In other  
words, I'd consider an uninitialized server as running, otherwise the  
supervisory machinery hangs.  I'm also not sure that this behavior is  
guaranteed in the future.

Also, since this is a cross-node thing, it's going to make "restart" a  
painful process that requires manually coordinating multiple  
machines.  Even when we've been scripting these things on EC2, it's  
just not feasible to serialize all of this for large numbers of hosts  
(especially with EC2's, understandably long, startup delay).

I really think this is better done with a series of gen_fsms that can  
bounce into and out of a "waiting" state, rather than requiring all  
manner of coordination when badness occurs.  In other worse, work with  
Erlang's supervision framework and synchronization primitives, not  
against them.

On Sep 17, 2009, at 7:04 PM, Richard Andrews wrote:

> 2009/9/18 Jayson Vantuyl <[hidden email]>:
>> What exactly are you trying to do?  It's really easy to create  
>> complex race
>> conditions and strange startup ordering issues doing something like  
>> this.
>>
>> If you can have the node signal you (much less work).  Just start  
>> out in an
>> "uninitialized" state set a short timeout and just go idle.  When you
>> receive the "other_node_started" event, go into "running" mode.  I'd
>> recommend using sync_send_event from the server you're waiting on.  
>> That
>> guarantees to close the loop in case they start up in a weird order.
>>
>> Timeouts are handy and highly underutilized.  Other great uses  
>> include
>> "forcibly garbage collecting during inactivity" and "delaying before
>> hibernating".
>>
>> If you must ping the node from the waiting side, you can simulate a  
>> delaying
>> loop with a gen_server (and a gen_fsm) by using the timeout  
>> feature.  Just
>> send the ping and set a timeout.  Have the timeout function send  
>> the ping
>> again and set a timeout again.  Eventually, you'll get the response  
>> and can
>> go about your merry way.
>>
>> Both of these methods prevent you from sitting in init for an  
>> undefined
>> amount of time.
>
> Yep. I use a hello message for this kind of thing. Start in an
> unconnected state and start sending hello to a known target. When
> responses start getting returned (with a valid ref() of course) then
> go into fully running mode. If the target process is the last thing to
> start in the peer system then you know the whole peer is up and
> running.
>
>> All of this gets complicated in the face of restarts.  It may not  
>> really be
>> workable.  It would help to know more about what you're doing.
>>
>> I think the "correct" way (i.e. overkill in the style of Ericsson)  
>> is to do
>> this at an application level.  You can set up a distributed Erlang
>> application to start in "phases".  Depending on your deployment,  
>> that might
>> be the ultimate answer, since it also provides hooks for takeover /
>> failover, which may be necessary depending on your application.
>
> You can do it at the supervisor level rather than application level.
> Since children are started in order, you can have one of the children
> not return from init() until it confirms the peer is up and running.
> This will block the supervisor from starting any subsequent children
> until the prerequisites are in place.
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>


________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Reply | Threaded
Open this post in threaded view
|

Re: initializing process

kamiseq
In reply to this post by Jayson Vantuyl-2
2009/9/18 Jayson Vantuyl <[hidden email]>

> What exactly are you trying to do?


exactly this is a application server that is getting requests from its
clients via http and then it is doing some computation. but the key feature
of the server is talking to other peer in local network and gathering
information about other devices in the area, process that information and
send it to clients.

the problem is that the other computer is a third-party hardware and
software black-box that I connect via tcp or rss and can switches off
itself. my server must be 100% sure that the other side is alive before
starting everything up. if my server is not responding clients will notify
local stuff that something is going on and someone will go check. the server
may loose connection with black-box and again will disconnect client. it is
maybe weird but help troubleshooting and I know that my server is ok but it
is THEIR :) fault.


> It's really easy to create complex race conditions and strange startup
> ordering issues doing something like this.
>
> that what I was worried about


> If you can have the node signal you (much less work).  Just start out in an
> "uninitialized" state set a short timeout and just go idle.  ..
>
If you must ping the node from the waiting side, you can simulate a delaying
> loop with a gen_server (and a gen_fsm) by using the timeout feature.  Just
> send the ping and set a timeout.  Have the timeout function send the ping
> again and set a timeout again.  Eventually, you'll get the response and can
> go about your merry way.
>

yep that is the case I need to force the communication.
ok lets say my device controller is gen_fsm I initialize it and set
running_disconnected fsm state, so now my fsm is waiting for some event. who
will ping the other side? I can't do it in init/1 as it may lead to race
conditions as you said earlier.

If I loose connection device controller should switch back to
running_disconnected fsm state and fold all processes that relay on
connection and start pinging again.

gleb's advice was to do everything in init using PeerPid!{ping} and receive,
it is some way. other is to skip behaviours and implement device controller
as a simple process that will do its job but both are less elegant.


> I think the "correct" way (i.e. overkill in the style of Ericsson) is to do
> this at an application level.  You can set up a distributed Erlang
> application to start in "phases".  Depending on your deployment, that might
> be the ultimate answer, since it also provides hooks for takeover /
> failover, which may be necessary depending on your application.
>
>
what do you mean by application level and "You can set up a distributed
Erlang application to start in "phases"" how you imagine the architecture
for that?
thanks for replay

pozdrawiam
Paweł Kamiński

[hidden email]
[hidden email]
______________________
Reply | Threaded
Open this post in threaded view
|

Re: initializing process

Ladislav Lenart
Hello.

paweł kamiński wrote:

> 2009/9/18 Jayson Vantuyl <[hidden email]>
>
>> What exactly are you trying to do?
>
>
> exactly this is a application server that is getting requests from its
> clients via http and then it is doing some computation. but the key feature
> of the server is talking to other peer in local network and gathering
> information about other devices in the area, process that information and
> send it to clients.
>
> the problem is that the other computer is a third-party hardware and
> software black-box that I connect via tcp or rss and can switches off
> itself. my server must be 100% sure that the other side is alive before
> starting everything up. if my server is not responding clients will notify
> local stuff that something is going on and someone will go check. the server
> may loose connection with black-box and again will disconnect client. it is
> maybe weird but help troubleshooting and I know that my server is ok but it
> is THEIR :) fault.
>
>
>> It's really easy to create complex race conditions and strange startup
>> ordering issues doing something like this.
>>
>> that what I was worried about
>
>
>> If you can have the node signal you (much less work).  Just start out in an
>> "uninitialized" state set a short timeout and just go idle.  ..
>>
> If you must ping the node from the waiting side, you can simulate a delaying
>> loop with a gen_server (and a gen_fsm) by using the timeout feature.  Just
>> send the ping and set a timeout.  Have the timeout function send the ping
>> again and set a timeout again.  Eventually, you'll get the response and can
>> go about your merry way.
>>
>
> yep that is the case I need to force the communication.
> ok lets say my device controller is gen_fsm I initialize it and set
> running_disconnected fsm state, so now my fsm is waiting for some event. who
> will ping the other side? I can't do it in init/1 as it may lead to race
> conditions as you said earlier.

You could send yourself a proper event message (that will trigger
the first ping) directly in the init function. Since you do this
in init, your message will be the first in the process's mailbox
(noone else can know the new pid).


HTH,

Ladislav Lenart


________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Reply | Threaded
Open this post in threaded view
|

Re: initializing process

Jayson Vantuyl-2
In reply to this post by kamiseq
> exactly this is a application server that is getting requests from  
> its clients via http and then it is doing some computation. but the  
> key feature of the server is talking to other peer in local network  
> and gathering information about other devices in the area, process  
> that information and send it to clients.
>
> the problem is that the other computer is a third-party hardware and  
> software black-box that I connect via tcp or rss and can switches  
> off itself. my server must be 100% sure that the other side is alive  
> before starting everything up. if my server is not responding  
> clients will notify local stuff that something is going on and  
> someone will go check. the server may loose connection with black-
> box and again will disconnect client. it is maybe weird but help  
> troubleshooting and I know that my server is ok but it is THEIR :)  
> fault.
This is dysfunctional, but helpful to know.  I had assumed you were  
writing the client and the server.  Since you connect to the "black  
box" using TCP and RSS, I wouldn't think you could do ping an Erlang  
node on it (since you couldn't install one).  Even if you can, the  
Erlang node running isn't the same as their TCP connections not  
working, so I'd recommend a pure-TCP solution.

Here's what I'm thinking.  At the top, put a single supervisor,  
running a one_for_all strategy.  It should have one child, and be  
registered under a name.

Have the child be a gen_fsm with initialized and uninitialized modes.  
In the init function, start in an 'uninitialized' state and specify a  
very short timeout (maybe 500 milliseconds).  When you receive a  
timeout in that state, try to use the gen_tcp:connect (or an http  
client to a harmless URL, if it's an HTTP device).  If there is a good  
connection, tell the supervisor to add a temporary child (the  
supervisor for your real workers).  If it's a bad connection, set a  
timeout again.

In the running state, have the timeout check, but perhaps with a  
longer loop time.  If it gets a good connection, set the timeout.  If  
it gets a bad connection, terminate the gen_fsm by returning  
{stop,"Couldn't make a connection",State}.  If you want to be a bit  
more forgiving, keep track of successive failures to connect and only  
terminate after maybe five consecutive failures (or somesuch).

I think that this has the correct behavior.  Assuming that your actual  
supervisor has a good supervision tree it does the following:

1) When uninitialized, tries to connect every 500 ms (or perhaps  
longer if packets to the remote server go into a black hole).
2) When running, tries to connect every so often.
3) When running, has added a supervisor for the worker
4) When running and connections fail, exits, causing the supervisor to  
kill the actual supervisor (and I believe the temporaryness of it  
causes it not to restart, if not, try transient).
5) Everything is managed by the supervision tree, as it should be.
6) The gen_fsm provides good logging of when and why everything died.
7) The ping function can be made arbitrarily complex but is still  
isolated from the actual workers.
8) No modification of the black box is required.
9) It's probably 30-50 lines of code for the gen_fsm, and one, static  
supervisor.

This is a good problem.  I teach an Erlang class.  If I decide to use  
this as an example, I'll send you a link to the code.

> what do you mean by application level and "You can set up a  
> distributed Erlang application to start in "phases"" how you imagine  
> the architecture for that?


In Erlang, you are supposed to package things as an application.  
Applications are defined by a .app file in the ebin directory of their  
providing module.  For example, on my system, /opt/local/lib/erlang/
lib/mnesia-4.4.9/ebin/mnesia.app contains the application definition  
for Mnesia.  This is used by the release system to create a boot  
script that starts Mnesia, if it's requested.  When you do  
application:start(mnesia), this is where it finds the information to  
start it up.  See here:  http://www.erlang.org/doc/design_principles/applications.html#7

Once you have an application, look at the start_phases stuff in the  
application documentation.  This shows pretty well how to make an  
application that has multiple phases, and synchronizes them across  
nodes (including doing exciting things like failover).  It's worth  
understanding, although you can probably avoid it for now.

In theory, you should develop your module in some sort of OTP-like  
root, under lib/module-vsn.  When the time comes, you can roll a boot-
script that will start your system (using systools:make_script/2),  
automatic upgrade instructions (using systools:make_relup/4) even a  
whole OTP install (using systools:make_tar/2).  Very few people go  
through the trouble, but I'm working on stuff to make it easier (as is  
Ericsson, see reltool).  Note that this can be a complicated process,  
since making the upgrade scripts (i.e. relup) requires having two  
copies of the application in the code_path, which Erlang doesn't like  
to do, by default.

If you want to see roughly what the structure should look like, I have  
a git repository on GitHub:  http://github.com/jvantuyl/erl-skel

It's not complete in terms of automation, but the scripts/make_release  
file gives an idea as to how it's done.  The directory structure is  
right, and the Rakefile will handle building the code for most simple  
cases (GNU make was a bit of a problem, and rake is generally  
everywhere now).  Actually making automated releases is still on the  
TODO list, and it doesn't try to build any C extensions at all.

With proper automation, you can eventually have it build a tarfile.  
With the tarfile, you can do an initial install just by uncompressing  
it.  When you run "erl -heart" in the uncompressed directory, it will  
automagically start all of your applications with the parameters you  
specified in your release (.rel) files, handle crashes of the entire  
system, and easily run as a daemon.  It also makes managing multiple  
deployments as easy as versioning a bunch of .rel files.  If you go  
this route, you can even use the automatic code updating stuff (i.e.  
code_change/3 in gen_*, update instructions in .appup files, etc.) to  
update a running system.  With proper preparation, this deploy process  
can even handle downgrades or updating Erlang itself, live!  Like most  
of Erlang, it's powerful, but the learning curve is steep.

--
Jayson Vantuyl
[hidden email]