process pools?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

process pools?

Julian Assange

Two questions:

        1) Is there an erlang http client around anywhere? There's
           Joe Armstrong's www_tools, but it's very primitive. I'm
           looking for something like curl.haxx.se/libcurl, which
           has API's for a wide number of languages, not including
           erlang. In particular something which supports HTTP
           persistance / pipling. Any decent erlang www spider
           should have these features.

        2) What is the best method to implement worker threads? i.e
           I'd like a pool of processes (one for each outbound
           socket), which listen on the same message queue for
           requests -- the first free process reads gets the
           message.

Cheers,
Julian.

--
 Julian Assange        |If you want to build a ship, don't drum up people
                       |together to collect wood or assign them tasks and
 proff          |work, but rather teach them to long for the endless
 proff  |immensity of the sea. -- Antoine de Saint Exupery



Reply | Threaded
Open this post in threaded view
|

process pools?

Joe Armstrong (AL/EAB)

Julian Assange wrote:

>Two questions:
>
>        1) ...
>
>
>
>        2) What is the best method to implement worker threads? i.e
>           I'd like a pool of processes (one for each outbound
>           socket), which listen on the same message queue for
>           requests -- the first free process reads gets the
>           message.
>

I suspect you don't need to keep a pool of workers - I'd just spawn off
a new process for
each new request (possibly limiting the number of processes that can be
spawned off).

I use tcp_server.erl (appended) for this - this I'm constantly modifying
(one day I'll get it right)

The call tcp_server:start_raw_server(2000, F/1, 25)
starts a listener on port 2000. The fun F/1 is called every time a new
connection is setup on
port 2000. At most 25 parallel sessions are allowed.

Read the comments at the start of the file.

F/1 should go into a receive loop where it can receive gen_tcp messages

This should probably do what you want.

/Joe Armstrong

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: tcp_server.erl
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20011009/824600c5/attachment.ksh>

Reply | Threaded
Open this post in threaded view
|

process pools?

Matthias Lang-2
In reply to this post by Julian Assange

 >         2) What is the best method to implement worker threads? i.e
 >            I'd like a pool of processes (one for each outbound

Thread pools are a solution to a particular problem C++ and Java
have: spawning new threads is fairly expensive in many (most? all?)
implementations, so you want to avoid doing that, especially when the
thread will then do a relatively small amount of work.

Erlang's processes are (much) more lightweight than the thread
implementations I've worked with, so you don't need to worry about
'spawn overhead' to the same degree.

The other reason I've seen thread pools used is in situations where
some expensive initialisation has to be done. It doesn't sound like
your program does this.

Matthias


Reply | Threaded
Open this post in threaded view
|

process pools?

Julian Assange
>
>  >         2) What is the best method to implement worker threads? i.e
>  >            I'd like a pool of processes (one for each outbound
>
> The other reason I've seen thread pools used is in situations where
> some expensive initialisation has to be done. It doesn't sound like
> your program does this.

The context is persistant http connections. Tearing down the socket
and rebuilding it for each url would be dramatically slower.

Cheers,
Julian.


Reply | Threaded
Open this post in threaded view
|

process pools? (for gen_servers)

Pascal Brisset
Julian Assange writes:
 > The context is persistant http connections. Tearing down the socket
 > and rebuilding it for each url would be dramatically slower.

We had plenty of similar problems, and we ended up writing a generic
'pool_server' process which masquerades as a gen_server and
load-balances requests to a pool of actual gen_servers.

Incidentally this pool_server is the right place to do things like
load regulation (keeping track of how many requests there are in each
server's queue), failover, static routing, retry-on-failure, etc.

A typical use is to multiplex SQL requests over several persistent
connections to a replicated database (switching to the secondary db
when all primary connections are dead, etc). In another context, the
load regulation stuff helps guarantee some kind of fairness among
several pool_servers which share the same workers.

-- Pascal Brisset <pascal.brisset> +33141986741 --
----- Cellicium | 73 avenue Carnot | 94230 Cachan | France -----