design pattern question for messaging system

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

design pattern question for messaging system

Miles Fidelman
Hi Folks,

So far, I've mostly been experimenting w/ Erlang, and using Erlang-based
technology (notably CouchDB).  As I'm thinking about a new application,
I'm having trouble getting my hands around an appropriate design
pattern.  I wonder if anybody might be able to point me in the right
direction.

The application is message handling (back to that in a minute).  I
realize that I have a pretty good idea how to handle some kinds of
applications in a highly concurrent fashion, such as:
- modeling/simulation (obviously, each entity - such as a vehicle - is a
process) - this is what led me to Erlang in the first place
- protocol engines as state machines - e.g., spawn a process for each
tcp connection
- transaction systems - spawn a process for each transaction
- transaction oriented

But I'm looking at a work flow application that maps onto a
paper-forms-based model.  It's a classic queuing system - work elements
move from queue to queue as they're worked on.  The obvious first
thought is:
- a process for each queue
- a worker process for each work step
- a message for each piece of work-in-process -- moving from queue to
queue via the worker processes

Except, that model kind of falls down because Erlang message are
unreliable by design, and don't persist in the event of a process crash
(much less a node crash).

My first two thoughts are:
- spawn a process for each queue entry, pass around the PIDs
- use Mnesia to hold the queues

But neither of those feels quite right.  This must be a solved problem,
but I'm hitting a blind spot.  So... what is the design pattern for
queuing systems and/or reliable message passing in Erlang?

Any good examples to look at?  Good presentation slides or reference
materials to review?

Thanks very much,

Miles Fidelman


--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: design pattern question for messaging system

Jesper Louis Andersen-2

On Tue, Jul 22, 2014 at 1:53 PM, Miles Fidelman <[hidden email]> wrote:
But neither of those feels quite right.  This must be a solved problem, but I'm hitting a blind spot.  So... what is the design pattern for queuing systems and/or reliable message passing in Erlang?

Make each message a process. Skip the queues. Each message runs its own state to completion. If you need some kind of capacity constraint, use something like uwiger/jobs or jlouis/safetyvalve, poolboy, basho/sidejob or likewise. The trick is to dualize the world since Erlang has no channel concept which naturally maps to a queue.




--
J.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: design pattern question for messaging system

Daniel Pezely-2
In reply to this post by Miles Fidelman
> Except, that model kind of falls down because Erlang message are
unreliable by design, and don't persist in the event of a process crash
(much less a node crash).

You can get the best of what Erlang offers while also adding a journal mechanism in much the same way that many traditional databases handle transactions reliably.

There is a cascade effect of features:

1. For every message that you send, first log it to disk.  Then in event of a complete system crash you can replay the journal.

2. Next is to simultaneously write two pieces of information: a flag to be set later upon completion of action following message delivery as well as writing the original message itself, of course.

3. This in turn requires updating the status flag upon completion of action which should also be a distinct node on your sequence diagram.  (Any I/O operation should be noted.)

These could be all in one file or two depending upon various factors: speed of which you generate messages, speed of which you process messages, number & nature of storage devices, whether you are on AWS style ephemeral server instances where semantics of service provider's local versus network attached storage is critical to understand, etc.

Some tricks include writing a header for every N messages such that you can perform bulk updates, thereby iterating at a granularity of large blocks rather than several bytes at a time and minimize seek operations during nominal operations. It also depends upon how you track state internally, such that you may be able to completely forgo the status flag on disk.  Lots of directions that you could take this...

-Daniel
--
[hidden email]

________________________________________
From: [hidden email] [[hidden email]] on behalf of Miles Fidelman [[hidden email]]
Sent: Tuesday, July 22, 2014 4:53 AM
To: erlang-questions Questions
Subject: [erlang-questions] design pattern question for messaging system

Hi Folks,

So far, I've mostly been experimenting w/ Erlang, and using Erlang-based
technology (notably CouchDB).  As I'm thinking about a new application,
I'm having trouble getting my hands around an appropriate design
pattern.  I wonder if anybody might be able to point me in the right
direction.

The application is message handling (back to that in a minute).  I
realize that I have a pretty good idea how to handle some kinds of
applications in a highly concurrent fashion, such as:
- modeling/simulation (obviously, each entity - such as a vehicle - is a
process) - this is what led me to Erlang in the first place
- protocol engines as state machines - e.g., spawn a process for each
tcp connection
- transaction systems - spawn a process for each transaction
- transaction oriented

But I'm looking at a work flow application that maps onto a
paper-forms-based model.  It's a classic queuing system - work elements
move from queue to queue as they're worked on.  The obvious first
thought is:
- a process for each queue
- a worker process for each work step
- a message for each piece of work-in-process -- moving from queue to
queue via the worker processes

Except, that model kind of falls down because Erlang message are
unreliable by design, and don't persist in the event of a process crash
(much less a node crash).

My first two thoughts are:
- spawn a process for each queue entry, pass around the PIDs
- use Mnesia to hold the queues

But neither of those feels quite right.  This must be a solved problem,
but I'm hitting a blind spot.  So... what is the design pattern for
queuing systems and/or reliable message passing in Erlang?

Any good examples to look at?  Good presentation slides or reference
materials to review?

Thanks very much,

Miles Fidelman


--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: design pattern question for messaging system

dmkolesnikov
In reply to this post by Miles Fidelman
Hello,

"Except, that model kind of falls down because Erlang message are unreliable by design, and don't persist in the event of a process crash (much less a node crash).”

This is a trade off you have to accept. If the message persistency is MUST then you have to look into reliable message broker solution then.

I think you already articulate the basic pattern, which is used in many applications

 —[ mq ]—[ worker ]—[ mq ]-[ worker ]—  

You are right if you keep messages within mailbox then crash of process destroys mailbox. Thus, you need an intermediate process to hold messages. The intermediate process does not do any work except enqueue / dequeue operation. Therefore, the number of failures is limited. The worker is pool of processes Jesper Louis gave you list pool libraries, I can give one more if you need ;-)  

I am using a similar pattern at one of my application. The biggest problem is node crash due to OOM or external factors. I am trying to solve it by duplicating the processing path into N-distinct node (my app uses last-write wins but you might use other conflict resolution technique).

You can extend the pattern by having persistent mq to check-point intermediate results to survive node crash but I found it complicated.

—[ p-mq ]—[ worker ]—[ mq ]—[ worker ]—[ mq ]—[ worker ]—[ p-mq ]-

Best Regards,
Dmitry

On 22 Jul 2014, at 14:53, Miles Fidelman <[hidden email]> wrote:

> Hi Folks,
>
> So far, I've mostly been experimenting w/ Erlang, and using Erlang-based technology (notably CouchDB).  As I'm thinking about a new application, I'm having trouble getting my hands around an appropriate design pattern.  I wonder if anybody might be able to point me in the right direction.
>
> The application is message handling (back to that in a minute).  I realize that I have a pretty good idea how to handle some kinds of applications in a highly concurrent fashion, such as:
> - modeling/simulation (obviously, each entity - such as a vehicle - is a process) - this is what led me to Erlang in the first place
> - protocol engines as state machines - e.g., spawn a process for each tcp connection
> - transaction systems - spawn a process for each transaction
> - transaction oriented
>
> But I'm looking at a work flow application that maps onto a paper-forms-based model.  It's a classic queuing system - work elements move from queue to queue as they're worked on.  The obvious first thought is:
> - a process for each queue
> - a worker process for each work step
> - a message for each piece of work-in-process -- moving from queue to queue via the worker processes
>
> Except, that model kind of falls down because Erlang message are unreliable by design, and don't persist in the event of a process crash (much less a node crash).
>
> My first two thoughts are:
> - spawn a process for each queue entry, pass around the PIDs
> - use Mnesia to hold the queues
>
> But neither of those feels quite right.  This must be a solved problem, but I'm hitting a blind spot.  So... what is the design pattern for queuing systems and/or reliable message passing in Erlang?
>
> Any good examples to look at?  Good presentation slides or reference materials to review?
>
> Thanks very much,
>
> Miles Fidelman
>
>
> --
> In theory, there is no difference between theory and practice.
> In practice, there is.   .... Yogi Berra
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions