Framework / library for at-least-once execution?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Framework / library for at-least-once execution?

Petri Pellinen
Hello everybody,

I'm a new subscriber to the list and new to the Erlang world. I have read a couple of books on Erlang and OTP and, as a long-time Java programmer, am very excited by all the concurrency and high availability features that come out of the box.

Tried searching the archives for answers but am not sure if I came up with the correct search terms and ended up empty-handed.

I'm curious if there is an existing library or framework that would let me submit a "job" and the framework makes sure that the job is run *at least once* to completion in an OTP cluster even if the machine running the submitted job dies during execution. If a machine/node dies during execution of the job then another node should restart the job as soon as possible.

So, from a client perspective, if I get an acknowledgment that a job was successfully received then I can rest assured that the job runs to completion.

If any of you are familiar with Spring Batch in the Java world then this is something similar but not really ETL orientated or for heavy batches - when I say "job" here I really mean any piece of code, even a very lightweight function. I'm trying to come up with an extremely reliable backend solution for delivering and processing messages between parties.

Any information or pointers to relevant existing solutions would be greatly appreciated.

Thanks in advance for any help you may be able to provide!

Kind regards,
Petri


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Framework / library for at-least-once execution?

Michael Truog
On 04/08/2018 07:38 AM, Petri Pellinen wrote:

> Hello everybody,
>
> I'm a new subscriber to the list and new to the Erlang world. I have read a couple of books on Erlang and OTP and, as a long-time Java programmer, am very excited by all the concurrency and high availability features that come out of the box.
>
> Tried searching the archives for answers but am not sure if I came up with the correct search terms and ended up empty-handed.
>
> I'm curious if there is an existing library or framework that would let me submit a "job" and the framework makes sure that the job is run *at least once* to completion in an OTP cluster even if the machine running the submitted job dies during execution. If a machine/node dies during execution of the job then another node should restart the job as soon as possible.
>
> So, from a client perspective, if I get an acknowledgment that a job was successfully received then I can rest assured that the job runs to completion.
>
> If any of you are familiar with Spring Batch in the Java world then this is something similar but not really ETL orientated or for heavy batches - when I say "job" here I really mean any piece of code, even a very lightweight function. I'm trying to come up with an extremely reliable backend solution for delivering and processing messages between parties.
>
> Any information or pointers to relevant existing solutions would be greatly appreciated.
>
> Thanks in advance for any help you may be able to provide!
>
> Kind regards,
> Petri
>

For real-time at-least-once processing you have two basic high-level abstract choices:
1) Treat a job as a piece of data you put in queue data to process at least once, with distributed consensus to ensure it can be fault-tolerant
2) Treat a job as source code in a service that receives task messages, so the concept of a job is abstract, allowing the algorithm and input/output of the job to change separately

#2 is similar to #1 because the tasks are still queued to get handled by the service.  However, #2 is clearly different by allowing hot-code upgrades/downgrades without extra complexity (i.e., task messages are using a protocol that is clearly defined and you don't need to be concerned about data structures changing, since they are isolated in the source code as a separate entity).

CloudI (https://cloudi.org) provides the #2 approach which is a natural way to approach the problem in Erlang, though you may be expecting to see the #1 approach.  If you were using CloudI to solve this problem, you could use the cloudi_service_quorum source code as a proxy to achieve consensus among different "job" services on separate machines that process the same task message concurrently, though something similar could be done with the CloudI API function mcast_async.

Best Regards,
Michael
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions