How to properly stop complicated async NIF resource.

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How to properly stop complicated async NIF resource.

Max Lapshin-2
Hi.

We are writing NIF wrapper for OpenMAX — standard API for transcoding video on Android or Raspberry Pi.

This API requires to launch separate thread (launched by library) and this thread receives some events from hardware. This thread wraps hardware events in erlang messages and sends them to owner process.  All this looks really cool and we have excellently working happy path of code.

When whole resource is to be destroyed, we need to clean it properly, shutdown all opened hardware handlers. Right now we have to write complicated code on C in destructor that calls cond_wait, mutex_lock, etc.

The same code in erlang is 100 times cleaner and easier, however in destructor we cannot write in erlang.  I cannot exit from destructor function and ask to continue destructing when separate thread will be activated by hardware.

So it would be cool to continue having erlang vm even in destructor or make something like "delayed destructor" with "I promise to finish this destruction".

Question is: is my problem well known or it is very specific?


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Mikael Karlsson-7
Hi Max,
I had a similar problem when interfacing towards the ALSA library for sound generation on Linux, but it is probably much more simple than your case so I am not sure it will help.
For every thread I have a peer gen_server that opens a handle in a resource and starts the callback thread (I get events every 5 ms) in init/1 and handle_continue/2 function . When the gen_server is shutdown and terminate/2 is called I call the a nif to close the handle. The resource itself is garbage collected by erlang (I think :-) ).
What I noticed was that I had to set the process_flag(trap_exit, true) in order to have the terminate function called at supervisor:terminate_child/2 .
I am not sure what you are meaning with delayed destructor but if you need a lot of time for complete destruction maybe you can orchestrate it from a separately spawned erlang process that calls the proper nifs.

You can checkout my code at https://github.com/karlsson/xalsa , mainly c_src/xalsa_nif.c and src/xalsa_server.erl

Best Regards
Mikael

Den ons 17 juli 2019 kl 17:53 skrev Max Lapshin <[hidden email]>:
Hi.

We are writing NIF wrapper for OpenMAX — standard API for transcoding video on Android or Raspberry Pi.

This API requires to launch separate thread (launched by library) and this thread receives some events from hardware. This thread wraps hardware events in erlang messages and sends them to owner process.  All this looks really cool and we have excellently working happy path of code.

When whole resource is to be destroyed, we need to clean it properly, shutdown all opened hardware handlers. Right now we have to write complicated code on C in destructor that calls cond_wait, mutex_lock, etc.

The same code in erlang is 100 times cleaner and easier, however in destructor we cannot write in erlang.  I cannot exit from destructor function and ask to continue destructing when separate thread will be activated by hardware.

So it would be cool to continue having erlang vm even in destructor or make something like "delayed destructor" with "I promise to finish this destruction".

Question is: is my problem well known or it is very specific?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Max Lapshin-2
I'm afraid of the situation when the only resource owner is killed and resource must be destroyed by calling destructor.

It is too late to call to erlang at this time =(

However, your idea is interesting.  I can register some destruction manager just like a "hier" in ets and when resource is going to be killed, or its owner is going to be killed, we send a message to the "destruction manager" and it will clean without these messy cond_wait_timeout calls.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Mikael Karlsson-7
Wed 17 July 2019 21:45 Max Lapshin <[hidden email]> wrote:
I'm afraid of the situation when the only resource owner is killed and resource must be destroyed by calling destructor.

It is too late to call to erlang at this time =(

However, your idea is interesting.  I can register some destruction manager just like a "hier" in ets and when resource is going to be killed, or its owner is going to be killed, we send a message to the "destruction manager" and it will clean without these messy cond_wait_timeout calls.

I think I understand....
In my case I do not even register a destructor callback since I do not need to.
Just some other ideas, maybe it is possible to use the enif_open_resource_type_x function to register a "down" function and then monitor your resource owner process from the thread with enif_monitor_process. I guess that the down callback would be called from within your thread and then you could cleanup in that without the need of mutexes and cond_wait_timout calls.
Or if enif_send is called often you could check if it returns false, which means that to_pid or sender is not alive, and then perform cleanup in that case.

Also just to be clear, even if you start a thread from an API it will run within the erlang VM if you can call enif_send:
It took me some time (and help from OTP developer) to understand this.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Max Lapshin-2
I've refactored my code and now there is a special killing process near each NIF resource.

It will properly shutdown hardware when nif owner is dead.

nif destructor now is a bit useless, because it is called when everything is done.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Mikael Karlsson-7

I've refactored my code and now there is a special killing process near each NIF resource.

It will properly shutdown hardware when nif owner is dead.
 
Nice.

nif destructor now is a bit useless, because it is called when everything is done.

Guess you just can replace it with NULL when it gets completely useless.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Max Lapshin-2
I'm afraid that I need to replace it with something like:

if(!obj->cleaned) {
fprintf(stderr, "We should not get here");
abort();
}


because if we get to destructor without calling erlang destructor, it means that we need to reboot whole program and after 5 times reboot whole device with power cycle.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to properly stop complicated async NIF resource.

Mikael Karlsson-7

because if we get to destructor without calling erlang destructor, it means that we need to reboot whole program and after 5 times reboot whole device with power cycle.

OK, this brings me a bit back to previous discussion about enif_release_resource. If you do not call enif_release_resource in the same function as when you do enif_allocate_resource but wait until you have cleaned everything up with the  killing process ("erlang destructor") you will make sure that garbage collection and destructor call will not happen before this. 

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions