Memory leak related to "driver_alloc"

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory leak related to "driver_alloc"

Florian Odronitz-2
Hi,

I am trying to debug what looks like a memory leak of about 6GB per week (~10kB/s).
The node in question gets data from Kafka and Redis, processes it and puts the result back in Redis (~1MB/s in, 0.9MB/s out).

When I log into the node I see erlang:memory() reporting 14,6 GB of memory basically all attributed to "system".
Digging deeper recon_alloc:memory(allocated_types) finds that basically all the memory is allocated by "driver_alloc".

I tried a few things that did not change anything:
- force GC on all processes
- restart the main application
- stop all applications except the bare minimum

I thought maybe it is NIFs leaking memory so I tried to run them in isolation for around 30 CPU minutes each encoding/decoding or compressing/decompressing. They all seem fine.
Here is a list of them:
snappy: https://github.com/fdmanana/snappy-erlang-nif(c4cd1bb35f3a3a399737d64bd03b479817b90e75)
jsonx:  https://github.com/odo/jsonx (435fc3e9df5c33bf307fa8c95da7a18b07ea49a7)
lz4:    https://github.com/szktty/erlang-lz4.git (bdbfbedc89c073cd154de5a0691a198be43c539b)

I don't fully understand what driver_alloc contains and what could drive its growth.
Any thoughts?

Thanks, Florian
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak related to "driver_alloc"

Stanislaw Klekot
On Wed, May 02, 2018 at 07:08:34PM +0200, Florian Odronitz wrote:
> Hi,
>
> I am trying to debug what looks like a memory leak of about 6GB per week (~10kB/s).
> The node in question gets data from Kafka and Redis, processes it and puts the result back in Redis (~1MB/s in, 0.9MB/s out).
>
> When I log into the node I see erlang:memory() reporting 14,6 GB of memory basically all attributed to "system".
> Digging deeper recon_alloc:memory(allocated_types) finds that basically all the memory is allocated by "driver_alloc".
[...]
> I thought maybe it is NIFs leaking memory so I tried to run them in
> isolation for around 30 CPU minutes each encoding/decoding or
> compressing/decompressing. They all seem fine.

If it's really driver_alloc(), then I wouldn't search among the NIFs,
but among the port drivers. Do you have any?

> I don't fully understand what driver_alloc contains and what could drive its growth.

driver_alloc() is, according to its documentation, a wrapper for
malloc(3), so it's not its own fault.

--
Stanislaw Klekot
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak related to "driver_alloc"

John Högberg
On Wed, May 02, 2018 at 07:08:34PM +0200, Florian Odronitz wrote:
> When I log into the node I see erlang:memory() reporting 14,6 GB of
> memory basically all attributed to "system".
> Digging deeper recon_alloc:memory(allocated_types) finds that
> basically all the memory is allocated by "driver_alloc".

If you can try out OTP 21-rc1, the new memory instrumentation features
will help you find the culprit:

http://blog.erlang.org/Memory-instrumentation-in-OTP-21/

> I don't fully understand what driver_alloc contains and what could
> drive its growth.
> Any thoughts?

`driver_alloc` is used for general allocations in drivers and NIFs,
e.g. `enif_alloc()`, `erl_drv_thread_create()`, or
`enif_inspect_iovec()`.

Regards,
John Högberg
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak related to "driver_alloc"

Florian Odronitz-2


On 3. May 2018, at 09:38, John Högberg <[hidden email]> wrote:

On Wed, May 02, 2018 at 07:08:34PM +0200, Florian Odronitz wrote:
When I log into the node I see erlang:memory() reporting 14,6 GB of
memory basically all attributed to "system".
Digging deeper recon_alloc:memory(allocated_types) finds that
basically all the memory is allocated by "driver_alloc".

If you can try out OTP 21-rc1, the new memory instrumentation features
will help you find the culprit:

http://blog.erlang.org/Memory-instrumentation-in-OTP-21/

"Those who have used erlang:memory() are probably familiar with how annoyingly general the system category can be. It’s possible to get a bit more information by using erlang:system_info({allocator, Alloc}) but the most it will do is tell you that it’s (say) driver_alloc that eats all that memory and leave you with no clue which one."

Thats my situation, exactly! Once Elixir is compatible, I will run one node under OTP 21 and then see if I can get some insight.

Thanks for helping
Florian


PS: In the meanwhile I also looked at memory fragmentation using recon_alloc:fragmentation(current).
I found some impressive carriers (>4GB) but they seem to have very good usage.

[{{:driver_alloc, 1},
  [sbcs_usage: 1.0, mbcs_usage: 0.9999680067349549, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 4937881312,
   mbcs_carriers_size: 4938039296]},
 {{:driver_alloc, 2},
  [sbcs_usage: 1.0, mbcs_usage: 0.9998221050872671, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 4819741224,
   mbcs_carriers_size: 4820598784]},
 {{:driver_alloc, 3},
  [sbcs_usage: 1.0, mbcs_usage: 0.9999122383844911, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 3822018976,
   mbcs_carriers_size: 3822354432]},
 {{:driver_alloc, 4},
  [sbcs_usage: 1.0, mbcs_usage: 0.9967773040787262, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 1326648648,
   mbcs_carriers_size: 1330937856]},
 {{:driver_alloc, 5},
  [sbcs_usage: 1.0, mbcs_usage: 0.9775849991437053, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 210428096,
   mbcs_carriers_size: 215252992]},
 {{:ll_alloc, 0},
  [sbcs_usage: 1.0, mbcs_usage: 0.88204345703125, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 34683360,
   mbcs_carriers_size: 39321600]},

 

I don't fully understand what driver_alloc contains and what could
drive its growth.
Any thoughts?

`driver_alloc` is used for general allocations in drivers and NIFs,
e.g. `enif_alloc()`, `erl_drv_thread_create()`, or
`enif_inspect_iovec()`.

Regards,
John Högberg
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak related to "driver_alloc"

Jesper Louis Andersen-2
My immediate intuition if  you have a 4GB carrier: my goodness, what are you doing to the poor VM to get that? Putting a BluRay disc in memory?

I'd definitely try to figure out why that happens, because my intuition has all alarms ringing.

On Fri, May 4, 2018 at 10:16 AM Florian Odronitz <[hidden email]> wrote:

On 3. May 2018, at 09:38, John Högberg <[hidden email]> wrote:

On Wed, May 02, 2018 at 07:08:34PM +0200, Florian Odronitz wrote:
When I log into the node I see erlang:memory() reporting 14,6 GB of
memory basically all attributed to "system".
Digging deeper recon_alloc:memory(allocated_types) finds that
basically all the memory is allocated by "driver_alloc".

If you can try out OTP 21-rc1, the new memory instrumentation features
will help you find the culprit:

http://blog.erlang.org/Memory-instrumentation-in-OTP-21/

"Those who have used erlang:memory() are probably familiar with how annoyingly general the system category can be. It’s possible to get a bit more information by using erlang:system_info({allocator, Alloc}) but the most it will do is tell you that it’s (say) driver_alloc that eats all that memory and leave you with no clue which one."

Thats my situation, exactly! Once Elixir is compatible, I will run one node under OTP 21 and then see if I can get some insight.

Thanks for helping
Florian


PS: In the meanwhile I also looked at memory fragmentation using recon_alloc:fragmentation(current).
I found some impressive carriers (>4GB) but they seem to have very good usage.

[{{:driver_alloc, 1},
  [sbcs_usage: 1.0, mbcs_usage: 0.9999680067349549, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 4937881312,
   mbcs_carriers_size: 4938039296]},
 {{:driver_alloc, 2},
  [sbcs_usage: 1.0, mbcs_usage: 0.9998221050872671, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 4819741224,
   mbcs_carriers_size: 4820598784]},
 {{:driver_alloc, 3},
  [sbcs_usage: 1.0, mbcs_usage: 0.9999122383844911, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 3822018976,
   mbcs_carriers_size: 3822354432]},
 {{:driver_alloc, 4},
  [sbcs_usage: 1.0, mbcs_usage: 0.9967773040787262, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 1326648648,
   mbcs_carriers_size: 1330937856]},
 {{:driver_alloc, 5},
  [sbcs_usage: 1.0, mbcs_usage: 0.9775849991437053, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 210428096,
   mbcs_carriers_size: 215252992]},
 {{:ll_alloc, 0},
  [sbcs_usage: 1.0, mbcs_usage: 0.88204345703125, sbcs_block_size: 0,
   sbcs_carriers_size: 0, mbcs_block_size: 34683360,
   mbcs_carriers_size: 39321600]},

 

I don't fully understand what driver_alloc contains and what could
drive its growth.
Any thoughts?

`driver_alloc` is used for general allocations in drivers and NIFs,
e.g. `enif_alloc()`, `erl_drv_thread_create()`, or
`enif_inspect_iovec()`.

Regards,
John Högberg
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions