High Erlang memory binary

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

High Erlang memory binary

Alin Popa
Hi all,

I've stumbled upon a weird situation (you'll see down below what I mean by weird) where my `system` memory is being quite high, out of which the majority is assigned to `binary`.
When I do `:erlang.memory()`, I'm getting back the following values:
```
[
  total: 1012506552,
  processes: 84512344,
  processes_used: 83993696,
  system: 927994208,
  atom: 1845473,
  atom_used: 1816962,
  binary: 834439728,
  code: 48673550,
  ets: 11823024
]
```
As you can see, the binary is ~ 89% of the system memory (according to the Erlang doc: 
"binary
The total amount of memory currently allocated for binaries. This memory is part of the memory presented as system memory.")

I wrote a function that gives me the memory taken by all processes refc binaries, that looks like this:
```
total = fn ->
  :erlang.processes()
  |> Enum.map(fn pid ->
    try do
      {_, bins} = :erlang.process_info(pid, :binary)
      total_size = bins |> Enum.reduce(0, fn {_id, size, _refs}, acc -> size + acc end)
      {pid, total_size, length(bins)}
    rescue
      _ -> {pid, 0, 0}
    end
  end)
  |> Enum.map(fn {pid, total_size, count} ->
    {pid, total_size, count}
  end)
  |> Enum.reduce(0, fn {_pid, total_size, _}, acc -> acc + total_size end)
end
```
The result returned by this function is in bytes 9600693, which is ~ 9 MB, and not even near the values of `binary` returned by `:erlang.memory()`.

Now, the weird part is that my app is running inside a docker container, and the docker container within a kubernetes cluster. In this setup I can see the problem, but if I'm running it locally, within bare bones docker, I can't see the same issue. And another thing I've noticed, even more weirder, is that when the BEAM starts, the binary memory is ~ 300 MB, and after each request it bumps up till it reaches ~ 900 MB, after which stays constant - what I mean is that the BEAM doesn't crash with out of memory, or something similar, but I'm still worried. Other applications I have, also running on Erlang 22, and exactly the same docker image, don't suffer from this problem, which makes me think this is related to the application code...
This is also happening on both Erlang 21, and 22.

Another thing to mention is that I used as well `recon`, and after running it, the binary section stays exactly the same, doesn't seem to be a matter of not being garbage collected.

My questions are:
1. What other type of binary (beside refc process binaries) can live in that zone of memory? Am I looking for the wrong thing?
2. How can I _look_ at the `binary` memory region to see what I have in there?
3. Any other suggestion on what I might be doing wrong?

Any help will be much appreciated.

Thanks,
Alin
Reply | Threaded
Open this post in threaded view
|

Re: High Erlang memory binary

Lukas Larsson-8


On Fri, Jan 10, 2020 at 8:04 PM Alin Popa <[hidden email]> wrote:
Hi all,

Hello!
 

I've stumbled upon a weird situation (you'll see down below what I mean by weird) where my `system` memory is being quite high, out of which the majority is assigned to `binary`.
When I do `:erlang.memory()`, I'm getting back the following values:
```
[
  total: 1012506552,
  processes: 84512344,
  processes_used: 83993696,
  system: 927994208,
  atom: 1845473,
  atom_used: 1816962,
  binary: 834439728,
  code: 48673550,
  ets: 11823024
]
```
As you can see, the binary is ~ 89% of the system memory (according to the Erlang doc: 
"binary
The total amount of memory currently allocated for binaries. This memory is part of the memory presented as system memory.")

I wrote a function that gives me the memory taken by all processes refc binaries, that looks like this:
```
total = fn ->
  :erlang.processes()
  |> Enum.map(fn pid ->
    try do
      {_, bins} = :erlang.process_info(pid, :binary)
      total_size = bins |> Enum.reduce(0, fn {_id, size, _refs}, acc -> size + acc end)
      {pid, total_size, length(bins)}
    rescue
      _ -> {pid, 0, 0}
    end
  end)
  |> Enum.map(fn {pid, total_size, count} ->
    {pid, total_size, count}
  end)
  |> Enum.reduce(0, fn {_pid, total_size, _}, acc -> acc + total_size end)
end
```
The result returned by this function is in bytes 9600693, which is ~ 9 MB, and not even near the values of `binary` returned by `:erlang.memory()`.

Now, the weird part is that my app is running inside a docker container, and the docker container within a kubernetes cluster. In this setup I can see the problem, but if I'm running it locally, within bare bones docker, I can't see the same issue. And another thing I've noticed, even more weirder, is that when the BEAM starts, the binary memory is ~ 300 MB, and after each request it bumps up till it reaches ~ 900 MB, after which stays constant - what I mean is that the BEAM doesn't crash with out of memory, or something similar, but I'm still worried. Other applications I have, also running on Erlang 22, and exactly the same docker image, don't suffer from this problem, which makes me think this is related to the application code...
This is also happening on both Erlang 21, and 22.

Another thing to mention is that I used as well `recon`, and after running it, the binary section stays exactly the same, doesn't seem to be a matter of not being garbage collected.

My questions are:
1. What other type of binary (beside refc process binaries) can live in that zone of memory? Am I looking for the wrong thing?

The main ones are binaries kept by ets tables and ports.
 
2. How can I _look_ at the `binary` memory region to see what I have in there?

You can use the instrument module in runtime_tools to get some more insights.
 
3. Any other suggestion on what I might be doing wrong?

From what you describe above I would look at the inet ports that are running and make sure that they are closed and that they use sane buffer sizes. 

Any help will be much appreciated.

Thanks,
Alin
Reply | Threaded
Open this post in threaded view
|

Re: High Erlang memory binary

Mike Benza
In reply to this post by Alin Popa
Alin,

3. Any other suggestion on what I might be doing wrong?

I ran into a similar problem a few years ago.  I was using jiffy to decode JSON blobs.  Jiffy gave me back sliced binaries from the original binary I passed in.  This kept the whole binary in memory, even if I was just using a tiny portion of it and discarding the rest.  If you're doing this (unlikely that you've run into the exact same problem), use the copy_strings option to jiffy.

If not, consider looking for other cases where you start with a large binary, discard most of it, and keep a small portion of it.  That may be the cause of your binary bloat.  If you find something like that, try copying the binaries in a way that causes the original binary to be released from memory.

- Mike
Reply | Threaded
Open this post in threaded view
|

Re: High Erlang memory binary

Alin Popa
In reply to this post by Lukas Larsson-8
Thanks so much, Lukas; I'll try to look at the inet ports for now, as I already did some experimentation (i.e. killing some processes), and the memory went down a bit (it goes back up, but this shows me that there may be something around that area.

Alin


On Tue, Jan 14, 2020 at 7:55 AM Lukas Larsson <[hidden email]> wrote:


On Fri, Jan 10, 2020 at 8:04 PM Alin Popa <[hidden email]> wrote:
Hi all,

Hello!
 

I've stumbled upon a weird situation (you'll see down below what I mean by weird) where my `system` memory is being quite high, out of which the majority is assigned to `binary`.
When I do `:erlang.memory()`, I'm getting back the following values:
```
[
  total: 1012506552,
  processes: 84512344,
  processes_used: 83993696,
  system: 927994208,
  atom: 1845473,
  atom_used: 1816962,
  binary: 834439728,
  code: 48673550,
  ets: 11823024
]
```
As you can see, the binary is ~ 89% of the system memory (according to the Erlang doc: 
"binary
The total amount of memory currently allocated for binaries. This memory is part of the memory presented as system memory.")

I wrote a function that gives me the memory taken by all processes refc binaries, that looks like this:
```
total = fn ->
  :erlang.processes()
  |> Enum.map(fn pid ->
    try do
      {_, bins} = :erlang.process_info(pid, :binary)
      total_size = bins |> Enum.reduce(0, fn {_id, size, _refs}, acc -> size + acc end)
      {pid, total_size, length(bins)}
    rescue
      _ -> {pid, 0, 0}
    end
  end)
  |> Enum.map(fn {pid, total_size, count} ->
    {pid, total_size, count}
  end)
  |> Enum.reduce(0, fn {_pid, total_size, _}, acc -> acc + total_size end)
end
```
The result returned by this function is in bytes 9600693, which is ~ 9 MB, and not even near the values of `binary` returned by `:erlang.memory()`.

Now, the weird part is that my app is running inside a docker container, and the docker container within a kubernetes cluster. In this setup I can see the problem, but if I'm running it locally, within bare bones docker, I can't see the same issue. And another thing I've noticed, even more weirder, is that when the BEAM starts, the binary memory is ~ 300 MB, and after each request it bumps up till it reaches ~ 900 MB, after which stays constant - what I mean is that the BEAM doesn't crash with out of memory, or something similar, but I'm still worried. Other applications I have, also running on Erlang 22, and exactly the same docker image, don't suffer from this problem, which makes me think this is related to the application code...
This is also happening on both Erlang 21, and 22.

Another thing to mention is that I used as well `recon`, and after running it, the binary section stays exactly the same, doesn't seem to be a matter of not being garbage collected.

My questions are:
1. What other type of binary (beside refc process binaries) can live in that zone of memory? Am I looking for the wrong thing?

The main ones are binaries kept by ets tables and ports.
 
2. How can I _look_ at the `binary` memory region to see what I have in there?

You can use the instrument module in runtime_tools to get some more insights.
 
3. Any other suggestion on what I might be doing wrong?

From what you describe above I would look at the inet ports that are running and make sure that they are closed and that they use sane buffer sizes. 

Any help will be much appreciated.

Thanks,
Alin
Reply | Threaded
Open this post in threaded view
|

Re: High Erlang memory binary

Alin Popa
In reply to this post by Mike Benza
Thanks Mike,
In fact that's a good point, even though I was aware of this, haven't spent much time looking at those things, so will add it to my list.

Alin

On Tue, Jan 14, 2020 at 11:59 AM Mike Benza <[hidden email]> wrote:
Alin,

3. Any other suggestion on what I might be doing wrong?

I ran into a similar problem a few years ago.  I was using jiffy to decode JSON blobs.  Jiffy gave me back sliced binaries from the original binary I passed in.  This kept the whole binary in memory, even if I was just using a tiny portion of it and discarding the rest.  If you're doing this (unlikely that you've run into the exact same problem), use the copy_strings option to jiffy.

If not, consider looking for other cases where you start with a large binary, discard most of it, and keep a small portion of it.  That may be the cause of your binary bloat.  If you find something like that, try copying the binaries in a way that causes the original binary to be released from memory.

- Mike