ETS memory fragmentation after deleting data

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

ETS memory fragmentation after deleting data

Dániel Szoboszlay
Hi,

I would like to understand some things about ETS memory fragmentation after deleting data. My current (probably faulty) mental model of the issue looks like this:
  • For every object in an ETS table a block is allocated on a carrier (typically a multi-block carrier, unless the object is huge).
  • Besides the objects themselves, the ETS table obviously needs some additional blocks too to describe the hash table data structure. The size of this data shall be small compared to the object data however (since ETS is not terribly space-inefficient), so I won't think about them any more.
  • If I delete some objects from an ETS table, the corresponding blocks are deallocated. However, the rest of the objects remain in their original location, so the carriers cannot be deallocated (unless all of their objects get deleted).
  • This implies that deleting a lot of data from ETS tables would lead to memory fragmentation.
  • Since there's no way to force ETS to rearrange the objects it already stores, the memory remains fragmented until subsequent updates to ETS tables fill the gaps with new objects.
I wrote a small test program (available here) to verify my mental model. But it doesn't exactly behave as I expected.
  1. I create an ETS table and populate it with 1M objects, where each object is 1027 words large.

    I expect the total ETS memory use to be around 1M * 1027 * 8 bytes ~ 7835 MiB (the size of all other ETS tables on a newly started Erlang node is negligible).

    And indeed I see that the total block size is ~7881 MiB and the total carrier size is ~7885 MiB (99.95% utilisation).
  2. I delete 75% of the objects randomly.

    I expect the block size to go down by ~75% and the carrier size with some smaller value.

    In practice however the block size goes down by 87%, while the carrier size drops by 48% (resulting in a disappointing 25% utilisation).
  3. Finally, I try to defragment the memory by overwriting each object that was left in the table with itself.

    I expect this operation to have no effect on the block size, but close the gap between the block size and carrier size by compacting the blocks on fewer carriers.

    In practice however the block size goes up by 91%(!!!), while the carrier size comes down very close to this new block size (utilisation is back at 99.56%). All in all, compared to the initial state in step 1, both block and carrier size is down by 75%.
So here's the list of things I don't understand or know based on this exercise:
  • How could the block size drop by 87% after deleting 75% of the data in step 2?
  • Why did overwriting each object with itself resulted in almost doubling the block size?
  • Would you consider running a select_replace to compact a table after deletions safe in production? E.g. doing it on a Mnesia table that's several GB-s in size and is actively used by Mnesia transactions. (I know the replace is atomic on each object, but how would a long running replace affect the execution time of other operations for example?)
  • Step 3 helped to reclaim unused memory, but it almost doubled the used memory (the block size). I don't know what caused this behaviour, but is there an operation that would achieve the opposite effect? That is, without altering the contents of the table reduce the block size by 45-50%?
Thanks,
Daniel

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: ETS memory fragmentation after deleting data

Sverker Eriksson-4
Hi Dániel 

I looked at your test code and I think it can be the 'mbcs_pool' stats that are missing.

They are returned as {mbcs_pool,[{blocks_size,0}]} without carriers_size for some reason
by erlang:system_info({allocator_sizes,ets_alloc}).

Use erlang:system_info({allocator,ets_alloc}) to get mbcs_pool with both block and carrier sizes.

Another thing that might confuse is that all binaries larger than 64 bytes will be stored in binary_alloc.

/Sverker

On tor, 2019-02-07 at 15:35 +0100, Dániel Szoboszlay wrote:
Hi,

I would like to understand some things about ETS memory fragmentation after deleting data. My current (probably faulty) mental model of the issue looks like this:
  • For every object in an ETS table a block is allocated on a carrier (typically a multi-block carrier, unless the object is huge).
  • Besides the objects themselves, the ETS table obviously needs some additional blocks too to describe the hash table data structure. The size of this data shall be small compared to the object data however (since ETS is not terribly space-inefficient), so I won't think about them any more.
  • If I delete some objects from an ETS table, the corresponding blocks are deallocated. However, the rest of the objects remain in their original location, so the carriers cannot be deallocated (unless all of their objects get deleted).
  • This implies that deleting a lot of data from ETS tables would lead to memory fragmentation.
  • Since there's no way to force ETS to rearrange the objects it already stores, the memory remains fragmented until subsequent updates to ETS tables fill the gaps with new objects.
I wrote a small test program (available here) to verify my mental model. But it doesn't exactly behave as I expected.
  1. I create an ETS table and populate it with 1M objects, where each object is 1027 words large.

    I expect the total ETS memory use to be around 1M * 1027 * 8 bytes ~ 7835 MiB (the size of all other ETS tables on a newly started Erlang node is negligible).

    And indeed I see that the total block size is ~7881 MiB and the total carrier size is ~7885 MiB (99.95% utilisation).
  2. I delete 75% of the objects randomly.

    I expect the block size to go down by ~75% and the carrier size with some smaller value.

    In practice however the block size goes down by 87%, while the carrier size drops by 48% (resulting in a disappointing 25% utilisation).
  3. Finally, I try to defragment the memory by overwriting each object that was left in the table with itself.

    I expect this operation to have no effect on the block size, but close the gap between the block size and carrier size by compacting the blocks on fewer carriers.

    In practice however the block size goes up by 91%(!!!), while the carrier size comes down very close to this new block size (utilisation is back at 99.56%). All in all, compared to the initial state in step 1, both block and carrier size is down by 75%.
So here's the list of things I don't understand or know based on this exercise:
  • How could the block size drop by 87% after deleting 75% of the data in step 2?
  • Why did overwriting each object with itself resulted in almost doubling the block size?
  • Would you consider running a select_replace to compact a table after deletions safe in production? E.g. doing it on a Mnesia table that's several GB-s in size and is actively used by Mnesia transactions. (I know the replace is atomic on each object, but how would a long running replace affect the execution time of other operations for example?)
  • Step 3 helped to reclaim unused memory, but it almost doubled the used memory (the block size). I don't know what caused this behaviour, but is there an operation that would achieve the opposite effect? That is, without altering the contents of the table reduce the block size by 45-50%?
Thanks,
Daniel
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: ETS memory fragmentation after deleting data

Sverker Eriksson-5
In reply to this post by Dániel Szoboszlay
Hi Dániel 

I looked at your test code and I think it can be the 'mbcs_pool' stats that are missing.

They are returned as {mbcs_pool,[{blocks_size,0}]} without carriers_size for some reason
by erlang:system_info({allocator_sizes,ets_alloc}).

Use erlang:system_info({allocator,ets_alloc}) to get mbcs_pool with both block and carrier sizes.

Another thing that might confuse is that all binaries larger than 64 bytes will be stored in binary_alloc.

/Sverker


On tor, 2019-02-07 at 15:35 +0100, Dániel Szoboszlay wrote:
Hi,

I would like to understand some things about ETS memory fragmentation after deleting data. My current (probably faulty) mental model of the issue looks like this:
  • For every object in an ETS table a block is allocated on a carrier (typically a multi-block carrier, unless the object is huge).
  • Besides the objects themselves, the ETS table obviously needs some additional blocks too to describe the hash table data structure. The size of this data shall be small compared to the object data however (since ETS is not terribly space-inefficient), so I won't think about them any more.
  • If I delete some objects from an ETS table, the corresponding blocks are deallocated. However, the rest of the objects remain in their original location, so the carriers cannot be deallocated (unless all of their objects get deleted).
  • This implies that deleting a lot of data from ETS tables would lead to memory fragmentation.
  • Since there's no way to force ETS to rearrange the objects it already stores, the memory remains fragmented until subsequent updates to ETS tables fill the gaps with new objects.
I wrote a small test program (available here) to verify my mental model. But it doesn't exactly behave as I expected.
  1. I create an ETS table and populate it with 1M objects, where each object is 1027 words large.

    I expect the total ETS memory use to be around 1M * 1027 * 8 bytes ~ 7835 MiB (the size of all other ETS tables on a newly started Erlang node is negligible).

    And indeed I see that the total block size is ~7881 MiB and the total carrier size is ~7885 MiB (99.95% utilisation).
  2. I delete 75% of the objects randomly.

    I expect the block size to go down by ~75% and the carrier size with some smaller value.

    In practice however the block size goes down by 87%, while the carrier size drops by 48% (resulting in a disappointing 25% utilisation).
  3. Finally, I try to defragment the memory by overwriting each object that was left in the table with itself.

    I expect this operation to have no effect on the block size, but close the gap between the block size and carrier size by compacting the blocks on fewer carriers.

    In practice however the block size goes up by 91%(!!!), while the carrier size comes down very close to this new block size (utilisation is back at 99.56%). All in all, compared to the initial state in step 1, both block and carrier size is down by 75%.
So here's the list of things I don't understand or know based on this exercise:
  • How could the block size drop by 87% after deleting 75% of the data in step 2?
  • Why did overwriting each object with itself resulted in almost doubling the block size?
  • Would you consider running a select_replace to compact a table after deletions safe in production? E.g. doing it on a Mnesia table that's several GB-s in size and is actively used by Mnesia transactions. (I know the replace is atomic on each object, but how would a long running replace affect the execution time of other operations for example?)
  • Step 3 helped to reclaim unused memory, but it almost doubled the used memory (the block size). I don't know what caused this behaviour, but is there an operation that would achieve the opposite effect? That is, without altering the contents of the table reduce the block size by 45-50%?
Thanks,
Daniel
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: ETS memory fragmentation after deleting data

Dániel Szoboszlay
Hi Sverker,

Thanks for the tip, I changed my code in the gist to use erlang:system_info({allocator, ets_alloc}) and the weirdest things disappeared. (Also, I intentionally avoided storing binaries in the ETS table in this test, so the binary_alloc couldn't play a role in the results.)

But now I see different "problems":
  • Deleting from the ETS table cannot free up any of the carriers. :(
    After deleting 75% of the objects I could regain 0 memory for the OS and the utilisation is down to a disappointing 25%.
  • Overwriting every object once with itself sometimes have no effect at all on the carrier size either. In this case a second round of overwrites are needed to free up carriers.
  • My memory compaction trick can now only achieve 50% utilisation. So the memory is still fragmented.
  • I tried to repeat the overwrite step a few more times, but once it reaches 50% utilisation it cannot improve on it any more.
My guess was that maybe carrier abandoning causes this problem. I tried playing with +MEacul 0, some different +MEas settings and even with +MEramv true, but neither of them helped.

So my new questions are:
  • What may be preventing my overwrite-with-self-compactor to go above 50% carrier utilisation?
  • Is there any trick that would help me further reduce the fragmentation and get back to 90%+ utilisation after deleting a lot of objects from ETS?
  • Wouldn't ERTS benefit from some built-in memory defragmentator utility, at least for ets_alloc? (For example I don't think eheap_alloc would need it: the copying GC effectively performs defragmentation automatically. binary_alloc would also be a potential candidate, but it may be significantly harder to implement, and I guess most systems store less binary data than ETS data.)
Thanks,
Daniel

On Thu, 7 Feb 2019 at 22:25 Sverker Eriksson <[hidden email]> wrote:
Hi Dániel 

I looked at your test code and I think it can be the 'mbcs_pool' stats that are missing.

They are returned as {mbcs_pool,[{blocks_size,0}]} without carriers_size for some reason
by erlang:system_info({allocator_sizes,ets_alloc}).

Use erlang:system_info({allocator,ets_alloc}) to get mbcs_pool with both block and carrier sizes.

Another thing that might confuse is that all binaries larger than 64 bytes will be stored in binary_alloc.

/Sverker


On tor, 2019-02-07 at 15:35 +0100, Dániel Szoboszlay wrote:
Hi,

I would like to understand some things about ETS memory fragmentation after deleting data. My current (probably faulty) mental model of the issue looks like this:
  • For every object in an ETS table a block is allocated on a carrier (typically a multi-block carrier, unless the object is huge).
  • Besides the objects themselves, the ETS table obviously needs some additional blocks too to describe the hash table data structure. The size of this data shall be small compared to the object data however (since ETS is not terribly space-inefficient), so I won't think about them any more.
  • If I delete some objects from an ETS table, the corresponding blocks are deallocated. However, the rest of the objects remain in their original location, so the carriers cannot be deallocated (unless all of their objects get deleted).
  • This implies that deleting a lot of data from ETS tables would lead to memory fragmentation.
  • Since there's no way to force ETS to rearrange the objects it already stores, the memory remains fragmented until subsequent updates to ETS tables fill the gaps with new objects.
I wrote a small test program (available here) to verify my mental model. But it doesn't exactly behave as I expected.
  1. I create an ETS table and populate it with 1M objects, where each object is 1027 words large.

    I expect the total ETS memory use to be around 1M * 1027 * 8 bytes ~ 7835 MiB (the size of all other ETS tables on a newly started Erlang node is negligible).

    And indeed I see that the total block size is ~7881 MiB and the total carrier size is ~7885 MiB (99.95% utilisation).
  2. I delete 75% of the objects randomly.

    I expect the block size to go down by ~75% and the carrier size with some smaller value.

    In practice however the block size goes down by 87%, while the carrier size drops by 48% (resulting in a disappointing 25% utilisation).
  3. Finally, I try to defragment the memory by overwriting each object that was left in the table with itself.

    I expect this operation to have no effect on the block size, but close the gap between the block size and carrier size by compacting the blocks on fewer carriers.

    In practice however the block size goes up by 91%(!!!), while the carrier size comes down very close to this new block size (utilisation is back at 99.56%). All in all, compared to the initial state in step 1, both block and carrier size is down by 75%.
So here's the list of things I don't understand or know based on this exercise:
  • How could the block size drop by 87% after deleting 75% of the data in step 2?
  • Why did overwriting each object with itself resulted in almost doubling the block size?
  • Would you consider running a select_replace to compact a table after deletions safe in production? E.g. doing it on a Mnesia table that's several GB-s in size and is actively used by Mnesia transactions. (I know the replace is atomic on each object, but how would a long running replace affect the execution time of other operations for example?)
  • Step 3 helped to reclaim unused memory, but it almost doubled the used memory (the block size). I don't know what caused this behaviour, but is there an operation that would achieve the opposite effect? That is, without altering the contents of the table reduce the block size by 45-50%?
Thanks,
Daniel
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: ETS memory fragmentation after deleting data

Sverker Eriksson-5
I think the table segments are whats keeping the carriers  alive.

When a set,bag or duplicate_bag grows new table segments are allocated.
Each new segment contains 2048 hash buckets and the load limit for growth is 100%.
This means for every 2048 object you insert a new segments i allocated.
The load limit for shrinking is 50%, so after inserting 1 miljon objects
you have to delete 0.5 miljon before the table starts to shrink and segments are deallocated.

Increasing the shrink limit will reduce carrier fragmentation in your case,
but it may also cost in performance from more frequent rehashing when number of objects fluctuates.

The shrink limit is controlled by 
#define SHRINK_LIMIT(NACTIVE) ((NACTIVE) / 2)
in erts/emulator/beam/erl_db_hash.c

/Sverker

On lör, 2019-02-09 at 00:30 +0100, Dániel Szoboszlay wrote:
Hi Sverker,

Thanks for the tip, I changed my code in the gist to use erlang:system_info({allocator, ets_alloc}) and the weirdest things disappeared. (Also, I intentionally avoided storing binaries in the ETS table in this test, so the binary_alloc couldn't play a role in the results.)

But now I see different "problems":
  • Deleting from the ETS table cannot free up any of the carriers. :(
    After deleting 75% of the objects I could regain 0 memory for the OS and the utilisation is down to a disappointing 25%.
  • Overwriting every object once with itself sometimes have no effect at all on the carrier size either. In this case a second round of overwrites are needed to free up carriers.
  • My memory compaction trick can now only achieve 50% utilisation. So the memory is still fragmented.
  • I tried to repeat the overwrite step a few more times, but once it reaches 50% utilisation it cannot improve on it any more.
My guess was that maybe carrier abandoning causes this problem. I tried playing with +MEacul 0, some different +MEas settings and even with +MEramv true, but neither of them helped.

So my new questions are:
  • What may be preventing my overwrite-with-self-compactor to go above 50% carrier utilisation?
  • Is there any trick that would help me further reduce the fragmentation and get back to 90%+ utilisation after deleting a lot of objects from ETS?
  • Wouldn't ERTS benefit from some built-in memory defragmentator utility, at least for ets_alloc? (For example I don't think eheap_alloc would need it: the copying GC effectively performs defragmentation automatically. binary_alloc would also be a potential candidate, but it may be significantly harder to implement, and I guess most systems store less binary data than ETS data.)
Thanks,
Daniel

On Thu, 7 Feb 2019 at 22:25 Sverker Eriksson <[hidden email]> wrote:
Hi Dániel 

I looked at your test code and I think it can be the 'mbcs_pool' stats that are missing.

They are returned as {mbcs_pool,[{blocks_size,0}]} without carriers_size for some reason
by erlang:system_info({allocator_sizes,ets_alloc}).

Use erlang:system_info({allocator,ets_alloc}) to get mbcs_pool with both block and carrier sizes.

Another thing that might confuse is that all binaries larger than 64 bytes will be stored in binary_alloc.

/Sverker


On tor, 2019-02-07 at 15:35 +0100, Dániel Szoboszlay wrote:
Hi,

I would like to understand some things about ETS memory fragmentation after deleting data. My current (probably faulty) mental model of the issue looks like this:
  • For every object in an ETS table a block is allocated on a carrier (typically a multi-block carrier, unless the object is huge).
  • Besides the objects themselves, the ETS table obviously needs some additional blocks too to describe the hash table data structure. The size of this data shall be small compared to the object data however (since ETS is not terribly space-inefficient), so I won't think about them any more.
  • If I delete some objects from an ETS table, the corresponding blocks are deallocated. However, the rest of the objects remain in their original location, so the carriers cannot be deallocated (unless all of their objects get deleted).
  • This implies that deleting a lot of data from ETS tables would lead to memory fragmentation.
  • Since there's no way to force ETS to rearrange the objects it already stores, the memory remains fragmented until subsequent updates to ETS tables fill the gaps with new objects.
I wrote a small test program (available here) to verify my mental model. But it doesn't exactly behave as I expected.
  1. I create an ETS table and populate it with 1M objects, where each object is 1027 words large.

    I expect the total ETS memory use to be around 1M * 1027 * 8 bytes ~ 7835 MiB (the size of all other ETS tables on a newly started Erlang node is negligible).

    And indeed I see that the total block size is ~7881 MiB and the total carrier size is ~7885 MiB (99.95% utilisation).
  2. I delete 75% of the objects randomly.

    I expect the block size to go down by ~75% and the carrier size with some smaller value.

    In practice however the block size goes down by 87%, while the carrier size drops by 48% (resulting in a disappointing 25% utilisation).
  3. Finally, I try to defragment the memory by overwriting each object that was left in the table with itself.

    I expect this operation to have no effect on the block size, but close the gap between the block size and carrier size by compacting the blocks on fewer carriers.

    In practice however the block size goes up by 91%(!!!), while the carrier size comes down very close to this new block size (utilisation is back at 99.56%). All in all, compared to the initial state in step 1, both block and carrier size is down by 75%.
So here's the list of things I don't understand or know based on this exercise:
  • How could the block size drop by 87% after deleting 75% of the data in step 2?
  • Why did overwriting each object with itself resulted in almost doubling the block size?
  • Would you consider running a select_replace to compact a table after deletions safe in production? E.g. doing it on a Mnesia table that's several GB-s in size and is actively used by Mnesia transactions. (I know the replace is atomic on each object, but how would a long running replace affect the execution time of other operations for example?)
  • Step 3 helped to reclaim unused memory, but it almost doubled the used memory (the block size). I don't know what caused this behaviour, but is there an operation that would achieve the opposite effect? That is, without altering the contents of the table reduce the block size by 45-50%?
Thanks,
Daniel
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions