Quantcast

Erlang and Memory management

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Erlang and Memory management

Roberto Ostinelli
Dear list,

I'd like to know some inner insights on how memory is handled in Erlang.

Let's say that I build up a Binary term() in a gen_server, which I store in its state, and then send it over to another gen_server which will also store this term into its state.

My question is: will this Binary term occupy 2 * memory space or is there some kind of pointer mechanism to handle it?

Thank you,

r.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Evgeniy Khramtsov-2
31.05.2011 19:13, Roberto Ostinelli wrote:
> My question is: will this Binary term occupy 2 * memory space or is there
> some kind of pointer mechanism to handle it?
>    

As I understand from binary.c, small binaries are kept on a process heap
(ERL_ONHEAP_BIN_LIMIT = 64), so they will likely be copied. Bigger
binaries are allocated on shared heap, so only a pointer will be copied.

--
Regards,
Evgeniy Khramtsov, ProcessOne.
xmpp:[hidden email].

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Pierpaolo Bernardi
In reply to this post by Roberto Ostinelli
On Tue, May 31, 2011 at 11:13, Roberto Ostinelli <[hidden email]> wrote:

> Dear list,
>
> I'd like to know some inner insights on how memory is handled in Erlang.
>
> Let's say that I build up a Binary term() in a gen_server, which I store in
> its state, and then send it over to another gen_server which will also store
> this term into its state.
>
> My question is: will this Binary term occupy 2 * memory space or is there
> some kind of pointer mechanism to handle it?

This is explained here:
http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id58893

P.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Roberto Ostinelli
2011/5/31 Pierpaolo Bernardi <[hidden email]>

This is explained here:
http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id58893


I've probably tried to oversimplify my user case.

What happens in case of lists?

r.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Gleb Peregud
On Tue, May 31, 2011 at 12:09, Roberto Ostinelli <[hidden email]> wrote:
> What happens in case of lists?

All other structures are copied, hence occupying memory twice.
Although there are special rules for "constant pool" structures.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Pierpaolo Bernardi
In reply to this post by Roberto Ostinelli
On Tue, May 31, 2011 at 12:09, Roberto Ostinelli <[hidden email]> wrote:
> 2011/5/31 Pierpaolo Bernardi <[hidden email]>
>>
>> This is explained here:
>> http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id58893
>>
>
> I've probably tried to oversimplify my user case.
>
> What happens in case of lists?

What Gleb said.

(You can check here:
http://www.erlang.org/doc/efficiency_guide/processes.html#id67714
if you don't trust Gleb  8^)

P.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Robert Virding-2
In reply to this post by Roberto Ostinelli
I just want to point out that these rules are BEAM specific and not Erlang specific. So on an implementation with a single heap all data is shared. Now there aren't currently many single heap implementations, to be exact only one, erjang (Erlang on the JVM). :-)

Robert

----- "Pierpaolo Bernardi" <[hidden email]> wrote:

> On Tue, May 31, 2011 at 12:09, Roberto Ostinelli <[hidden email]>
> wrote:
> > 2011/5/31 Pierpaolo Bernardi <[hidden email]>
> >>
> >> This is explained here:
> >>
> http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id58893
> >>
> >
> > I've probably tried to oversimplify my user case.
> >
> > What happens in case of lists?
>
> What Gleb said.
>
> (You can check here:
> http://www.erlang.org/doc/efficiency_guide/processes.html#id67714
> if you don't trust Gleb  8^)
>
> P.
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Kostis Sagonas-2
Robert Virding wrote:
> I just want to point out that these rules are BEAM specific and not Erlang specific.

Well, this is not correct: the BEAM nowhere specifies that its
instruction set has to be implemented using a private heap architecture.
  In fact, we successfully used the BEAM instruction set to implement
both a shared heap and a hybrid heap implementation.  See:

   http://user.it.uu.se/~kostis/Papers/heap.pdf

> So on an implementation with a single heap all data is shared. Now there aren't currently many single heap implementations, to be exact only one, erjang (Erlang on the JVM). :-)

There have been more.  We had an OTP system with a shared heap and the
hybrid heap system was part of OTP for quite a while.  IMO, it's too bad
that it was not maintained when Erlang/OTP was extended to support SMP
architectures.

Also, the ETOS (Erlang to Scheme) system was based on a shared heap
architecture.

Kostis
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Paul Meier
I'm in a similar situation.  I've got a few large tries, implemented
with gb_trees.  I keep them in an orddict, which I'm using as the
State variable of a gb_server.  For example:

------
init() ->
    %% State is an orddict of 3 giant Tries, pre-made and fetched from
the filesystem.
    {ok, State}.

handle_call({keyed_operation_on_large_data, Key}, _From, State) ->
    GiantTrie = orddict:find(Key, State),
    .... computations using GiantTrie....
    {reply, Result, State}.

I'm finding that my process is crashing due to inability to mmap, and
the erl_crash.dump file tells me that the gen_server's supervisor (or,
on occasion, the error_logger) is the one with a prohibitively large
Stack+Heap.  Also, I never modify the Tries or the orddict containing
them in calls to the gb_server: it's all read-only.

Is storing the value in an orddict, or using regular terms as opposed
to binaries causing unnecessary copying?  While _I_ know the data is
being used in a read-only fashion, is there a place where copying
might be occurring because the VM doesn't know this?

I was hoping to play with binaries as a next attempt at fixing this
(the links in this thread suggest that large binaries are handled more
conservatively), but this thread appeared fortuitously, and maybe
someone can spot the very likely obvious error that I'm making.

I'm grateful for your help, whether in this thread or the others I've
enjoyed reading ^_^

-Paul


On Tue, May 31, 2011 at 11:19 AM, Kostis Sagonas <[hidden email]> wrote:

> Robert Virding wrote:
>>
>> I just want to point out that these rules are BEAM specific and not Erlang
>> specific.
>
> Well, this is not correct: the BEAM nowhere specifies that its instruction
> set has to be implemented using a private heap architecture.  In fact, we
> successfully used the BEAM instruction set to implement both a shared heap
> and a hybrid heap implementation.  See:
>
>  http://user.it.uu.se/~kostis/Papers/heap.pdf
>
>> So on an implementation with a single heap all data is shared. Now there
>> aren't currently many single heap implementations, to be exact only one,
>> erjang (Erlang on the JVM). :-)
>
> There have been more.  We had an OTP system with a shared heap and the
> hybrid heap system was part of OTP for quite a while.  IMO, it's too bad
> that it was not maintained when Erlang/OTP was extended to support SMP
> architectures.
>
> Also, the ETOS (Erlang to Scheme) system was based on a shared heap
> architecture.
>
> Kostis
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Michael Truog
A better trie data structure exists that should consume less memory, here:
https://github.com/okeuday/trie

On 05/31/2011 12:28 PM, Paul Meier wrote:

> I'm in a similar situation.  I've got a few large tries, implemented
> with gb_trees.  I keep them in an orddict, which I'm using as the
> State variable of a gb_server.  For example:
>
> ------
> init() ->
>     %% State is an orddict of 3 giant Tries, pre-made and fetched from
> the filesystem.
>     {ok, State}.
>
> handle_call({keyed_operation_on_large_data, Key}, _From, State) ->
>     GiantTrie = orddict:find(Key, State),
>     .... computations using GiantTrie....
>     {reply, Result, State}.
>
> I'm finding that my process is crashing due to inability to mmap, and
> the erl_crash.dump file tells me that the gen_server's supervisor (or,
> on occasion, the error_logger) is the one with a prohibitively large
> Stack+Heap.  Also, I never modify the Tries or the orddict containing
> them in calls to the gb_server: it's all read-only.
>
> Is storing the value in an orddict, or using regular terms as opposed
> to binaries causing unnecessary copying?  While _I_ know the data is
> being used in a read-only fashion, is there a place where copying
> might be occurring because the VM doesn't know this?
>
> I was hoping to play with binaries as a next attempt at fixing this
> (the links in this thread suggest that large binaries are handled more
> conservatively), but this thread appeared fortuitously, and maybe
> someone can spot the very likely obvious error that I'm making.
>
> I'm grateful for your help, whether in this thread or the others I've
> enjoyed reading ^_^
>
> -Paul
>
>
> On Tue, May 31, 2011 at 11:19 AM, Kostis Sagonas <[hidden email]> wrote:
>> Robert Virding wrote:
>>> I just want to point out that these rules are BEAM specific and not Erlang
>>> specific.
>> Well, this is not correct: the BEAM nowhere specifies that its instruction
>> set has to be implemented using a private heap architecture.  In fact, we
>> successfully used the BEAM instruction set to implement both a shared heap
>> and a hybrid heap implementation.  See:
>>
>>  http://user.it.uu.se/~kostis/Papers/heap.pdf
>>
>>> So on an implementation with a single heap all data is shared. Now there
>>> aren't currently many single heap implementations, to be exact only one,
>>> erjang (Erlang on the JVM). :-)
>> There have been more.  We had an OTP system with a shared heap and the
>> hybrid heap system was part of OTP for quite a while.  IMO, it's too bad
>> that it was not maintained when Erlang/OTP was extended to support SMP
>> architectures.
>>
>> Also, the ETOS (Erlang to Scheme) system was based on a shared heap
>> architecture.
>>
>> Kostis
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Robert Virding-2
In reply to this post by Roberto Ostinelli

----- "Kostis Sagonas" <[hidden email]> wrote:

> Robert Virding wrote:
> > I just want to point out that these rules are BEAM specific and not
> Erlang specific.
>
> Well, this is not correct: the BEAM nowhere specifies that its
> instruction set has to be implemented using a private heap
> architecture.
>   In fact, we successfully used the BEAM instruction set to implement
>
> both a shared heap and a hybrid heap implementation.  See:
>
>    http://user.it.uu.se/~kostis/Papers/heap.pdf

Yes, I know that the instruction set says nothing about process heaps. I was referring to the BEAM *IMPLEMENTATION* which is based on separate process heaps and copying messages. The efficiency guides mentioned relate to the implementation and are BEAM specific.

> > So on an implementation with a single heap all data is shared. Now
> there aren't currently many single heap implementations, to be exact
> only one, erjang (Erlang on the JVM). :-)
>
> There have been more.  We had an OTP system with a shared heap and the
> hybrid heap system was part of OTP for quite a while.  IMO, it's too
> bad
> that it was not maintained when Erlang/OTP was extended to support SMP
> architectures.
>
> Also, the ETOS (Erlang to Scheme) system was based on a shared heap
> architecture.

I quite agree with you, I also think a shared heap architecture is the way to go and doing an SMP version would be very interesting. My only gripe with erjang is that it cannot yet use SMPs properly. But I assume they will fix that. A shared heap architecture could change

I suppose my main gripe is really that it is not always clear what is language specific and what is implementation specific.

Robert
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Jack Moffitt
> I quite agree with you, I also think a shared heap architecture is the way to go and doing an SMP version would be very interesting. My only gripe with erjang is that it cannot yet use SMPs properly. But I assume they will fix that. A shared heap architecture could change

Could you elaborate a bit on the reason why you'd prefer a shared
heap? I was under the impression that individual heaps were one of the
ways Erlang achieved soft real-time performance. Is CPU power enough
improved (or garbage collectors) since Erlang made that decision that
it's no longer worth it?

jack.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Kostis Sagonas-2
In reply to this post by Kostis Sagonas-2
Kostis Sagonas wrote:

> Robert Virding wrote:
>> ...
>> So on an implementation with a single heap all data is shared. Now
>> there aren't currently many single heap implementations, to be exact
>> only one, erjang (Erlang on the JVM). :-)
>
> There have been more.  We had an OTP system with a shared heap and the
> hybrid heap system was part of OTP for quite a while.  IMO, it's too bad
> that it was not maintained when Erlang/OTP was extended to support SMP
> architectures.

Some clarification because the above was slightly ambiguous and I think
it has been misunderstood.  The "it's too bad that ..." sentence of mine
was referring to the *hybrid* heap system, not the shared heap one.

Kostis
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Gleb Peregud
In reply to this post by Paul Meier

Erlang has generational per process GC. This means that after many read operations where you are likely to create new small terms (e.g. reply tuple with value), the GC may kick in. Generational GC would copy all reachable terms from old heap to new one, before discarding old heap. This leads to double use of memory, which might be the reason of crashes.

Best,
Gleb

On 31 May 2011 21:29, "Paul Meier" <[hidden email]> wrote:
> I'm in a similar situation. I've got a few large tries, implemented
> with gb_trees. I keep them in an orddict, which I'm using as the
> State variable of a gb_server. For example:
>
> ------
> init() ->
> %% State is an orddict of 3 giant Tries, pre-made and fetched from
> the filesystem.
> {ok, State}.
>
> handle_call({keyed_operation_on_large_data, Key}, _From, State) ->
> GiantTrie = orddict:find(Key, State),
> .... computations using GiantTrie....
> {reply, Result, State}.
>
> I'm finding that my process is crashing due to inability to mmap, and
> the erl_crash.dump file tells me that the gen_server's supervisor (or,
> on occasion, the error_logger) is the one with a prohibitively large
> Stack+Heap. Also, I never modify the Tries or the orddict containing
> them in calls to the gb_server: it's all read-only.
>
> Is storing the value in an orddict, or using regular terms as opposed
> to binaries causing unnecessary copying? While _I_ know the data is
> being used in a read-only fashion, is there a place where copying
> might be occurring because the VM doesn't know this?
>
> I was hoping to play with binaries as a next attempt at fixing this
> (the links in this thread suggest that large binaries are handled more
> conservatively), but this thread appeared fortuitously, and maybe
> someone can spot the very likely obvious error that I'm making.
>
> I'm grateful for your help, whether in this thread or the others I've
> enjoyed reading ^_^
>
> -Paul
>
>
> On Tue, May 31, 2011 at 11:19 AM, Kostis Sagonas <[hidden email]> wrote:
>> Robert Virding wrote:
>>>
>>> I just want to point out that these rules are BEAM specific and not Erlang
>>> specific.
>>
>> Well, this is not correct: the BEAM nowhere specifies that its instruction
>> set has to be implemented using a private heap architecture.  In fact, we
>> successfully used the BEAM instruction set to implement both a shared heap
>> and a hybrid heap implementation.  See:
>>
>>  http://user.it.uu.se/~kostis/Papers/heap.pdf
>>
>>> So on an implementation with a single heap all data is shared. Now there
>>> aren't currently many single heap implementations, to be exact only one,
>>> erjang (Erlang on the JVM). :-)
>>
>> There have been more.  We had an OTP system with a shared heap and the
>> hybrid heap system was part of OTP for quite a while.  IMO, it's too bad
>> that it was not maintained when Erlang/OTP was extended to support SMP
>> architectures.
>>
>> Also, the ETOS (Erlang to Scheme) system was based on a shared heap
>> architecture.
>>
>> Kostis
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Ulf Wiger
In reply to this post by Jack Moffitt

On 1 Jun 2011, at 01:01, Jack Moffitt wrote:

>> I quite agree with you, I also think a shared heap architecture is the way to go and doing an SMP version would be very interesting. My only gripe with erjang is that it cannot yet use SMPs properly. But I assume they will fix that. A shared heap architecture could change
>
> Could you elaborate a bit on the reason why you'd prefer a shared
> heap? I was under the impression that individual heaps were one of the
> ways Erlang achieved soft real-time performance. Is CPU power enough
> improved (or garbage collectors) since Erlang made that decision that
> it's no longer worth it?

Actually, one of the innovations of the Erlang team (Robert and Joe [1], I believe) was to realise that when no cyclical data structures are allowed, and data structures are immutable, GC is much helped by the fact that all pointers point from newer data to older data, and GC can easily be made re-entrant. Thus, incremental GC fits Erlang very well, and real-time performance need not be sacrificed.

A problem with the old shared-heap implementation was that the incremental GC was never implemented (this, as it turned out, was good news for Scala [2]). A compromise was to introduce a hybrid-heap system, where only data that is actually shared between processes gets put on a shared heap.

As it turns out, the hybrid-heap system ought to be a good compromise in the long term as well, since a copying GC has some advantages  over (reference-counting) shared-heap GC. In particular, reference counting doesn't add much value for non-shared data, but it does increase memory usage.

I am certain Kostis and Robert will correct me if I've uttered any falsehoods. :)

BR,
Ulf W

[1] http://www.erlang.se/publications/memory1995.ps
     One Pass Real-Time Generational Mark-Sweep Garbage Collection.
    Joe Armstrong and Robert Virding.
    International Workshop on Memory Management.
    Kinross, Scotland, September 27-29, 1995.

[2] Instead of joining Ericsson and implementing the shared-heap GC, Erik Stenmnan became project manager in Martin Odersky's Scala group and helped bring about Scala 1.0.

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Ulf Wiger

On 1 Jun 2011, at 09:09, Ulf Wiger wrote:

> Actually, one of the innovations of the Erlang team (Robert and Joe [1], I believe) was to realise that when no cyclical data structures are allowed, and data structures are immutable, GC is much helped by the fact that all pointers point from newer data to older data, and GC can easily be made re-entrant. Thus, incremental GC fits Erlang very well, and real-time performance need not be sacrificed.

In fact, the Erlang Processor (ECOMP) made use of this fact and ran garbage collection in parallel with program execution. In that way, no CPU cycles were sacrificed for garbage collection, which ought to be about as real-time as it gets. :)

BR,
Ulf W

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Tony Arcieri-3
In reply to this post by Jack Moffitt
On Tue, May 31, 2011 at 5:01 PM, Jack Moffitt <[hidden email]> wrote:
Could you elaborate a bit on the reason why you'd prefer a shared
heap? I was under the impression that individual heaps were one of the
ways Erlang achieved soft real-time performance. Is CPU power enough
improved (or garbage collectors) since Erlang made that decision that
it's no longer worth it?

Erlang has to copy terms whenever they're sent in a message, aside from the exceptions noted earlier regarding binaries. With a shared heap, sending messages could be done by reference instead of actually copying data between process heaps.

There are realtime (and even hard realtime) garbage collectors in the Java world which use a shared heap. Azul systems with 768 cores and 500GB+ heaps managed to keep GC pauses in the 10-20ms range, albeit with hardware memory barriers. Perhaps commodity x64 chips will get them at some point in the future as the number of CPU cores continues to grow.

Shared heaps can scale well and eliminate the need to copy data on message sends.

--
Tony Arcieri


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Michael Turner
"Shared heaps can scale well and eliminate the need to copy data on message sends."

The advantages seem obvious, but ... with cache-per-core, there will be copying anyway: into the core's data cache. Immutable data is nice, because you don't have to worry so much cache coherence.  But how many CPU architectures are mutability-aware, at least at the right level?

Copying a data structure is also an opportunity to construct a "fresh" data structure that's relatively unfragmented. Lower fragmentation of data structures would make cache fetches more efficient, and cache hits hits more likely, for the receiver.

Probably most Erlang code would faster with a shared heap, but some code might slow down -- maybe by a lot. Putting certain smarts in Send might help: e.g., if the receiving process isn't ready to run, copy; if it is ready, just pass the pointer to shared memory. I'm sure lots more strategies might be devised. And I'm about equally sure they'll all have their own limits and pitfalls.

-michael turner

On Wed, Jun 1, 2011 at 4:20 PM, Tony Arcieri <[hidden email]> wrote:
On Tue, May 31, 2011 at 5:01 PM, Jack Moffitt <[hidden email]> wrote:
Could you elaborate a bit on the reason why you'd prefer a shared
heap? I was under the impression that individual heaps were one of the
ways Erlang achieved soft real-time performance. Is CPU power enough
improved (or garbage collectors) since Erlang made that decision that
it's no longer worth it?

Erlang has to copy terms whenever they're sent in a message, aside from the exceptions noted earlier regarding binaries. With a shared heap, sending messages could be done by reference instead of actually copying data between process heaps.

There are realtime (and even hard realtime) garbage collectors in the Java world which use a shared heap. Azul systems with 768 cores and 500GB+ heaps managed to keep GC pauses in the 10-20ms range, albeit with hardware memory barriers. Perhaps commodity x64 chips will get them at some point in the future as the number of CPU cores continues to grow.

Shared heaps can scale well and eliminate the need to copy data on message sends.

--
Tony Arcieri


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Jesper Louis Andersen-2
On Wed, Jun 1, 2011 at 15:49, Michael Turner
<[hidden email]> wrote:

> The advantages seem obvious, but ... with cache-per-core, there will be
> copying anyway: into the core's data cache. Immutable data is nice, because
> you don't have to worry so much cache coherence.  But how many CPU
> architectures are mutability-aware, at least at the right level?

Also, you need a copy when you go to another distributed machine
anyway. So suddenly locality will play a role. From a point of
performance, locality matter.

> Copying a data structure is also an opportunity to construct a "fresh" data
> structure that's relatively unfragmented. Lower fragmentation of data
> structures would make cache fetches more efficient, and cache hits hits more
> likely, for the receiver.

Garbage Collection will usually unfragment data anyway. But the
process heap acts as a crude region-based memory manager: When a
process dies, we can *instantly* reset its memory area and give it
back. And that kind of cleanup is really effective. This gives us a
region of memory of which we have some control, due to the lifetime of
the process. In a shared heap, you loose that control and have to wait
until the next GC for the memory to be reclaimed. The fact that you
can tune the GC of each process rather than for the whole VM is also a
pretty good thing.


--
J.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Erlang and Memory management

Andy W. Song-2
In reply to this post by Paul Meier
A related question, for code like following:
test(<<255, T/binary>>) ->
    ok;
test(<<_, T/binary>>) ->
    test(T).

Does Erlang copy the input binary for each recursion? If the input is huge, it would be very low efficient. If so, what should we do?

Thanks
Andy

On Wed, Jun 1, 2011 at 3:28 AM, Paul Meier <[hidden email]> wrote:
I'm in a similar situation.  I've got a few large tries, implemented
with gb_trees.  I keep them in an orddict, which I'm using as the
State variable of a gb_server.  For example:

------
init() ->
   %% State is an orddict of 3 giant Tries, pre-made and fetched from
the filesystem.
   {ok, State}.

handle_call({keyed_operation_on_large_data, Key}, _From, State) ->
   GiantTrie = orddict:find(Key, State),
   .... computations using GiantTrie....
   {reply, Result, State}.

I'm finding that my process is crashing due to inability to mmap, and
the erl_crash.dump file tells me that the gen_server's supervisor (or,
on occasion, the error_logger) is the one with a prohibitively large
Stack+Heap.  Also, I never modify the Tries or the orddict containing
them in calls to the gb_server: it's all read-only.

Is storing the value in an orddict, or using regular terms as opposed
to binaries causing unnecessary copying?  While _I_ know the data is
being used in a read-only fashion, is there a place where copying
might be occurring because the VM doesn't know this?

I was hoping to play with binaries as a next attempt at fixing this
(the links in this thread suggest that large binaries are handled more
conservatively), but this thread appeared fortuitously, and maybe
someone can spot the very likely obvious error that I'm making.

I'm grateful for your help, whether in this thread or the others I've
enjoyed reading ^_^

-Paul


On Tue, May 31, 2011 at 11:19 AM, Kostis Sagonas <[hidden email]> wrote:
> Robert Virding wrote:
>>
>> I just want to point out that these rules are BEAM specific and not Erlang
>> specific.
>
> Well, this is not correct: the BEAM nowhere specifies that its instruction
> set has to be implemented using a private heap architecture.  In fact, we
> successfully used the BEAM instruction set to implement both a shared heap
> and a hybrid heap implementation.  See:
>
>  http://user.it.uu.se/~kostis/Papers/heap.pdf
>
>> So on an implementation with a single heap all data is shared. Now there
>> aren't currently many single heap implementations, to be exact only one,
>> erjang (Erlang on the JVM). :-)
>
> There have been more.  We had an OTP system with a shared heap and the
> hybrid heap system was part of OTP for quite a while.  IMO, it's too bad
> that it was not maintained when Erlang/OTP was extended to support SMP
> architectures.
>
> Also, the ETOS (Erlang to Scheme) system was based on a shared heap
> architecture.
>
> Kostis
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
12
Loading...