Design of the Erlang VM

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Design of the Erlang VM

Joel Reymont
Would someone kindly point me to existing resources on the design of  
the Erlang VM?

I would like to know if it's a stack-based VM, for example (think  
Forth) without digging through lots of code.

        Thanks, Joel

--
http://wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Bjorn Gustavsson
Joel Reymont <[hidden email]> writes:

> I would like to know if it's a stack-based VM, for example (think  
> Forth) without digging through lots of code.

It is not stack-based; it is register-based.

/Bjorn

--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Tony Finch
On Tue, 7 Aug 2007, Bjorn Gustavsson wrote:
>
> It is not stack-based; it is register-based.

I get the impression from a quick look at Joe's HOPL paper that the
registers are special-purpose and implicit, and the instruction format is
suitable for a threaded code interpreter. This implies to me that it isn't
like a CPU instruction set in the way that Lua 5 VM instructions are (see
http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/jucs05.pdf). This is a guess so
I'm probably wrong...

Tony.
--
f.a.n.finch  <[hidden email]>  http://dotat.at/
IRISH SEA: SOUTHERLY, BACKING NORTHEASTERLY FOR A TIME, 3 OR 4. SLIGHT OR
MODERATE. SHOWERS. MODERATE OR GOOD, OCCASIONALLY POOR.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Bjorn Gustavsson
Tony Finch <[hidden email]> writes:

> On Tue, 7 Aug 2007, Bjorn Gustavsson wrote:
> >
> > It is not stack-based; it is register-based.
>
> I get the impression from a quick look at Joe's HOPL paper that the
> registers are special-purpose and implicit, and the instruction format is
> suitable for a threaded code interpreter. This implies to me that it isn't
> like a CPU instruction set in the way that Lua 5 VM instructions are (see
> http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/jucs05.pdf). This is a guess so
> I'm probably wrong...

Yes, but Joe described the original JAM (Joe Armstrong Machine) virtual machine.
JAM has been replaced by BEAM.

The BEAM instructions actually are executed by a threaded code interpreter,
but BEAM does not the stack in the way that FORTH does; instead it uses registers,
only putting values that need to survive function calls into a stack frame.

The first word in each instruction points to the executable C code for the
instruction. That first instruction word is followed by zero or more operand
words. The operands can be constants such as atoms or integers, or the number
of the registers to operate on.

/Bjorn

--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Joel Reymont

On Aug 8, 2007, at 6:26 AM, Bjorn Gustavsson wrote:

> The first word in each instruction points to the executable C code  
> for the instruction. That first instruction word is followed by  
> zero or more operand words.

Would it be right to say that Erlang VM instructions are C function  
calls? This is my understanding of the above.

Is there significant overhead of function call setup per instruction?

        Thanks, Joel

--
http://wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Bjorn Gustavsson
Joel Reymont <[hidden email]> writes:

> On Aug 8, 2007, at 6:26 AM, Bjorn Gustavsson wrote:
>
> > The first word in each instruction points to the executable C code
> > for the instruction. That first instruction word is followed by
> > zero or more operand words.
>
> Would it be right to say that Erlang VM instructions are C function
> calls? This is my understanding of the above.

No.

We use the "first-class label" in feature in GCC. All instructions
are in the same huge function (process_main()). The addresses are
address to labels within that function. Each instruction is responsible
for picking up the address for the C code for the next instruction and
*jump* to it.

We use an old plain switch statement if the emulator is compiled with a C compiler
that doesn't support first-class labels.

/Bjorn
--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Damien Morton-2
Quite a few years back, I did an analysis of Python bytecode execution
traces and the main interpreter loop.

It turns out that there are some sequences of bytecodes that occur more
frequently than others, and that you can structure the main interpreter
loop so that common traces will result in straight through execution,
that is, you test to see if the next instruction is the same as for the
following code block, and either fall through or jump to the appropriate
label.

In my case I took statistics on bytecode sequences of length 4 and
applied simulated annealing to produce a ordering of bytecodes that
minimised the need for jumps, though a good layout can me made by hand.
The result was a small but measurable increase in performance.

Might the same technique be applicable to the Erlang VM.

> Joel Reymont <[hidden email]> writes:
>
>  
>> On Aug 8, 2007, at 6:26 AM, Bjorn Gustavsson wrote:
>>
>>    
>>> The first word in each instruction points to the executable C code
>>> for the instruction. That first instruction word is followed by
>>> zero or more operand words.
>>>      
>> Would it be right to say that Erlang VM instructions are C function
>> calls? This is my understanding of the above.
>>    
>
> No.
>
> We use the "first-class label" in feature in GCC. All instructions
> are in the same huge function (process_main()). The addresses are
> address to labels within that function. Each instruction is responsible
> for picking up the address for the C code for the next instruction and
> *jump* to it.
>
> We use an old plain switch statement if the emulator is compiled with a C compiler
> that doesn't support first-class labels.
>
> /Bjorn
>  

_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: : Design of the Erlang VM

Raimo Niskanen-2
In reply to this post by Joel Reymont
No, they are not C function calls. C function calls push
the arguments (that may come not only from the calling
code, but also from other places, nevertheless there
is explicit code to place them ---) on the processor stack,
pushes the return addresss on the processor stack and then
jumps to the function address. When the function returns
the return instruction pops the return address from the
stack and jumps back to where it came from. Then the
calling code restores the stack pointer to remove the
call arguments from the processor stack.

BEAM instructions are the instruction word followed by
zero or more operand words. The instruction word is
an address to the processor instruction sequence that
implements the BEAM instruction, as Bjorn said below.

The BEAM instructions are just straight processor code
that do not expect arguments on the processor stack
and do not end in a processor return instruction.

The BEAM instructions themselves increment the BEAM
instruction pointer and pick up operand words into
processor registers. They end by reading the next
instruction and since it is a jump address - jump
to that address and then BEAM is performing the next
BEAM instruction.

If the BEAM instruction is a (conditional) jump or call
et.al, the instruction just sets the BEAM instruction
pointer to point to the next BEAM instruction to
actually perform, reads it and jumps.

So there is no call setup overhead per instruction.


On Wed, Aug 08, 2007 at 08:03:46PM +0100, Joel Reymont wrote:

>
> On Aug 8, 2007, at 6:26 AM, Bjorn Gustavsson wrote:
>
> > The first word in each instruction points to the executable C code  
> > for the instruction. That first instruction word is followed by  
> > zero or more operand words.
>
> Would it be right to say that Erlang VM instructions are C function  
> calls? This is my understanding of the above.
>
> Is there significant overhead of function call setup per instruction?
>
> Thanks, Joel
>
> --
> http://wagerlabs.com
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://www.erlang.org/mailman/listinfo/erlang-questions

--

/ Raimo Niskanen, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: : Design of the Erlang VM

Serge Aleynikov-2
Could you shed some light on how non-tail recursive calls are being
handled in the absence of use of the processor stack?

Serge

Raimo Niskanen wrote:

> No, they are not C function calls. C function calls push
> the arguments (that may come not only from the calling
> code, but also from other places, nevertheless there
> is explicit code to place them ---) on the processor stack,
> pushes the return addresss on the processor stack and then
> jumps to the function address. When the function returns
> the return instruction pops the return address from the
> stack and jumps back to where it came from. Then the
> calling code restores the stack pointer to remove the
> call arguments from the processor stack.
>
> BEAM instructions are the instruction word followed by
> zero or more operand words. The instruction word is
> an address to the processor instruction sequence that
> implements the BEAM instruction, as Bjorn said below.
>
> The BEAM instructions are just straight processor code
> that do not expect arguments on the processor stack
> and do not end in a processor return instruction.
>
> The BEAM instructions themselves increment the BEAM
> instruction pointer and pick up operand words into
> processor registers. They end by reading the next
> instruction and since it is a jump address - jump
> to that address and then BEAM is performing the next
> BEAM instruction.
>
> If the BEAM instruction is a (conditional) jump or call
> et.al, the instruction just sets the BEAM instruction
> pointer to point to the next BEAM instruction to
> actually perform, reads it and jumps.
>
> So there is no call setup overhead per instruction.
>
>
> On Wed, Aug 08, 2007 at 08:03:46PM +0100, Joel Reymont wrote:
>> On Aug 8, 2007, at 6:26 AM, Bjorn Gustavsson wrote:
>>
>>> The first word in each instruction points to the executable C code  
>>> for the instruction. That first instruction word is followed by  
>>> zero or more operand words.
>> Would it be right to say that Erlang VM instructions are C function  
>> calls? This is my understanding of the above.
>>
>> Is there significant overhead of function call setup per instruction?
>>
>> Thanks, Joel
>>
>> --
>> http://wagerlabs.com
>>
>>
>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>

_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Design of the Erlang VM

Richard Carlsson-2
In reply to this post by Damien Morton-2
Damien Morton wrote:
> It turns out that there are some sequences of bytecodes that occur more
> frequently than others, and that you can structure the main interpreter
> loop so that common traces will result in straight through execution,
> that is, you test to see if the next instruction is the same as for the
> following code block, and either fall through or jump to the appropriate
> label.

The Erlang bytecode loader does a large amount of work rewriting the
generic "transport format" bytecode in the object files into the
concrete internal bytecode operations that are actually executed.
This recognizes common sequences and replaces them with optimized
single-opcode versions. (I don't know if the fallthrough trick is
also used, but it wouldn't surprise me.)

     /Richard

_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: : Design of the Erlang VM

Richard Carlsson-2
In reply to this post by Serge Aleynikov-2
Serge Aleynikov wrote:
> Could you shed some light on how non-tail recursive calls are being
> handled in the absence of use of the processor stack?

Every Erlang process has its own stack, for storing Erlang function
call frames. It's just that it's not the same as the C call stack
(which is used by the emulator itself). The emulator code explicitly
handles the processes' stacks, storing and retreiving local variables,
saving and restoring the Beam-code program counter, incrementing and
decrementing the process' stack pointer, etc. The process stack is
basically just an array of words, to be manipulated at will.

     /Richard

_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions