Tail calls of funs not optimized?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Tail calls of funs not optimized?

Ondřej Adamovský-2
Hello.

Out of curiosity, I was looking at BEAM assembler listings of some of my modules. To my great dismay, I noticed tail fun calls seem to not be optimized. Regular function tail calls are translated to single call_last or call_only instruction, but fun tail call is translated to call_fun instruction followed by deallocate and return instructions. Also, the call_fun instruction is the same regardless of its position in the function body.

Can somebody explain to me why? Or if this still is optimized somehow, then how?

I regularly use a recursive cycle of functions formed by recursive tail calls. With the optimization it effectively uses only one stack frame. It is only natural to vary parts of the cycle using funs depending on options provided to the initial call. If this finding is true, the cycle with funs produces at least one stack frame per cycle which can use up quite substantial portion of memory (or even deplete it in endless cycles).

This should have a big red warning in the documentation, but I did not find any mention of this there or anywhere. Tail call optimization is discussed here https://erlang.org/doc/reference_manual/functions.html#tail-recursion in chapter about functions with no mention of funs. Function call is defined here https://erlang.org/doc/reference_manual/expressions.html#function-calls explicitly stating fun call as a possibility.

Best regards,
Ondřej Adamovský
Reply | Threaded
Open this post in threaded view
|

Re: Tail calls of funs not optimized?

Christophe De Troyer
As a lurker of this mailing list Im interested in these listings. Mind sharing them, too?

- Christophe

On 29 March 2021 18:52:02 CEST, "Ondřej Adamovský" <[hidden email]> wrote:
Hello.

Out of curiosity, I was looking at BEAM assembler listings of some of my modules. To my great dismay, I noticed tail fun calls seem to not be optimized. Regular function tail calls are translated to single call_last or call_only instruction, but fun tail call is translated to call_fun instruction followed by deallocate and return instructions. Also, the call_fun instruction is the same regardless of its position in the function body.

Can somebody explain to me why? Or if this still is optimized somehow, then how?

I regularly use a recursive cycle of functions formed by recursive tail calls. With the optimization it effectively uses only one stack frame. It is only natural to vary parts of the cycle using funs depending on options provided to the initial call. If this finding is true, the cycle with funs produces at least one stack frame per cycle which can use up quite substantial portion of memory (or even deplete it in endless cycles).

This should have a big red warning in the documentation, but I did not find any mention of this there or anywhere. Tail call optimization is discussed here https://erlang.org/doc/reference_manual/functions.html#tail-recursion in chapter about functions with no mention of funs. Function call is defined here https://erlang.org/doc/reference_manual/expressions.html#function-calls explicitly stating fun call as a possibility.

Best regards,
Ondřej Adamovský

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Reply | Threaded
Open this post in threaded view
|

Re: Tail calls of funs not optimized?

Björn Gustavsson-4
In reply to this post by Ondřej Adamovský-2
On Tue, Mar 30, 2021 at 8:46 AM Ondřej Adamovský <[hidden email]> wrote:

> Can somebody explain to me why? Or if this still is optimized somehow, then how?

Calls to funs in the tail of a function will be rewritten by the
loader to make the calls tail-recursive.

In OTP 23 or earlier, call erts_debug:df(Mod) to produce a text file
named Mod.dis with the disassembled loaded instructions.

In the upcoming OTP 24 with the JIT enabled, start the runtime system
like this: "erl +asmdump" The generated assembly code for all loaded
modules will be listed in text files with the extension *.asm.

/Björn
--
Björn Gustavsson, Erlang/OTP, Ericsson AB
Reply | Threaded
Open this post in threaded view
|

RE: Tail calls of funs not optimized?

John Högberg
If you're curious about how it's done you can find a quick overview under "Transformation rules" in beam_makeops.md [1]. The specific rule is in ops.tab [2].

Regards,
John Högberg

[1] https://github.com/erlang/otp/blob/7f37605f8bb24519bd51b15184cc27d5583cddb7/erts/emulator/internal_doc/beam_makeops.md 
[2] https://github.com/erlang/otp/blob/7f37605f8bb24519bd51b15184cc27d5583cddb7/erts/emulator/beam/emu/ops.tab#L1034+L1040

-----Original Message-----
From: erlang-questions <[hidden email]> On Behalf Of Björn Gustavsson
Sent: Tuesday, March 30, 2021 10:14 AM
To: Ondřej Adamovský <[hidden email]>
Cc: erlang-questions <[hidden email]>
Subject: Re: Tail calls of funs not optimized?

On Tue, Mar 30, 2021 at 8:46 AM Ondřej Adamovský <[hidden email]> wrote:

> Can somebody explain to me why? Or if this still is optimized somehow, then how?

Calls to funs in the tail of a function will be rewritten by the loader to make the calls tail-recursive.

In OTP 23 or earlier, call erts_debug:df(Mod) to produce a text file named Mod.dis with the disassembled loaded instructions.

In the upcoming OTP 24 with the JIT enabled, start the runtime system like this: "erl +asmdump" The generated assembly code for all loaded modules will be listed in text files with the extension *.asm.

/Björn
--
Björn Gustavsson, Erlang/OTP, Ericsson AB
Reply | Threaded
Open this post in threaded view
|

Re: Tail calls of funs not optimized?

Ondřej Adamovský-2
Thank you for your explanation and links. It is very interesting peek under the hood. I had no idea how much work the loader does. I completely missed this part of the documentation.
This means the generic instruction set does not really reflect what the machine actually does. Good to know. Are there more of these semantics changing transformations between generic and specific instructions?

Best regards,
Ondřej Adamovský

On Tuesday, March 30, 2021 10:24:04 AM (+02:00), John Högberg wrote:

> If you're curious about how it's done you can find a quick overview under "Transformation rules" in beam_makeops.md [1]. The specific rule is in ops.tab [2].
>
> Regards,
> John Högberg
>
> [1] https://github.com/erlang/otp/blob/7f37605f8bb24519bd51b15184cc27d5583cddb7/erts/emulator/internal_doc/beam_makeops.md
> [2] https://github.com/erlang/otp/blob/7f37605f8bb24519bd51b15184cc27d5583cddb7/erts/emulator/beam/emu/ops.tab#L1034+L1040
>
> -----Original Message-----
> From: erlang-questions <[hidden email]> On Behalf Of Björn Gustavsson
> Sent: Tuesday, March 30, 2021 10:14 AM
> To: Ondřej Adamovský <[hidden email]>
> Cc: erlang-questions <[hidden email]>
> Subject: Re: Tail calls of funs not optimized?
>
> On Tue, Mar 30, 2021 at 8:46 AM Ondřej Adamovský <[hidden email]> wrote:
>
> > Can somebody explain to me why? Or if this still is optimized somehow, then how?
>
> Calls to funs in the tail of a function will be rewritten by the loader to make the calls tail-recursive.
>
> In OTP 23 or earlier, call erts_debug:df(Mod) to produce a text file named Mod.dis with the disassembled loaded instructions.
>
> In the upcoming OTP 24 with the JIT enabled, start the runtime system like this: "erl +asmdump" The generated assembly code for all loaded modules will be listed in text files with the extension *.asm.
>
> /Björn
> --
> Björn Gustavsson, Erlang/OTP, Ericsson AB
>