byte_size/1 vs variable access

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Loïc Hoguin-2
Hello,

Been wondering something pretty much insignificant recently.

When I need the size of a binary I tend to create a variable, say:

DataSize = byte_size(Data)

and then I reuse this variable in the function.

But considering byte_size/1 is O(1) I am wondering if perhaps that's a
little pointless. Is it even worth creating a variable for this? Is
perhaps the variable optimized out? Perhaps accessing a variable
contents and calling byte_size/1 are equivalent operations? Or the GC
that will follow is not worth what little is saved by creating a
variable in the first place?

If someone could shed some light on this perhaps I could stop creating
variables like this and simplify my code a little more.

Thank you.

--
Lo?c Hoguin
http://ninenines.eu

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Bob Ippolito-2
I don't know about performance implications, but I've found that it's often
necessary to work around limitations of bit matching syntax. Here's an
example:

1> Foo = <<"asdf">>, FooSize = byte_size(Foo).
4
2> <<Foo:(byte_size(Foo))/binary, Rest/binary>> = <<"asdfghi">>.
* 1: illegal bit size
3> <<Foo:FooSize/binary, Rest/binary>> = <<"asdfghi">>.
<<"asdfghi">>


On Tue, Feb 25, 2014 at 1:56 PM, Lo?c Hoguin <essen> wrote:

> Hello,
>
> Been wondering something pretty much insignificant recently.
>
> When I need the size of a binary I tend to create a variable, say:
>
> DataSize = byte_size(Data)
>
> and then I reuse this variable in the function.
>
> But considering byte_size/1 is O(1) I am wondering if perhaps that's a
> little pointless. Is it even worth creating a variable for this? Is perhaps
> the variable optimized out? Perhaps accessing a variable contents and
> calling byte_size/1 are equivalent operations? Or the GC that will follow
> is not worth what little is saved by creating a variable in the first place?
>
> If someone could shed some light on this perhaps I could stop creating
> variables like this and simplify my code a little more.
>
> Thank you.
>
> --
> Lo?c Hoguin
> http://ninenines.eu
> _______________________________________________
> erlang-questions mailing list
> erlang-questions
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140225/777f40e3/attachment.html>

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Björn Gustavsson-3
In reply to this post by Loïc Hoguin-2
On Tue, Feb 25, 2014 at 10:56 PM, Lo?c Hoguin <essen> wrote:

> But considering byte_size/1 is O(1) I am wondering if perhaps that's a
> little pointless. Is it even worth creating a variable for this? Is perhaps
> the variable optimized out? Perhaps accessing a variable contents and
> calling byte_size/1 are equivalent operations? Or the GC that will follow is
> not worth what little is saved by creating a variable in the first place?
>
> If someone could shed some light on this perhaps I could stop creating
> variables like this and simplify my code a little more.
>

There is definitely more overhead calling a BIF than accessing a variable.

That said, I doubt that you will be able to notice the difference in a
real program.
So I suggest that you write in the way that you find most readable.

/Bjorn

--
Bj?rn Gustavsson, Erlang/OTP, Ericsson AB

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Loïc Hoguin-2
Thanks.

Is there any optimization when it's done inside a function clause guard?
For example say, 5 of my 6 clauses need to check byte_size(Bin) to
decide what to do. I am reading some code I wrote a few days ago and I
see I instinctively used a single clause, creating a variable to hold
the size and then used a if inside it. Perhaps the compiler is doing
something about it in this case?

I know it doesn't matter 99% of the time, but I got one or two modules
that deal with binaries where the smallest improvement means I can
handle a bunch more traffic. To be honest it would be really nice if the
compiler would automatically create a variable when it sees me use
byte_size/1 more than once as I can then stop writing all this extra
code. And I'm guessing it could probably do the same with length/1 and
others.

On 02/26/2014 12:46 PM, Bj?rn Gustavsson wrote:

> On Tue, Feb 25, 2014 at 10:56 PM, Lo?c Hoguin <essen> wrote:
>
>> But considering byte_size/1 is O(1) I am wondering if perhaps that's a
>> little pointless. Is it even worth creating a variable for this? Is perhaps
>> the variable optimized out? Perhaps accessing a variable contents and
>> calling byte_size/1 are equivalent operations? Or the GC that will follow is
>> not worth what little is saved by creating a variable in the first place?
>>
>> If someone could shed some light on this perhaps I could stop creating
>> variables like this and simplify my code a little more.
>>
>
> There is definitely more overhead calling a BIF than accessing a variable.
>
> That said, I doubt that you will be able to notice the difference in a
> real program.
> So I suggest that you write in the way that you find most readable.
>
> /Bjorn
>

--
Lo?c Hoguin
http://ninenines.eu

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Björn Gustavsson-3
On Wed, Feb 26, 2014 at 1:42 PM, Lo?c Hoguin <essen> wrote:
> Thanks.
>
> Is there any optimization when it's done inside a function clause guard? For
> example say, 5 of my 6 clauses need to check byte_size(Bin) to decide what
> to do. I am reading some code I wrote a few days ago and I see I
> instinctively used a single clause, creating a variable to hold the size and
> then used a if inside it. Perhaps the compiler is doing something about it
> in this case?

No. The compiler currently does not do much optimizations of guards.

> I know it doesn't matter 99% of the time, but I got one or two modules that
> deal with binaries where the smallest improvement means I can handle a bunch
> more traffic. To be honest it would be really nice if the compiler would
> automatically create a variable when it sees me use byte_size/1 more than
> once as I can then stop writing all this extra code. And I'm guessing it
> could probably do the same with length/1 and others.
>

Is any of that code open-source? I occasionally look
through BEAM assembly code looking for things
that the BEAM compiler could handle better. Pointers
to real-world code that is performance critical
are appreciated.

/Bjorn

--
Bj?rn Gustavsson, Erlang/OTP, Ericsson AB

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Erik Søe Sørensen-3
In reply to this post by Loïc Hoguin-2
As far as I recall, guards aren't handled cleverly - common parts do not
result in shared code.

Regarding GC: Variables are on the stack and are released as soon as their
lifetime is done, so any extra GC-effect caused by the heap growing into
the stack sooner would presumably be small except if the variable in
question had a lifetime across a recursive call. I guess you'd have to
measure...
Den 26/02/2014 13.42 skrev "Lo?c Hoguin" <essen>:

> Thanks.
>
> Is there any optimization when it's done inside a function clause guard?
> For example say, 5 of my 6 clauses need to check byte_size(Bin) to decide
> what to do. I am reading some code I wrote a few days ago and I see I
> instinctively used a single clause, creating a variable to hold the size
> and then used a if inside it. Perhaps the compiler is doing something about
> it in this case?
>
> I know it doesn't matter 99% of the time, but I got one or two modules
> that deal with binaries where the smallest improvement means I can handle a
> bunch more traffic. To be honest it would be really nice if the compiler
> would automatically create a variable when it sees me use byte_size/1 more
> than once as I can then stop writing all this extra code. And I'm guessing
> it could probably do the same with length/1 and others.
>
> On 02/26/2014 12:46 PM, Bj?rn Gustavsson wrote:
>
>> On Tue, Feb 25, 2014 at 10:56 PM, Lo?c Hoguin <essen> wrote:
>>
>>  But considering byte_size/1 is O(1) I am wondering if perhaps that's a
>>> little pointless. Is it even worth creating a variable for this? Is
>>> perhaps
>>> the variable optimized out? Perhaps accessing a variable contents and
>>> calling byte_size/1 are equivalent operations? Or the GC that will
>>> follow is
>>> not worth what little is saved by creating a variable in the first place?
>>>
>>> If someone could shed some light on this perhaps I could stop creating
>>> variables like this and simplify my code a little more.
>>>
>>>
>> There is definitely more overhead calling a BIF than accessing a variable.
>>
>> That said, I doubt that you will be able to notice the difference in a
>> real program.
>> So I suggest that you write in the way that you find most readable.
>>
>> /Bjorn
>>
>>
> --
> Lo?c Hoguin
> http://ninenines.eu
> _______________________________________________
> erlang-questions mailing list
> erlang-questions
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140226/7177551d/attachment.html>

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Loïc Hoguin-2
In reply to this post by Loïc Hoguin-2
What I want is to write the most readable code possible and that the
compiler makes it run as fast as possible, using hints that could be
provided in the form of compiler flags (possibly local to a module).
It's pretty much already what the compiler is doing with other things.

I understand I can't defer everything to the compiler, but considering
Erlang is immutable it shouldn't be too hard to automate that kind of
optimization, and be allowed to write tighter and cleaner code while
still having these tiny improvements.

On 02/26/2014 02:23 PM, Aidan Hobson Sayers wrote:

> If your performance is on a knife edge such that var lookup vs bif call
> is important then perhaps you wouldn't want the compiler to blindly
> perform this optimisation?
> E.g. if your first clause happens in (say) 99% of cases and the compiler
> tries to be clever, you're doing
>   - bif call
>   - variable store
>   - variable access
> when maybe you'd rather it do the multiple bif calls if it had to
> because 99% of the time it only needs to do one.
>
> But that would depend on how much faster var access is compared to the
> bif call. If it's 2x faster you definitely don't want the 'optimisation'
> above all the time. If it's 1000x faster then you want this optimisation
> for the 99% of the time case above...but not if the first clause happens
> 99.999% of the time.
>
> As with all things performance, benchmarks are the only solution.
>
> Aidan
>
>
> On 26 February 2014 12:42, Lo?c Hoguin <essen
> <mailto:essen>> wrote:
>
>     Thanks.
>
>     Is there any optimization when it's done inside a function clause
>     guard? For example say, 5 of my 6 clauses need to check
>     byte_size(Bin) to decide what to do. I am reading some code I wrote
>     a few days ago and I see I instinctively used a single clause,
>     creating a variable to hold the size and then used a if inside it.
>     Perhaps the compiler is doing something about it in this case?
>
>     I know it doesn't matter 99% of the time, but I got one or two
>     modules that deal with binaries where the smallest improvement means
>     I can handle a bunch more traffic. To be honest it would be really
>     nice if the compiler would automatically create a variable when it
>     sees me use byte_size/1 more than once as I can then stop writing
>     all this extra code. And I'm guessing it could probably do the same
>     with length/1 and others.
>
>
>     On 02/26/2014 12:46 PM, Bj?rn Gustavsson wrote:
>
>         On Tue, Feb 25, 2014 at 10:56 PM, Lo?c Hoguin
>         <essen <mailto:essen>> wrote:
>
>             But considering byte_size/1 is O(1) I am wondering if
>             perhaps that's a
>             little pointless. Is it even worth creating a variable for
>             this? Is perhaps
>             the variable optimized out? Perhaps accessing a variable
>             contents and
>             calling byte_size/1 are equivalent operations? Or the GC
>             that will follow is
>             not worth what little is saved by creating a variable in the
>             first place?
>
>             If someone could shed some light on this perhaps I could
>             stop creating
>             variables like this and simplify my code a little more.
>
>
>         There is definitely more overhead calling a BIF than accessing a
>         variable.
>
>         That said, I doubt that you will be able to notice the
>         difference in a
>         real program.
>         So I suggest that you write in the way that you find most readable.
>
>         /Bjorn
>
>
>     --
>     Lo?c Hoguin
>     http://ninenines.eu
>
>     _________________________________________________
>     erlang-questions mailing list
>     erlang-questions <mailto:erlang-questions>
>     http://erlang.org/mailman/__listinfo/erlang-questions
>     <http://erlang.org/mailman/listinfo/erlang-questions>
>
>

--
Lo?c Hoguin
http://ninenines.eu

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Loïc Hoguin-2
In reply to this post by Björn Gustavsson-3
On 02/26/2014 02:30 PM, Bj?rn Gustavsson wrote:

> On Wed, Feb 26, 2014 at 1:42 PM, Lo?c Hoguin <essen> wrote:
>> Thanks.
>>
>> Is there any optimization when it's done inside a function clause guard? For
>> example say, 5 of my 6 clauses need to check byte_size(Bin) to decide what
>> to do. I am reading some code I wrote a few days ago and I see I
>> instinctively used a single clause, creating a variable to hold the size and
>> then used a if inside it. Perhaps the compiler is doing something about it
>> in this case?
>
> No. The compiler currently does not do much optimizations of guards.

Shame.

>> I know it doesn't matter 99% of the time, but I got one or two modules that
>> deal with binaries where the smallest improvement means I can handle a bunch
>> more traffic. To be honest it would be really nice if the compiler would
>> automatically create a variable when it sees me use byte_size/1 more than
>> once as I can then stop writing all this extra code. And I'm guessing it
>> could probably do the same with length/1 and others.
>>
>
> Is any of that code open-source? I occasionally look
> through BEAM assembly code looking for things
> that the BEAM compiler could handle better. Pointers
> to real-world code that is performance critical
> are appreciated.

That particular byte_size/1 related code isn't online yet. I am
finishing it up. I do have two very performance critical modules online,
one that I optimized to death and then some, here:

  https://github.com/extend/cowboy/blob/master/src/cowboy_protocol.erl

It uses a single binary match context for the whole thing (as far as I
can tell, anyway).

And one where I simply reimplemented my own supervisor here:

  https://github.com/extend/ranch/blob/master/src/ranch_conns_sup.erl

If you can make any of these faster than they currently are, then I
believe you will make happy a big part of the Erlang community.

--
Lo?c Hoguin
http://ninenines.eu

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Fred Hebert
In reply to this post by Loïc Hoguin-2
Everything is immutable, but it is not side-effect free. Twot hings to
consider:

- Only allowing it in guards to avoid side-effects that could be
  triggered at the wrong moment. If you allow it with any code, then
  multiple function calls being moved around can change the number of
  messages sent or received within that function
- Messing up tracing of function calls where the file and the resulting
  code no longer match, breaking expectations.

On 02/26, Lo?c Hoguin wrote:

> What I want is to write the most readable code possible and that the
> compiler makes it run as fast as possible, using hints that could be
> provided in the form of compiler flags (possibly local to a module). It's
> pretty much already what the compiler is doing with other things.
>
> I understand I can't defer everything to the compiler, but considering
> Erlang is immutable it shouldn't be too hard to automate that kind of
> optimization, and be allowed to write tighter and cleaner code while still
> having these tiny improvements.
>
> On 02/26/2014 02:23 PM, Aidan Hobson Sayers wrote:
> >If your performance is on a knife edge such that var lookup vs bif call
> >is important then perhaps you wouldn't want the compiler to blindly
> >perform this optimisation?
> >E.g. if your first clause happens in (say) 99% of cases and the compiler
> >tries to be clever, you're doing
> >  - bif call
> >  - variable store
> >  - variable access
> >when maybe you'd rather it do the multiple bif calls if it had to
> >because 99% of the time it only needs to do one.
> >
> >But that would depend on how much faster var access is compared to the
> >bif call. If it's 2x faster you definitely don't want the 'optimisation'
> >above all the time. If it's 1000x faster then you want this optimisation
> >for the 99% of the time case above...but not if the first clause happens
> >99.999% of the time.
> >
> >As with all things performance, benchmarks are the only solution.
> >
> >Aidan
> >
> >
> >On 26 February 2014 12:42, Lo?c Hoguin <essen
> ><mailto:essen>> wrote:
> >
> >    Thanks.
> >
> >    Is there any optimization when it's done inside a function clause
> >    guard? For example say, 5 of my 6 clauses need to check
> >    byte_size(Bin) to decide what to do. I am reading some code I wrote
> >    a few days ago and I see I instinctively used a single clause,
> >    creating a variable to hold the size and then used a if inside it.
> >    Perhaps the compiler is doing something about it in this case?
> >
> >    I know it doesn't matter 99% of the time, but I got one or two
> >    modules that deal with binaries where the smallest improvement means
> >    I can handle a bunch more traffic. To be honest it would be really
> >    nice if the compiler would automatically create a variable when it
> >    sees me use byte_size/1 more than once as I can then stop writing
> >    all this extra code. And I'm guessing it could probably do the same
> >    with length/1 and others.
> >
> >
> >    On 02/26/2014 12:46 PM, Bj?rn Gustavsson wrote:
> >
> >        On Tue, Feb 25, 2014 at 10:56 PM, Lo?c Hoguin
> >        <essen <mailto:essen>> wrote:
> >
> >            But considering byte_size/1 is O(1) I am wondering if
> >            perhaps that's a
> >            little pointless. Is it even worth creating a variable for
> >            this? Is perhaps
> >            the variable optimized out? Perhaps accessing a variable
> >            contents and
> >            calling byte_size/1 are equivalent operations? Or the GC
> >            that will follow is
> >            not worth what little is saved by creating a variable in the
> >            first place?
> >
> >            If someone could shed some light on this perhaps I could
> >            stop creating
> >            variables like this and simplify my code a little more.
> >
> >
> >        There is definitely more overhead calling a BIF than accessing a
> >        variable.
> >
> >        That said, I doubt that you will be able to notice the
> >        difference in a
> >        real program.
> >        So I suggest that you write in the way that you find most readable.
> >
> >        /Bjorn
> >
> >
> >    --
> >    Lo?c Hoguin
> >    http://ninenines.eu
> >
> >    _________________________________________________
> >    erlang-questions mailing list
> >    erlang-questions <mailto:erlang-questions>
> >    http://erlang.org/mailman/__listinfo/erlang-questions
> >    <http://erlang.org/mailman/listinfo/erlang-questions>
> >
> >
>
> --
> Lo?c Hoguin
> http://ninenines.eu
> _______________________________________________
> erlang-questions mailing list
> erlang-questions
> http://erlang.org/mailman/listinfo/erlang-questions

Reply | Threaded
Open this post in threaded view
|

byte_size/1 vs variable access

Loïc Hoguin-2
I do not believe there's an issue for the two I mentioned, byte_size/1
and length/1. The only difference is creating an extra variable to hold
the result of the first call, and reusing that afterwards. If the first
call doesn't fail, then all subsequent calls are guaranteed to succeed,
so it doesn't mess up the stacktrace. As far as tracing goes, it could
have the same behavior as inlined functions. It of course wouldn't be
enabled by default either.

On 02/26/2014 02:50 PM, Fred Hebert wrote:

> Everything is immutable, but it is not side-effect free. Twot hings to
> consider:
>
> - Only allowing it in guards to avoid side-effects that could be
>    triggered at the wrong moment. If you allow it with any code, then
>    multiple function calls being moved around can change the number of
>    messages sent or received within that function
> - Messing up tracing of function calls where the file and the resulting
>    code no longer match, breaking expectations.
>
> On 02/26, Lo?c Hoguin wrote:
>> What I want is to write the most readable code possible and that the
>> compiler makes it run as fast as possible, using hints that could be
>> provided in the form of compiler flags (possibly local to a module). It's
>> pretty much already what the compiler is doing with other things.
>>
>> I understand I can't defer everything to the compiler, but considering
>> Erlang is immutable it shouldn't be too hard to automate that kind of
>> optimization, and be allowed to write tighter and cleaner code while still
>> having these tiny improvements.
>>
>> On 02/26/2014 02:23 PM, Aidan Hobson Sayers wrote:
>>> If your performance is on a knife edge such that var lookup vs bif call
>>> is important then perhaps you wouldn't want the compiler to blindly
>>> perform this optimisation?
>>> E.g. if your first clause happens in (say) 99% of cases and the compiler
>>> tries to be clever, you're doing
>>>   - bif call
>>>   - variable store
>>>   - variable access
>>> when maybe you'd rather it do the multiple bif calls if it had to
>>> because 99% of the time it only needs to do one.
>>>
>>> But that would depend on how much faster var access is compared to the
>>> bif call. If it's 2x faster you definitely don't want the 'optimisation'
>>> above all the time. If it's 1000x faster then you want this optimisation
>>> for the 99% of the time case above...but not if the first clause happens
>>> 99.999% of the time.
>>>
>>> As with all things performance, benchmarks are the only solution.
>>>
>>> Aidan
>>>
>>>
>>> On 26 February 2014 12:42, Lo?c Hoguin <essen
>>> <mailto:essen>> wrote:
>>>
>>>     Thanks.
>>>
>>>     Is there any optimization when it's done inside a function clause
>>>     guard? For example say, 5 of my 6 clauses need to check
>>>     byte_size(Bin) to decide what to do. I am reading some code I wrote
>>>     a few days ago and I see I instinctively used a single clause,
>>>     creating a variable to hold the size and then used a if inside it.
>>>     Perhaps the compiler is doing something about it in this case?
>>>
>>>     I know it doesn't matter 99% of the time, but I got one or two
>>>     modules that deal with binaries where the smallest improvement means
>>>     I can handle a bunch more traffic. To be honest it would be really
>>>     nice if the compiler would automatically create a variable when it
>>>     sees me use byte_size/1 more than once as I can then stop writing
>>>     all this extra code. And I'm guessing it could probably do the same
>>>     with length/1 and others.
>>>
>>>
>>>     On 02/26/2014 12:46 PM, Bj?rn Gustavsson wrote:
>>>
>>>         On Tue, Feb 25, 2014 at 10:56 PM, Lo?c Hoguin
>>>         <essen <mailto:essen>> wrote:
>>>
>>>             But considering byte_size/1 is O(1) I am wondering if
>>>             perhaps that's a
>>>             little pointless. Is it even worth creating a variable for
>>>             this? Is perhaps
>>>             the variable optimized out? Perhaps accessing a variable
>>>             contents and
>>>             calling byte_size/1 are equivalent operations? Or the GC
>>>             that will follow is
>>>             not worth what little is saved by creating a variable in the
>>>             first place?
>>>
>>>             If someone could shed some light on this perhaps I could
>>>             stop creating
>>>             variables like this and simplify my code a little more.
>>>
>>>
>>>         There is definitely more overhead calling a BIF than accessing a
>>>         variable.
>>>
>>>         That said, I doubt that you will be able to notice the
>>>         difference in a
>>>         real program.
>>>         So I suggest that you write in the way that you find most readable.
>>>
>>>         /Bjorn
>>>
>>>
>>>     --
>>>     Lo?c Hoguin
>>>     http://ninenines.eu
>>>
>>>     _________________________________________________
>>>     erlang-questions mailing list
>>>     erlang-questions <mailto:erlang-questions>
>>>     http://erlang.org/mailman/__listinfo/erlang-questions
>>>     <http://erlang.org/mailman/listinfo/erlang-questions>
>>>
>>>
>>
>> --
>> Lo?c Hoguin
>> http://ninenines.eu
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions
>> http://erlang.org/mailman/listinfo/erlang-questions

--
Lo?c Hoguin
http://ninenines.eu