Erlang list equality check question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Erlang list equality check question

yorda
Dear erlang community,

I’ve been reading the beam book recently, just learned the memory representation of a list is just a con cell which occupy a machine word which 2 bit tag. So It’s make sense that appending a new element on a big list is a small operation. timer:tc/1 shows X = lists:seq(1, 10000) takes 1184us and X2 = [0 | X] only takes 2 us.

But X =:= X1 takes 600us, which shows it walks the whole list to do the equality check, my question is why not just do a cheap con cell immediate comparison since erlang variables don’t vary and tl(X2) points to X1. Is there any method to do the cons equality comparison?


mko_io
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Erlang list equality check question

Pierre Fenoll-2
This sounds like a it could be compiler optimization. 
However I think it would be better to have a lint tool suggesting to rewrite that equality comparison by extracting the head and comparing only that. Maybe code from Dialyzer could be reused to find such cases and others, providing performance tips as well as style and maybe even architectural suggestions. 

I think Erlang needs a linter. Do you guys know any such tool? Kostis has one that rewrites calls to lists:map/2 into list comprehensions but I can’t find it right now. 

On Sat 24 Mar 2018 at 23:47, mko_io <[hidden email]> wrote:
Dear erlang community,

I’ve been reading the beam book recently, just learned the memory representation of a list is just a con cell which occupy a machine word which 2 bit tag. So It’s make sense that appending a new element on a big list is a small operation. timer:tc/1 shows X = lists:seq(1, 10000) takes 1184us and X2 = [0 | X] only takes 2 us.

But X =:= X1 takes 600us, which shows it walks the whole list to do the equality check, my question is why not just do a cheap con cell immediate comparison since erlang variables don’t vary and tl(X2) points to X1. Is there any method to do the cons equality comparison?


mko_io
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
--

Cheers,
-- 
Pierre Fenoll


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Erlang list equality check question

Oliver Korpilla
Hello.

I'm trying to understand this topic as well.

I was under the impression that if you generated two lists independently from each but with identical elements they would not share memory, but let's say you took part of an existing list or append to an existing list they do. If this assumption is true, list comparison could only short circuit for lists indeed sharing memory.

And it seems it does (latest OTP 20.3 used on arm):

8> List1 = lists:seq(1,10000).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
9> List2 = lists:seq(1,10000).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
10> timer:tc(fun() -> List1 =:= List2 end).
{279,true}
11> timer:tc(fun() -> List1 =:= List1 end).
{20,true}
12> List3 = [ 0 | List1 ].
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28|...]
13> timer:tc(fun() -> List1 =:= List3 end).
{22,false}
14> [ _ | List4 ] = List3.
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28|...]
15> timer:tc(fun() -> List1 =:= List4 end).
{23,true}

10> compares two independently generated lists and does a full walk, it seems.
11> compares the same con cell and performs much better.
13> up to 15> compares two lists that share con cells and performs much better

It seems that BEAM already short circuits list walks if it finds a shared con cell.

I'm a bit confused about the example mko_io gives though as te definition of X1 is not given? Also, what system were you testing on with what OTP version? Your numbers seem much worse than even on my Raspberry Pi 3.

Cheers,
Oliver 
 

Gesendet: Sonntag, 25. März 2018 um 14:04 Uhr
Von: "Pierre Fenoll" <[hidden email]>
An: mko_io <[hidden email]>
Cc: [hidden email]
Betreff: Re: [erlang-questions] Erlang list equality check question

This sounds like a it could be compiler optimization. 
However I think it would be better to have a lint tool suggesting to rewrite that equality comparison by extracting the head and comparing only that. Maybe code from Dialyzer could be reused to find such cases and others, providing performance tips as well as style and maybe even architectural suggestions. 
 
I think Erlang needs a linter. Do you guys know any such tool? Kostis has one that rewrites calls to lists:map/2 into list comprehensions but I can’t find it right now. 
 

On Sat 24 Mar 2018 at 23:47, mko_io <[hidden email][mailto:[hidden email]]> wrote:Dear erlang community,

I’ve been reading the beam book recently, just learned the memory representation of a list is just a con cell which occupy a machine word which 2 bit tag. So It’s make sense that appending a new element on a big list is a small operation. timer:tc/1 shows X = lists:seq(1, 10000) takes 1184us and X2 = [0 | X] only takes 2 us.

But X =:= X1 takes 600us, which shows it walks the whole list to do the equality check, my question is why not just do a cheap con cell immediate comparison since erlang variables don’t vary and tl(X2) points to X1. Is there any method to do the cons equality comparison?


mko_io
_______________________________________________
erlang-questions mailing list
[hidden email][mailto:[hidden email]]
http://erlang.org/mailman/listinfo/erlang-questions--

 
Cheers,
-- 
Pierre Fenoll
 _______________________________________________ erlang-questions mailing list [hidden email] http://erlang.org/mailman/listinfo/erlang-questions[http://erlang.org/mailman/listinfo/erlang-questions]
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Erlang list equality check question

Roman Galeev
> I think Erlang needs a linter.

Definitely. Generally speaking, a code rewrite tool (to unquote atoms ;))

> list comprehensions but I can’t find it right now.

It's closed source now, and afaik is not ready for production.

On Sun, Mar 25, 2018 at 2:49 PM, Oliver Korpilla <[hidden email]> wrote:

>
> Hello.
>
> I'm trying to understand this topic as well.
>
> I was under the impression that if you generated two lists independently from each but with identical elements they would not share memory, but let's say you took part of an existing list or append to an existing list they do. If this assumption is true, list comparison could only short circuit for lists indeed sharing memory.
>
> And it seems it does (latest OTP 20.3 used on arm):
>
> 8> List1 = lists:seq(1,10000).
> [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>  23,24,25,26,27,28,29|...]
> 9> List2 = lists:seq(1,10000).
> [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>  23,24,25,26,27,28,29|...]
> 10> timer:tc(fun() -> List1 =:= List2 end).
> {279,true}
> 11> timer:tc(fun() -> List1 =:= List1 end).
> {20,true}
> 12> List3 = [ 0 | List1 ].
> [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>  23,24,25,26,27,28|...]
> 13> timer:tc(fun() -> List1 =:= List3 end).
> {22,false}
> 14> [ _ | List4 ] = List3.
> [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>  23,24,25,26,27,28|...]
> 15> timer:tc(fun() -> List1 =:= List4 end).
> {23,true}
>
> 10> compares two independently generated lists and does a full walk, it seems.
> 11> compares the same con cell and performs much better.
> 13> up to 15> compares two lists that share con cells and performs much better
>
> It seems that BEAM already short circuits list walks if it finds a shared con cell.
>
> I'm a bit confused about the example mko_io gives though as te definition of X1 is not given? Also, what system were you testing on with what OTP version? Your numbers seem much worse than even on my Raspberry Pi 3.
>
> Cheers,
> Oliver
>  
>
> Gesendet: Sonntag, 25. März 2018 um 14:04 Uhr
> Von: "Pierre Fenoll" <[hidden email]>
> An: mko_io <[hidden email]>
> Cc: [hidden email]
> Betreff: Re: [erlang-questions] Erlang list equality check question
>
> This sounds like a it could be compiler optimization.
> However I think it would be better to have a lint tool suggesting to rewrite that equality comparison by extracting the head and comparing only that. Maybe code from Dialyzer could be reused to find such cases and others, providing performance tips as well as style and maybe even architectural suggestions.
>  
> I think Erlang needs a linter. Do you guys know any such tool? Kostis has one that rewrites calls to lists:map/2 into list comprehensions but I can’t find it right now.
>  
>
> On Sat 24 Mar 2018 at 23:47, mko_io <[hidden email][mailto:[hidden email]]> wrote:Dear erlang community,
>
> I’ve been reading the beam book recently, just learned the memory representation of a list is just a con cell which occupy a machine word which 2 bit tag. So It’s make sense that appending a new element on a big list is a small operation. timer:tc/1 shows X = lists:seq(1, 10000) takes 1184us and X2 = [0 | X] only takes 2 us.
>
> But X =:= X1 takes 600us, which shows it walks the whole list to do the equality check, my question is why not just do a cheap con cell immediate comparison since erlang variables don’t vary and tl(X2) points to X1. Is there any method to do the cons equality comparison?
>
>
> mko_io
> _______________________________________________
> erlang-questions mailing list
> [hidden email][mailto:[hidden email]]
> http://erlang.org/mailman/listinfo/erlang-questions--
>
>  
> Cheers,
> --
> Pierre Fenoll
>  _______________________________________________ erlang-questions mailing list [hidden email] http://erlang.org/mailman/listinfo/erlang-questions[http://erlang.org/mailman/listinfo/erlang-questions]
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions




--
With best regards,
     Roman Galeev,
     +420 702 817 968

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Erlang list equality check question

Jesper Louis Andersen-2
In reply to this post by yorda
On Sat, Mar 24, 2018 at 11:47 PM mko_io <[hidden email]> wrote:

But X =:= X1 takes 600us, which shows it walks the whole list to do the equality check, my question is why not just do a cheap con cell immediate comparison since erlang variables don’t vary and tl(X2) points to X1. Is there any method to do the cons equality comparison?


Normally, an equality check runs through three levels:

1. Pointer equality
2. Type analysis. Eterms having different tags must be different.
3. Value analysis, perhaps recursion/congruence checks for the subterms.

So in principle, if your X and X1 are equal at the pointer level, they should stop dong a comparison. What is a bit more likely though is that you are checking in the shell and that the shell is interpreted. So it might react a bit differently. Also, you did not tell us what X1 is, so depending on how you generated that, it might be an entirely new set of data (and this requires a deep comparison with recursion then).

The Erlang system does no hash-consing, so if you generate new data, comparisons will always be expensive.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions