efficiency, deep lists

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Ulf Wiger-4

Hi,

I've been playing around with continuation passing style when
processing lists of lists. It does seem to be a bit cheaper than
the alternative styles. If funs are optimized in R8, it should be
even more so.


Example:

I wrote four different ways to increment all elements in a list
of lists of integers (lists:duplicate(50, lists:seq(1,50)).

The outcome on my machine was as follows:

- incr1/1: 964 us
- incr2/1: 814 us
- incr3/1: 951 us
- incr4/1: 914 us


The winner, albeit by a hair, was:

incr2([H|T]) ->
   incr22(H, fun() -> incr2(T) end);
incr2([]) ->
   [].

incr22([H|T], F) ->
   [H+1|incr22(T, F)];
incr22([], F) ->
   F().


So, question to the community. Does this style of programming
strike you as offensive, or do you kind of like it?

Personally, I feel that the other contenders only come close
because ++ has been optimized in C, so I kind of like the
continuation passing style because it is cheap by design.  ;-)

/Uffe
****************************

-module(continuation).
-author('etxuwig').

-compile(export_all).

mk_list() ->
    lists:duplicate(50, lists:seq(1,10)).

incr1([H|T]) when list(H) ->
    incr11(H) ++ incr1(T);
incr1([]) ->
    [].

incr11([H|T]) ->
    [H+1|incr11(T)];
incr11([]) ->
    [].

incr2([H|T]) when list(H) ->
    incr22(H, fun() -> incr2(T) end);
incr2([]) ->
    [].

incr22([H|T], F) ->
    [H+1|incr22(T, F)];
incr22([], F) ->
    F().

incr3(List) ->
    lists:foldl(
      fun(L, Acc) ->
              incr11(L) ++ Acc
      end, [], List).

incr4(List) ->
    incr11(lists:concat(List)).



--
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB



Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Luke Gorrie-3
Ulf Wiger <etxuwig> writes:

> I've been playing around with continuation passing style when
> processing lists of lists. It does seem to be a bit cheaper than
> the alternative styles. If funs are optimized in R8, it should be
> even more so.

I'll throw in another continuation-based version (for any-depth
lists):

  incr5(L) -> incr55(L, []).

  %% incr55(DeepList, [Continuation])
  incr55([H|T], C) when list(H) ->
      incr55(H, [T|C]);
  incr55([H|T], C) ->
      [H+1|incr55(T, C)];
  incr55([], [C|Cs]) ->
      incr55(C, Cs);
  incr55([], []) ->
      [].

The basic difference here is the trick that most of the time you can
use a data structure instead of a fun for a continuation - which is
perhaps a bit easier to read in crash reports :-). I think that if you
did the any-depth implementation with funs, you'd have a chain of them
each calling the next, which is the same basic thing as my list of
continuations - which is fine too.

I don't know about the relative efficiency, timer:tc is giving me wide
ranges of numbers today. I would *guess* it's slightly cheaper since
it conses instead of making a fun, and makes a local call instead of
calling a fun.

--
Favourite Haiku error: Three things are certain:
                       Death, taxes, and lost data.
                       Guess which has occurred.


Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Luke Gorrie-3
Luke Gorrie <luke> writes:

> Ulf Wiger <etxuwig> writes:
>
> > I've been playing around with continuation passing style when
> > processing lists of lists. It does seem to be a bit cheaper than
> > the alternative styles. If funs are optimized in R8, it should be
> > even more so.
>
> I'll throw in another continuation-based version (for any-depth
> lists):

For benchmarking, here's the list-of-lists version, which is just
Ulf's incr2 with the recursive call moved from where the continuation
is created to where the continuation is invoked:

  incr6([H|T]) when list(H) ->
      incr66(H, T);
  incr6([]) ->
      [].

  incr66([H|T], C) ->
      [H+1|incr66(T, C)];
  incr66([], C) ->
      incr6(C).

Ulf, I'd be interested to know how it runs in the benchmark, since you
seem to have a way of timing things better than us (maybe a Solaris
thing?).

--
Favourite Haiku error: Three things are certain:
                       Death, taxes, and lost data.
                       Guess which has occurred.


Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Richard Carlsson-4

On 2 Oct 2001, Luke Gorrie wrote:

> Luke Gorrie <luke> writes:

> For benchmarking, here's the list-of-lists version, which is just
> Ulf's incr2 with the recursive call moved from where the continuation
> is created to where the continuation is invoked:

This was interesting, so I did some benchmarks of my own. The results
seem to have little variance, and I think they can be trusted:

                Time   GC     Reclaimed
        incr1:  11100  40000  2.1e7   ;naive
        incr2:  7200   25005  1.5e7   ;fun-continuation
        incr3:  6800   10014  2.0e7   ;bad foldl-version
        incr3': 11900  44000  4.7e7   ;corrected foldl-version
        incr4:  7400   5003   2.1e7   ;flatten first
        incr5:  9000   30001  1.1e7   ;continuation as data
        incr6:  5200   10002  1.0e7   ;list-of-lists only
        incr7:  9200   53342  2.4e7   ;tail recursive version of 5
        incr8:  7600   26668  2.3e7   ;tail recursive version of 6
        incr9:  8000   20738  1.6e7   ;fun-continuation (any depth)

Explanations: incr3 was incorrectly implemented - the returned list is
in the wrong order, and since it did not compute the same function, it
could execute more efficiently than incr1, even though they seem to be
very similar at first glance. The corrected incr3' looks as follows:

        incr3(List) ->
            lists:reverse(lists:foldl(
                    fun(L, Acc) ->
                            lists:reverse(incr11(L)) ++ Acc
                    end, [], List)).

and as can be seen above, it is comparable to incr1.

incr5 and incr6 are those implemented by Luke. As the table shows, incr6
is clearly the fastest implementation, but note that incr2 is not bad in
comparison to incr1 and incr4. incr7 and incr8 are tail-recursive
versions of Luke's implementations, using an accumulator argument and
reversing the final list before returning. The measurements show that
there is apparently no advantage to that technique in this case.

The really interesting version is incr9, which both uses funs as
continuations and handles lists of any depth. The code is very elegant,
and as you can see from the table, it is actually more efficient than
incr5.

        /Richard

Code for incr7-9:

 incr7(L) -> lists:reverse(incr77(L, [], [])).

 %% incr77(DeepList, [Continuation], [Accumulator])
 incr77([H|T], C, A) when list(H) ->
     incr77(H, [T|C], A);
 incr77([H|T], C, A) ->
     incr77(T, C, [H+1|A]);
 incr77([], [C|Cs], A) ->
     incr77(C, Cs, A);
 incr77([], [], A) ->
     A


 incr8(L) -> lists:reverse(incr88(L, [])).

 incr88([H|T], A) when list(H) ->
     incr888(H, T, A);
 incr88([], A) ->
     A.

 incr888([H|T], C, A) ->
     incr888(T, C, [H+1|A]);
 incr888([], C, A) ->
     incr88(C, A).


 incr9(L) -> incr99(L, fun () -> [] end).

 incr99([H|T], C) when list(H) ->
     incr99(H, fun () -> incr99(T, C) end);
 incr99([H|T], C) ->
     [H+1|incr99(T, C)];
 incr99([], C) ->
     C().



Richard Carlsson (richardc)   (This space intentionally left blank.)
E-mail: Richard.Carlsson WWW: http://www.csd.uu.se/~richardc/



Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Ulf Wiger-4
On Tue, 2 Oct 2001, Richard Carlsson wrote:

>This was interesting, so I did some benchmarks of my own. The
>results seem to have little variance, and I think they can be
>trusted:

[solid-looking numbers indicating that CPS is not too bad]


What I was getting at was that it appeared as if continuation
passing style actually led to pretty good performance, but I
couldn't recall having seen much of it in Erlang.


The real function I rewrote was not (increment on deep lists),
but the following (slightly modified to protect the innocent):

----------------------------------------------------
get_values(Keys) when list(Keys) ->
    lists:foldl(
      fun(K, Acc) ->
              get_vals(K) ++ Acc
      end, [], Keys).

get_vals(Key) ->
    [{_, Points}] = ets:lookup(mytab, Key),
    lists:map(
      fun({point, Value}) ->
              Value
      end,
      Points).
----------------------------------------------------

which I rewrote thus:


----------------------------------------------------
get_values2([Key|Keys]) ->
   get_vals2(Key, fun() -> get_values2(Keys) end);
get_values2([]) ->
   [].

get_vals2(Key, Cont) ->
    [{_, Points}] = ets:lookup(mytab, Key),
    get_values2(Points, Cont).

get_values2([{point, Value}|Points], Cont) ->
    [Value | get_values2(Points, Cont)];
get_values2([], Cont) ->
    Cont().

----------------------------------------------------

Here's a helper function that I used to populate a table.

----------------------------------------------------
fill_tab() ->
    ets:new(mytab, [set, public, named_table]),
    Ps = lists:seq(1,10),
    Points = [{point, V} || V <- Ps],
    lists:foreach(
      fun(P) ->
              Obj = {P, Points},
              ets:insert(mytab, Obj)
      end, Ps).
----------------------------------------------------


My version seems to be about 50% faster in a sloppy benchmark.

In this case, the input is not a deep list, because we don't want
to read it all onto the heap to begin with. The continuation
passing style still works fine.

/Uffe
--
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB





Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Ulf Wiger-4
On Tue, 2 Oct 2001, Ulf Wiger wrote:

>get_values2([Key|Keys]) ->
>   get_vals2(Key, fun() -> get_values2(Keys) end);
>get_values2([]) ->
>   [].
>
>get_vals2(Key, Cont) ->
>    [{_, Points}] = ets:lookup(mytab, Key),
>    get_values2(Points, Cont).
>
>get_values2([{point, Value}|Points], Cont) ->
>    [Value | get_values2(Points, Cont)];
>get_values2([], Cont) ->
>    Cont().


Never mind the weird choice of function names. It was a quick
copy-and-paste job to change the original names to something more
general.... I should have left it as it was.

/Uffe
--
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB



Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Richard Carlsson-4
In reply to this post by Ulf Wiger-4

On Tue, 2 Oct 2001, Ulf Wiger wrote:

> What I was getting at was that it appeared as if continuation
> passing style actually led to pretty good performance, but I
> couldn't recall having seen much of it in Erlang.

Precisely. Probably this is because in the dark ages (i.e., up until a
year or two ago), Erlang funs were rather inefficiently implemented; in
particular, a fun application was a lot slower (at least 7-8 times) than
a normal function call, in the early Beam implementations. Today, fun
applications are only ~3 times slower than a local function call (local
calls have also been made faster since then, so this is really quite a
good number). Furthermore, creating a fun value was a bit expensive in
R6 - compared to creating a tuple of comparable size - but in R8, this
is much faster (twice as fast as in R7, actually; too bad I don't have
any figures to compare R6 with R7). Numbers taken from Bj?rn
Gustavsson's home page: http://www.ericsson.com/cslab/~bjorn/.


> The real function I rewrote was not (increment on deep lists),
> but the following (slightly modified to protect the innocent):

So the order of Value elements in the final list was not important?
(Since your new version changes the order to strict left-to-right.
Maybe this was also a bug fix?)

And regarding the names - you can eliminate the get_vals2/2 function by
inlining it in get_values2/1.

        /Richard

> ----------------------------------------------------
> get_values(Keys) when list(Keys) ->
>     lists:foldl(
>       fun(K, Acc) ->
>               get_vals(K) ++ Acc
>       end, [], Keys).
>
> get_vals(Key) ->
>     [{_, Points}] = ets:lookup(mytab, Key),
>     lists:map(
>       fun({point, Value}) ->
>      Value
>       end,
>       Points).
> ----------------------------------------------------
>
> which I rewrote thus:
>
>
> ----------------------------------------------------
> get_values2([Key|Keys]) ->
>    get_vals2(Key, fun() -> get_values2(Keys) end);
> get_values2([]) ->
>    [].
>
> get_vals2(Key, Cont) ->
>     [{_, Points}] = ets:lookup(mytab, Key),
>     get_values2(Points, Cont).
>
> get_values2([{point, Value}|Points], Cont) ->
>     [Value | get_values2(Points, Cont)];
> get_values2([], Cont) ->
>     Cont().
>
> ----------------------------------------------------
>
> Here's a helper function that I used to populate a table.
>
> ----------------------------------------------------
> fill_tab() ->
>     ets:new(mytab, [set, public, named_table]),
>     Ps = lists:seq(1,10),
>     Points = [{point, V} || V <- Ps],
>     lists:foreach(
>       fun(P) ->
>      Obj = {P, Points},
>      ets:insert(mytab, Obj)
>       end, Ps).
> ----------------------------------------------------
>
>
> My version seems to be about 50% faster in a sloppy benchmark.
>
> In this case, the input is not a deep list, because we don't want
> to read it all onto the heap to begin with. The continuation
> passing style still works fine.
>
> /Uffe
> --
> Ulf Wiger                                    tfn: +46  8 719 81 95
> Senior System Architect                      mob: +46 70 519 81 95
> Strategic Product & System Management    ATM Multiservice Networks
> Data Backbone & Optical Services Division      Ericsson Telecom AB
>
>
>
>

Richard Carlsson (richardc)   (This space intentionally left blank.)
E-mail: Richard.Carlsson WWW: http://www.csd.uu.se/~richardc/



Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Ulf Wiger-4
On Wed, 3 Oct 2001, Richard Carlsson wrote:

>On Tue, 2 Oct 2001, Ulf Wiger wrote:
>
>> The real function I rewrote was not (increment on deep lists),
>> but the following (slightly modified to protect the innocent):
>
>So the order of Value elements in the final list was not
>important? (Since your new version changes the order to strict
>left-to-right. Maybe this was also a bug fix?)

Uhm, perhaps you're seeing something I don't...?
Anyway, I did the following:

9> Keys = lists:seq(1,10).
[1,2,3,4,5,6,7,8,9,10]
10> continuation:get_values(Keys).
[1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9|...]
11> continuation:get_values2(Keys).
[1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9|...]
12> v(10) == v(11).
true

and therefore concluded that the functions were equivalent,
without any deeper code analysis.

/Uffe
--
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB



Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Richard Carlsson-4

On Wed, 3 Oct 2001, Ulf Wiger wrote:

> Uhm, perhaps you're seeing something I don't...?
> Anyway, I did the following:
>
> 9> Keys = lists:seq(1,10).
> [1,2,3,4,5,6,7,8,9,10]
> 10> continuation:get_values(Keys).
> [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9|...]
> 11> continuation:get_values2(Keys).
> [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9|...]
> 12> v(10) == v(11).
> true
>
> and therefore concluded that the functions were equivalent,
> without any deeper code analysis.

If the input has the form [[a,b,c],[d,e,f],[g,h,i]], the version using
foldl and ++ yields [g',h',i',d',e',f',a',b',c'], since it appended to
the left. However, appending to the right would give quadratic
behaviour. Changing to foldr would solve it.

        /Richard

Richard Carlsson (richardc)   (This space intentionally left blank.)
E-mail: Richard.Carlsson WWW: http://www.csd.uu.se/~richardc/



Reply | Threaded
Open this post in threaded view
|

efficiency, deep lists

Ulf Wiger-4

On Wed, 3 Oct 2001, Richard Carlsson wrote:

>If the input has the form [[a,b,c],[d,e,f],[g,h,i]], the
>version using foldl and ++ yields [g',h',i',d',e',f',a',b',c'],
>since it appended to the left. However, appending to the right
>would give quadratic behaviour. Changing to foldr would solve
>it.

You're right, of course. Anyway, the order didn't matter in this
case.  (:

/Uffe
--
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB



Reply | Threaded
Open this post in threaded view
|

Voice Apps

Martin Carlson-2
In reply to this post by Ulf Wiger-4
Hello all,
    I am currently writting voice apps in c++. A large part of this
process involves writing complex state machines.  The library developed
here is  quite well designed.  I do, however, beilieve that this could
be implemented in a more efficiant manner by employing erlang. I am also
looking to spread erlang usage, and I think that this would be an ideal
way to really bring it to the forefront at my company.  So my question
to all of you is what do you think the best way to go about this is.
More specifically what would be the best way to deal with the creation
and parsing of voice XML with an erlang app. If anyone knows of a method
or a library that handles this please help.  I am quite exited about
this so any help, even the most meager, would be greatly appreciated.
            Cheers,
            Martin