Fwd: About a multicore benchmark

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: About a multicore benchmark

Iván Carmenates
Hi, I am trying to do a simple benchmark against python, and we decide to use factorial function to the test

fact(0)->1;
fact(N)->N*fact(N-1).

But I wonder if there is a way to implement this using a multicore approach.

Kindest regards,
Ivan.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: About a multicore benchmark

Dmitry Kolesnikov-2
Hello,

The parallel implementation of Factorial algorithm has been studies for a while.



This algorithm would not give you a representative benchmark vs Python. I am afraid that any math problem would be weaker in Erlang unless you implement it as NIF. You need to design benchmark strategy according to your use-cases.  

Best Regards, 
Dmitry

On 9 Nov 2017, at 20.17, Iván Carmenates <[hidden email]> wrote:

Hi, I am trying to do a simple benchmark against python, and we decide to use factorial function to the test

fact(0)->1;
fact(N)->N*fact(N-1).

But I wonder if there is a way to implement this using a multicore approach.

Kindest regards,
Ivan.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: About a multicore benchmark

Richard A. O'Keefe-2
In reply to this post by Iván Carmenates
> Hi, I am trying to do a simple benchmark against python, and we decide to
> use factorial function to the test
>
> fact(0)->1;
> fact(N)->N*fact(N-1).
>
> But I wonder if there is a way to implement this using a multicore
> approach.

Fairly obviously, yes.
Suppose you have k cores.
Core 1 computes 1 * (k+1) * (2k+1) ...
Core 2 computes 2 * (k+2) * (2k+2) ...
Core 3 computes 3 * (k+3) * (2k+3) ...
...
Core k computes k * (k+k) * (2k+k) ...
and finally you take the product of their answers.

The problem is that it is difficult to see this benchmark as telling
you anything worth knowing.  Factorials get very big very fast.  So
for any N big enough to be worth measuring, what you are really
measuring is the speed of the under-lying bignum library.  I expect
that Erlang and Python use the *same* bignum library (gmp), so what
does that tell you about anything *else* in Erlang and Python?

For what it's worth, here's your benchmark in Smalltalk:
Time millisecondsToRun: [20000 factorial]
System A: 355 msec.
System B: 347 msec (noise level difference).
System C: 189 msec.
What does this actually tell us about these systems?
Very little.  It just so happens that System C (which is mine)
has a carefully written #factorial function which runs, in
System A or System B, about 40% faster than the naive code they
have.  The underlying bignum library in System C is actually
significantly *worse* than the ones in Systems A and B for
large enough products.

If you want some marginally less unrealistic benchmarks,
you could look at "The Computer Languages Benchmarks Game"
http://benchmarksgame.alioth.debian.org/
You should certainly read
http://benchmarksgame.alioth.debian.org/why-measure-toy-benchmark-programs.html

The Chameneos-Redux benchmark has
Erlang at 62.3 seconds and Python3 at 236.84 seconds.

However, when I tried my own implementation in Smalltalk,
it turned out that almost all the time was going into
shared queue operations and a small tweak to shared queues
*doubled* the speed of the program.

If for example you turn to the NBody benchmark, what you
mostly learn is that programming languages that use "boxed"
floating point numbers do much worse than languages with
unboxed ones, unless they have exceedingly clever compilers.
Erlang's performance suffers for that reason, but Python
does worse.  Here again, a relatively simple compiler hack
(turning  x := t * vx + x  into if t, vx, and x are all
FloatD, then BOXD(UNBOXD(t)*UNBOXD(vx)+UNBOXD(x) else ...)
can halve the amount of boxing, thus doubling the speed.
This is something I believe HiPE can do, given suitable
type information.

The lesson is that small benchmarks can be extremely
sensitive to implementation details that are much less
unfluential for large scale systems.

Benchmarking is a wonderful way to ask "where is the time
going? how can I do better?" but a lousy way to compare
languages.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions