How to split a single huge server?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How to split a single huge server?

semmit mondo-2
Hi,
 
I have a single-process solution for a server that, I feel, could be implementet
in a more concurrent fashion.  Its internal state is a vector that consists of n numbers.
It receives casts without parameters.  Every cast is basically a matrix multiplication.
The matrices are known in advance.  So, if the internal state is v, and the server
process receives a cast, and it has the matrix M associated with that cast, then the
new state will be M*v (M is a matrix, * is matrix multiplication, and v is a column vector
representing the previous internal state of the server process).
 
My problem is that the size of v (and the matrices) can be very large, and my
server runs in a single process.  It would be great to split it apart and use several
smaller processes to calculate the new state.  I would like to use separate
processes for each number in v.  But because of the nature of matrix multiplication,
that's not so easy to achieve, because in order to be able to calculate a single
number in the new state, I need to know all the numbers in the previous state.
The prev state could be shared between processes in advance, but that would
require large messages containing all the old values to all the processes around.
I believe that is a wrong idea and there must be a better one.
 
My question is how would you crach this problem if efficiency matters?
 
 
I have one possible solution in mind, and would like to know your opinion:  there
could be a main server process and size(v) calculator processes, one for every
number in v.  The server process handles the casts, and it has the whole v vector
in it.  The calculator processes have only one number from v, and they have the
proper lines from M.  When a cast arrives, the server process builds a fun that has
the whole v encoded in it as a clojure, and this anon function gets sent to the
calculator processes.  Then the calculator processes apply the fun to the
appropriate line of M, and they have the new number that they have to send
back to the server.  I'm not sure if sending a huge function with a large body is
cheaper than sending a large list of numbers, but I hope there's some optimisation
going on in BEAM with funs...  Am I right with all this?
 
THX
 
 

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

Jon Schneider
How large are these matrices ?

It all sounds like you'd end up wasting resources on inter-process
communication and just about everything else. Remember we've had desktop
PCs with instructions (MMX, SSE) that do maybe 8 FLOPS per cycle for about
twenty years now but you're not going to be able to take advantage of this
by re-inventing matrix multiplication.

Jon


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

semmit mondo-2
In reply to this post by semmit mondo-2
 
How large are these matrices ?
 
They can be huge.  2^16 x 2^16 is an average one but can be much larger.
Does the anonymous function trick help me out?
 
So basically what you suggest is that I need an external piece of software written in a low level language that uses low level SIMD instructions on the bare CPU or maybe uses GPU.
 

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

Bob Ippolito
There are languages such as Python (numpy, numba) or Haskell (repa, accelerate) that have high level and fast implementations of these operations in libraries, supporting CPU and/or GPU. I'm not aware of such popular bindings for Erlang, most of the ones I've come across don't seem to be maintained but cl looks like it might be promising for working with the GPU from Erlang: https://github.com/tonyrog/cl


On Tue, Aug 5, 2014 at 7:12 AM, semmit mondo <[hidden email]> wrote:
 
How large are these matrices ?
 
They can be huge.  2^16 x 2^16 is an average one but can be much larger.
Does the anonymous function trick help me out?
 
So basically what you suggest is that I need an external piece of software written in a low level language that uses low level SIMD instructions on the bare CPU or maybe uses GPU.
 

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

Peer Stritzinger
In reply to this post by semmit mondo-2
You could use

https://github.com/tonyrog/cl


On 2014-08-05 14:12:11 +0000, semmit mondo said:

> --===============4801298331139695666==
>
> Content-Type: MULTIPART/alternative;
> BOUNDARY="2096565089-1804289383-1407247931=:5440"
>
>
>
> --2096565089-1804289383-1407247931=:5440
>
> Content-Type: TEXT/plain; CHARSET=UTF-8
>
>
>
>  
>
> >How large are these matrices ?
>
>  They can be huge.  2^16 x 2^16 is an average one but can be much
> larger.Does the anonymous function trick help me out? So basically what
> you suggest is that I need an external piece of software written in a
> low level language that uses low level SIMD instructions on the bare
> CPU or maybe uses GPU.
>
> --2096565089-1804289383-1407247931=:5440
>
> Content-Type: TEXT/html; CHARSET=UTF-8
>
>
>
> <div>&nbsp;
>
> <div>
>
> <div>
>
> <blockquote>
>
> <div>How&nbsp;large&nbsp;are&nbsp;these&nbsp;matrices&nbsp;?</div>
>
> </blockquote>
>
>
>
> <div>&nbsp;</div>
>
>
>
> <div>They can be huge.&nbsp; 2^16 x 2^16 is an average one but can be
> much larger.</div>
>
>
>
> <div>Does the anonymous function trick help me out?</div>
>
>
>
> <div>&nbsp;</div>
>
>
>
> <div>So basically what you suggest is that I need an external piece of
> software written in a low level language that uses low level SIMD
> instructions on the bare CPU or maybe uses GPU.</div>
>
>
>
> <div>&nbsp;</div>
>
> </div>
>
> </div>
>
> </div>
>
>
>
> --2096565089-1804289383-1407247931=:5440--
>
>
>
> --===============4801298331139695666==
>
> Content-Type: text/plain; charset="us-ascii"
>
> MIME-Version: 1.0
>
> Content-Transfer-Encoding: 7bit
>
> Content-Disposition: inline
>
>
>



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

dmkolesnikov
Hello,

I think the original question was how to run a matrix to vector multiplication efficiently in parallel.
The usage of C libraries is one way to handle it. However, we do not see the description of actual use-case.
At least, it was not clear to me… was the question about pure “math” multiplication or something else.

All-in-all Here is very good description how to run matrix to vert multiplication in parallel. I guess you can dap at the technique for your need.
http://www.hpcc.unn.ru/mskurs/ENG/DOC/pp07.pdf

- Dmitry


On 05 Aug 2014, at 19:26, Peer Stritzinger <[hidden email]> wrote:

> You could use
>
> https://github.com/tonyrog/cl
>
>
> On 2014-08-05 14:12:11 +0000, semmit mondo said:
>
>> --===============4801298331139695666==
>> Content-Type: MULTIPART/alternative; BOUNDARY="2096565089-1804289383-1407247931=:5440"
>> --2096565089-1804289383-1407247931=:5440
>> Content-Type: TEXT/plain; CHARSET=UTF-8
>> >How large are these matrices ?
>> They can be huge.  2^16 x 2^16 is an average one but can be much larger.Does the anonymous function trick help me out? So basically what you suggest is that I need an external piece of software written in a low level language that uses low level SIMD instructions on the bare CPU or maybe uses GPU. --2096565089-1804289383-1407247931=:5440
>> Content-Type: TEXT/html; CHARSET=UTF-8
>> <div>&nbsp;
>> <div>
>> <div>
>> <blockquote>
>> <div>How&nbsp;large&nbsp;are&nbsp;these&nbsp;matrices&nbsp;?</div>
>> </blockquote>
>> <div>&nbsp;</div>
>> <div>They can be huge.&nbsp; 2^16 x 2^16 is an average one but can be much larger.</div>
>> <div>Does the anonymous function trick help me out?</div>
>> <div>&nbsp;</div>
>> <div>So basically what you suggest is that I need an external piece of software written in a low level language that uses low level SIMD instructions on the bare CPU or maybe uses GPU.</div>
>> <div>&nbsp;</div>
>> </div>
>> </div>
>> </div>
>> --2096565089-1804289383-1407247931=:5440--
>> --===============4801298331139695666==
>> Content-Type: text/plain; charset="us-ascii"
>> MIME-Version: 1.0
>> Content-Transfer-Encoding: 7bit
>> Content-Disposition: inline
>
>
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

Jon Schneider
In reply to this post by semmit mondo-2
> So basically what
> you suggest is that I need an external piece of software written in a low
> level language that uses low level SIMD instructions on the bare CPU or
> maybe uses GPU.

I can't say I'm talking from experience but strongly suspect yes.

Jon

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: How to split a single huge server?

Jesper Louis Andersen-2
In reply to this post by semmit mondo-2

On Tue, Aug 5, 2014 at 4:12 PM, semmit mondo <[hidden email]> wrote:
So basically what you suggest is that I need an external piece of software written in a low level language that uses low level SIMD instructions on the bare CPU or maybe uses GPU.

It depends. How much RAM are you going to accept using and how fast are you going to want your computation? If you have a small kernel which has to run fast at any cost, it is often better to move to a lower level language. Note that this affects development time and development cost. So you have to decide if it is worth it. Usually, it's not if you are just toying around.


--
J.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions