massive tcp servers

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Dustin Sallings

        Does anyone have any experience with massively large tcp servers?  I'm
doing something like a chat server where there will be many, many
connections simultaneously along with some mechanism for addressing
those connections.  Does anyone have any idea what might be required to
have something on the order of 10,000,000 concurrent connections in a
cluster.  Obviously I want as small of a cluster as possible for
hardware costs.  I also get the impression that large clusters may not
be that easy to scale.

--
SPY                      My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________



Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Shawn Pearce
Dustin Sallings <dustin> wrote:
>
> Does anyone have any experience with massively large tcp servers?  
> I'm doing something like a chat server where there will be many, many
> connections simultaneously along with some mechanism for addressing
> those connections.  Does anyone have any idea what might be required to
> have something on the order of 10,000,000 concurrent connections in a
> cluster.  Obviously I want as small of a cluster as possible for
> hardware costs.  I also get the impression that large clusters may not
> be that easy to scale.

http://www.sics.se/~joe/apachevsyaws.html

Erlang will easily take 80,000 connections on some OSes and still
keep a pretty good throughput.

10 million connections may take quite a few machines; I'd expect
you would want to be hitting around 100k-200k TCP connections per
physical computer, with about 10k-80k TCP connections per Erlang node.
100k/computer = 100 computers.

But what do I know. :)

Its really a function of how much work you need each to do, Erlang's
IO system on some platforms is pretty capable of going to large
numbers of file descriptors.  On others, I think its lucky to get
1024 without making the C libraries choke (FD_SET size limits).

The real limiting factor might just be the CPU load of your application,
requiring you to use even more systems than you would like.  Because
if you do much more than just act as a data pump, most of your nodes
will be overwhelmed with 80k connections.

I assume you have at least researched IRC?  It might not entirely fit
with your problem domain, but it does sort of fit into the "building a
very big cluster to support a very large number of connections from a
very large number of clients" arena.  At the very least it offers some
lessons learned, and some good do's and don'ts.

--
Shawn.


Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Luke Gorrie-3
In reply to this post by Dustin Sallings
Dustin Sallings <dustin> writes:

> Does anyone have any experience with massively large tcp
> servers?  I'm doing something like a chat server where there will be
> many, many connections simultaneously along with some mechanism for
> addressing those connections.  Does anyone have any idea what might be
> required to have something on the order of 10,000,000 concurrent
> connections in a cluster.  Obviously I want as small of a cluster as
> possible for hardware costs.  I also get the impression that large
> clusters may not be that easy to scale.

Oh, oh, oh, that sounds like fun!

I'm assuming that most of these users will be idle most of the time,
and each will use a small amount of overall bandwidth.

Entering fantasy land here..

My first thought is that the OS will have trouble with that many
connections. I'm not certain of the per-connection overhead, but
looking at the 'tcp_opt' struct in include/linux/tcp.h at least it
looks significant.

If you wrote your own TCP (or simpler custom protocol) in userspace
then maybe you could pull it off with just one box. You could keep the
TCP control structures in a database on disk and map the active ones
into a large-but-bounded-size pool in memory.

I have only limited experience with hacking TCP/IP, but my impression
is that writing your own TCP is probably no harder than e.g. writing a
good HTTP/1.1 implementation. I recommend the first two volumes of
Douglas Comer's networking series, in turn recommended to me by Tobbe
from when he wrote a TCP in Erlang.

On Linux I think the most practical option would be to use a 'tun'
interface (linux/Documentation/networking/tuntap.txt), which allows
you to write a user-space network interface that operates at IP-level
(rather than ethernet-level). That way you can use Linux's IPv4
implementation and only worry about TCP yourself.

-Luke


Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Joe Williams-2
> If you wrote your own TCP (or simpler custom protocol) in userspace
> then maybe you could pull it off with just one box. You could keep the
> TCP control structures in a database on disk and map the active ones
> into a large-but-bounded-size pool in memory.
>
> I have only limited experience with hacking TCP/IP, but my impression
> is that writing your own TCP is probably no harder than e.g. writing a
> good HTTP/1.1 implementation. I recommend the first two volumes of
> Douglas Comer's networking series, in turn recommended to me by Tobbe
> from when he wrote a TCP in Erlang.

  I suspect  it might  be pretty  easy - Adam  Dunkels wrote  a TCP/IP
"thingy" in  PHP - not a  full TCP but  just enough to answer  an HTTP
request see

  http://www.sics.se/~adam/phpstack/

  I talked to Adam he said it  took him "3 hours" - including the time
to learn  PHP - now Adam  has admittedly implemented TCP  many times -
but staring at his code might provide some inspiration.

  If you  wrote "enough" TCP to  just handle a HTTP  request you might
come up with a pretty high performance, highly concurrent web server -
who knows.

  IMHO the  tricky bit  is making  a tunnel so  that a  regular Erlang
program can see the raw IP datagrams - the rest is just plain coding.

  Luke sees to have some code that does this bit - (am I right Luke).

  Cheers

/Joe


>
> On Linux I think the most practical option would be to use a 'tun'
> interface (linux/Documentation/networking/tuntap.txt), which allows
> you to write a user-space network interface that operates at IP-level
> (rather than ethernet-level). That way you can use Linux's IPv4
> implementation and only worry about TCP yourself.
>
> -Luke
>



Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Luke Gorrie-3
Joe Armstrong <joe> writes:

>   IMHO the  tricky bit  is making  a tunnel so  that a  regular Erlang
> program can see the raw IP datagrams - the rest is just plain coding.
>
>   Luke sees to have some code that does this bit - (am I right Luke).

That part is actually easy, and yes, the 'tuntap' application in the
Jungerl does it. We've used that in production code.

-Luke


Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Hal Snyder-2
In reply to this post by Shawn Pearce
Shawn Pearce <spearce> writes:

> Dustin Sallings <dustin> wrote:
>>
>> Does anyone have any experience with massively large tcp servers?  
>> I'm doing something like a chat server where there will be many, many
>> connections simultaneously along with some mechanism for addressing
>> those connections.  Does anyone have any idea what might be required to
>> have something on the order of 10,000,000 concurrent connections in a
>> cluster.  Obviously I want as small of a cluster as possible for
>> hardware costs.  I also get the impression that large clusters may not
>> be that easy to scale.
>
> http://www.sics.se/~joe/apachevsyaws.html
>
> Erlang will easily take 80,000 connections on some OSes and still
> keep a pretty good throughput.
>
> 10 million connections may take quite a few machines; I'd expect
> you would want to be hitting around 100k-200k TCP connections per
> physical computer, with about 10k-80k TCP connections per Erlang node.
> 100k/computer = 100 computers.

Jonathan Lemon's kqueue paper discusses of large numbers of TCP
connections, testing HTTP sessions with and without kernel event
delivery.

  ... The unmodified thttpd server runs out of cpu when the number of
  idle connections is around 600, while the modified server still has
  approximately 48% idle time with 10,000 idle connections.

http://www.cs.princeton.edu/courses/archive/fall03/cs518/papers/kqueue.pdf



Reply | Threaded
Open this post in threaded view
|

massive tcp servers

Dustin Sallings
In reply to this post by Luke Gorrie-3

On Jun 24, 2004, at 2:34, Luke Gorrie wrote:

> Oh, oh, oh, that sounds like fun!

        Yeah, I think this project is a really stupid idea, but it's a
fascinating problem.  :)

> I'm assuming that most of these users will be idle most of the time,
> and each will use a small amount of overall bandwidth.

        Yep.

> Entering fantasy land here..
>
> My first thought is that the OS will have trouble with that many
> connections. I'm not certain of the per-connection overhead, but
> looking at the 'tcp_opt' struct in include/linux/tcp.h at least it
> looks significant.

        Yep, (although I was thinking about using FreeBSD or something).  Does
erlang use /dev/poll?

> If you wrote your own TCP (or simpler custom protocol) in userspace
> then maybe you could pull it off with just one box. You could keep the
> TCP control structures in a database on disk and map the active ones
> into a large-but-bounded-size pool in memory.

        That is just completely amazing.  It seems like the obvious and likely
only solution to the problem.  You've given me a lot to think about.  
Thanks a lot.

--
SPY                      My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________