Quantcast

clipping of max# of file descriptors by erts when kernel poll is enabled

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

clipping of max# of file descriptors by erts when kernel poll is enabled

Joel Reymont
I believe there's a "bug" in the current implementation of  
erts_poll_init in erts/emulator/sys/common/erl_poll.c. This bug  
prevents Erlang from using more than 1024 file descriptors regardless  
of whether kernel poll is available and used.

This happens on Mac OSX Leopard but should affect other platforms as  
well.

This is the relevant chunk of code from erts/emulator/sys/common/
erl_poll.c:

#if ERTS_POLL_USE_SELECT && defined(FD_SETSIZE)
     if (max_fds > FD_SETSIZE)
         max_fds = FD_SETSIZE;
#endif

ERTS will successfully grab the maximum # of open files from the  
kernel so long as SYSCONF is present, i.e.

     max_fds = sysconf(_SC_OPEN_MAX);

What happens then is that even if kernell poll is enabled, the maximum  
number of file descriptors (max_fds) gets clipped to FD_SETSIZE, 1024  
on Mac OSX Leopard.

The code needs to look like this instead:

#if ERTS_POLL_USE_SELECT && defined(FD_SETSIZE) && !
ERTS_POLL_USE_KERNEL_POLL
     if (max_fds > FD_SETSIZE)
         max_fds = FD_SETSIZE;
#endif

ERTS appears to compile erl_poll.c twice, with and without kernell  
poll enabled. It then uses erts_poll_init_kp (kpoll) or  
erts_poll_init_nkp, depending on whether -K true was given.

I believe that the lack of && !ERTS_POLL_USE_KERNEL_POLL above is just  
oversight and no harm is caused by adding it.

        Thanks, Joel

--
wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by erts when kernel poll is enabled

Matthew Dempsky
On Fri, Aug 22, 2008 at 10:56 AM, Joel Reymont <[hidden email]> wrote:
> This happens on Mac OSX Leopard but should affect other platforms as
> well.

Erlang doesn't use poll(2) on OS X, because it's broken for devices
(e.g., it never reports readability on /dev/null; see erts/configure's
test for "checking for working poll()").
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by erts when kernel poll is enabled

Matthew Dempsky
On Fri, Aug 22, 2008 at 5:14 PM, Matthew Dempsky <[hidden email]> wrote:
> Erlang doesn't use poll(2) on OS X, because it's broken for devices
> (e.g., it never reports readability on /dev/null; see erts/configure's
> test for "checking for working poll()").

To be more explicit, because poll(2) is broken, Erlang supports
select(2) for fallback if kernel poll is disabled at run-time, so it
can't support more than FD_SETSIZE files.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by erts when kernel poll is enabled

Joel Reymont
Just to make sure I delivered my point across...

erl_poll.c will be compiled twice, once with ERTS_POLL_USE_KERNEL_POLL  
and once without, even when kernel poll is present and the build  
detects it.

ERTS should not be clipping max_fds when ERTS_POLL_USE_KERNEL_POLL is  
defined, not in the "kernel poll" version of erl_poll.c.

To make sure there's no clipping when kernel poll is present and  
detected, the code should look like this:

#if ERTS_POLL_USE_SELECT && defined(FD_SETSIZE) && !
ERTS_POLL_USE_KERNEL_POLL
    if (max_fds > FD_SETSIZE)
        max_fds = FD_SETSIZE;
#endif

I tested it by running erl +Ktrue and erl +Kfalse and it works. The  
original version clips max_fds regardless of kernel poll.

        Thanks, Joel

--
wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Rickard Green
Matthew's answer is correct. If we need to fall back on select() on a filedescriptor that is larger than select() can handle, your modification wont work.

BR,
Rickard Green, Erlang/OTP, Ericsson AB.


-----Ursprungligt meddelande-----
Från: [hidden email] genom Joel Reymont
Skickat: lö 2008-08-23 10:32
Till: Matthew Dempsky
Kopia: Erlang Questions
Ämne: Re: [erlang-questions] clipping of max# of file descriptors by ertswhen kernel poll is enabled
 
Just to make sure I delivered my point across...

erl_poll.c will be compiled twice, once with ERTS_POLL_USE_KERNEL_POLL  
and once without, even when kernel poll is present and the build  
detects it.

ERTS should not be clipping max_fds when ERTS_POLL_USE_KERNEL_POLL is  
defined, not in the "kernel poll" version of erl_poll.c.

To make sure there's no clipping when kernel poll is present and  
detected, the code should look like this:

#if ERTS_POLL_USE_SELECT && defined(FD_SETSIZE) && !
ERTS_POLL_USE_KERNEL_POLL
    if (max_fds > FD_SETSIZE)
        max_fds = FD_SETSIZE;
#endif

I tested it by running erl +Ktrue and erl +Kfalse and it works. The  
original version clips max_fds regardless of kernel poll.

        Thanks, Joel

--
wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Joel Reymont
Rickard,

On Aug 25, 2008, at 2:14 PM, Rickard Green S wrote:

> Matthew's answer is correct. If we need to fall back on select() on  
> a filedescriptor that is larger than select() can handle, your  
> modification wont work.


The decision to use select() is made by ERTS during initialization.

If kernel poll cannot be used then the version of erl_poll.c compiled  
without the ERTS_POLL_USE_KERNEL_POLL flag will be used. This version  
will clip the file descriptors as per the code below.

#if ERTS_POLL_USE_SELECT && defined(FD_SETSIZE) && !
ERTS_POLL_USE_KERNEL_POLL
    if (max_fds > FD_SETSIZE)
        max_fds = FD_SETSIZE;
#endif

The kernel poll version of the same erl_poll.c will determine the  
maximum number of file descriptors according to the OS kernel  
configuration, a few lines above the clipping code.  This version does  
NOT need any clipping and adding && !ERTS_POLL_USE_KERNEL_POLL above  
will ensure this.

Again, there's no "fallback on select" at runtime. The decision to  
make use of select is made depending on platform capabilities at build  
time and on the +K setting.

If erl is told not to use kernel poll while the capability is there,  
then the code from erl_poll_nkp.o will be used and this code will  
always clip descriptors.

If kernel poll is available and erlang is told to use it then there  
will be NO fallback to select and NO clipping of descriptors to  
FD_SETSIZE is needed or wanted. The way to ensure this is to add && !  
ERTS_POLL_USE_KERNEL_POLL to the clipping code above.

Erlang is supposed to scale and without a fix to the above I cannot  
scale my supposedly scalable poker server above ~300 users on a single  
Erlang VM. This plain and obviously SUCKS!

With my proposed fix I can scale to thousands of users on a single VM  
without a problem.

I insist that I'm right and the OTP team is wrong. There's a bug in  
erl_poll.c and the solution is trivial. Please fix the bug!

        Thanks, Joel

--
wagerlabs.com

_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Rickard Green
Well, you are wrong. Current implementation will fallback on select() at runtime if kqueue cannot handle the filedescriptor.
 
The current behavior is a necessity not a bug. Introducing your suggested change will introduce a serious bug.
 
BR,
Rickard Green, Erlang/OTP, Ericsson AB.

________________________________

Från: Joel Reymont [mailto:[hidden email]]
Skickat: må 2008-08-25 15:50
Till: Rickard Green S
Kopia: Matthew Dempsky; Erlang Questions
Ämne: Re: SV: [erlang-questions] clipping of max# of file descriptors by ertswhen kernel poll is enabled



Rickard,

On Aug 25, 2008, at 2:14 PM, Rickard Green S wrote:

> Matthew's answer is correct. If we need to fall back on select() on
> a filedescriptor that is larger than select() can handle, your
> modification wont work.


The decision to use select() is made by ERTS during initialization.

If kernel poll cannot be used then the version of erl_poll.c compiled
without the ERTS_POLL_USE_KERNEL_POLL flag will be used. This version
will clip the file descriptors as per the code below.

#if ERTS_POLL_USE_SELECT && defined(FD_SETSIZE) && !
ERTS_POLL_USE_KERNEL_POLL
    if (max_fds > FD_SETSIZE)
        max_fds = FD_SETSIZE;
#endif

The kernel poll version of the same erl_poll.c will determine the
maximum number of file descriptors according to the OS kernel
configuration, a few lines above the clipping code.  This version does
NOT need any clipping and adding && !ERTS_POLL_USE_KERNEL_POLL above
will ensure this.

Again, there's no "fallback on select" at runtime. The decision to
make use of select is made depending on platform capabilities at build
time and on the +K setting.

If erl is told not to use kernel poll while the capability is there,
then the code from erl_poll_nkp.o will be used and this code will
always clip descriptors.

If kernel poll is available and erlang is told to use it then there
will be NO fallback to select and NO clipping of descriptors to
FD_SETSIZE is needed or wanted. The way to ensure this is to add && !
ERTS_POLL_USE_KERNEL_POLL to the clipping code above.

Erlang is supposed to scale and without a fix to the above I cannot
scale my supposedly scalable poker server above ~300 users on a single
Erlang VM. This plain and obviously SUCKS!

With my proposed fix I can scale to thousands of users on a single VM
without a problem.

I insist that I'm right and the OTP team is wrong. There's a bug in
erl_poll.c and the solution is trivial. Please fix the bug!

        Thanks, Joel

--
wagerlabs.com



_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Joel Reymont
Rickard,

Where is this in the code?

I'm willing to bet that the fallback consists of using the _nkp (non-
kernel poll) version of the polling function and that version will be  
compiled w/o ERTS_POLL_USE_KERNEL_POLL by definition.

Is that not so?

On Aug 25, 2008, at 6:51 PM, Rickard Green S wrote:

> Well, you are wrong. Current implementation will fallback on  
> select() at runtime if kqueue cannot handle the filedescriptor.
>
> The current behavior is a necessity not a bug. Introducing your  
> suggested change will introduce a serious bug.

--
wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Joel Reymont
In reply to this post by Rickard Green
Rickard,

On Aug 25, 2008, at 6:51 PM, Rickard Green S wrote:

> Well, you are wrong. Current implementation will fallback on  
> select() at runtime if kqueue cannot handle the filedescriptor.
>
> The current behavior is a necessity not a bug. Introducing your  
> suggested change will introduce a serious bug.


I took a closer look at erts_poll.c. The current code does indeed use  
select as a fallback, but it should not clip descriptors if fallback  
is not used.

Fallback to select is currently implemented thusly

#if ERTS_POLL_USE_FALLBACK
     /* We depend on the wakeup pipe being handled by kernel poll */
     if (ps->fds_status[wake_fds[0]].flags & ERTS_POLL_FD_FLG_INFLBCK)
         fatal_error("%s:%d:create_wakeup_pipe(): Internal error\n",
                     __FILE__, __LINE__);
#endif

No clipping of descriptors should be done then unless  
ERTS_POLL_USE_FALLBACK is defined and the ERTS_POLL_FD_FLG_INFLBCK is  
set in ps->fds_status[...].flags, at least when  
ERTS_POLL_USE_KERNEL_POLL is defined.

Would you agree?

You can only have 1024 descriptors with R12B3 on Mac OSX right now  
since FD_SETSIZE is 1024. ERTS clips descriptors to FD_SETSIZE  
regardless of whether it needs to fallback or not, even when kernel  
poll is present and enabled.

This is an unwarranted and crippling limitation since I can go to 12k  
file descriptors on my Mac if I remove the file descriptor clipping.

1k and 12k descriptors make a huge difference for network servers and  
scalability!

        Thanks, Joel

--
wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Matthew Dempsky
On Mon, Aug 25, 2008 at 11:12 AM, Joel Reymont <[hidden email]> wrote:
> I took a closer look at erts_poll.c. The current code does indeed use select
> as a fallback, but it should not clip descriptors if fallback is not used.

What should it do with file descriptors >=1024 later when it does
fallback to select(2)?

> You can only have 1024 descriptors with R12B3 on Mac OSX right now since
> FD_SETSIZE is 1024.

Fortunately, Linux and BSD support poll(2), and no one uses OS X in
server environments.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Joel Reymont

On Aug 25, 2008, at 7:24 PM, Matthew Dempsky wrote:

> no one uses OS X in server environments.


Then perhaps they would not care about falling back on select at all?

I would rather NOT fall back to select and have the runtime give me an  
error than be stuck with puny 1024 file descriptors!

--
wagerlabs.com





_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clipping of max# of file descriptors by ertswhen kernel poll is enabled

Tony Finch
In reply to this post by Rickard Green
On Mon, 25 Aug 2008, Rickard Green S wrote:

> Matthew's answer is correct. If we need to fall back on select() on a
> filedescriptor that is larger than select() can handle, your
> modification wont work.

Note that on Mac OS X 10.4 and later, you can #define FD_SETSIZE to
whatever value you want at compile time. In fact, you can allocate fd
sets dynamically so long as they are big enough to hold the largest file
descriptor you are using. See the comment just above the definition of
FD_SETSIZE in /usr/include/sys/select.h.

This is also true for other BSDs.

Tony.
--
f.anthony.n.finch  <[hidden email]>  http://dotat.at/
HEBRIDES BAILEY: WESTERLY OR SOUTHWESTERLY 5 OR 6, BECOMING VARIABLE 3 OR 4 IN
NORTH. MODERATE OR ROUGH. RAIN OR DRIZZLE. MODERATE OR GOOD, OCCASIONALLY
POOR.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://www.erlang.org/mailman/listinfo/erlang-questions
Loading...