Incoming TCP connections are closed immediately, requiring SO_PRIORITY on newer Linux kernel

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Incoming TCP connections are closed immediately, requiring SO_PRIORITY on newer Linux kernel

Dmitry Simonov
Hello!

I've met strange behaviour of RabbitMQ application on Linux kernel 4.14.46.
Rabbit-users mailing list recommended ask here, probably it's Erlang's behaviour.

Symptoms:
Incoming TCP connections are established, and immediately closed. In strace (for /usr/lib/erlang/erts-10.0/bin/beam.smp process) there are failures setting SO_PRIORITY socket option:

[pid  1814] accept(58, {sa_family=AF_INET, sin_port=htons(43054), sin_addr=inet_addr("127.0.0.1")}, [16]) = 13
[pid  1814] epoll_ctl(4, EPOLL_CTL_MOD, 58, {EPOLLONESHOT, {u32=58, u64=20331670804627514}}) = 0
[pid  1814] fcntl(13, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  1814] fcntl(13, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  1814] getsockopt(58, SOL_TCP, TCP_NODELAY, [0], [4]) = 0
[pid  1814] getsockopt(58, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
[pid  1814] getsockopt(58, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1814] getsockopt(58, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1814] getsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1814] getsockopt(13, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1814] setsockopt(13, SOL_IP, IP_TOS, [0], 4) = 0
[pid  1814] setsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], 4) = -1 EPERM (Operation not permitted)
[pid  1814] getsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1814] getsockopt(13, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1814] setsockopt(13, SOL_SOCKET, SO_PRIORITY, [0], 4) = -1 EPERM (Operation not permitted)
[pid  1814] getsockopt(13, SOL_SOCKET, SO_LINGER, {onoff=0, linger=0}, [8]) = 0
[pid  1814] close(13)                   = 0

With older Linux kernel (4.9.86), connections work well works fine:

[pid  1665] accept(58, {sa_family=AF_INET, sin_port=htons(48170), sin_addr=inet_addr("127.0.0.1")}, [16]) = 12
[pid  1665] epoll_ctl(4, EPOLL_CTL_MOD, 58, {EPOLLONESHOT, {u32=58, u64=20331670804627514}}) = 0
[pid  1665] fcntl(12, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  1665] fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  1665] getsockopt(58, SOL_TCP, TCP_NODELAY, [0], [4]) = 0
[pid  1665] getsockopt(58, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
[pid  1665] getsockopt(58, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(58, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_IP, IP_TOS, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_IP, IP_TOS, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], [4]) = 0
[pid  1665] getsockopt(12, SOL_IP, IP_TOS, [0], [4]) = 0
[pid  1665] setsockopt(12, SOL_TCP, TCP_NODELAY, [0], 4) = 0
[pid  1665] setsockopt(12, SOL_SOCKET, SO_PRIORITY, [0], 4) = 0
[pid  1665] getsockopt(58, SOL_IPV6, IPV6_TCLASS, 0x7f88c14fc7c8, 0x7f88c14fc7cc) = -1 EOPNOTSUPP (Operation not supported)
[pid  1665] accept(58, 0x7f88c14feaf0, 0x7f88c14feac4) = -1 EAGAIN (Resource temporarily unavailable)
[pid  1665] epoll_ctl(4, EPOLL_CTL_MOD, 58, {EPOLLIN|EPOLLONESHOT, {u32=58, u64=14125640596043333690}}) = 0
[pid  1665] recvfrom(12, 0x7f88c43489e8, 1460, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid  1665] epoll_ctl(4, EPOLL_CTL_ADD, 12, {EPOLLIN|EPOLLONESHOT, {u32=12, u64=14125640596043333644}}) = 0
[pid  1665] futex(0x7f88c3f811d0, FUTEX_WAIT_PRIVATE, 4294967295, {0, 151060429}) = -1 ETIMEDOUT (Connection timed out)
[pid  1665] futex(0x7f88c3f811d0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1665] futex(0x7f88c3f813d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>

Setting these capabilities explicitly (setcap cap_net_admin+ep /usr/lib/erlang/erts-10.0/bin/beam.smp) makes RabbitMQ to work again (TCP connections are not closed any more).

Could you please help?
Why does this problem occur?

Erlang version is 21 (latest):
# erl -sname test
Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1]

Eshell V10.0  (abort with ^G)

RabbitMQ version: 3.7.7-1 (latest).

--
Best Regards,
Dmitry Simonov

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions