What the reason for connection closed error message.?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

What the reason for connection closed error message.?

ARUN P
Hi,

    I am running my application on a distributed way, with one server node and multiple client nodes and all nodes are started in visible mode. A SNMP agent is running in the server node and I have a GUI through which I can access all the nodes in the distributed system. The SNMP query will be landing in the server node and which further does rpc to the specific client node and serve the query.

The problem what I am facing is that after the complete nodes are up while accessing properties of some client nodes, the server node is getting node down messages for most of the client nodes connected to it with reason "connection_closed" . The physical connection is intact and there is no ping break between the nodes.

What could be the reason for this error. ? and what are the circumstances this error can come.?

Note : The tick time is set to 5 Second in all the nodes.

Regards,
Arun P

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: What the reason for connection closed error message.?

Valentin Micic-2
Just a thought… 

1. Check your interface for errors (e.g. collisions etc.)
This usually happens when your local ethernet interface and adjacent switch have different configuration (e.g. one is configured to be half-duplex, and another as full-duplex).
In this case, as the  traffic increases, a number of errors (e.g. collisions) on the interface increases as well, thus causing nodes to get disconnected.

2. Check the disk I/O. 
Sometimes, excessive disk I/O operations may cause "uninterruptible sleep". When this happens, your CPU is prevented from doing anything else but wait for disk.
If this is the case, start your run-time (where excessive disk I/O is taking place) with +A attribute (this will increase a number of threads responsible for disk I/O).

Kind regards

V/


On 15 May 2017, at 6:56 AM, Arun wrote:

Hi,

    I am running my application on a distributed way, with one server node and multiple client nodes and all nodes are started in visible mode. A SNMP agent is running in the server node and I have a GUI through which I can access all the nodes in the distributed system. The SNMP query will be landing in the server node and which further does rpc to the specific client node and serve the query.

The problem what I am facing is that after the complete nodes are up while accessing properties of some client nodes, the server node is getting node down messages for most of the client nodes connected to it with reason "connection_closed" . The physical connection is intact and there is no ping break between the nodes.

What could be the reason for this error. ? and what are the circumstances this error can come.?

Note : The tick time is set to 5 Second in all the nodes.

Regards,
Arun P
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: What the reason for connection closed error message.?

Ayanda Dube
I agree with Valentin. And on his 2nd point, recommendation is usually to set +A parameter to at least 12 threads per core on which your node is deployed on. e.g. 128 on an 8 core platform should be fine.


Best regards,
Ayanda

Erlang Solutions Ltd.


On 15 May 2017 at 08:51, Valentin Micic <[hidden email]> wrote:
Just a thought… 

1. Check your interface for errors (e.g. collisions etc.)
This usually happens when your local ethernet interface and adjacent switch have different configuration (e.g. one is configured to be half-duplex, and another as full-duplex).
In this case, as the  traffic increases, a number of errors (e.g. collisions) on the interface increases as well, thus causing nodes to get disconnected.

2. Check the disk I/O. 
Sometimes, excessive disk I/O operations may cause "uninterruptible sleep". When this happens, your CPU is prevented from doing anything else but wait for disk.
If this is the case, start your run-time (where excessive disk I/O is taking place) with +A attribute (this will increase a number of threads responsible for disk I/O).

Kind regards

V/


On 15 May 2017, at 6:56 AM, Arun wrote:

Hi,

    I am running my application on a distributed way, with one server node and multiple client nodes and all nodes are started in visible mode. A SNMP agent is running in the server node and I have a GUI through which I can access all the nodes in the distributed system. The SNMP query will be landing in the server node and which further does rpc to the specific client node and serve the query.

The problem what I am facing is that after the complete nodes are up while accessing properties of some client nodes, the server node is getting node down messages for most of the client nodes connected to it with reason "connection_closed" . The physical connection is intact and there is no ping break between the nodes.

What could be the reason for this error. ? and what are the circumstances this error can come.?

Note : The tick time is set to 5 Second in all the nodes.

Regards,
Arun P
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions