Where are the binaries in message queue but not part of process_info(Pid, binary) ?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Where are the binaries in message queue but not part of process_info(Pid, binary) ?

Bekes, Andras G

Hi All,

 

While investigating a system that transfers lots of refc binaries, I tried measuring the amount of data in the message queue of a process. I noticed that the binaries in the message queue are not always included in the result of process_info(Pid, binary).

In particular, when observing the binaries, I see time intervals when each of the new messages appear in the binary list, and time intervals when the binary list stays the same regardless of new messages in the queue.

 

I made a simple module that demonstrates the phenomenon. The output looks like:

 

Msgs sent: 0 Msgs in queue: 0 Binaries of receiver: 0

Msgs sent: 1 Msgs in queue: 1 Binaries of receiver: 1

Msgs sent: 2 Msgs in queue: 2 Binaries of receiver: 2

[...goes well up to 36...]

Msgs sent: 36 Msgs in queue: 36 Binaries of receiver: 36

[then binary count stays 36, for thousands of messages, and suddenly starts counting again ! ]

 

Looks like if there are some magic limits at 36 (sometimes 33), 256, 4781, 12518 and 20255 messages/binaries and so far I was unable to find any logic in which limit takes effect.

 

I am aware that this feature of process_info is not meant for production use (“can be changed or removed without prior notice.”), but this misleading behavior is maybe worse than not having it.

 

May I kindly ask for an explanation of what I observe?

 

Thanks!

 

Andras G. Bekes, Vice President   
Morgan Stanley | Institutional Securities Tech   
Lechner Odon Fasor 6 | Floor 06   
Budapest, 1095   
Phone: +36 1 882-0791   
[hidden email]   

----------------------------------------------------------------------------------

My test module:

 

-module(msgq_len_bin_test).

 

-export([test_sleeper/0,test_receiver/0]).

 

test_sleeper()->

   io:format("Testing sleeper...\n",[]),

   PID = spawn_link(fun sleeper/0),

   sender(PID,0,10000,x). % no reason to test further, binary count never goes above 36

 

test_receiver()->

   io:format("Testing receiver...\n",[]),

   PID = spawn_link(fun receiver/0),

   sender(PID,0,100000,x). % worth testing big numbers

 

sleeper()->timer:sleep(100000000).

 

receiver()->

   timer:sleep(10),

   receive _ -> ok end,

   receiver().

 

sender(_,N,N,_)-> io:format("Sent ~p messages.",[N]);

sender(PID,N,Max,LastBinLen)->

 

   [{message_queue_len,MsgQLen}, {binary,BinaryInfoList}] =

      erlang:process_info(PID,[message_queue_len,binary]),

 

   BinLen = length(BinaryInfoList),

   case BinLen of

      LastBinLen -> io:format(".",[]);

      _ -> io:format("Msgs sent: ~p Msgs in queue: ~p Binaries of receiver: ~p \n",[N,MsgQLen,BinLen])

   end,

 

   RefcBinary = binary:copy(<<N:32>>,1000),

   PID ! RefcBinary,

 

   sender(PID,N+1,Max,BinLen).

   





NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Where are the binaries in message queue but not part of process_info(Pid, binary) ?

Lukas Larsson-8
Hello,

On Tue, Oct 17, 2017 at 4:00 PM, Bekes, Andras G <[hidden email]> wrote:

While investigating a system that transfers lots of refc binaries, I tried measuring the amount of data in the message queue of a process. I noticed that the binaries in the message queue are not always included in the result of process_info(Pid, binary).

In particular, when observing the binaries, I see time intervals when each of the new messages appear in the binary list, and time intervals when the binary list stays the same regardless of new messages in the queue.


erlang:process_info(Pid, binary) returns the binaries that are currently on the heap of the process. Binaries in messages that have not yet been received may or may not be part of the heap depending on internal emulator optimizations.

The message_queue_data process flag explains this a little more and also gives a way to control it to some extent.


Lukas


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Where are the binaries in message queue but not part of process_info(Pid, binary) ?

Bekes, Andras G

Hi Lukas,

 

Thank you very much for the quick and detailed reply. This explains it.

 

Now I have a few further questions.

 

- It seems the reliable way to measure the amount of data in the message queue is setting message_queue_data process flag to off_heap beforehand (ideally at spawn time with spawn_opt), and retrieve the full message queue content with process_info(Pid,messages) – which must be fully scanned. Is there a better solution? (I originally aimed at refc binaries only because I just wanted a quick estimation, and in my case anything else was negligible.)

 

- May I suggest to improve the documentation of erlang:process_info(Pid, binary) by the below facts, also referring to message_queue_data process flag?

 

- Is it possible to implement a new process_info_item() of erlang:process_info/2 that gives information about binaries currently in the message queue but off-heap?

 

- The documentation suggests using off_heap message_queue_data “if the process potentially can get many messages”, but this is rather vague. Does this refer to processes expected to have many *queued* messages? I guess a process receiving but not queuing many messages performs better with on_heap (“Performance of the actual message passing is however generally better when not using flag off_heap.”).

 

Thanks,

   Andras

 

From: Lukas Larsson [mailto:[hidden email]]
Sent: Tuesday, October 17, 2017 4:37 PM
To: Bekes, Andras G (IST)
Cc: [hidden email]
Subject: Re: [erlang-questions] Where are the binaries in message queue but not part of process_info(Pid, binary) ?

 

Hello,

 

On Tue, Oct 17, 2017 at 4:00 PM, Bekes, Andras G <[hidden email]> wrote:

While investigating a system that transfers lots of refc binaries, I tried measuring the amount of data in the message queue of a process. I noticed that the binaries in the message queue are not always included in the result of process_info(Pid, binary).

In particular, when observing the binaries, I see time intervals when each of the new messages appear in the binary list, and time intervals when the binary list stays the same regardless of new messages in the queue.

 

erlang:process_info(Pid, binary) returns the binaries that are currently on the heap of the process. Binaries in messages that have not yet been received may or may not be part of the heap depending on internal emulator optimizations.

 

The message_queue_data process flag explains this a little more and also gives a way to control it to some extent.

 

 

Lukas

 





NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers  If you cannot access these links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you consent to the foregoing and to the voice recording of conversations with personnel of Morgan Stanley.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Where are the binaries in message queue but not part of process_info(Pid, binary) ?

Lukas Larsson-8
Hello,

On Tue, Oct 17, 2017 at 11:38 PM, Bekes, Andras G <[hidden email]> wrote:

- It seems the reliable way to measure the amount of data in the message queue is setting message_queue_data process flag to off_heap beforehand (ideally at spawn time with spawn_opt), and retrieve the full message queue content with process_info(Pid,messages) – which must be fully scanned. Is there a better solution? (I originally aimed at refc binaries only because I just wanted a quick estimation, and in my case anything else was negligible.)


The off_heap flag should not effect anything when you are using process_info(Pid, messages). If you want the size of the data in the message queue then there is no better option than process_info(Pid, messages).

Do keep in mind though, that any message queue inspection tools only show the messages that is part of what we call the internal message queue. Messages that are part of the external queue are considered to be in-transit to the process and thus not part of the message queue. In an overloaded process the majority of the messages will most likely be in the external message queue.
 

 

- May I suggest to improve the documentation of erlang:process_info(Pid, binary) by the below facts, also referring to message_queue_data process flag?


Yes, good idea.
 

 

- Is it possible to implement a new process_info_item() of erlang:process_info/2 that gives information about binaries currently in the message queue but off-heap?


It should be possible to do for messages with off_heap data, however for a on_heap message queue it is only easy to get the messages that ended up off-heap, the binaries of messages that are on-heap would not be part of the result. This makes the API hard to explain which is not good.

Also this inspection would again only be for the internal message queue.
 

 

- The documentation suggests using off_heap message_queue_data “if the process potentially can get many messages”, but this is rather vague. Does this refer to processes expected to have many *queued* messages? I guess a process receiving but not queuing many messages performs better with on_heap (“Performance of the actual message passing is however generally better when not using flag off_heap.”).


yes it does. I'll try to clarify that in the documentation.

Lukas

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions