-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process_info/2 returns empty messages list #7413
Comments
Thanks for your report! It does return all the messages in the mailbox, it just so happens to be empty after a second even though you have sent messages to said process. As per the documentation, emphasis mine:
Despite occurring a second later, the Note that the latter applies to the other process as well: if it calls |
Thanks for your response! But And there is no mention about the difference in behavior between |
There is no difference in behavior, you can get the same result with an ordinary receive as well. What you're missing is that
|
Thank you for your answer! I tried to call
No matter how the delay is long, it needs about 1198 times to call
|
I researched this a bit on a basis of @door example: http://tryerl.seriyps.ru/#id=dab4. -module(main).
-export([main/0]).
main() ->
Self = self(),
%% WORKAROUND 1: off-heap MQD fixes getting own messages
%process_flag(message_queue_data, off_heap),
%% UN-WORKAROUND 1: io:format removes the positive effect of the off-heap MQD
%io:format("xx\n"),
%% UN-WORKAROUND 2: any non-zero sleep breaks things again as well
%receive after 1 -> broken end,
%timer:sleep(1),
%% External printer to avoid receiving messages
P = spawn(fun Pr() -> receive {Tag, Term} -> io:format("~w ~p~n", [Tag, Term]), Pr() end end),
%% External process which sends messages to this process
spawn(fun() -> Self ! foo, Self ! bar, receive _ -> ok end end),
%% External process calling process_info on this one
spawn(fun() -> timer:sleep(1100), P ! {r, process_info(Self, [messages])} end),
%% After 1 second messages are definitely in the queue
timer:sleep(1000),
%% WORKAROUND 2: Searching in the message queue fixes later process_info
%receive {{{nonsense}}} -> ok after 0 -> ok end,
%% WORKAROUND 3: GC fixes later process_info
%garbage_collect(),
%% process_info on self consistently sees an empty mailbox
P ! {l1, process_info(Self, [ message_queue_len, messages])},
erlang:yield(),
receive after 0 -> ok end,
timer:sleep(30),
P ! {l2, process_info(Self, [message_queue_len, messages])},
%% After an other process gets the process_info of this one, process_info on self works too
timer:sleep(200),
P ! {l3, process_info(Self, [message_queue_len, messages])},
timer:sleep(400),
ok. This example shows that
It may be related to mechanism of moving messages from signal queue to message queue. This may affect code which uses process_info for overload protection:
@jhogberg can you please evaluate again a severity of this issue? |
I found that logger's overload protection uses -module(olptest).
-export([check/0]).
% Callbacks
-export([init/1, handle_info/2, handle_load/2]).
init(#{latency := _, printer := _} = Opts) ->
{ok, Opts}.
handle_info(Msg, #{latency := Lat, printer := Printer} = State) ->
Printer ! {info, Msg},
timer:sleep(Lat),
{load, State}.
handle_load(Msg, #{printer := Printer} = State) ->
Printer ! {load, Msg},
State.
check() ->
% Print via external process to avoid calling receive on synchronous calls
Printer = spawn_link(fun P() -> receive Msg -> io:format("handle ~0p~n", [Msg]) end, P() end),
% When the message queue is larger than 5, OLP must raise an error
{ok, Pid, _} = logger_olp:start_link(test_olp, ?MODULE, #{latency => 200, printer => Printer}, #{overload_kill_qlen => 5, overload_kill_enable => true}),
% To avoid races, first pack is sent while the OLP is suspended.
% This is not necessary on my computer, but the number of messages accepted at first, may vary
erlang:suspend_process(Pid),
[Pid ! {msg, N} || N <- lists:seq(1, 5)],
io:format("Resuming OLP with 5 messages in queue~n"),
erlang:resume_process(Pid),
% When OLP is already processing the first message, add 10 more messages
timer:sleep(100),
[Pid ! {msg, N} || N <- lists:seq(6, 15)],
io:format("Added 10 more messages~n"),
% And when it is processing the second message, add another 10
timer:sleep(100),
[Pid ! {msg, N} || N <- lists:seq(16, 25)],
io:format("Added 10 messages again~n"),
{ok, Pid, Printer}. Here, I configure logger_olp to crash if message_queue_len is larger than 5.
|
It's not a bug. @rickard-green is working on a signal queue monitoring feature that will be better suited for checking overload than polling with |
This is not a bug as @jhogberg stated. I can understand that it is somewhat surprising, though. However, the examples above are good examples of what not to do. That is, not execute However, we will have a look at a solution that will cut the very long times for outstanding messages that can appear when no |
Thank you for the explanation! I do understand the performance penalty of always giving the exact message queue information. As just a user I expect sent message to be observable. Now it just "disappears" (if I use documented API). As for
|
I don't think this would work well. Simplifying things a bit, we'd either have to break the signal order to do this (non-starter since this would affect the target process), or stop at the first non-message signal we encounter (making more or less it useless for processes whose clients tend to use monitors with their requests, e.g.
The reason we haven't done this is because it would be very expensive, particularly for processes that receive messages from many different processes which tend to be those you want overload protection for to begin with. You will have to take 1-64 locks (depending on the parallel signal optimization) to move signals from the outer to the middle queue, then linearly walk through the signals in the middle queue to figure out how many there are, and then do the same for the message queue ("inner queue") itself. Oh, and all this work is done on the target process and not the one making the request. There's lots of software in the wild that calls |
Thanks!
Yep, that was slow and caused heavy lock contention in our project few years ago. |
@door @stolen The PR #7595 address the potentially long message signal delivery times. I've reconsidered this a bit and consider current behavior a bug, since the message signal delivery time potentially could get infinite. We will release a patch on OTP 25 and OTP 26. I cannot say how quickly the patches will be out, though. The code in #7595 is based on OTP 25.3.2 and should cleanly merge on top of all larger OTP versions if you want to try it out on another version. |
The fixes in PR #7595 have been released in OTP 25.3.2.6 and OTP 26.1. |
Describe the bug
process_info(self(),messages)
returns empty list in some cases.To Reproduce
Expected behavior
process_info(self(),messages)
always returns all messages contained in its mailbox.Affected versions
26.0.1
25.3.2 (partially)
Additional context
I checked it on linux, from shell:
rm -f check_messages.beam && erlc check_messages.erl && erl -noinput -run check_messages check_messages -run erlang halt
and from erlang repl:
1> check_messages:check_messages().
Version 26.0.1 fails in both cases, 25.3.2 gives error in erlang repl but not in shell, and 24.3.4.6 has no error in both cases.
The text was updated successfully, but these errors were encountered: