-protocol declartion idea, worth an EEP?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

-protocol declartion idea, worth an EEP?

Richard A. O'Keefe-2
One of the things I've wanted for a long time in Erlang is
a way to check message passing.  Now that we have the
Dialyzer, it occurred to me that there's a simple thing we
could do that wouldn't achieve everything we might want
(it's not UBF) but might be useful in catching typos at least.

Before writing it up as an EEP, I thought I'm check with the
mailing list whether it makes sense to other people.  It seems
like the kind of thing that must have been thought of before,
so maybe there is some flaw in it that I'm not seeing.

-protocol(<type>).
says that the following function and the functions it calls
may only receive messages that conform to the <type>.
Taking the <type> as a union type (counting a type that is
not a union type as a union of one element), each element
of the union must have at least one corresponding receive.
[I know 'corresponding' is vague; this is one of the things
the EEP would spell out.]

What if a function calls another that is not known until
run time?  That requires a change to the type language:

Fun :: fun( OptRcv)                    %% any function
       | fun((...) -> Type OptRcv)     %% any arity, returning Type
       | fun(() -> Type OptRcv)
       | fun((TList) -> Type OptRcv)


OptRcv :: /* empty */
        | receive Type

If you want to specify that a function doesn't receive
anything, use 'receive none()'; any empty OptRcv says nothing
about the function's protocol.

-spec could be extended too.
  -spec Module:Function(ArgType1, ..., ArgTypeN) -> ReturnType OptRcv.

This would not let us check that sends and receives matched up
properly, although part 2 of the EEP would address that.  But
it would provide
(1) a place to document the programmer's intentions
    about the messages a process should receive
(2) a way to check that patterns in receives were supposed to
    be there, catching typos
(3) a way to check that everything that is supposed to be
    received might be received somewhere; this would have
    false negatives, but it's still better than what we have.




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Joe Harrison
> One of the things I've wanted for a long time in Erlang is
> a way to check message passing. Now that we have the
> Dialyzer, it occurred to me that there's a simple thing we
> could do that wouldn't achieve everything we might want
> (it's not UBF) but might be useful in catching typos at least.

I'm actually working on exactly this at the moment. Though I'm
not relying on Dialyzer, there has been work on using Dialyzer
for this purpose:

"Detection of Asynchronous Message Passing Errors Using Static
Analysis" by M. Christakis and K. Sagonas

> Before writing it up as an EEP, I thought I'm check with the
> mailing list whether it makes sense to other people. It seems
> like the kind of thing that must have been thought of before,
> so maybe there is some flaw in it that I'm not seeing.

The problem I see is simple. If you have an irrefutable pattern,
it covers *all* of your receive specifications:

receive
  A -> ok
end.

> If you want to specify that a function doesn't receive
> anything, use 'receive none()'; any empty OptRcv says nothing
> about the function's protocol.

I disagree. This would mean that all "pure" functions require
annotations. This seems pretty tedious.

> This would not let us check that sends and receives matched up
> properly, although part 2 of the EEP would address that. But
> it would provide
> (1) a place to document the programmer's intentions
>     about the messages a process should receive

I love this style of programming. We should always let the
programmer communicate their intentions to the compiler.

> (2) a way to check that patterns in receives were supposed to
>     be there, catching typos

See my irrefutable pattern comment above.
I do however, think it would be very useful in a classroom
setting. We teach our undergraduate students Erlang and
one of my anecdotal observations is something similar to
the following:

client(Server) ->
  Server ! hello,
  receive
    akc -> ok
  end,
  client(Server).

server() ->
  receive
    {hello, Client} ->
      Client ! ack
  end,
  server().

In these settings I believe the analysis would be useful.
In OTP settings, I think that the headers of the
handle_call, handle_cast, and handle_info functions serve
as informal receive specifications.

> (3) a way to check that everything that is supposed to be
>     received might be received somewhere; this would have
>     false negatives, but it's still better than what we have.

I agree, it's better than nothing. I feel like the biggest
point of contention would be the style of the receive
specifications, and I would like to see this as an EEP :)

- Joe
https://cs.kent.ac.uk/~jrh53
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Richard A. O'Keefe-2
> The problem I see is simple. If you have an irrefutable pattern,
> it covers *all* of your receive specifications:
>
> receive
>   A -> ok
> end.

Or more generally, the "catch-all" idiom
   receive
      good1 -> ...
    ; good2 -> ...
    ; _ -> loop(...) % discard bad messages
   end
I was thinking of having a macro ?CATCHALL for this,
with different translations for ordinary compilation
(_) and checking (something else).  Or perhaps more
simply, treating a single wild-card in receive as a
special case.

>
>> If you want to specify that a function doesn't receive
>> anything, use 'receive none()'; any empty OptRcv says nothing
>> about the function's protocol.
>
> I disagree. This would mean that all "pure" functions require
> annotations. This seems pretty tedious.

The problem is that right now fun() and -spec say *nothing*
about reception.  Backwards compatibility seems to require
not changing that.

The idea was that a checker should *infer* 'receive none()'
for pure functions, and that this should *refine* a given
-spec.

One thing I was after was that you should normally require
just one -protocol declaration per process, with the
receive part of the functions it calls being inferred.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Joe Harrison
> Or more generally, the "catch-all" idiom
>    receive
>       good1 -> ...
>     ; good2 -> ...
>     ; _ -> loop(...) % discard bad messages
>    end

Yes, definitely. I was assuming that A was unbound.

> I was thinking of having a macro ?CATCHALL for this,
> with different translations for ordinary compilation
> (_) and checking (something else). Or perhaps more
> simply, treating a single wild-card in receive as a
> special case.

You're right. Wildcards are definitely a special case that
can hide a lot of (dangerous) communication problems.
Though I'm not clear where this macro should be: in the
protocol spec or the receive?

> The problem is that right now fun() and -spec say *nothing*
> about reception. Backwards compatibility seems to require
> not changing that.
>
> The idea was that a checker should *infer* 'receive none()'
> for pure functions, and that this should *refine* a given
> -spec.

Oh right. That makes much more sense :)

> One thing I was after was that you should normally require
> just one -protocol declaration per process, with the
> receive part of the functions it calls being inferred.

Yep! It seems like a good idea to make these compositional.

- Joe
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Michael Truog
In reply to this post by Richard A. O'Keefe-2
On 06/05/2017 06:16 AM, [hidden email] wrote:

>>> If you want to specify that a function doesn't receive
>>> anything, use 'receive none()'; any empty OptRcv says nothing
>>> about the function's protocol.
>> I disagree. This would mean that all "pure" functions require
>> annotations. This seems pretty tedious.
> The problem is that right now fun() and -spec say *nothing*
> about reception.  Backwards compatibility seems to require
> not changing that.
>
> The idea was that a checker should *infer* 'receive none()'
> for pure functions, and that this should *refine* a given
> -spec.
>
> One thing I was after was that you should normally require
> just one -protocol declaration per process, with the
> receive part of the functions it calls being inferred.

I believe it is more important to be able to mark functions impure or pure within the Erlang type specification as described at https://bugs.erlang.org/browse/ERL-373 than to clarify the received messages, though it would be nice to have both.  In practice, the code paths taken with a receive won't benefit clearly with the -protocol addition, when considering how receives are normally buried in an Erlang behaviour (it is considered beneficial to use Erlang behaviours in practice, and often explicit receives are typical in academic exercises which may refer back to Erlang in earlier days before OTP existed).  If we could mark functions as impure or pure and have it inferred/checked with dialyzer, there would always be a clear benefit at any level of abstraction.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

zxq9-2
On 2017年06月05日 月曜日 10:35:40 Michael Truog wrote:

> On 06/05/2017 06:16 AM, [hidden email] wrote:
> >>> If you want to specify that a function doesn't receive
> >>> anything, use 'receive none()'; any empty OptRcv says nothing
> >>> about the function's protocol.
> >> I disagree. This would mean that all "pure" functions require
> >> annotations. This seems pretty tedious.
> > The problem is that right now fun() and -spec say *nothing*
> > about reception.  Backwards compatibility seems to require
> > not changing that.
> >
> > The idea was that a checker should *infer* 'receive none()'
> > for pure functions, and that this should *refine* a given
> > -spec.
> >
> > One thing I was after was that you should normally require
> > just one -protocol declaration per process, with the
> > receive part of the functions it calls being inferred.
>
> I believe it is more important to be able to mark functions impure or pure within the Erlang type specification as described at https://bugs.erlang.org/browse/ERL-373 than to clarify the received messages, though it would be nice to have both.  In practice, the code paths taken with a receive won't benefit clearly with the -protocol addition, when considering how receives are normally buried in an Erlang behaviour (it is considered beneficial to use Erlang behaviours in practice, and often explicit receives are typical in academic exercises which may refer back to Erlang in earlier days before OTP existed).  If we could mark functions as impure or pure and have it inferred/checked with dialyzer, there would always be a clear benefit at any level of abstraction.

I totally agree, to the point that I often annotate pure functions:
https://github.com/zxq9/zuuid/blob/master/src/zuuid.erl#L34-L39

BUT

These are two very different issues. What ROK is getting at is having a way to declare a message protocol (as a verifiable contract) by which a certain process is going to live and abide by (at the "living objects" level of the system), whereas declaring functional purity is strictly functional annotation. These are entirely orthogonal issues, but both absolutely critical to providing a way for a programmer to formally declare the intent of some code and have that intent checked by a tool like Dialyzer.

That said...

OTP and extension of generic behaviors (OTP or otherwise) is an interesting issue because while it is often useful to have one module in src/ equate to one type of process that will be spawned, leveraging gen_server with another layer of abstraction is not uncommon (consider wx) -- and neither is writing OTP-compliant pure Erlang processes that contain their own naked 'receive' clauses when dealing with code that needs to have extremely low latency (tight socket handling code comes to mind). That means that the parts of a process' definition that might include informal declarations of valid incoming and outgoing message forms may be spread across several code modules and shared across processes -- and this gets tricky to determine how to define. Even a good syntax for how to write this is not easy to pull out of a hat.

The traditional way to sort of wish the protocol message adherence issue away has been to use behaviors and abstract all message passing behind public interface functions that are as explicitly typed as one can manage -- and yet this still fails to really give us a solid contract by which processes have to live in a way that dialyzer can check for us, as it forces side-effecty message-involved functions to be conflated with any other functions. (This approach is also one that seems to be heavily used and almost never discussed in these terms.)

Hmmm...

Anyway, I certainly think a -pure declaration in Dialyzer would be much easier to implement in the short-term than a process-level message protocol declaration. But I deeply wish we had both already -- and always have.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Richard A. O'Keefe-2
In reply to this post by Michael Truog

> On 6/06/2017, at 5:35 AM, Michael Truog <[hidden email]> wrote:
> I believe it is more important to be able to mark functions impure or pure

Actually that was something I was working on with SERC just before
funs were introduced into Erlang and completely derailed my ideas.

One issue is that 'pure' or 'impure' is far too coarse a classification.
For example, we might want to distinguish
 - termination
   - terminates in bounded time
   - terminates in finite time
   - not known to terminate
 - process dictionary
   - no dependency on process dictionary
   - reads specified parts of process dictionary
   - writes (and maybe reads) specified parts of process dictionary
 - input/output
   - does no I/O
   - reads
   - writes
   - opens and closes
 - inter-process communication
   - does it create processes or not
   - does it send messages or not
   - does it receive messages (which?) or not
And that isn't meant to be a complete list.

Another issue is higher-order functions.

map(F, [X|Xs]) -> [F(X) | map(F, Xs)];
map(_, [])     -> [].

is itself pure, but if F has any effects, map(F, ...) may
(but need not) have those effects too.  So we need a type
like
map/2 :: (A -> B effect E, list(A)) -> list(B) effect E+finite.

Since then, Erlang has gone through at least three different
type systems (the Wadler one, the Armstrong one, and the one
it now has), and I never did get around to trying to adapt a
proper "effect" system to the Dialyzer approach.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Michael Truog
On 06/05/2017 03:42 PM, Richard A. O'Keefe wrote:
>> On 6/06/2017, at 5:35 AM, Michael Truog <[hidden email]> wrote:
>> I believe it is more important to be able to mark functions impure or pure
> Actually that was something I was working on with SERC just before
> funs were introduced into Erlang and completely derailed my ideas.
>
> One issue is that 'pure' or 'impure' is far too coarse a classification.
It would be nice to have finer granularity than just 'pure' or 'impure', though it seems
very easy to make this purity concept complex enough that people don't use it.
Having the ability to distinguish between "operational purity" and
"mathematical purity" (as they were described in https://bugs.erlang.org/browse/ERL-373)
would be interesting, though it would go beyond the concept of purity that is
present in Haskell, which is already complex, and would require that people
care about the difference (I don't think most people would care, in practice,
since their concerns would be likely focused only on their logic they are
attempting to develop, while they assume the compilation platform remains
(generally) constant).  I would expect that Erlang, as a dynamically typed
language, would favor simplicity in the purity concept, instead of providing
more granularity, though I personally would like more granularity
(i.e., to keep the function types in any Erlang system simple, and able to
handle future changes more easily, even if the changes span separate
module versions and Erlang versions).

> For example, we might want to distinguish
>   - termination
>     - terminates in bounded time
>     - terminates in finite time
>     - not known to terminate
I would assume most analysis of normal source code wouldn't be able to say
anything too useful about termination (due to normal complexity), but I don't
know much about this.  I mainly just consider that dialyzer is already pushing
speed and memory limits which shouldn't be taxed much more (especially
if this would only ever work on toy examples).  I would rather that dialyzer is
always part of compilation (even if it always remains a separate executable)
to get similar lint-ing when compared to statically typed languages.

>   - process dictionary
>     - no dependency on process dictionary
>     - reads specified parts of process dictionary
>     - writes (and maybe reads) specified parts of process dictionary
I like the idea of being able to distinguish between local mutable data
(the process dictionary) and global mutable data (ets, etc.), to show the
scope of the mutability used in the function.  It is likely reasonable to say
global mutable data use permits local mutable data use, where global
mutable data use is considered "more impure" than local mutable data
use (i.e., using ets allows you to use the process dictionary, without
being specific about using the process dictionary).  Having separate
levels of impurity could convey an increasing potential for error
(more impurity requires more testing at a system level to ensure state
is being managed according to expectations).

>   - input/output
>     - does no I/O
>     - reads
>     - writes
>     - opens and closes
>   - inter-process communication
>     - does it create processes or not
>     - does it send messages or not
>     - does it receive messages (which?) or not
I think it is best to group I/O as it relates to the Erlang port type,
so sockets and Erlang port processes would be treated the same.
That I/O is separate from Erlang messages, which relates more to the
-protocol declaration idea.

> And that isn't meant to be a complete list.
>
> Another issue is higher-order functions.
>
> map(F, [X|Xs]) -> [F(X) | map(F, Xs)];
> map(_, [])     -> [].
>
> is itself pure, but if F has any effects, map(F, ...) may
> (but need not) have those effects too.  So we need a type
> like
> map/2 :: (A -> B effect E, list(A)) -> list(B) effect E+finite.
If dialyzer could infer the purity from the anonymous function,
then we should be able to avoid making the syntax for the type
specification more complex.  That may require that functions
never default to being "impure" and just stay ambiguous, until
it can be inferred from type specifications based on the call path
(or things like catching exceptions, local/global mutable state, etc.).
>
> Since then, Erlang has gone through at least three different
> type systems (the Wadler one, the Armstrong one, and the one
> it now has), and I never did get around to trying to adapt a
> proper "effect" system to the Dialyzer approach.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: -protocol declartion idea, worth an EEP?

Richard A. O'Keefe-2

> On 6/06/2017, at 1:55 PM, Michael Truog <[hidden email]> wrote:
>
>>
> It would be nice to have finer granularity than just 'pure' or 'impure', though it seems
> very easy to make this purity concept complex enough that people don't use it.

Fair comment.  I guess my point is not clear.  Here it is explicitly:
if someone calls an Erlang function "pure", I don't know what they mean.
As a specific example, I don't know whether
  X = f(A), g(), Y = f(A)
guarantees that X =:= Y.

Why might it not?  Because while f/1 might not *change* anything,
it might *read* mutable data which g/0 might change.

> I would assume most analysis of normal source code wouldn't be able to say
> anything too useful about termination (due to normal complexity), but I don't
> know much about this.

The Mercury people added termination analysis to their compiler
and were able to automatically show termination for about 70% of
the procedures they checked.  That relied on mode and type analysis,
but Erlang doesn't have modes.

The one difference in Erlang of course is that people deliberately
write functions (the dispatch loops of processes) which aren't
*meant* to terminate (except when told to by a message).

[would be nice to have dialyzer always part of compilation]
agreed.
[simple purity hierarchy]
agreed.
[simple type system desirable]
as long as it does the job, agreed. The first requirement of any
type system is that programmers should understand it well enough
to use it effectively.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Loading...