Must and May convention

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
59 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

Roman Galeev
> No messaging, no external resource calls, no ets, no nifs, certain functions in io and io_lib are out, etc.

So each function sending a message out is impure by definition, therefore all gen_server calls are impure too. Meaning -pure declarations are applicable only to a narrow subset of all functions defined. I just wonder how many paths are in a graph of calls not involving calls to gen_server:* family in a typical Erlang application. 


On Thu, Sep 28, 2017 at 11:16 AM, zxq9 <[hidden email]> wrote:
On 2017年09月28日 木曜日 10:49:21 Karlo Kuna wrote:
> > > -pure([f/1, g/0, h/3]).
> >
> > So, you declare f/1 is pure, but how do you really know it is pure?
> > Meaning it should be a way to prove f/1 is pure, at least with some tool
> > like dialyzer
> >
>
> if f/1 is NIF there is little hope of proving that
> so in general i think ti should be either provable (meaning it is not NIF
> and uses only pure functions) or declared as such
>
> i think we should go with "trust" here. If function is NIF it should be
> declared as pure, otherwise tool is to assume it is not

Right. This is actually the core issue with Dialyzer in general, because
it is a permissive typer that assumes anything is acceptable unless it
can prove by inference that something is impossible OR the user has
written a typespec to guide it and specify the wrongness. Those
specifications are essentially arbitrary in most cases.

This is the exact opposite approach used by GHC, for example. But for
the way Erlang and OTP work at their core it is necessary to have
permissive type checks guided by annotations.

NOTHING SAYS THOSE ANNOTATIONS MIGHT NOT BE WRONG.

This is also true of -pure declarations.

What you CAN prove, however, is that no definitely IMPURE functions
are called from within a function marked as -pure.

Which means Dialyzer would emit an error on any -pure function calling
a function not also considered -pure.

As noted before, much of the stdlib is pure, so quite a lot of code
could become provably pure quite quickly.

No messaging, no external resource calls, no ets, no nifs, certain
functions in io and io_lib are out, etc.

There may be a way to sneak a side effect in despite this, but I think
you could get CLOSE to having a provable graph of pure function calls.

BUT

That is true ONLY IF they are all actually fully qualified calls.
And this is also the problem with trying to make Dialyzer more strict.
(Also, dynamic upgrades are another impossible case to prove.)

Any time you do something like call apply(M, F, A) or Module:foo() you
suddenly can't know for sure what just happened. That module could have
come in dynamically. The only thing Dialyzer can know about it is, outside
of the runtime, the module itself is checked and it has a declaration.
But that's not the same thing as proving what the return type or external
effects of the ambiguous call site might be, so its return handler has
to be specced also. This is pretty much what happens throughout every
gen_server you ever write: the dispatching handle_* functions are quite
ambiguous, but the implementation functions below can be specified and
the interface functions that make the gen_*:call or gen_*:cast calls
can be specified as accepting and returning specific types -- so you
constrain the possibilities tremendously.

This happens every time you write a set of callbacks.
Part of what -callback declarations in behavior defining modules do for
you is typespec a callback -- but you can write callbacks without any
of that (and lots of people do) and that leaves Dialyzer blind. And
consider, of course, that the generic callback declarations for things
like gen_server are deliberately and arbitrarily broad. So to constrain
those you have to guard against type explosions in the interface and
implementation functions.

Nothing is going to fix that but hints like -pure and -spec and -callback.

And that means Dialyzer must continue to be permissive by default and
trust the programmer to be accurate about typespecs in their code.

So "proofing" is not attainable, no matter what, if you mean it in the
strict sense. But seriously, this is an industrial language hellbent on
getting code out the door that solves human problems. So forget about
silly notions of type completeness. That's for academia. In the real world
we have confusing problems and convoluted tools -- this one is
dramatically less convoluted than MOST tools in this domain of solutions
and it can AT LEAST give us a warning when it can prove that you've done
something that is clearly in error.

The kind of super late binding that dynamic module loading and dynamic
call sites imply prohibit proving much about them -- at least in Erlang.
Yet they still work quite well.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



--
With best regards,
     Roman Galeev,
     +420 702 817 968

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

Karlo Kuna
for me benefit would be to narrow down on what IS pure and what developer THINKS is is pure 
just with that IMHO lot of common errors could be detected 

developers should be then guided during debugging on parts that are not pure, also big help! 


On Thu, Sep 28, 2017 at 11:38 AM, Roman Galeev <[hidden email]> wrote:
> No messaging, no external resource calls, no ets, no nifs, certain functions in io and io_lib are out, etc.

So each function sending a message out is impure by definition, therefore all gen_server calls are impure too. Meaning -pure declarations are applicable only to a narrow subset of all functions defined. I just wonder how many paths are in a graph of calls not involving calls to gen_server:* family in a typical Erlang application. 


On Thu, Sep 28, 2017 at 11:16 AM, zxq9 <[hidden email]> wrote:
On 2017年09月28日 木曜日 10:49:21 Karlo Kuna wrote:
> > > -pure([f/1, g/0, h/3]).
> >
> > So, you declare f/1 is pure, but how do you really know it is pure?
> > Meaning it should be a way to prove f/1 is pure, at least with some tool
> > like dialyzer
> >
>
> if f/1 is NIF there is little hope of proving that
> so in general i think ti should be either provable (meaning it is not NIF
> and uses only pure functions) or declared as such
>
> i think we should go with "trust" here. If function is NIF it should be
> declared as pure, otherwise tool is to assume it is not

Right. This is actually the core issue with Dialyzer in general, because
it is a permissive typer that assumes anything is acceptable unless it
can prove by inference that something is impossible OR the user has
written a typespec to guide it and specify the wrongness. Those
specifications are essentially arbitrary in most cases.

This is the exact opposite approach used by GHC, for example. But for
the way Erlang and OTP work at their core it is necessary to have
permissive type checks guided by annotations.

NOTHING SAYS THOSE ANNOTATIONS MIGHT NOT BE WRONG.

This is also true of -pure declarations.

What you CAN prove, however, is that no definitely IMPURE functions
are called from within a function marked as -pure.

Which means Dialyzer would emit an error on any -pure function calling
a function not also considered -pure.

As noted before, much of the stdlib is pure, so quite a lot of code
could become provably pure quite quickly.

No messaging, no external resource calls, no ets, no nifs, certain
functions in io and io_lib are out, etc.

There may be a way to sneak a side effect in despite this, but I think
you could get CLOSE to having a provable graph of pure function calls.

BUT

That is true ONLY IF they are all actually fully qualified calls.
And this is also the problem with trying to make Dialyzer more strict.
(Also, dynamic upgrades are another impossible case to prove.)

Any time you do something like call apply(M, F, A) or Module:foo() you
suddenly can't know for sure what just happened. That module could have
come in dynamically. The only thing Dialyzer can know about it is, outside
of the runtime, the module itself is checked and it has a declaration.
But that's not the same thing as proving what the return type or external
effects of the ambiguous call site might be, so its return handler has
to be specced also. This is pretty much what happens throughout every
gen_server you ever write: the dispatching handle_* functions are quite
ambiguous, but the implementation functions below can be specified and
the interface functions that make the gen_*:call or gen_*:cast calls
can be specified as accepting and returning specific types -- so you
constrain the possibilities tremendously.

This happens every time you write a set of callbacks.
Part of what -callback declarations in behavior defining modules do for
you is typespec a callback -- but you can write callbacks without any
of that (and lots of people do) and that leaves Dialyzer blind. And
consider, of course, that the generic callback declarations for things
like gen_server are deliberately and arbitrarily broad. So to constrain
those you have to guard against type explosions in the interface and
implementation functions.

Nothing is going to fix that but hints like -pure and -spec and -callback.

And that means Dialyzer must continue to be permissive by default and
trust the programmer to be accurate about typespecs in their code.

So "proofing" is not attainable, no matter what, if you mean it in the
strict sense. But seriously, this is an industrial language hellbent on
getting code out the door that solves human problems. So forget about
silly notions of type completeness. That's for academia. In the real world
we have confusing problems and convoluted tools -- this one is
dramatically less convoluted than MOST tools in this domain of solutions
and it can AT LEAST give us a warning when it can prove that you've done
something that is clearly in error.

The kind of super late binding that dynamic module loading and dynamic
call sites imply prohibit proving much about them -- at least in Erlang.
Yet they still work quite well.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



--
With best regards,
     Roman Galeev,
     <a href="tel:+420%20702%20817%20968" value="+420702817968" target="_blank">+420 702 817 968

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

zxq9-2
In reply to this post by Roman Galeev
On 2017年09月28日 木曜日 11:38:08 you wrote:
> > No messaging, no external resource calls, no ets, no nifs, certain
> functions in io and io_lib are out, etc.
>
> So each function sending a message out is impure by definition, therefore
> all gen_server calls are impure too. Meaning -pure declarations are
> applicable only to a narrow subset of all functions defined. I just wonder
> how many paths are in a graph of calls not involving calls to gen_server:*
> family in a typical Erlang application.

Yep.

What winds up being pure is your implementation of how you react to all
those messages.

If you write code with the deliberate intention of making as much of
your code pure as possible you wind up with quite a few very happy side
effects (no pun intended). You start building side-effecty message and
IO bussing procedures that involve dispatch to implementation functions
that do the actual work of processing your values -- and all of those
become provable. This naturally separates transport from processing,
and a set of useful outcomes within your own mind and your development
team tend to result.

I don't know how to go about quantifying those effects -- but doing things
like onboarding new team members is markedly easier and the time between
them showing up (perhaps not knowing Erlang) and contributing more than
distractions to the effort is MUCH shorter.

The bottom line is that passing stuff around is NOT the main job of your
program in most cases, manging some problem of the real world usually is.
Incident to that, you model the problem in two ways: the path and manner
of how messages move around, and within processes how the values involved
are processed. The value processing parts are sequential, procedural, and
can be proven and made pure. The message passing is messy, unprovable,
in a sense unreliable, subject to random timing, and a bunch of other
effects that derive from the fact that the real world actually works that
way, and these aspects make that half of any concurrent solution chaotic
and unprovable.

You'll get a lot more mileage out of a pure type graph in, say, a business
data processing system (something like 80/20 pure/impure) than a chat or
comment threading system (something more like 20/80). In simulation systems
like game servers you get about 50/50 -- which is still a huge win because
a lot of the processing involved is subtle and really needs to be property
tested lest you invent some wacky emergent behaviors that players will
exploit almost instantly as if by magic.

These are tools to make better software and, most importantly, help us
understand what we are actually trying to do. In this sort of inherently
messy effort the perfect is absolutely the enemy of the good.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

Roman Galeev
Interesting. Not so strictly speaking, some of the gen_server calls that do not change its state may be seen as 'pure' albeit relying on messaging, and a tool could probably figure out such calls automatically. And this definitely could help to figure out what' going on in a project.

On Thu, Sep 28, 2017 at 12:12 PM, zxq9 <[hidden email]> wrote:
On 2017年09月28日 木曜日 11:38:08 you wrote:
> > No messaging, no external resource calls, no ets, no nifs, certain
> functions in io and io_lib are out, etc.
>
> So each function sending a message out is impure by definition, therefore
> all gen_server calls are impure too. Meaning -pure declarations are
> applicable only to a narrow subset of all functions defined. I just wonder
> how many paths are in a graph of calls not involving calls to gen_server:*
> family in a typical Erlang application.

Yep.

What winds up being pure is your implementation of how you react to all
those messages.

If you write code with the deliberate intention of making as much of
your code pure as possible you wind up with quite a few very happy side
effects (no pun intended). You start building side-effecty message and
IO bussing procedures that involve dispatch to implementation functions
that do the actual work of processing your values -- and all of those
become provable. This naturally separates transport from processing,
and a set of useful outcomes within your own mind and your development
team tend to result.

I don't know how to go about quantifying those effects -- but doing things
like onboarding new team members is markedly easier and the time between
them showing up (perhaps not knowing Erlang) and contributing more than
distractions to the effort is MUCH shorter.

The bottom line is that passing stuff around is NOT the main job of your
program in most cases, manging some problem of the real world usually is.
Incident to that, you model the problem in two ways: the path and manner
of how messages move around, and within processes how the values involved
are processed. The value processing parts are sequential, procedural, and
can be proven and made pure. The message passing is messy, unprovable,
in a sense unreliable, subject to random timing, and a bunch of other
effects that derive from the fact that the real world actually works that
way, and these aspects make that half of any concurrent solution chaotic
and unprovable.

You'll get a lot more mileage out of a pure type graph in, say, a business
data processing system (something like 80/20 pure/impure) than a chat or
comment threading system (something more like 20/80). In simulation systems
like game servers you get about 50/50 -- which is still a huge win because
a lot of the processing involved is subtle and really needs to be property
tested lest you invent some wacky emergent behaviors that players will
exploit almost instantly as if by magic.

These are tools to make better software and, most importantly, help us
understand what we are actually trying to do. In this sort of inherently
messy effort the perfect is absolutely the enemy of the good.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



--
With best regards,
     Roman Galeev,
     +420 702 817 968

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

zxq9-2
On 2017年09月28日 木曜日 12:32:32 you wrote:
> Interesting. Not so strictly speaking, some of the gen_server calls that do
> not change its state may be seen as 'pure' albeit relying on messaging, and
> a tool could probably figure out such calls automatically. And this
> definitely could help to figure out what' going on in a project.

Perhaps -- but you probably do NOT want a naked call there, because there
is no guarantee the process you called actually exists. Or is on the same
node. Or is calling some external resource not even related to Erlang to
figure out its response.

We could imagine an intermediate category of calls that are a sort of
"pure RPC", but the utility of this distinction may be pretty limited --
we can't (and shouldn't) know what is going on inside another process.

Anyway, it is worth exploring the idea, maybe. At the moment I'm looking
at a very limited goal: getting pure declarations to be a real thing. If
I had more time I would formally propose it and work on implementation --
but I have my hands full and a HUGE backlog of other, slightly higher
priority things I think will improve Erlang tooling at a more immediate
level of impact.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

scott ribe
In reply to this post by zxq9-2
On Sep 28, 2017, at 12:14 AM, zxq9 <[hidden email]> wrote:
>
> unless bangs are acceptable characters to unclude in function names (are they?).

Yes they are. As with Lisp-dervied languages, where it's common to have ! and ?... So it's just a naming convention.

--
Scott Ribe
[hidden email]
(303) 722-0567

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

Roman Galeev
In reply to this post by zxq9-2
> We could imagine an intermediate category of calls that are a sort of "pure RPC"

Yeah, I've been thinking of something like that. Anyway, this 'pure RPC' notion could give enough clues to a developer, and to code analyzing tools.

On Thu, Sep 28, 2017 at 12:39 PM, zxq9 <[hidden email]> wrote:
On 2017年09月28日 木曜日 12:32:32 you wrote:
> Interesting. Not so strictly speaking, some of the gen_server calls that do
> not change its state may be seen as 'pure' albeit relying on messaging, and
> a tool could probably figure out such calls automatically. And this
> definitely could help to figure out what' going on in a project.

Perhaps -- but you probably do NOT want a naked call there, because there
is no guarantee the process you called actually exists. Or is on the same
node. Or is calling some external resource not even related to Erlang to
figure out its response.

We could imagine an intermediate category of calls that are a sort of
"pure RPC", but the utility of this distinction may be pretty limited --
we can't (and shouldn't) know what is going on inside another process.

Anyway, it is worth exploring the idea, maybe. At the moment I'm looking
at a very limited goal: getting pure declarations to be a real thing. If
I had more time I would formally propose it and work on implementation --
but I have my hands full and a HUGE backlog of other, slightly higher
priority things I think will improve Erlang tooling at a more immediate
level of impact.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



--
With best regards,
     Roman Galeev,
     +420 702 817 968

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

zxq9-2
In reply to this post by scott ribe
On 2017年09月28日 木曜日 04:40:04 scott ribe wrote:
> On Sep 28, 2017, at 12:14 AM, zxq9 <[hidden email]> wrote:
> >
> > unless bangs are acceptable characters to unclude in function names (are they?).
>
> Yes they are. As with Lisp-dervied languages, where it's common to have ! and ?... So it's just a naming convention.

In that case, this being a mere convention, seeing

  Blah = foo!()

instead of

  Blah = foo()

actually tells me LESS than seeing

  {ok, Value} = foo()

which will have a concrete effect both at runtime AND during static analysis.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

Joe Armstrong-2
In reply to this post by Roman Galeev
A few more comments

All code might crash so you have to decide what to do.

I think of code as belonging to different categories:

a) Code that if is incorrect will totally screw up your system
b) Code that if it incorrect will screw up part but not all of your system
c) Code that if you restart it might work
d) Code that is in production so it should not crash
e) Code that is being developed so you expect the odd crash
    ...

It's part of the design to figure out which code should be in which category.

A few years ago, when I was helping build some pretty complex systems
we did this. What concerned me was the category a) code - this
is what I called the "error kernel" - a goal was to make this as small
as possible. This part was written by our most experienced programmers
and heavily checked.

Other code in the system had far less stringent requirements on quality.
For example, Internationalization. Error messages were in English, and we made
modules  to turn them into different languages.

This code was not failure-critical - the default strategy if the code failed
was "revert to English" - so this code was done by our "beginners"

Part of system design is isolating the error-kernel making it as small as possible
and getting it implemented by the best programmers.

All this supervision tree stuff is fine - but you do have to trust
that the code that implements supervision trees and gen_servers and so on is correct.

/Joe




On Thu, Sep 28, 2017 at 12:48 PM, Roman Galeev <[hidden email]> wrote:
> We could imagine an intermediate category of calls that are a sort of "pure RPC"

Yeah, I've been thinking of something like that. Anyway, this 'pure RPC' notion could give enough clues to a developer, and to code analyzing tools.

On Thu, Sep 28, 2017 at 12:39 PM, zxq9 <[hidden email]> wrote:
On 2017年09月28日 木曜日 12:32:32 you wrote:
> Interesting. Not so strictly speaking, some of the gen_server calls that do
> not change its state may be seen as 'pure' albeit relying on messaging, and
> a tool could probably figure out such calls automatically. And this
> definitely could help to figure out what' going on in a project.

Perhaps -- but you probably do NOT want a naked call there, because there
is no guarantee the process you called actually exists. Or is on the same
node. Or is calling some external resource not even related to Erlang to
figure out its response.

We could imagine an intermediate category of calls that are a sort of
"pure RPC", but the utility of this distinction may be pretty limited --
we can't (and shouldn't) know what is going on inside another process.

Anyway, it is worth exploring the idea, maybe. At the moment I'm looking
at a very limited goal: getting pure declarations to be a real thing. If
I had more time I would formally propose it and work on implementation --
but I have my hands full and a HUGE backlog of other, slightly higher
priority things I think will improve Erlang tooling at a more immediate
level of impact.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



--
With best regards,
     Roman Galeev,
     <a href="tel:+420%20702%20817%20968" value="+420702817968" target="_blank">+420 702 817 968

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Joe Armstrong-2
In reply to this post by zxq9-2
On Thu, Sep 28, 2017 at 8:14 AM, zxq9 <[hidden email]> wrote:

> On 2017年09月28日 木曜日 05:43:33 you wrote:
>> I dont think its been mentioned, elixir does this with the ! postfix.
>>
>> {ok, Bin} | {error, Reason} = File:read("my_file")
>>
>> Bin | throw/1 = File:read!("my_file")
>>
>>
>> Exactly as you said Mr. Armstrong, the former is more clear off the bat but the latter gives you a nice error (with the filename!).
>>
>> Which do I prefer?  It seems both are useful in certain cases, and one should not replace the other as the absolute truth. If
>> an absolute truth were to be arrived at, then I would vote for the option to have both! A way to call any STDLIB function and have it return a tuple, or throw/exception.
>
> Elixir may do that, but I think adding a magical glyph to the syntax is almost universally a mistake. Not because this one case is right or wrong, but because the thought process underlying it causes languages to eventually suffocate themselves -- as you may have begun noticing... And no, adding a bang at the end of a function call is NOT the same as naming two separate functions, unless bangs are acceptable characters to unclude in function names (are they?).
>
> If they are, some fun names might result:
>
>   launch!the!missiles!(BadGuys)
>
>   clean_your_room!dammit!(Kids)
>
> If they are legal characters, though, then what you are referring to is a mere coding convention -- arguably of less utility that matching wrapped values VS receiving naked ones, which has a concrete runtime effect (and is understood by the type system).
>
> If they are not legal characters and are actually a part of the core language (separate from the type system language) then... that's something I think should be contemplated for a HUGE amount of time before dropping into the language (things like that should be argued for years on end before just being picked as some new idea -- unless it is a research language, then, meh).
>
> That said, the point I was making is not that we should have both because everything is all the same and all functions should be treated equally and we really really need more diversity in the way we handle function call returns. The point I was making is that some functions are CLEARLY PURE and that is just an innate part of their nature. Other functions have side effects and, systems using runtime prayers/monadic returns aside, there is just no way to make a side effecty function pure.
>
> On that note, in a practical sense, when we return {Reply, NewState} in a gen_server we are sort of doing what IO monads do for Haskell. There is NO GUARANTEE from the perspective of the called function that the Reply value will actually be returned to the caller. That is obviously the intent, but it is by no means clear. This property also makes it very convenient to hook such functions up to a property tester. On the other hand, the tuple returned itself can be viewed as a naked value.
>
> Returning naked values mandates purity or total crashability of the system. Nothing in between. Unless you want to play "confuse the new guy".
>
> When we have side effects, such as file operations, there is no difference between calling
>
>   {ok, Bin} = file:read_file(Path)
>
> and
>
>   Bin = file!:read_file(Path)
>
> That's a frivolous reason to add a glyph to the language. The assertion at the time of return already crashes on an error.

Actually I half this but would prefer Bin=file:read_file!(P) - the problem
is that ! is a weak typographic character - so it's easy to miss, and ! means
send so we might (or rather will) confuse beginners.

I guess we could say

     Bin = MUST file:read_file(Path)

Easy to implement and it stands out nicely so you can see it from a
long way off.
Loosely scanning many lines of code and seeing MUST would be nice,
it could be used to signify which parts of the code must be correct.

I actually quite liked the Elixir pipe operator - so I changed the
Erlang parser and messed with this a bit.

So now I can write

test2(F) ->
    F  |> file:read_file()
        |> ok()
        |> tokenize()
        |> parse().

Instead of

test2(F) ->
    {ok, B} = file:read_file(),
    T = tokenize(B),
    parse(T).

Where:

ok({ok,B}) ->
    B.

I can't make my mind up about this - the version with pipes is slightly longer
but seems very readable to me. Trouble is, if it crashes due to a non existent
file, I can't see which file it is ...

/Joe






>
> And no, you don't "need both". That's WHY the return value is matched in the first place -- to leave it up to the caller how they want to handle a surprise {error, Reason}.
>
> Knowing that you are going to get a wrapped value is exactly how you know you are dealing with side effects somewhere, rendering both the return value issue moot and the question of whether the thing you're calling is pure or has a side effect.
>
> -Craig
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Must and May convention

zxq9-2
In reply to this post by Joe Armstrong-2
On 2017年09月28日 木曜日 13:21:56 Joe Armstrong wrote:
> Part of system design is isolating the error-kernel making it as small as
> possible and getting it implemented by the best programmers.

Indeed. I wish this was an easy concept to inculcate.

> All this supervision tree stuff is fine - but you do have to trust
> that the code that implements supervision trees and gen_servers and so on
> is correct.

I strongly agree with this. This is also the reason I trust property
testing FAR more than a few hand-written unit tests, and user testing
of an integrated system far more than an automated anything.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

zxq9-2
In reply to this post by Joe Armstrong-2
On 2017年09月28日 木曜日 13:36:30 Joe Armstrong wrote:

> On Thu, Sep 28, 2017 at 8:14 AM, zxq9 <[hidden email]> wrote:
> > On 2017年09月28日 木曜日 05:43:33 you wrote:
> >> I dont think its been mentioned, elixir does this with the ! postfix.
> >>
> >> {ok, Bin} | {error, Reason} = File:read("my_file")
> >>
> >> Bin | throw/1 = File:read!("my_file")
> >>
> >>
> >> Exactly as you said Mr. Armstrong, the former is more clear off the bat but the latter gives you a nice error (with the filename!).
> >>
> >> Which do I prefer?  It seems both are useful in certain cases, and one should not replace the other as the absolute truth. If
> >> an absolute truth were to be arrived at, then I would vote for the option to have both! A way to call any STDLIB function and have it return a tuple, or throw/exception.
> >
> > Elixir may do that, but I think adding a magical glyph to the syntax is almost universally a mistake. Not because this one case is right or wrong, but because the thought process underlying it causes languages to eventually suffocate themselves -- as you may have begun noticing... And no, adding a bang at the end of a function call is NOT the same as naming two separate functions, unless bangs are acceptable characters to unclude in function names (are they?).
> >
> > If they are, some fun names might result:
> >
> >   launch!the!missiles!(BadGuys)
> >
> >   clean_your_room!dammit!(Kids)
> >
> > If they are legal characters, though, then what you are referring to is a mere coding convention -- arguably of less utility that matching wrapped values VS receiving naked ones, which has a concrete runtime effect (and is understood by the type system).
> >
> > If they are not legal characters and are actually a part of the core language (separate from the type system language) then... that's something I think should be contemplated for a HUGE amount of time before dropping into the language (things like that should be argued for years on end before just being picked as some new idea -- unless it is a research language, then, meh).
> >
> > That said, the point I was making is not that we should have both because everything is all the same and all functions should be treated equally and we really really need more diversity in the way we handle function call returns. The point I was making is that some functions are CLEARLY PURE and that is just an innate part of their nature. Other functions have side effects and, systems using runtime prayers/monadic returns aside, there is just no way to make a side effecty function pure.
> >
> > On that note, in a practical sense, when we return {Reply, NewState} in a gen_server we are sort of doing what IO monads do for Haskell. There is NO GUARANTEE from the perspective of the called function that the Reply value will actually be returned to the caller. That is obviously the intent, but it is by no means clear. This property also makes it very convenient to hook such functions up to a property tester. On the other hand, the tuple returned itself can be viewed as a naked value.
> >
> > Returning naked values mandates purity or total crashability of the system. Nothing in between. Unless you want to play "confuse the new guy".
> >
> > When we have side effects, such as file operations, there is no difference between calling
> >
> >   {ok, Bin} = file:read_file(Path)
> >
> > and
> >
> >   Bin = file!:read_file(Path)
> >
> > That's a frivolous reason to add a glyph to the language. The assertion at the time of return already crashes on an error.
>
> Actually I half this but would prefer Bin=file:read_file!(P) - the problem
> is that ! is a weak typographic character - so it's easy to miss, and ! means
> send so we might (or rather will) confuse beginners.
>
> I guess we could say
>
>      Bin = MUST file:read_file(Path)
>
> Easy to implement and it stands out nicely so you can see it from a
> long way off.
> Loosely scanning many lines of code and seeing MUST would be nice,
> it could be used to signify which parts of the code must be correct.

I still don't see it as more clear to the eye than

  {ok, Bar} = foo()

vs

  Bar = MUST foo()

vs

  Bar = foo!()

The thing I like most about the wrapped value is that it applies to
type checking at static analysis time, runtime effects at runtime, is
visually obvious and yet succinct, doesn't add anything to the grammar
and allows the caller to pick how the value should be handled (crash or
handle the received error).

> I actually quite liked the Elixir pipe operator - so I changed the
> Erlang parser and messed with this a bit.
>
> So now I can write
>
> test2(F) ->
>     F  |> file:read_file()
>         |> ok()
>         |> tokenize()
>         |> parse().
>
> Instead of
>
> test2(F) ->
>     {ok, B} = file:read_file(),
>     T = tokenize(B),
>     parse(T).
>
> Where:
>
> ok({ok,B}) ->
>     B.
>
> I can't make my mind up about this - the version with pipes is slightly longer
> but seems very readable to me. Trouble is, if it crashes due to a non existent
> file, I can't see which file it is ...

I've thought about this a bit myself, and decided that in the context of
the Erlang runtime and Erlang syntax, I don't like it. Perhaps if we come
up with an Erlang2 with a slightly different semantic base I might like it.

(I went over some of that here, actually.
https://stackoverflow.com/questions/34622869/function-chaining-in-erlang/34626317#34626317
Sometimes Erlang *seems* verbose but I seem to consistently wind up with
*much* shorter programs overall in Erlang than when I write an equivalent
one in Python for anything non-trivial. And Python isn't so bad in terms
of program length, typically.)

It is the same reason I strongly resist the syntax OOP people try to bring
with them in the form of arguing for parameterized modules: it confuses
the concept of functional composition by adding a new syntax for it.

For example:

  test2(F) ->
      F = parse(tokenize(ok(file:read_file()))).

This exhibits no real differences because there is no underlying rule
applied other than simple composition. For example, there is no additive
rule that mandates that every call is actually being passed through
a function that looks like:

  must({ok, Value}) ->
      Value.

This would change the picture a lot, and allow functions to always be
written just one way instead of needing two versions.

Adding syntax for mere convenience is BAD unless we just want to mess
with people's minds. It is a recreation of the C++ and Haskell problem:
with 10 equivalent syntaxes, how many people can be expected to remember
and parse all of them when reading?

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Fred Hebert-2
In reply to this post by Joe Armstrong-2
On 09/28, Joe Armstrong wrote:

>
>I actually quite liked the Elixir pipe operator - so I changed the
>Erlang parser and messed with this a bit.
>
> ...
>
>I can't make my mind up about this - the version with pipes is slightly longer
>but seems very readable to me. Trouble is, if it crashes due to a non existent
>file, I can't see which file it is ...
>

A distinction between a function signature that returns `{ok, _} |
{error, _}' or one that raises exceptions is in the expectation of how
recoverable an error should be to the caller.

First let's look at our 3 exception types:

1. throws: should be used for non-local returns. I argue that throws
should never escape the module they were declared into without very
explicit documentation, and they should be mostly forgettable to your
callers.

2. errors: a condition prevents the code from doing its work, and the
expectation is that a programmer should modify code for things to keep
going. Those are calls you can catch and analyze, but aside from
situations where you're actively trying to work around limitations in
library code (or you're writing tests/logging code), you should want to
avoid that kind of catching.

3. exits: the process should stop running.

This sheds some light, IMO, in when to use `{ok, _} | {error, _}' or
just raising an exception: if the condition is something you could
foresee as reasonable as a library writer and for which corrective
action is plausible to undertake at runtime by the caller, then wrapping
results in `{ok, _} | {error, _}' makes sense.

The wrapping of arguments lets the caller make their own decisions
regarding the acceptability of a condition. If they think it's fair to
expect various values, a `case ... of` may be used; they can otherwise
just do a strict match and fail, elevating the unexpected return value
to the exception level (`error') on their own.

The interesting variation within your two code sample comes from the
handling, whether implicit or not, of these conditions:

    test2(F) ->
        F  |> file:read_file()
           |> ok()
           |> tokenize()
           |> parse().
   
    test2(F) ->
        {ok, B} = file:read_file(),
        T = tokenize(B),
        parse(T).
   
     ok({ok,T}) -> T.

The interesting thing that happens in the later case is that the
assertion that `B` must be there is very explicit and parseable
visually. In the piped case, you must manually transform the
unrecoverable condition into an exception through an explicit check.  
There's no big difference between that and:

    test2(F) ->
        B = ok(file:read_file()),
        T = tokenize(B),
        parse(T).

The awkwardness of either solution, I think, comes from the fact that
you're taking composition and using it to handle control flow. Taken to
a bigger extreme, you could imagine:

    check_a(...) -> {ok, X};
    check_a(...) -> {error, E}.

    check_b(...) -> {ok, X};
    check_b(...) -> {error, E}.
    ...
    check_z(...) -> {ok, X};
    check_z(...) -> {error, E}.

Handling those with a pipe would remain very awkward:

    testN(F) ->
        F |> check_a()
          |> ok()
          |> check_b()
          |> ok()
          |> ...
          |> check_z()
          |> ok()
          |> do_something().

Clearly, the pipe is not the right tool for the conditional job. That's
a bit where the maybe monad and similar tools come to help. Let's define
a new pipe for the sake of the argument: ||> will either unpack the `X'
from `{ok,X}' and pass it on, or exit as soon as possible:

    testN(F) ->
        F ||> check_a()
          ||> check_b()
          ...
          ||> check_z(),
          ||> do_something().

Now it tends to work pretty well. The weakness though is that you may
have a case where *some* conditions can be handled and some can't, or
where only *some* conditions need asserting and in other cases you don't
need them. How workable would it be to have a thing like:

    test3(F) ->
        F ||> file:read_file()
           |> tokenize()
           |> parse().

Where both operators compose within the same flow and fairly
transparently? You of course still lose the property of knowing which
operation failed and caused a bail out, but that is possibly more a
weakness of using a pure opaque flow to handle things.

I tried to play a bit with that in fancyflow[1] which failed a bit
because I didn't want to do anything but use parse transforms and it
doesn't look nearly as pretty:

    test4(F) ->
        [pipe](F,
               [maybe](_, file:read_file(_)),
               tokenize(_),
               parse(_)).

It does, however, have the advantage of being able to use arbitrary
argument positions and even repeat the argument in multiple places at
the call site.

I'm not arguing fancyflow should be a thing people use in their every
day life, but it proved an interesting experimentation device. For
example, I also managed to add

    test5() ->
        [F1,F2|T] = [parallel](f1(),
                               f2(),
                               f3(),
                               2+5).

under a similar form. As opposed to using a given operator, the verbose
format allows very clear composability rules (you can make parallel
validations of pipes using 'maybe's without confusion), and can be
extended for all kinds of operations.

The interesting aspect of it is the ability to (as with monads) define a
specific execution context around various expressions telling you which
transformations should be applied between each of them.

Therefore, nothing would really prevent us from doing something like:

    testN(F) ->
        [verbose_maybe](F,
                        check_a(_),
                        check_b(_),
                        ...
                        check_z(_),
                        do_something(_)).

Where rather than returning `{ok,X} | {error,X}' with `{error,X}' being
the literal return value of any of the check function, it instead
returns something like `{error, {verbose_maybe, Line, "check_b(_)"}, X}'
(or whatever other format could be judged useful), allowing to keep
local information about the workflow.

Or why not just turn to exceptions?

    testN(F) ->
        [ok](F,
             check_a(_),
             check_b(_),
             ...
             check_z(_),
             do_something(_)).

This format could, for example, just apply your 'ok/1' function
in-between any function call listed there, yieding full blown exceptions
every time something is off.

The problem, is, of course, the cost of letting someone write such
abstractions. With monads in a language like Haskell, it is extremely
cheap; syntax never changes, only the context definitions.
With parse transforms in Erlang like in fancyflow, it's easy to read new
forms, but a pain in the ass to extend.  With custom operators in a
strict language like Erlang, it's years of work.
Even then though custom operators in a macro-friendly language like
Elixir (or a language friendly to custom operators like Scala) are easy
to add, it then comes with a huge cognitive cost to the community since
anyone can reinvent them and they're always shitty to decipher.

It's fun to think about though!

Regards,
Fred.


[1]: https://github.com/ferd/fancyflow
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Joe Armstrong-2
In reply to this post by zxq9-2
On Thu, Sep 28, 2017 at 2:01 PM, zxq9 <[hidden email]> wrote:

> On 2017年09月28日 木曜日 13:36:30 Joe Armstrong wrote:
>> On Thu, Sep 28, 2017 at 8:14 AM, zxq9 <[hidden email]> wrote:
>> > On 2017年09月28日 木曜日 05:43:33 you wrote:
>> >> I dont think its been mentioned, elixir does this with the ! postfix.
>> >>
>> >> {ok, Bin} | {error, Reason} = File:read("my_file")
>> >>
>> >> Bin | throw/1 = File:read!("my_file")
>> >>
>> >>
>> >> Exactly as you said Mr. Armstrong, the former is more clear off the bat but the latter gives you a nice error (with the filename!).
>> >>
>> >> Which do I prefer?  It seems both are useful in certain cases, and one should not replace the other as the absolute truth. If
>> >> an absolute truth were to be arrived at, then I would vote for the option to have both! A way to call any STDLIB function and have it return a tuple, or throw/exception.
>> >
>> > Elixir may do that, but I think adding a magical glyph to the syntax is almost universally a mistake. Not because this one case is right or wrong, but because the thought process underlying it causes languages to eventually suffocate themselves -- as you may have begun noticing... And no, adding a bang at the end of a function call is NOT the same as naming two separate functions, unless bangs are acceptable characters to unclude in function names (are they?).
>> >
>> > If they are, some fun names might result:
>> >
>> >   launch!the!missiles!(BadGuys)
>> >
>> >   clean_your_room!dammit!(Kids)
>> >
>> > If they are legal characters, though, then what you are referring to is a mere coding convention -- arguably of less utility that matching wrapped values VS receiving naked ones, which has a concrete runtime effect (and is understood by the type system).
>> >
>> > If they are not legal characters and are actually a part of the core language (separate from the type system language) then... that's something I think should be contemplated for a HUGE amount of time before dropping into the language (things like that should be argued for years on end before just being picked as some new idea -- unless it is a research language, then, meh).
>> >
>> > That said, the point I was making is not that we should have both because everything is all the same and all functions should be treated equally and we really really need more diversity in the way we handle function call returns. The point I was making is that some functions are CLEARLY PURE and that is just an innate part of their nature. Other functions have side effects and, systems using runtime prayers/monadic returns aside, there is just no way to make a side effecty function pure.
>> >
>> > On that note, in a practical sense, when we return {Reply, NewState} in a gen_server we are sort of doing what IO monads do for Haskell. There is NO GUARANTEE from the perspective of the called function that the Reply value will actually be returned to the caller. That is obviously the intent, but it is by no means clear. This property also makes it very convenient to hook such functions up to a property tester. On the other hand, the tuple returned itself can be viewed as a naked value.
>> >
>> > Returning naked values mandates purity or total crashability of the system. Nothing in between. Unless you want to play "confuse the new guy".
>> >
>> > When we have side effects, such as file operations, there is no difference between calling
>> >
>> >   {ok, Bin} = file:read_file(Path)
>> >
>> > and
>> >
>> >   Bin = file!:read_file(Path)
>> >
>> > That's a frivolous reason to add a glyph to the language. The assertion at the time of return already crashes on an error.
>>
>> Actually I half this but would prefer Bin=file:read_file!(P) - the problem
>> is that ! is a weak typographic character - so it's easy to miss, and ! means
>> send so we might (or rather will) confuse beginners.
>>
>> I guess we could say
>>
>>      Bin = MUST file:read_file(Path)
>>
>> Easy to implement and it stands out nicely so you can see it from a
>> long way off.
>> Loosely scanning many lines of code and seeing MUST would be nice,
>> it could be used to signify which parts of the code must be correct.
>
> I still don't see it as more clear to the eye than
>
>   {ok, Bar} = foo()
>
> vs
>
>   Bar = MUST foo()
>
> vs
>
>   Bar = foo!()
>
> The thing I like most about the wrapped value is that it applies to
> type checking at static analysis time, runtime effects at runtime, is
> visually obvious and yet succinct, doesn't add anything to the grammar
> and allows the caller to pick how the value should be handled (crash or
> handle the received error).
>
>> I actually quite liked the Elixir pipe operator - so I changed the
>> Erlang parser and messed with this a bit.
>>
>> So now I can write
>>
>> test2(F) ->
>>     F  |> file:read_file()
>>         |> ok()
>>         |> tokenize()
>>         |> parse().
>>
>> Instead of
>>
>> test2(F) ->
>>     {ok, B} = file:read_file(),
>>     T = tokenize(B),
>>     parse(T).
>>
>> Where:
>>
>> ok({ok,B}) ->
>>     B.
>>
>> I can't make my mind up about this - the version with pipes is slightly longer
>> but seems very readable to me. Trouble is, if it crashes due to a non existent
>> file, I can't see which file it is ...
>
> I've thought about this a bit myself, and decided that in the context of
> the Erlang runtime and Erlang syntax, I don't like it. Perhaps if we come
> up with an Erlang2 with a slightly different semantic base I might like it.
>
> (I went over some of that here, actually.
> https://stackoverflow.com/questions/34622869/function-chaining-in-erlang/34626317#34626317
> Sometimes Erlang *seems* verbose but I seem to consistently wind up with
> *much* shorter programs overall in Erlang than when I write an equivalent
> one in Python for anything non-trivial. And Python isn't so bad in terms
> of program length, typically.)
>
> It is the same reason I strongly resist the syntax OOP people try to bring
> with them in the form of arguing for parameterized modules: it confuses
> the concept of functional composition by adding a new syntax for it.
>
> For example:
>
>   test2(F) ->
>       F = parse(tokenize(ok(file:read_file()))).
>
> This exhibits no real differences because there is no underlying rule
> applied other than simple composition.

My problem with the above is I almost never write like this
In my head I'm thinking "first you read the file then you tokenize it
then you parse it"

So pipes or breaking the code line by line with temporary variables
is fine by me. Adding temporary variables is also nice because you
can print them when things go wrong.


> For example, there is no additive
> rule that mandates that every call is actually being passed through
> a function that looks like:
>
>   must({ok, Value}) ->
>       Value.
>
> This would change the picture a lot, and allow functions to always be
> written just one way instead of needing two versions.
>
> Adding syntax for mere convenience is BAD unless we just want to mess
> with people's minds. It is a recreation of the C++ and Haskell problem:
> with 10 equivalent syntaxes, how many people can be expected to remember
> and parse all of them when reading?
>
> -Craig
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Richard Bowker
In reply to this post by Joe Armstrong-2

> Adding temporary variables is also nice because you
can print them when things go wrong.

Elixir has IO.inspect which you can use to tap into the middle of a pipe chain to output the intermediate state. (A form of the Kestrel combinator iirc)


--- [hidden email] wrote:

From: Joe Armstrong <[hidden email]>
To: zxq9 <[hidden email]>
Cc: Erlang <[hidden email]>
Subject: Re: [erlang-questions] Must and May convention
Date: Thu, 28 Sep 2017 16:14:39 +0200

On Thu, Sep 28, 2017 at 2:01 PM, zxq9 <[hidden email]> wrote:

> On 2017年09月28日 木曜日 13:36:30 Joe Armstrong wrote:
>> On Thu, Sep 28, 2017 at 8:14 AM, zxq9 <[hidden email]> wrote:
>> > On 2017年09月28日 木曜日 05:43:33 you wrote:
>> >> I dont think its been mentioned, elixir does this with the ! postfix.
>> >>
>> >> {ok, Bin} | {error, Reason} = File:read("my_file")
>> >>
>> >> Bin | throw/1 = File:read!("my_file")
>> >>
>> >>
>> >> Exactly as you said Mr. Armstrong, the former is more clear off the bat but the latter gives you a nice error (with the filename!).
>> >>
>> >> Which do I prefer?  It seems both are useful in certain cases, and one should not replace the other as the absolute truth. If
>> >> an absolute truth were to be arrived at, then I would vote for the option to have both! A way to call any STDLIB function and have it return a tuple, or throw/exception.
>> >
>> > Elixir may do that, but I think adding a magical glyph to the syntax is almost universally a mistake. Not because this one case is right or wrong, but because the thought process underlying it causes languages to eventually suffocate themselves -- as you may have begun noticing... And no, adding a bang at the end of a function call is NOT the same as naming two separate functions, unless bangs are acceptable characters to unclude in function names (are they?).
>> >
>> > If they are, some fun names might result:
>> >
>> >   launch!the!missiles!(BadGuys)
>> >
>> >   clean_your_room!dammit!(Kids)
>> >
>> > If they are legal characters, though, then what you are referring to is a mere coding convention -- arguably of less utility that matching wrapped values VS receiving naked ones, which has a concrete runtime effect (and is understood by the type system).
>> >
>> > If they are not legal characters and are actually a part of the core language (separate from the type system language) then... that's something I think should be contemplated for a HUGE amount of time before dropping into the language (things like that should be argued for years on end before just being picked as some new idea -- unless it is a research language, then, meh).
>> >
>> > That said, the point I was making is not that we should have both because everything is all the same and all functions should be treated equally and we really really need more diversity in the way we handle function call returns. The point I was making is that some functions are CLEARLY PURE and that is just an innate part of their nature. Other functions have side effects and, systems using runtime prayers/monadic returns aside, there is just no way to make a side effecty function pure.
>> >
>> > On that note, in a practical sense, when we return {Reply, NewState} in a gen_server we are sort of doing what IO monads do for Haskell. There is NO GUARANTEE from the perspective of the called function that the Reply value will actually be returned to the caller. That is obviously the intent, but it is by no means clear. This property also makes it very convenient to hook such functions up to a property tester. On the other hand, the tuple returned itself can be viewed as a naked value.
>> >
>> > Returning naked values mandates purity or total crashability of the system. Nothing in between. Unless you want to play "confuse the new guy".
>> >
>> > When we have side effects, such as file operations, there is no difference between calling
>> >
>> >   {ok, Bin} = file:read_file(Path)
>> >
>> > and
>> >
>> >   Bin = file!:read_file(Path)
>> >
>> > That's a frivolous reason to add a glyph to the language. The assertion at the time of return already crashes on an error.
>>
>> Actually I half this but would prefer Bin=file:read_file!(P) - the problem
>> is that ! is a weak typographic character - so it's easy to miss, and ! means
>> send so we might (or rather will) confuse beginners.
>>
>> I guess we could say
>>
>>      Bin = MUST file:read_file(Path)
>>
>> Easy to implement and it stands out nicely so you can see it from a
>> long way off.
>> Loosely scanning many lines of code and seeing MUST would be nice,
>> it could be used to signify which parts of the code must be correct.
>
> I still don't see it as more clear to the eye than
>
>   {ok, Bar} = foo()
>
> vs
>
>   Bar = MUST foo()
>
> vs
>
>   Bar = foo!()
>
> The thing I like most about the wrapped value is that it applies to
> type checking at static analysis time, runtime effects at runtime, is
> visually obvious and yet succinct, doesn't add anything to the grammar
> and allows the caller to pick how the value should be handled (crash or
> handle the received error).
>
>> I actually quite liked the Elixir pipe operator - so I changed the
>> Erlang parser and messed with this a bit.
>>
>> So now I can write
>>
>> test2(F) ->
>>     F  |> file:read_file()
>>         |> ok()
>>         |> tokenize()
>>         |> parse().
>>
>> Instead of
>>
>> test2(F) ->
>>     {ok, B} = file:read_file(),
>>     T = tokenize(B),
>>     parse(T).
>>
>> Where:
>>
>> ok({ok,B}) ->
>>     B.
>>
>> I can't make my mind up about this - the version with pipes is slightly longer
>> but seems very readable to me. Trouble is, if it crashes due to a non existent
>> file, I can't see which file it is ...
>
> I've thought about this a bit myself, and decided that in the context of
> the Erlang runtime and Erlang syntax, I don't like it. Perhaps if we come
> up with an Erlang2 with a slightly different semantic base I might like it.
>
> (I went over some of that here, actually.
> https://stackoverflow.com/questions/34622869/function-chaining-in-erlang/34626317#34626317
> Sometimes Erlang *seems* verbose but I seem to consistently wind up with
> *much* shorter programs overall in Erlang than when I write an equivalent
> one in Python for anything non-trivial. And Python isn't so bad in terms
> of program length, typically.)
>
> It is the same reason I strongly resist the syntax OOP people try to bring
> with them in the form of arguing for parameterized modules: it confuses
> the concept of functional composition by adding a new syntax for it.
>
> For example:
>
>   test2(F) ->
>       F = parse(tokenize(ok(file:read_file()))).
>
> This exhibits no real differences because there is no underlying rule
> applied other than simple composition.

My problem with the above is I almost never write like this
In my head I'm thinking "first you read the file then you tokenize it
then you parse it"

So pipes or breaking the code line by line with temporary variables
is fine by me. Adding temporary variables is also nice because you
can print them when things go wrong.


> For example, there is no additive
> rule that mandates that every call is actually being passed through
> a function that looks like:
>
>   must({ok, Value}) ->
>       Value.
>
> This would change the picture a lot, and allow functions to always be
> written just one way instead of needing two versions.
>
> Adding syntax for mere convenience is BAD unless we just want to mess
> with people's minds. It is a recreation of the C++ and Haskell problem:
> with 10 equivalent syntaxes, how many people can be expected to remember
> and parse all of them when reading?
>
> -Craig
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

zxq9-2
In reply to this post by Joe Armstrong-2
On 2017年09月28日 木曜日 16:14:39 Joe Armstrong wrote:

> >   test2(F) ->
> >       F = parse(tokenize(ok(file:read_file()))).
>
> My problem with the above is I almost never write like this
> In my head I'm thinking "first you read the file then you tokenize it
> then you parse it"
>
> So pipes or breaking the code line by line with temporary variables
> is fine by me. Adding temporary variables is also nice because you
> can print them when things go wrong.

I absolutely agree with that. The only time I find heavy composition
readable is when you are casting from one form to another just to get
some utility from a particular representation, like pushing a set or
map into a list and back again.

When we need pipes I prefer to go to full-blown pipeline functions
instead of sugary composition operators. What I nearly always really
need is an assertion at each step, and that is why I am so strongly
in favor of the form:

    foo(Resource) ->
        {ok, Data} = get_stuff(Resource),
        {ok, Scrubbed} = scrub(Data),
        process_important_value(Scrubbed).

If I see instead:

    foo(Resource) ->
        {ok, Data} = get_stuff(Resource),
        Scrubbed = scrub(Data),
        process_important_value(Scrubbed).

I know straight away that scrub/1 is a pure function that crashes on
bad input. This is, of course, assuming the idiom that I laid out before.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Richard Bowker
In reply to this post by Joe Armstrong-2
As Fred already said above, most of these flows are forms of the "Railway Oriented Programming" interpretation of Monads. i.e. do next step if ok, else shortcut to the end. the "with" concept in Elixir may be useful for comparison.

--- [hidden email] wrote:

From: zxq9 <[hidden email]>
To: Erlang <[hidden email]>
Subject: Re: [erlang-questions] Must and May convention
Date: Thu, 28 Sep 2017 23:35:49 +0900

On 2017年09月28日 木曜日 16:14:39 Joe Armstrong wrote:

> >   test2(F) ->
> >       F = parse(tokenize(ok(file:read_file()))).
>
> My problem with the above is I almost never write like this
> In my head I'm thinking "first you read the file then you tokenize it
> then you parse it"
>
> So pipes or breaking the code line by line with temporary variables
> is fine by me. Adding temporary variables is also nice because you
> can print them when things go wrong.

I absolutely agree with that. The only time I find heavy composition
readable is when you are casting from one form to another just to get
some utility from a particular representation, like pushing a set or
map into a list and back again.

When we need pipes I prefer to go to full-blown pipeline functions
instead of sugary composition operators. What I nearly always really
need is an assertion at each step, and that is why I am so strongly
in favor of the form:

    foo(Resource) ->
        {ok, Data} = get_stuff(Resource),
        {ok, Scrubbed} = scrub(Data),
        process_important_value(Scrubbed).

If I see instead:

    foo(Resource) ->
        {ok, Data} = get_stuff(Resource),
        Scrubbed = scrub(Data),
        process_important_value(Scrubbed).

I know straight away that scrub/1 is a pure function that crashes on
bad input. This is, of course, assuming the idiom that I laid out before.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

zxq9-2
On 2017年09月28日 木曜日 07:43:32 Richard Bowker wrote:
> As Fred already said above, most of these flows are forms of the
> "Railway Oriented Programming" interpretation of Monads. i.e. do
> next step if ok, else shortcut to the end. the "with" concept in
> Elixir may be useful for comparison.

Syntactic sugar is syntactic sugar, and generally undesirable imo.
In this case syntactic sugar over pipeline functions which seem to
exist merely to bewilder those who have yet to encounter their
specially sugary syntactic forms in lang X or Y just yet.

It is nearly always more interesting and self-consistent to build
constructs such as this from the basic tools of the language, IN the
language, and keep that language as small as possible than to cough
up new syntax for each type of thing. It is a particularly bad sign
when a language continues to accumulate new features and syntax well
beyond its first year in production, especially when they are nifty
little "me, too!" sort of sugary cheats.

What if, for example, you want to continue evaluation to the end of
the steps specifically because of some side-effects desired instead
of shortcutting? Building a dead-obvious pipeline that does just that
and says so in the name is not as likely to confuse someone as
introducing new syntax to cover that one case plus the original shortcut
one. Two syntaxes for two types of pipelines... but there are many more
kinds of pipelines. So where do we stop?

"Oh, I know! We'll add a new rule that you can 'hook stuff into this'
as it zips along!" Great. One more thing to try to teach and debug...
when the original construct, if left in the form of functions, would have
been as concise, as clear, and not have wasted anyone's time learning
which non alphanumeric chars on their keyboard they need to be more
closely aware of.

This is the same argument I have, generally speaking, against use
of the `class` keyword to special-case creation of dispatching closures
and brand them "objects" -- particularly when the language in question
doesn't even bother to actually enforce encapsulation. This is also the
reason I dislike XML -- semantic ambiguity grows from having a variety
of equivalent expressions and ruins the party for everyone.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Loïc Hoguin-3
In reply to this post by Joe Armstrong-2
On 09/28/2017 04:14 PM, Joe Armstrong wrote:

>> For example:
>>
>>    test2(F) ->
>>        F = parse(tokenize(ok(file:read_file()))).
>>
>> This exhibits no real differences because there is no underlying rule
>> applied other than simple composition.
>
> My problem with the above is I almost never write like this
> In my head I'm thinking "first you read the file then you tokenize it
> then you parse it"

I wonder how a native speaker of a RTL language would feel about this?
Perhaps it would make a lot more sense for them.

Personally I don't have an issue about the order in these cases, the
only times I do worry about order is when doing arithmetic or boolean
operations (because I can never remember precedence rules, so I put
parents around everything). So a simplified example like the above feels
perfectly natural. A more complex example with many arguments could of
course be harder to read. Which leads to...

> So pipes or breaking the code line by line with temporary variables
> is fine by me. Adding temporary variables is also nice because you
> can print them when things go wrong.

I think breaking up has two more important properties than order.

First, they make expressions shorter. Short expressions are easy to read
and to understand.

Second, they make the operations independent. This makes the block
easier to read and to update.

These two properties are not provided by pipes, you still have a big
expression, you just changed the written order. Worse, this only works
if your subject is always the first argument, so you end up having to do
it the traditional way in some cases anyway. (Unless that changed.)

--
Loïc Hoguin
https://ninenines.eu
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Must and May convention

Fred Hebert-2
In reply to this post by zxq9-2
On 09/28, zxq9 wrote:
>Syntactic sugar is syntactic sugar, and generally undesirable imo.
>In this case syntactic sugar over pipeline functions which seem to
>exist merely to bewilder those who have yet to encounter their
>specially sugary syntactic forms in lang X or Y just yet.
>

Unless you have so much syntactic sugar (or sugar so ill-developed) it
cannot be expected that the average language user knows all of them, you
have to assume some level of knowledge in your users about the language
they use.

We should be careful about an approach advocating for the lowest common
denominator in the name of least surprise; a strict obedience to that
rules would mean C-like syntax and no pattern matching since those are
probably some of the trickiest things a newcomer can meet in Erlang.

>It is nearly always more interesting and self-consistent to build
>constructs such as this from the basic tools of the language, IN the
>language, and keep that language as small as possible than to cough
>up new syntax for each type of thing. It is a particularly bad sign
>when a language continues to accumulate new features and syntax well
>beyond its first year in production, especially when they are nifty
>little "me, too!" sort of sugary cheats.
>
>What if, for example, you want to continue evaluation to the end of
>the steps specifically because of some side-effects desired instead
>of shortcutting? Building a dead-obvious pipeline that does just that
>and says so in the name is not as likely to confuse someone as
>introducing new syntax to cover that one case plus the original
>shortcut
>one. Two syntaxes for two types of pipelines... but there are many more
>kinds of pipelines. So where do we stop?

The question underlined here is not whether a given abstraction (with or
without syntax) is warranted or not, but whether the given abstraction
is composable enough.

As an example, Elixir has added the 'With' syntax:

    opts = %{width: 10, height: 15}
    with {:ok, width} <- Map.fetch(opts, :width),
         {:ok, height} <- Map.fetch(opts, :height),
      do: {:ok, width * height}

This is now some kind of new fancy syntax. However, Erlang lets you do
something similar to with with list comprehensions:

    Opts = dict:from_list([{width,10}, {heigth,15}]),
    hd([{ok, Width*Height}
        || {ok, Width} <- [dict:find(width, Opts)],
           {ok, Height} <- [dict:find(height, Opts)]])

Even though the basic building blocks provided by list comprehensions
allow the flexibility to do the desired operation provided by the `with'
block and have allowed to do so for longer than Elixir has even existed,
it is nearly never seen in the wild, and I avoid writing it whenever
possible.

Direct basic flexible syntax is not a given synonym of a superior
approach. The intent behind the list comprehension code is lost because
more often, list comprehensions are about lists, not hijacking fancy
pattern branching to get nifty flow control.

The tradeoff is not on whether the basic formula is the best, but
whether the developers can communicate their intent in an error-free
manner with the least amount of confusion to readers. Sometimes, adding
syntax can be worth it for very common patterns, because in most cases,
they will express things better.

>
>"Oh, I know! We'll add a new rule that you can 'hook stuff into this'
>as it zips along!" Great. One more thing to try to teach and debug...
>when the original construct, if left in the form of functions, would have
>been as concise, as clear, and not have wasted anyone's time learning
>which non alphanumeric chars on their keyboard they need to be more
>closely aware of.
>

A shortcut is a good thing if it saves you time with little risk. As an
example, the need to always return something in Erlang has yielded
shortcuts like the following ones for the cases where users don't need a
return value:

    - case Cond of true -> log(Message); _ -> ok end
    - Cond andalso log(Message)
    - [log(Message) || Cond]
    - maybe_log(Cond, Message)  % repeated in every module

Those are likely rare enough and isolated enough that they are not worth
their own syntax. In fact, some of them are clean enough that they may
be totally acceptable and will sooner or later permeate regular practice
at the confusion of newcomers.

The interesting question related to this is to ask when is a shortuct
common enough to become a desire path? See
https://en.wikipedia.org/wiki/Desire_path

Maps weren't in the original language, but people had such a need for
them that it needed to be added. Even though you could very well make do
with records and dicts and trees, people knew there was a more
convenient path through maps. And hence the previously very common:

    #record{dict := Dict} = Record,
    {ok, Val1} = dict:find(Key1, Dict),
    {ok, Val2} = dict:find(Key2, Dict),
    Val1 + Val2

became:

    #record{dict = #{Key1 := Val1, Key2 := Val2}} = Record,
    Val1 + Val2

The potential for errors has gone down, the expressivity went up, and
user convenience and satisfaction likely went up as well, since a lot of
code now uses maps everywhere.

It is a bit too easy to just say "you don't really need that, here's a
workaround". What's interesting is figuring out where or when it does or
does not apply, and how could it be made composable enough not to stand
out like a sore thumb and be amenable to extension in the future if need
be.

So the |> pipe operator is not really perfect in elixir:

- it is strict with regards to position of the arguments
- it obscures the original number of arguments to a function (hello
  tracing)
- it may compose or nest funny in some cases or other constructs
- it is limited in what it can do (see my previous post in this thread)
- it brings the language down a tricky avenue since other similar
  control flow constructs would need to add more special operators with
  new semantics
- other language features may be equivalent to it already

Nevertheless, it is hailed as one of the best features of the language
by newcomers. I don't think I'd want it under its existing form in
Erlang, but the tradeoff was deemed worth it by the Elixir community.

The question, to me, is not whether a pipe operator (or a `with' block)
is required at all, but rather that if we wanted to provide one, which
way would be good enough to have all the positives and few or none of
the negatives?

I am able to get along fine without it, but if working without a similar
construct requires 40 posts in a mailing list to make sure other people
get a solution or pattern they are comfortable with when the operator is
*not* available, then maybe there's a need to look at ways to improve
the current state of things.

Regards,
Fred.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
123