"prefix" ++ Rest = Something

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

"prefix" ++ Rest = Something

Stefan Hellkvist-2
Hi, 

Erlang has this syntactic sugar for matching string prefixes (http://erlang.org/doc/reference_manual/expressions.html#id80508) where you can do:

"prefix" ++ Rest = "prefixsomething"

, which would bind Rest to "something" in this case. 


I'm curious why however it is ok to do:

1> "prefix" ++ Rest = "prefixsomething".
"prefixsomething"
2> Rest.
"something"


but it is not ok to do:

1> Prefix = "prefix".
"prefix"
2> Prefix ++ Rest = "prefixsomething".
* 1: illegal pattern


Is it because this syntactic sugar is transformed more or less as a preprocessing step where the value of Prefix needs to be known, or why else is "Prefix ++ Rest = Something" not allowed even when Prefix is bound?

/Stefan

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Kostis Sagonas-2
On 10/25/2017 08:56 AM, Stefan Hellkvist wrote:
> Is it because this syntactic sugar is transformed more or less as a
> preprocessing step where the value of Prefix needs to be known,

Yes, that's the reason.  It's a purely _syntactic_ thing.

Kostis
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Pierre Fenoll-2
It's the same with binaries:

<<"a", Rest/binary>> = <<"abc">>
%%=> Rest = <<"bc">>

a() -> <<"a">>
Prefix = <<(a())/binary>>
PrefixSize = byte_size(Prefix)  %% Compilation fails if size is not provided
<<Prefix:PrefixSize/binary, Rest/binary>> = <<"abc">>
%%=> Rest = <<"bc">>

Not sure what the differences may be due to compilation v. in the REPL.

I never understood however why this was "just sugar" or why PrefixSize is explicitly needed:
  in
      f(Prefix) when is_list(Prefix) ->
          Prefix ++ Rest = "my_list_thing",
          Rest.
  why isn't the compiler able to generate a pattern match? Is it missing some concept, some structures?
What are the missing pieces and what can be done to add them?

Same question with binaries:
  since we have a "prefix-match of binaries only providing PrefixSize at runtime" instruction,
      * Why don't we have one for lists?
      * And oh God why do we have to provide that PrefixSize "manually"? (binding the variable ourselves, when the compiler could do that itself)
      * Why isn't suffix-matching of binaries implemented yet? (how different to prefix-matching can it be?)

I could never find the answer to all of these questions.
WRT binaries my thinking is that they are actually a mix of "bytes" and references to binaries,
making some crucial operations O(log n) instead of O(1)... but prefix match exists...

I really want to be able to write things like:
    <<"{", Name/binary, "}">> = PathToken

%%=> Name = <<"id">> given PathToken = <<"{id}">>



Cheers,
-- 
Pierre Fenoll


On 25 October 2017 at 09:27, Kostis Sagonas <[hidden email]> wrote:
On 10/25/2017 08:56 AM, Stefan Hellkvist wrote:
Is it because this syntactic sugar is transformed more or less as a preprocessing step where the value of Prefix needs to be known,

Yes, that's the reason.  It's a purely _syntactic_ thing.

Kostis
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Raimo Niskanen-2
On Wed, Oct 25, 2017 at 12:54:59PM +0200, Pierre Fenoll wrote:

> It's the same with binaries:
>
> <<"a", Rest/binary>> = <<"abc">>
> %%=> Rest = <<"bc">>
>
> a() -> <<"a">>
> Prefix = <<(a())/binary>>
> PrefixSize = byte_size(Prefix)  %% Compilation fails if size is not provided
> <<Prefix:PrefixSize/binary, Rest/binary>> = <<"abc">>
> %%=> Rest = <<"bc">>

The only time an unspecified size is allowed in the bit syntax is for the
last field in a bit expression.

>
> Not sure what the differences may be due to compilation v. in the REPL.
>
> I never understood however why this was "just sugar" or why PrefixSize is
> explicitly needed:
>   in
>       f(Prefix) when is_list(Prefix) ->
>           Prefix ++ Rest = "my_list_thing",
>           Rest.
>   why isn't the compiler able to generate a pattern match? Is it missing
> some concept, some structures?
> What are the missing pieces and what can be done to add them?

    "abc" ++ Rest

is compiled to

    [$a, $b, $c | Rest]

The variable Prefix is a complete list hence there is only a pointer to the
head in runtime, so the only way to append to it is via lists:append/2 i.e
the actual operator erlang:'++'/2, which creates a new list.  So this can
not be described as a pattern.

The visual appearence of "abc"++Rest vs Prefix++Rest is misleading:

   [$a, $b, $c | Rest] vs lists:append(Prefix, Rest)


>
> Same question with binaries:
>   since we have a "prefix-match of binaries only providing PrefixSize at
> runtime" instruction,
>       * Why don't we have one for lists?
>       * And oh God why do we have to provide that PrefixSize "manually"?
> (binding the variable ourselves, when the compiler could do that itself)
>       * Why isn't suffix-matching of binaries implemented yet? (how
> different to prefix-matching can it be?)
>
> I could never find the answer to all of these questions.
> WRT binaries my thinking is that they are actually a mix of "bytes" and
> references to binaries,
> making some crucial operations O(log n) instead of O(1)... but prefix match
> exists...
>
> I really want to be able to write things like:
>     <<"{", Name/binary, "}">> = PathToken
>
> %%=> Name = <<"id">> given PathToken = <<"{id}">>
>
>
>
> Cheers,
> --
> Pierre Fenoll
>
>
> On 25 October 2017 at 09:27, Kostis Sagonas <[hidden email]> wrote:
>
> > On 10/25/2017 08:56 AM, Stefan Hellkvist wrote:
> >
> >> Is it because this syntactic sugar is transformed more or less as a
> >> preprocessing step where the value of Prefix needs to be known,
> >>
> >
> > Yes, that's the reason.  It's a purely _syntactic_ thing.
> >
> > Kostis
> > _______________________________________________
> > erlang-questions mailing list
> > [hidden email]
> > http://erlang.org/mailman/listinfo/erlang-questions
> >

> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions


--

/ Raimo Niskanen, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Pierre Fenoll-2
So this can not be described as a pattern.

I'm saying let's fix that.

Prefix ++ Rest as a pattern does not need to be calling ++/2.
Instead, I am suggesting to allow "partial matches": adding support for these patterns.
What makes this hard? Why is this only sugar?
Why is a similar pattern allowed in binaries but not with lists? (the <<Prefix:PrefixSize/binary, Rest/binary>> pattern)


Cheers,
-- 
Pierre Fenoll


On 25 October 2017 at 13:53, Raimo Niskanen <[hidden email]> wrote:
On Wed, Oct 25, 2017 at 12:54:59PM +0200, Pierre Fenoll wrote:
> It's the same with binaries:
>
> <<"a", Rest/binary>> = <<"abc">>
> %%=> Rest = <<"bc">>
>
> a() -> <<"a">>
> Prefix = <<(a())/binary>>
> PrefixSize = byte_size(Prefix)  %% Compilation fails if size is not provided
> <<Prefix:PrefixSize/binary, Rest/binary>> = <<"abc">>
> %%=> Rest = <<"bc">>

The only time an unspecified size is allowed in the bit syntax is for the
last field in a bit expression.

>
> Not sure what the differences may be due to compilation v. in the REPL.
>
> I never understood however why this was "just sugar" or why PrefixSize is
> explicitly needed:
>   in
>       f(Prefix) when is_list(Prefix) ->
>           Prefix ++ Rest = "my_list_thing",
>           Rest.
>   why isn't the compiler able to generate a pattern match? Is it missing
> some concept, some structures?
> What are the missing pieces and what can be done to add them?

    "abc" ++ Rest

is compiled to

    [$a, $b, $c | Rest]

The variable Prefix is a complete list hence there is only a pointer to the
head in runtime, so the only way to append to it is via lists:append/2 i.e
the actual operator erlang:'++'/2, which creates a new list.  So this can
not be described as a pattern.

The visual appearence of "abc"++Rest vs Prefix++Rest is misleading:

   [$a, $b, $c | Rest] vs lists:append(Prefix, Rest)


>
> Same question with binaries:
>   since we have a "prefix-match of binaries only providing PrefixSize at
> runtime" instruction,
>       * Why don't we have one for lists?
>       * And oh God why do we have to provide that PrefixSize "manually"?
> (binding the variable ourselves, when the compiler could do that itself)
>       * Why isn't suffix-matching of binaries implemented yet? (how
> different to prefix-matching can it be?)
>
> I could never find the answer to all of these questions.
> WRT binaries my thinking is that they are actually a mix of "bytes" and
> references to binaries,
> making some crucial operations O(log n) instead of O(1)... but prefix match
> exists...
>
> I really want to be able to write things like:
>     <<"{", Name/binary, "}">> = PathToken
>
> %%=> Name = <<"id">> given PathToken = <<"{id}">>
>
>
>
> Cheers,
> --
> Pierre Fenoll
>
>
> On 25 October 2017 at 09:27, Kostis Sagonas <[hidden email]> wrote:
>
> > On 10/25/2017 08:56 AM, Stefan Hellkvist wrote:
> >
> >> Is it because this syntactic sugar is transformed more or less as a
> >> preprocessing step where the value of Prefix needs to be known,
> >>
> >
> > Yes, that's the reason.  It's a purely _syntactic_ thing.
> >
> > Kostis
> > _______________________________________________
> > erlang-questions mailing list
> > [hidden email]
> > http://erlang.org/mailman/listinfo/erlang-questions
> >

> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions


--

/ Raimo Niskanen, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Danil Zagoskin-2
In reply to this post by Stefan Hellkvist-2
One can implement different ways of matching string against prefix-as-binding:

7> Prefix = "hello ".
"hello "
8> {Prefix, Rest} = lists:split(length(Prefix), "hello world").
{"hello ","world"}
9> Rest.
"world"

10> Rest = lists:foldl(fun(C, [C|RestAcc]) -> RestAcc; (_, _) -> error(badmatch) end, "hello world", Prefix).
"world"
11> lists:foldl(fun(C, [C|RestAcc]) -> RestAcc; (_, _) -> error(badmatch) end, "help world", Prefix).
** exception error: badmatch
     in function  shell:apply_fun/3 (shell.erl, line 900)
     in call from lists:foldl/3 (lists.erl, line 1263)

12> MatchPref = fun MatchPref([], SRest) -> SRest; MatchPref([C|Pref], [C|SRest]) -> MatchPref(Pref, SRest); MatchPref(_, _) -> error(badmatch) end.
#Fun<erl_eval.36.99386804>
13> Rest = MatchPref(Prefix, "hello world").
"world"
14> Rest = MatchPref(Prefix, "help world").
** exception error: badmatch

So it should be possible to implement a syntactic sugar for Prefix ++ Rest = String.
I suppose one can do that even for function/case clauses (but that would be more tricky).

Seems like parse_transform would be enough for a proof-of-concept. In this case even no OTP patch is required.

On Wed, Oct 25, 2017 at 9:56 AM, Stefan Hellkvist <[hidden email]> wrote:
Hi, 

Erlang has this syntactic sugar for matching string prefixes (http://erlang.org/doc/reference_manual/expressions.html#id80508) where you can do:

"prefix" ++ Rest = "prefixsomething"

, which would bind Rest to "something" in this case. 


I'm curious why however it is ok to do:

1> "prefix" ++ Rest = "prefixsomething".
"prefixsomething"
2> Rest.
"something"


but it is not ok to do:

1> Prefix = "prefix".
"prefix"
2> Prefix ++ Rest = "prefixsomething".
* 1: illegal pattern


Is it because this syntactic sugar is transformed more or less as a preprocessing step where the value of Prefix needs to be known, or why else is "Prefix ++ Rest = Something" not allowed even when Prefix is bound?

/Stefan

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




--
Danil Zagoskin | [hidden email]

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Richard A. O'Keefe-2
In reply to this post by Pierre Fenoll-2


On 26/10/17 3:11 AM, Pierre Fenoll wrote:
>> So this can not be described as a pattern.
>
> I'm saying let's fix that.
>
> Prefix ++ Rest as a pattern does not need to be calling ++/2.
> Instead, I am suggesting to allow "partial matches": adding support for
> these patterns.
> What makes this hard? Why is this only sugar?

(1) Consider Prefix ++ Rest = List where List has N elements.
     If Prefix and List are both unbound, there are N+1 solutions.
     If List is bound, there is at most one solution.
     If Prefix is bound, there is at most one solution.
     If Prefix is a list of patterns some of which are unbound,
     but the length of Prefix is fixed, there is at most one solution.

> Why is a similar pattern allowed in binaries but not with lists? (the
> <<Prefix:PrefixSize/binary, Rest/binary>> pattern)

     Did you notice the :PrefixSize part?  That has to be known.
     Otherwise it would have the same multiple-solutions problem.

(2) "prefix" ++ Rest as a pattern _doesn't_ call ++/2.
     It is translated to [$p|[$r|[$e|[$f|[$i|[$x|Rest]]]]]]
     by the parser.  (As noted above, the _elements_ of the list
     could perfectly well be any pattern, but the *spine* of the
     list must be visibly present.)

It's rather like the way Haskell used to allow
     f 0 = 1
     f (n+1) = f n * (n+1)
but did not allow n+m as a pattern.


Just how much of a problem _is_ this anyway?
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: "prefix" ++ Rest = Something

Pierre Fenoll-2
I am not sure I understand your (1): we have been talking about the case where Prefix is bound, just not with a value known at compile time.

Of course I noticed the :PrefixSize part, I have been writing about it for 3 emails now. 
So the :PrefixSize is needed, okay. Why isn't the compiler binding this variable automatically then?
You know, write:
  <<Prefix/binary, Rest/binary>> = Bla (with Prefix bound)
and the compiler should generate:
  PrefixSize = byte_size(Prefix),
  <<Prefix:PrefixSize/binary, Rest/binary>> = Bla

Obviously the compiler can do this. Now why doesn't it do that already?
Then, let's implement the same matching for lists.
And I believe the fact that a list's length is O(n) will not be an issue: you can already pattern match lists today.

FYI your Haskell example is also a valid Erlang pattern match.

> Just how much of a problem _is_ this anyway?

I'm repeating the example I gave before:
    <<"{", Name/binary, "}">> = PathToken
With latest Erlang/OTP (20.1) one has to write:
    <<${, Rest/binary>> = PathToken,
    <<$}, Eman/binary>> = binary:reverse(Rest),
    Name = binary:reverse(Eman)

which wastes copying, creates garbage, is of probably worse complexity and uses a function that doesn't even exist.
Yes, it is maybe time to add binary:reverse/1, from https://stackoverflow.com/a/43310493/1418165 probably.

Further improvement to pattern matching:
How about allowing matching non-local functions?
  case F of
      fun io:format/2 -> blip();
      fun erlang:display/1 -> blop()
  end
Of course this would only really be matching {M,F,Arity}.



Cheers,
-- 
Pierre Fenoll


On 26 October 2017 at 04:47, Richard A. O'Keefe <[hidden email]> wrote:


On 26/10/17 3:11 AM, Pierre Fenoll wrote:
So this can not be described as a pattern.

I'm saying let's fix that.

Prefix ++ Rest as a pattern does not need to be calling ++/2.
Instead, I am suggesting to allow "partial matches": adding support for
these patterns.
What makes this hard? Why is this only sugar?

(1) Consider Prefix ++ Rest = List where List has N elements.
    If Prefix and List are both unbound, there are N+1 solutions.
    If List is bound, there is at most one solution.
    If Prefix is bound, there is at most one solution.
    If Prefix is a list of patterns some of which are unbound,
    but the length of Prefix is fixed, there is at most one solution.

Why is a similar pattern allowed in binaries but not with lists? (the
<<Prefix:PrefixSize/binary, Rest/binary>> pattern)

    Did you notice the :PrefixSize part?  That has to be known.
    Otherwise it would have the same multiple-solutions problem.

(2) "prefix" ++ Rest as a pattern _doesn't_ call ++/2.
    It is translated to [$p|[$r|[$e|[$f|[$i|[$x|Rest]]]]]]
    by the parser.  (As noted above, the _elements_ of the list
    could perfectly well be any pattern, but the *spine* of the
    list must be visibly present.)

It's rather like the way Haskell used to allow
    f 0 = 1
    f (n+1) = f n * (n+1)
but did not allow n+m as a pattern.


Just how much of a problem _is_ this anyway?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions