Fix for A=<<1>>

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

James Hague-3
>What I observed in erl_scan.erl is that this kind of
>cheating is already done when matching "<<", ">>", ">=",
>"->", etc.

I thought the same thing, and I wouldn't have even tried to special-case
"=<<" if there already hadn't been support in the same function for
multi-character tokens like "=<".  But I agree with your concern, and I'm
curious how those are handled properly by erl_scan.


Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Raimo Niskanen-7
I have just rewritten the scanner for R9C to make it twice as fast, and
the reentrancy issues you worry about has been solved in that rewrite
(presumably).

About the '=<<' problem I was just about to fix that when I checked with
the other guys, and this is the problem.

How should the scanner interpret '=<<<' as in "if A =< <<1,2>>" without
spaces; that's easy:  '=<' '<<'.

But since sub-binaries are allowed when constructing: how about '=<<<<<'
as in "if A =< << <<1>>/binary, 2>>", then one can see that the scanning
of '=<<' depends on if the number of '<' characters following is odd or
even, so the scanner might have to scan infinitely ahead. A look ahead
scan of limited small length would be fine, but this is ugly.

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



James Hague wrote:
>>What I observed in erl_scan.erl is that this kind of
>>cheating is already done when matching "<<", ">>", ">=",
>>"->", etc.
>
>
> I thought the same thing, and I wouldn't have even tried to special-case
> "=<<" if there already hadn't been support in the same function for
> multi-character tokens like "=<".  But I agree with your concern, and I'm
> curious how those are handled properly by erl_scan.



Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Ulf Wiger-4
On Mon, 5 May 2003, Raimo Niskanen wrote:

>But since sub-binaries are allowed when constructing: how
>about '=<<<<<' as in "if A =< << <<1>>/binary, 2>>", then
>one can see that the scanning of '=<<' depends on if the
>number of '<' characters following is odd or even, so the
>scanner might have to scan infinitely ahead. A look ahead
>scan of limited small length would be fine, but this is
>ugly.

I'm not sure what the upper limit would be for a sequence of
'<' symbols in a program making any qlaims of still being
useable (one of course has to take into account generated
code, which is usually less readable than hand-written
code.)

Perhaps a stupid question, but, so what if the scanner looks
ahead and breaks for safety at, say, 1000 tokens? This won't
cause any big problems as far as memory is concerned, and at
least I find it difficult to envision a program that would
break because of this, that is still worthy of being
compiled.

The following syntactically correct expression would no
longer work (line breaks added for nettiquette compliance).
I'm prepared to say "so what?":

A =<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<1>>
/binary,2>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
/binary>>.

/Uffe
--
Ulf Wiger, Senior Specialist,
   / / /   Architecture & Design of Carrier-Class Software
  / / /    Strategic Product & System Management
 / / /     Ericsson AB, Connectivity and Control Nodes



Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Raimo Niskanen-7
Alright, fixed for any number of '<' characters. It was a rather simple
change. There should be no bad consequences for neither execution time
nor memory consumption either, it was just to count the '<' characters
and generate the tokens at the end of the sequence.

If the change breaks anything in our daily build and test runs I will
let you know. Otherwise the change will come in R9C.

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Ulf Wiger wrote:

> On Mon, 5 May 2003, Raimo Niskanen wrote:
>
>
>>But since sub-binaries are allowed when constructing: how
>>about '=<<<<<' as in "if A =< << <<1>>/binary, 2>>", then
>>one can see that the scanning of '=<<' depends on if the
>>number of '<' characters following is odd or even, so the
>>scanner might have to scan infinitely ahead. A look ahead
>>scan of limited small length would be fine, but this is
>>ugly.
>
>
> I'm not sure what the upper limit would be for a sequence of
> '<' symbols in a program making any qlaims of still being
> useable (one of course has to take into account generated
> code, which is usually less readable than hand-written
> code.)
>
> Perhaps a stupid question, but, so what if the scanner looks
> ahead and breaks for safety at, say, 1000 tokens? This won't
> cause any big problems as far as memory is concerned, and at
> least I find it difficult to envision a program that would
> break because of this, that is still worthy of being
> compiled.
>
> The following syntactically correct expression would no
> longer work (line breaks added for nettiquette compliance).
> I'm prepared to say "so what?":
>
> A =<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<1>>
> /binary,2>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> /binary>>.
>
> /Uffe



Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Robert Virding-5
Doesn't all this work with a special case just to handle a general case tend to indicate that the original special case was probably wrong.

VERY IMPORTANT
Actually this special case means that tokens are no longer scaned in the same way. Everywhere else tokenising is eager, you collect as many characters as you can to mke a token EXCEPT HERE. Why not add other fantastic special cases:

"caseVar" should be scaned as "case Var"
"ifVar" should be scanned as "if Var"
How about any word starting with receive/case/if should be split automatically as this is obviously what the programmer intended. As should any word ending in end. Obvious. Actually any token with if/case/receive/end should be split around them.

Need I go on? There is no fundamental difference between these examples and the original one!!

So this change should not be added for two main reasons:

1. You are introducing an inconsistency and inconsistencies are always bad.
2. You are trying to guess what the user actually meant and at the tokenising stage you have no idea of context. When I wrote "A=<<1>>" I might have actually meant to write "A =< <<1>>" so changing it automagically to "A=<<1>>" introduces a fundamental semantic change to my code.

Can you GUARANTEE that this change is always what the programmer intended? ALWAYS? If not then you can't make this change.

Sorry if sound a bit harsh but someone has to be the devil's advocate.

Robert

P.S. Raimo how did you get the speed up. I know the original was coded just as mucg for clarity as for speed (I wrote it), but how?

----- Original Message -----
From: "Raimo Niskanen" <raimo.niskanen>
To: <erlang-questions>
Sent: Monday, May 05, 2003 1:24 PM
Subject: Re: Fix for A=<<1>>


> Alright, fixed for any number of '<' characters. It was a rather simple
> change. There should be no bad consequences for neither execution time
> nor memory consumption either, it was just to count the '<' characters
> and generate the tokens at the end of the sequence.
>
> If the change breaks anything in our daily build and test runs I will
> let you know. Otherwise the change will come in R9C.
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
>
>
> Ulf Wiger wrote:
> > On Mon, 5 May 2003, Raimo Niskanen wrote:
> >
> >
> >>But since sub-binaries are allowed when constructing: how
> >>about '=<<<<<' as in "if A =< << <<1>>/binary, 2>>", then
> >>one can see that the scanning of '=<<' depends on if the
> >>number of '<' characters following is odd or even, so the
> >>scanner might have to scan infinitely ahead. A look ahead
> >>scan of limited small length would be fine, but this is
> >>ugly.
> >
> >
> > I'm not sure what the upper limit would be for a sequence of
> > '<' symbols in a program making any qlaims of still being
> > useable (one of course has to take into account generated
> > code, which is usually less readable than hand-written
> > code.)
> >
> > Perhaps a stupid question, but, so what if the scanner looks
> > ahead and breaks for safety at, say, 1000 tokens? This won't
> > cause any big problems as far as memory is concerned, and at
> > least I find it difficult to envision a program that would
> > break because of this, that is still worthy of being
> > compiled.
> >
> > The following syntactically correct expression would no
> > longer work (line breaks added for nettiquette compliance).
> > I'm prepared to say "so what?":
> >
> > A =<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<1>>
> > /binary,2>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>.
> >
> > /Uffe
>
>



Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Robert Virding-5
In reply to this post by Raimo Niskanen-7
Forgot to say sorry for the delay in answering but I haven't been connected for a while.

Robert

P.S. But the idea is still a bad one.-)

----- Original Message -----
From: "Raimo Niskanen" <raimo.niskanen>
To: <erlang-questions>
Sent: Monday, May 05, 2003 1:24 PM
Subject: Re: Fix for A=<<1>>


> Alright, fixed for any number of '<' characters. It was a rather simple
> change. There should be no bad consequences for neither execution time
> nor memory consumption either, it was just to count the '<' characters
> and generate the tokens at the end of the sequence.
>
> If the change breaks anything in our daily build and test runs I will
> let you know. Otherwise the change will come in R9C.
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
>
>
> Ulf Wiger wrote:
> > On Mon, 5 May 2003, Raimo Niskanen wrote:
> >
> >
> >>But since sub-binaries are allowed when constructing: how
> >>about '=<<<<<' as in "if A =< << <<1>>/binary, 2>>", then
> >>one can see that the scanning of '=<<' depends on if the
> >>number of '<' characters following is odd or even, so the
> >>scanner might have to scan infinitely ahead. A look ahead
> >>scan of limited small length would be fine, but this is
> >>ugly.
> >
> >
> > I'm not sure what the upper limit would be for a sequence of
> > '<' symbols in a program making any qlaims of still being
> > useable (one of course has to take into account generated
> > code, which is usually less readable than hand-written
> > code.)
> >
> > Perhaps a stupid question, but, so what if the scanner looks
> > ahead and breaks for safety at, say, 1000 tokens? This won't
> > cause any big problems as far as memory is concerned, and at
> > least I find it difficult to envision a program that would
> > break because of this, that is still worthy of being
> > compiled.
> >
> > The following syntactically correct expression would no
> > longer work (line breaks added for nettiquette compliance).
> > I'm prepared to say "so what?":
> >
> > A =<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<1>>
> > /binary,2>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>/binary>>/binary>>/binary>>/binary>>/binary>>
> > /binary>>.
> >
> > /Uffe
>
>



Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Raimo Niskanen-7
In reply to this post by Robert Virding-5
A non-text attachment was scrubbed...
Name: not available
Type: multipart/mixed
Size: 36170 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20030527/8ab20eb0/attachment.bin>

Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Robert Virding-5
The problems are at different levels:

1. There is a fundamental difference in scanning atoms/strings and =<<<<<.
You have basically introduced infinite look-ahead (at least to the end of
the file) to try and determine what the first token actually is, for
everything else there is one character look-ahead. (As is the grammar which
is one LALR1)

2. There are at least three different "one key-press typos" which can give
"A=<<1>>":

A==<<1>>
A=<<<1>>
A= <<1>>

Two tests and a match. How can you so so certain what I meant as to choose
one? Introducing a match when I really meant a test will introduce a
fundamental semantic change to the code, both as to return value and to the
error case. Also if I had mistyped the variable name then this *really*
changes the meaning of the code. And it does it automagically, completely
silently and in a way which can make it extremely difficult for the
programmer to find!

3. Nothing should try to correct its input, especially not something as
fundamental as the scanner. Especially when the "correction" is not
unambiguous. Just because you now don't think the chances of meaning
something else is slim doesn't mean other people think the same way. Or that
you might not think so in the future. :-)

4. I personally always (or almost always) use spaces to separate my symbols
so I don't really see what the problem is.

My basic premise is that you can not add an "improver" which works silently
and is only sometimes correct. Will you take the responsibility when this
generates an serious, invisible error in someones code?

The trouble with writing code at this level is that you always have to try
and handle the case when users do things which you had assumed was so stupid
or strange,even though it is legal,  that no one would do it, in this case
mean "A=<<<1>>" when they  wrote "A=<<1>>". I remember that a relatively
early version of the JAM compiler could not handle the case when you did a
send as an argument to a function to get the message in as an argument,
(foo(X ! <big evaluation>, ...)). The premise was that send is used for side
effects not return values. Of course someone did just this and it crashed.

Robert

P.S. Liked the new code. You need to fix the comments. Can't the fun be
generated each suspend and include the state? I think it would make things
easier.




Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Raimo Niskanen-7
You are of course right, but:

Sometimes I and many others do not use spaces to separate the symbols.
The best example I can think of is:
        Rec#rec{length=0,state=init,count=0,buf=<<>>}
where I do _not_ want spaces.

Size comparision of binaries I think is so uncommon that the programmer
that thinks "A =< <<1>>" does not write "A=<<<1>>". It is to ambiguous
to the eye. Especially size comparisions of binaries must be _very_
uncommon. The programmer that skips as many spaces between tokens as
possible is begging for trouble.

Equality comparisions of binaries I think is also rather uncommon, but
note that "A==<<1>>" is scanned correctly(tm) and "A==<1>>" gives an
error since scanning of '==' is eager. Remains typing "A=<<<1>>" when
meaning "A==<<1>>", but that is a double fault.

About the code comments: more precisely, which (kind of) comments need
to be fixed?

About including the state in the fun: I guess more/6 could be change
into something like:

more(Cs, Pos, State, Fun) ->
     {more,{Cs,Pos,State,Fun}}.

And tokens/3 into:

tokens([], Chars, Pos) ->
     tokens({[],Pos,io,
            fun (Cs,P,State)->
                scan(Cs, [], [] Pos, State})
            end, Chars, Pos);
tokens({Cs,Pos,eof,_Fun}, eof, _) ->
     {done,{eof,Pos},Cs};
tokens({Cs,Pos,_State,Fun}, eof, _) ->
     Fun(Cs++eof, eof);
tokens({Cs,Pos,State,Fun}, Chars, _) ->
     Fun(Cs++Chars, State).

note that Cs, Pos, State and Fun are still needed in tokens/3. The fun
can only contain Stack, Tokens and Pos.


Then the calls to more/6 would have to change into:
scan(">"=Cs, Stack, Toks, Pos, State) ->
     more(Cs, Pos, State, fun (C, S) ->
                              scan(C, Stack, Toks, Pos, S)
                          end);

And these calls are rather many so it will clutter the code compared to:

scan(">"=Cs, Stack, Toks, Pos, State) ->
     more(Cs, Stack, Toks, Pos, State, fun scan/5);

By passing two (as far as I can see they are only 2: Stack and Tokens)
unneseccary arguments to more/6 I can use the "fun scan/5" notation
which improves readability.

Or did you have a smarter change in mind?

/ Raimo



Robert Virding wrote:

> The problems are at different levels:
>
> 1. There is a fundamental difference in scanning atoms/strings and =<<<<<.
> You have basically introduced infinite look-ahead (at least to the end of
> the file) to try and determine what the first token actually is, for
> everything else there is one character look-ahead. (As is the grammar which
> is one LALR1)
>
> 2. There are at least three different "one key-press typos" which can give
> "A=<<1>>":
>
> A==<<1>>
> A=<<<1>>
> A= <<1>>
>
> Two tests and a match. How can you so so certain what I meant as to choose
> one? Introducing a match when I really meant a test will introduce a
> fundamental semantic change to the code, both as to return value and to the
> error case. Also if I had mistyped the variable name then this *really*
> changes the meaning of the code. And it does it automagically, completely
> silently and in a way which can make it extremely difficult for the
> programmer to find!
>
> 3. Nothing should try to correct its input, especially not something as
> fundamental as the scanner. Especially when the "correction" is not
> unambiguous. Just because you now don't think the chances of meaning
> something else is slim doesn't mean other people think the same way. Or that
> you might not think so in the future. :-)
>
> 4. I personally always (or almost always) use spaces to separate my symbols
> so I don't really see what the problem is.
>
> My basic premise is that you can not add an "improver" which works silently
> and is only sometimes correct. Will you take the responsibility when this
> generates an serious, invisible error in someones code?
>
> The trouble with writing code at this level is that you always have to try
> and handle the case when users do things which you had assumed was so stupid
> or strange,even though it is legal,  that no one would do it, in this case
> mean "A=<<<1>>" when they  wrote "A=<<1>>". I remember that a relatively
> early version of the JAM compiler could not handle the case when you did a
> send as an argument to a function to get the message in as an argument,
> (foo(X ! <big evaluation>, ...)). The premise was that send is used for side
> effects not return values. Of course someone did just this and it crashed.
>
> Robert
>
> P.S. Liked the new code. You need to fix the comments. Can't the fun be
> generated each suspend and include the state? I think it would make things
> easier.
>
>



Reply | Threaded
Open this post in threaded view
|

Fix for A=<<1>>

Raimo Niskanen-7
In reply to this post by Robert Virding-5
OK, your argument about the typo "A=<<<1>>" when meaning "A=<<1>>" i.e
"A = << 1 >>" accidentally becoming a valid syntax "A =< << 1 >>" has
convinced us to remove this change from erl_scan (at least for R9C).

The debate will probably respawn (reflame) after the R9C release. It is
easier to add later than to remove. The final word is certainly not said.

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Robert Virding wrote:

> The problems are at different levels:
>
> 1. There is a fundamental difference in scanning atoms/strings and =<<<<<.
> You have basically introduced infinite look-ahead (at least to the end of
> the file) to try and determine what the first token actually is, for
> everything else there is one character look-ahead. (As is the grammar which
> is one LALR1)
>
> 2. There are at least three different "one key-press typos" which can give
> "A=<<1>>":
>
> A==<<1>>
> A=<<<1>>
> A= <<1>>
>
> Two tests and a match. How can you so so certain what I meant as to choose
> one? Introducing a match when I really meant a test will introduce a
> fundamental semantic change to the code, both as to return value and to the
> error case. Also if I had mistyped the variable name then this *really*
> changes the meaning of the code. And it does it automagically, completely
> silently and in a way which can make it extremely difficult for the
> programmer to find!
>
> 3. Nothing should try to correct its input, especially not something as
> fundamental as the scanner. Especially when the "correction" is not
> unambiguous. Just because you now don't think the chances of meaning
> something else is slim doesn't mean other people think the same way. Or that
> you might not think so in the future. :-)
>
> 4. I personally always (or almost always) use spaces to separate my symbols
> so I don't really see what the problem is.
>
> My basic premise is that you can not add an "improver" which works silently
> and is only sometimes correct. Will you take the responsibility when this
> generates an serious, invisible error in someones code?
>
> The trouble with writing code at this level is that you always have to try
> and handle the case when users do things which you had assumed was so stupid
> or strange,even though it is legal,  that no one would do it, in this case
> mean "A=<<<1>>" when they  wrote "A=<<1>>". I remember that a relatively
> early version of the JAM compiler could not handle the case when you did a
> send as an argument to a function to get the message in as an argument,
> (foo(X ! <big evaluation>, ...)). The premise was that send is used for side
> effects not return values. Of course someone did just this and it crashed.
>
> Robert
>
> P.S. Liked the new code. You need to fix the comments. Can't the fun be
> generated each suspend and include the state? I think it would make things
> easier.
>
>