Start Conditions in Lexical Analyzer Generator (leex)

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Start Conditions in Lexical Analyzer Generator (leex)

Vance Shipley-2
I've never tried using leex before but since the job at hand is
to parse a language file, where the reference implementation uses
flex, it seemed like porting their lex input file for use with leex
would be the way to go.

However I got stuck right away in that leex doesn't seem to support
"start conditions":

   http://flex.sourceforge.net/manual/Start-Conditions.html#Start-Conditions

Am I missing something?

--
        -Vance

Reply | Threaded
Open this post in threaded view
|

Start Conditions in Lexical Analyzer Generator (leex)

Dmitry Kolesnikov
Hello,

Probably, I've missed something? The start condition is the literal part of lex regex.
e.g.

Definitions.

WSS = [\x20\x09\x0A\x0D]+
VAR = [a-zA-Z.]+

Rules.

{if{WSS}{VAR}} : {token, {'if', TokenLine, TokenChars}}.

this matches token is it starts with if

- Dmitry

On Jan 17, 2014, at 4:42 PM, Vance Shipley <vances> wrote:

> I've never tried using leex before but since the job at hand is
> to parse a language file, where the reference implementation uses
> flex, it seemed like porting their lex input file for use with leex
> would be the way to go.
>
> However I got stuck right away in that leex doesn't seem to support
> "start conditions":
>
>   http://flex.sourceforge.net/manual/Start-Conditions.html#Start-Conditions
>
> Am I missing something?
>
> --
> -Vance
> _______________________________________________
> erlang-questions mailing list
> erlang-questions
> http://erlang.org/mailman/listinfo/erlang-questions


Reply | Threaded
Open this post in threaded view
|

Start Conditions in Lexical Analyzer Generator (leex)

Rikard Strömmer
On Fri, Jan 17, 2014 at 10:53 PM, Dmitry Kolesnikov
<dmkolesnikov> wrote:

> Hello,
>
> Probably, I've missed something? The start condition is the literal part of lex regex.
> e.g.
>
> Definitions.
>
> WSS = [\x20\x09\x0A\x0D]+
> VAR = [a-zA-Z.]+
>
> Rules.
>
> {if{WSS}{VAR}} : {token, {'if', TokenLine, TokenChars}}.
>
> this matches token is it starts with if

That's probably not what Vance Shipley wants.  Using start conditions
is sort of moving between state machines.  See below example excerpted
from http://flex.sourceforge.net/manual/Start-Conditions.html#Start-Conditions


         "/*"         BEGIN(comment);

         <comment>[^*\n]*        /* eat anything that's not a '*' */
         <comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
         <comment>\n             ++line_num;
         <comment>"*"+"/"        BEGIN(INITIAL);


>
> - Dmitry
>
> On Jan 17, 2014, at 4:42 PM, Vance Shipley <vances> wrote:
>
>> I've never tried using leex before but since the job at hand is
>> to parse a language file, where the reference implementation uses
>> flex, it seemed like porting their lex input file for use with leex
>> would be the way to go.
>>
>> However I got stuck right away in that leex doesn't seem to support
>> "start conditions":
>>
>>   http://flex.sourceforge.net/manual/Start-Conditions.html#Start-Conditions
>>
>> Am I missing something?


>From http://erlang.org/doc/man/leex.html rules have the following format:

        <Regexp> : <Erlang code>.

and apparently start conditions are not regular expressions... so
probably there is no support for that.


--
Regards,
Xiao Jia

Reply | Threaded
Open this post in threaded view
|

Start Conditions in Lexical Analyzer Generator (leex)

Dmitry Kolesnikov
Hello,

Right? I think there is not a straight path from flex to Erlang leex.
Instead of porting flex input rules, I would take a look into flex interim results and try to map them to leex.  

- Dmitry

On 18 Jan 2014, at 03:25, Xiao Jia <me> wrote:

> On Fri, Jan 17, 2014 at 10:53 PM, Dmitry Kolesnikov
> <dmkolesnikov> wrote:
>> Hello,
>>
>> Probably, I've missed something? The start condition is the literal part of lex regex.
>> e.g.
>>
>> Definitions.
>>
>> WSS = [\x20\x09\x0A\x0D]+
>> VAR = [a-zA-Z.]+
>>
>> Rules.
>>
>> {if{WSS}{VAR}} : {token, {'if', TokenLine, TokenChars}}.
>>
>> this matches token is it starts with if
>
> That's probably not what Vance Shipley wants.  Using start conditions
> is sort of moving between state machines.  See below example excerpted
> from http://flex.sourceforge.net/manual/Start-Conditions.html#Start-Conditions
>
>
>         "/*"         BEGIN(comment);
>
>         <comment>[^*\n]*        /* eat anything that's not a '*' */
>         <comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
>         <comment>\n             ++line_num;
>         <comment>"*"+"/"        BEGIN(INITIAL);
>
>
>>
>> - Dmitry
>>
>> On Jan 17, 2014, at 4:42 PM, Vance Shipley <vances> wrote:
>>
>>> I've never tried using leex before but since the job at hand is
>>> to parse a language file, where the reference implementation uses
>>> flex, it seemed like porting their lex input file for use with leex
>>> would be the way to go.
>>>
>>> However I got stuck right away in that leex doesn't seem to support
>>> "start conditions":
>>>
>>>  http://flex.sourceforge.net/manual/Start-Conditions.html#Start-Conditions
>>>
>>> Am I missing something?
>
>
> From http://erlang.org/doc/man/leex.html rules have the following format:
>
>        <Regexp> : <Erlang code>.
>
> and apparently start conditions are not regular expressions... so
> probably there is no support for that.
>
>
> --
> Regards,
> Xiao Jia

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140120/676a6a84/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Start Conditions in Lexical Analyzer Generator (leex)

Richard A. O'Keefe

On 20/01/2014, at 11:06 PM, Dmitry Kolesnikov wrote:
> Instead of porting flex input rules, I would take a look into flex interim results and try to map them to leex.  

Or better still, take a step _right_ back and look at
the basic problem.  Sometimes programmers using Lex or
Flex use start conditions when they don't need to.
PL/I-style comments are a good example of this.
It's _easier_ to process them using start conditions,
but it's _possible_ to process them without.

Using start conditions:

>         "/*"         BEGIN(comment);
>
>         <comment>[^*\n]*        /* eat anything that's not a '*' */
>         <comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
>         <comment>\n             ++line_num;
>         <comment>"*"+"/"        BEGIN(INITIAL);

Not using them:

"/*"[^*]*"*"+([^/*][^*]*"*"+)*"/" { line_num += count_nls(yytext); }




Reply | Threaded
Open this post in threaded view
|

Start Conditions in Lexical Analyzer Generator (leex)

Vance Shipley-2
In reply to this post by Dmitry Kolesnikov
On Mon, Jan 20, 2014 at 12:06:50PM +0200, Dmitry Kolesnikov wrote:
}  I think there is not a straight path from flex to Erlang leex.

I looked way ack to Unix 7th Edition in 1978 and see that the original
lex also included start conditions:

   http://cm.bell-labs.com/7thEdMan/v7vol2b.pdf  (second paper)

--
        -Vance