Backreferences with re module

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Backreferences with re module

Technion

Hi,


I seem to be having some issues with the re: module any time a back reference is introduced.


(yes, I ideally wouldn't use a regex, I have one place I really need to)


Referencing the module manual here:

http://erlang.org/doc/man/re.html


A specific example is given:


Consider, for example:
(.*)abc\1
If the subject is "xyz123abc123", the match point is the fourth character. 

However, I've simplified my code down to that exact example and I can't make it match:


46> {ok, MP} = re:compile("(.*)abc\1").
{ok,{re_pattern,1,0,0,
                <<69,82,67,80,89,0,0,0,0,0,0,0,65,1,0,0,255,255,255,255,
                  255,255,...>>}}
47> re:run("xyz123abc123", MP, [{capture,all_names,binary}]).
nomatch

I have written all manner of patterns without a back reference that return a match fine:


48> {ok, MP2} = re:compile("(.*)abc123").
{ok,{re_pattern,1,0,0,
                <<69,82,67,80,93,0,0,0,0,0,0,0,65,1,0,0,255,255,255,255,
                  255,255,...>>}}
50> re:run("xyz123abc123", MP2, [{capture,all,binary}]).
{match,[<<"xyz123abc123">>,<<"xyz123">>]}

My platform:

Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V9.3  (abort with ^G)

The presented syntax works perfectly fine in Ruby, further suggesting the regex is correct:

irb(main):001:0> /(.*)abc\1/.match("xyz123abc123")
=> #<MatchData "123abc123" 1:"123">


Any assistance on this is appreciated.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Backreferences with re module

Steve Vinoski-2


On Sun, Apr 22, 2018 at 7:53 AM, Technion <[hidden email]> wrote:

Hi,


I seem to be having some issues with the re: module any time a back reference is introduced.


(yes, I ideally wouldn't use a regex, I have one place I really need to)


Referencing the module manual here:

http://erlang.org/doc/man/re.html


A specific example is given:


Consider, for example:
(.*)abc\1
If the subject is "xyz123abc123", the match point is the fourth character. 

However, I've simplified my code down to that exact example and I can't make it match:


46> {ok, MP} = re:compile("(.*)abc\1").

You need a double backslash here instead:

{ok, MP} = re:compile("(.*)abc\\1"). 

Look at the note near the top of the re man page http://erlang.org/doc/man/re.html :

'The Erlang literal syntax for strings uses the "\" (backslash) character as an escape code. You need to escape backslashes in literal strings, both in your code and in the shell, with an extra backslash, that is, "\\".'

--steve


{ok,{re_pattern,1,0,0,
                <<69,82,67,80,89,0,0,0,0,0,0,0,65,1,0,0,255,255,255,255,
                  255,255,...>>}}
47> re:run("xyz123abc123", MP, [{capture,all_names,binary}]).
nomatch

I have written all manner of patterns without a back reference that return a match fine:


48> {ok, MP2} = re:compile("(.*)abc123").
{ok,{re_pattern,1,0,0,
                <<69,82,67,80,93,0,0,0,0,0,0,0,65,1,0,0,255,255,255,255,
                  255,255,...>>}}
50> re:run("xyz123abc123", MP2, [{capture,all,binary}]).
{match,[<<"xyz123abc123">>,<<"xyz123">>]}

My platform:

Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V9.3  (abort with ^G)

The presented syntax works perfectly fine in Ruby, further suggesting the regex is correct:

irb(main):001:0> /(.*)abc\1/.match("xyz123abc123")
=> #<MatchData "123abc123" 1:"123">


Any assistance on this is appreciated.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: Backreferences with re module

Technion

Thanks a heap Steve. I must have read that man page a dozen times but that two liner escaped me every time.


From: [hidden email] <[hidden email]> on behalf of Steve Vinoski <[hidden email]>
Sent: Sunday, 22 April 2018 10:12:02 PM
To: Technion
Cc: [hidden email]
Subject: Re: [erlang-questions] Backreferences with re module
 


On Sun, Apr 22, 2018 at 7:53 AM, Technion <[hidden email]> wrote:

Hi,


I seem to be having some issues with the re: module any time a back reference is introduced.


(yes, I ideally wouldn't use a regex, I have one place I really need to)


Referencing the module manual here:

http://erlang.org/doc/man/re.html


A specific example is given:


Consider, for example:
(.*)abc\1
If the subject is "xyz123abc123", the match point is the fourth character. 

However, I've simplified my code down to that exact example and I can't make it match:


46> {ok, MP} = re:compile("(.*)abc\1").

You need a double backslash here instead:

{ok, MP} = re:compile("(.*)abc\\1"). 

Look at the note near the top of the re man page http://erlang.org/doc/man/re.html :

'The Erlang literal syntax for strings uses the "\" (backslash) character as an escape code. You need to escape backslashes in literal strings, both in your code and in the shell, with an extra backslash, that is, "\\".'

--steve


{ok,{re_pattern,1,0,0,
                <<69,82,67,80,89,0,0,0,0,0,0,0,65,1,0,0,255,255,255,255,
                  255,255,...>>}}
47> re:run("xyz123abc123", MP, [{capture,all_names,binary}]).
nomatch

I have written all manner of patterns without a back reference that return a match fine:


48> {ok, MP2} = re:compile("(.*)abc123").
{ok,{re_pattern,1,0,0,
                <<69,82,67,80,93,0,0,0,0,0,0,0,65,1,0,0,255,255,255,255,
                  255,255,...>>}}
50> re:run("xyz123abc123", MP2, [{capture,all,binary}]).
{match,[<<"xyz123abc123">>,<<"xyz123">>]}

My platform:

Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V9.3  (abort with ^G)

The presented syntax works perfectly fine in Ruby, further suggesting the regex is correct:

irb(main):001:0> /(.*)abc\1/.match("xyz123abc123")
=> #<MatchData "123abc123" 1:"123">


Any assistance on this is appreciated.



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions