• Uncategorized

About regex : bash-regex-difference-between-Linux-and-MacOS

Question Detail

Following script containing regex with backreference :

pattern="(a)\1"; [[ aa =~ $pattern ]] && echo okay || echo notokay

On Linux :

$ uname -a
Linux #9-Ubuntu SMP Fri Apr 10 21:10:36 UTC 2020
$ bash --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
$ bash -c 'pattern="(a)\1"; [[ aa =~ $pattern ]] && echo okay || echo notokay'
okay
$

On MacOS :

$ uname -a
Darwin MacBook-Pro.local 21.2.0 Darwin Kernel Version 21.2.0: Sun Nov 28 20:28:54 PST 2021

$ /bin/bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin21)
$ /bin/bash -c 'pattern="(a)\1"; [[ aa =~ $pattern ]] && echo okay || echo notokay'
notokay

$ /usr/local/bin/bash --version
GNU bash, version 5.1.12(1)-release (x86_64-apple-darwin21.1.0)
$ /usr/local/bin/bash -c 'pattern="(a)\1"; [[ aa =~ $pattern ]] && echo okay || echo notokay'
notokay
$

How can I find out why backreference works on Linux but not on MacOS ?

Question Answer

Thanks to @user1934428, I found following mentions in BUGS section of man 3 regex on MacOS :

The back-reference code is subtle and doubts linger about its correctness in complex cases.

...

It is advised to avoid back-references whenever possible.

...

The standard's definition of back references is vague.  For example, does ‘a\(\(b\)*\2\)*d’ match ‘abbbd’?
Until the standard is clarified, behavior in such cases should not be relied on.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.