Skip to content

Compiled re should use a raw string with escape sequences

Duncan Macleod requested to merge github/fork/eagoetz/raw-str-regexp into master

Created by: eagoetz

This PR makes the use of raw strings for compiled regexp. When first running gwtrigfind, I found the warning:

/home/evan.goetz/lscrepos/gwtrigfind-eg/gwtrigfind/ SyntaxWarning: invalid escape sequence '\A'
  daily_cbc = re.compile('\Adaily[\s_-]cbc\Z')
/home/evan.goetz/lscrepos/gwtrigfind-eg/gwtrigfind/ SyntaxWarning: invalid escape sequence '\A'
  pycbc_live = re.compile('\Apycbc[\s_-]live\Z')
/home/evan.goetz/lscrepos/gwtrigfind-eg/gwtrigfind/ SyntaxWarning: invalid escape sequence '\A'
  kleinewelle = re.compile('\A(kw|kleinewelle)\Z', re.I)
/home/evan.goetz/lscrepos/gwtrigfind-eg/gwtrigfind/ SyntaxWarning: invalid escape sequence '\A'
  dmt_omega = re.compile('\Admt([\s_-])?omega\Z', re.I)
/home/evan.goetz/lscrepos/gwtrigfind-eg/gwtrigfind/ SyntaxWarning: invalid escape sequence '\A'
  omega = re.compile('\Aomega([\s_-])?(online)?\Z', re.I)

Interestingly, running a second time causes the warning to go away, so one doesn't notice this the first time the code is run.

According to re,

If you’re not using a raw string to express the pattern, remember that Python also uses the backslash as an escape sequence in string literals; if the escape sequence isn’t recognized by Python’s parser, the backslash and subsequent character are included in the resulting string. However, if Python would recognize the resulting sequence, the backslash should be repeated twice. This is complicated and hard to understand, so it’s highly recommended that you use raw strings for all but the simplest expressions.

This PR changes the compiled regexp code into raw strings so that this warning is fixed.

Merge request reports
