Fix Python – Handling backreferences to capturing groups in re.sub replacement pattern

Question

Asked By – Richard

I want to take the string 0.71331, 52.25378 and return 0.71331,52.25378 – i.e. just look for a digit, a comma, a space and a digit, and strip out the space.

This is my current code:

coords = '0.71331, 52.25378'
coord_re = re.sub("(\d), (\d)", "\1,\2", coords)
print coord_re

But this gives me 0.7133,2.25378. What am I doing wrong?

Now we will see solution for issue: Handling backreferences to capturing groups in re.sub replacement pattern


Answer

You should be using raw strings for regex, try the following:

coord_re = re.sub(r"(\d), (\d)", r"\1,\2", coords)

With your current code, the backslashes in your replacement string are escaping the digits, so you are replacing all matches the equivalent of chr(1) + "," + chr(2):

>>> '\1,\2'
'\x01,\x02'
>>> print '\1,\2'
,
>>> print r'\1,\2'   # this is what you actually want
\1,\2

Any time you want to leave the backslash in the string, use the r prefix, or escape each backslash (\\1,\\2).

This question is answered By – Andrew Clark

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0