Remove duplicate chars digits using regex - Shuang0420/Shuang0420.github.io GitHub Wiki

>>> import re
>>> re.sub(r'([a-z])\1+', r'\1', 'ffffffbbbbbbbqqq')
'fbq'
>>> re.sub(r'([0-9])\1+', r'\1', '11111222222334')
'1234'

The () around the [a-z] specify a capture group, and then the \1 (a backreference) in both the pattern and the replacement refer to the contents of the first capture group.

Thus, the regex reads "find a letter, followed by one or more occurrences of that same letter" and then entire found portion is replaced with a single occurrence of the found letter.

参考链接: Remove duplicate chars using regex?