How do I check if a string matches this pattern?
Uppercase letter, number(s), uppercase letter, number(s)...
Example, These would match:
A1B2
B10L1
C1N200J1
These wouldn't ('^' points to problem)
a1B2
^
A10B
^
AB400
^
^([A-Z]\d+){1,}$
like this?
B
and not with A
.
A
and B
are small letters right? A10b
and aB400
?
import re
pattern = re.compile("^([A-Z][0-9]+)+$")
pattern.match(string)
One-liner: re.match(r"pattern", string) # No need to compile
import re
>>> if re.match(r"hello[0-9]+", 'hello1'):
... print('Yes')
...
Yes
You can evalute it as bool
if needed
>>> bool(re.match(r"hello[0-9]+", 'hello1'))
True
re.match
in the context of an if
, but you have to use bool
if you're using it elsewhere?
re.match
. It only matches at the start of a string. Have a look at re.search
instead.
if
checks for the match not being None
.
re
is used in more than one places to improve efficiency. In terms of error .match
would throw the same error what .compile
does. It's perfectly safe to use.
re
module compile and cache the patterns. Therefore there is absolutely no efficiency gain using compile and then match than just directly calling re.match
. All of these functions call the internal function _compile
(including re.compile
) which does the caching to a python dictionary.
Please try the following:
import re
name = ["A1B1", "djdd", "B2C4", "C2H2", "jdoi","1A4V"]
# Match names.
for element in name:
m = re.match("(^[A-Z]\d[A-Z]\d)", element)
if m:
print(m.groups())
import re
import sys
prog = re.compile('([A-Z]\d+)+')
while True:
line = sys.stdin.readline()
if not line: break
if prog.match(line):
print 'matched'
else:
print 'not matched'
import re
ab = re.compile("^([A-Z]{1}[0-9]{1})+$")
ab.match(string)
I believe that should work for an uppercase, number pattern.
regular expressions make this easy ...
[A-Z]
will match exactly one character between A and Z
\d+
will match one or more digits
()
group things (and also return things... but for now just think of them grouping)
+
selects 1 or more
As stated in the comments, all these answers using re.match
implicitly matches on the start of the string. re.search
is needed if you want to generalize to the whole string.
import re
pattern = re.compile("([A-Z][0-9]+)+")
# finds match anywhere in string
bool(re.search(pattern, 'aA1A1')) # True
# matches on start of string, even though pattern does not have ^ constraint
bool(re.match(pattern, 'aA1A1')) # False
Credit: @LondonRob and @conradkleinespel in the comments.
Success story sharing
re.match
:If zero or more characters at the beginning of string match the regular expression pattern
. I just spent like 30 minutes trying to understand why I couldn't match something at the end of a string. Seems like it's not possible withmatch
, is it? For that,re.search(pattern, my_string)
works though.^
at the beginning when you usematch
. I think it's a bit more complicated then that very simple explanation, but I'm not clear. You are correct that it does start from the beginning of the string though.search()
in this context.search()
". It works perfectly fine with match.