You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My solution has a 2-step logic. The reason you can not do this in one go, has to do with the limitations of python's re. We'll save that for later. If you're interested, look at Addendum 1.
2 steps: the first step will check if the '-' are in the right place, while the second one will check if there are not 4 consecutive equal numbers.
I will start with the 2nd step, the most memory-consuming one: a regex that checks if there are no consecutive 4 numbers. The following regex will do:
((\d)(?!\2{3})){16}
Explanation:
( # group 1 start
(\d) # group 2: match a digit
(?!\2{3}) # negative lookahead: not 3 times group 2
){16} # repeat that 16 times.
The first step would be matching groups of 4 digits, eventually separated by '-' (look at example 2) The problem to solve here, is to make sure that if first and second group digits is separated by a '-', then all groups need to be separated by a '-'. We manage to do that by using a backreference to group 2 in the next regex.
(\d{4})(-?)(\d{4})(\2\d{4}){2}
Explanation:
(\d{4}) # starting 4 digits
(-?) # group 2 contains a '-' or not
(\d{4}) # 2nd group of 4 digits
(\2\d{4}){2} # last 2 groups, starting with a backreference
# to group 2 ( a '-' or not)
Example program:
import re
pattern1 = r"(\d{4})(-?)(\d{4})(\2\d{4}){2}"
pattern2 = r"((\d)(?!\2{3})){16}"
tests = ["5123-4567-8912-3456"]
for elt in tests:
if re.match( pattern1, elt):
print "example has dashes in correct place"
elt = elt.replace("-", "")
if re.match(pattern2, elt):
print "...and has the right numbers."
Addendum: Now for desert. I've put a regex together to do this in one go. Let's think about what is needed for every digit depending on its position in a group:
1st digit: followed by 3 digits
2nd digit: followed by 3 digits OR digit, digit, dash, digit
3rd digit: followed by 3 digits OR digit, dash, digit, digit
4th digit: followed by 3 digits OR dash, digit, digit, digit
So, for the lookahead we used in example 1, we need to present for each digit all possibilities of follow-ups. Let's have a look at a pattern for a group of 4 digits:
(
(\d) # the digit at hand
(?! # negative lookahead
\2{3} # digit, digit, digit
|\2{2}-\2 # OR digit, digit, dash, digit
|\2-\2{2} # OR digit, dash, digit, digit
|-\2{3} # OR dash, digit, digit, digit
)
){4} # 4 times, for each digit in a group of 4
We would like to expand that to 16 digits of course. We need to define if it's possible to add '-' before the digit. A simple -? won't do, because a creditcard doesn't start with a dash. Let's use alternation:
(? # if
(?<=\d{4}) # lookbehind: there are 4 preceding digits
-? # then: '-' or not
| # else: nothing
)
Look at example 3. We need the \b on both sides because we want to make sure that, whenever the match succeeds, it matches the complete string.
Let's be fair: one has its doubts if this is the way to go. On the upside, we have a valid reason for doing it in 2 steps now: python's standard re doesn't support conditionals and what not. You can workaround this, by using a replacement. Or switch programming language. ;-)
Addendum 2: People asked me where the 16 comes from in example 3. Isn't it true that the complete string can be 19 characters long? The reason is whenever the inner regex (group 1) matches once, it matches with either [0-9] or -[0-9]. That match has to succeed exactly 16 times.
The text was updated successfully, but these errors were encountered:
My solution has a 2-step logic. The reason you can not do this in one go, has to do with the limitations of python's re. We'll save that for later. If you're interested, look at Addendum 1.
2 steps: the first step will check if the '-' are in the right place, while the second one will check if there are not 4 consecutive equal numbers.
I will start with the 2nd step, the most memory-consuming one: a regex that checks if there are no consecutive 4 numbers. The following regex will do:
Explanation:
look at example 1
The first step would be matching groups of 4 digits, eventually separated by '-' (look at example 2) The problem to solve here, is to make sure that if first and second group digits is separated by a '-', then all groups need to be separated by a '-'. We manage to do that by using a backreference to group 2 in the next regex.
Explanation:
Example program:
Addendum: Now for desert. I've put a regex together to do this in one go. Let's think about what is needed for every digit depending on its position in a group:
So, for the lookahead we used in example 1, we need to present for each digit all possibilities of follow-ups. Let's have a look at a pattern for a group of 4 digits:
We would like to expand that to 16 digits of course. We need to define if it's possible to add '-' before the digit. A simple
-?
won't do, because a creditcard doesn't start with a dash. Let's use alternation:Combined, this brings us to:
Look at example 3. We need the \b on both sides because we want to make sure that, whenever the match succeeds, it matches the
complete
string.Let's be fair: one has its doubts if this is the way to go. On the upside, we have a valid reason for doing it in 2 steps now: python's standard re doesn't support conditionals and what not. You can workaround this, by using a replacement. Or switch programming language. ;-)
Addendum 2: People asked me where the
16
comes from in example 3. Isn't it true that the complete string can be 19 characters long? The reason is whenever the inner regex (group 1) matches once, it matches with either[0-9]
or-[0-9]
. That match has to succeed exactly 16 times.The text was updated successfully, but these errors were encountered: