Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate card numbers using regex python #21

Open
JiangWeixian opened this issue Oct 20, 2021 · 0 comments
Open

Validate card numbers using regex python #21

JiangWeixian opened this issue Oct 20, 2021 · 0 comments

Comments

@JiangWeixian
Copy link

My solution has a 2-step logic. The reason you can not do this in one go, has to do with the limitations of python's re. We'll save that for later. If you're interested, look at Addendum 1.

2 steps: the first step will check if the '-' are in the right place, while the second one will check if there are not 4 consecutive equal numbers.

I will start with the 2nd step, the most memory-consuming one: a regex that checks if there are no consecutive 4 numbers. The following regex will do:

((\d)(?!\2{3})){16}

Explanation:

(                       # group 1 start
  (\d)                  # group 2: match a digit
  (?!\2{3})             # negative lookahead: not 3 times group 2
){16}                   # repeat that 16 times.

look at example 1

The first step would be matching groups of 4 digits, eventually separated by '-' (look at example 2) The problem to solve here, is to make sure that if first and second group digits is separated by a '-', then all groups need to be separated by a '-'. We manage to do that by using a backreference to group 2 in the next regex.

(\d{4})(-?)(\d{4})(\2\d{4}){2}

Explanation:

(\d{4})                 # starting 4 digits
(-?)                    # group 2 contains a '-' or not
(\d{4})                 # 2nd group of 4 digits
(\2\d{4}){2}            # last 2 groups, starting with a backreference
                        # to group 2 ( a '-' or not)

Example program:

 import re

 pattern1 = r"(\d{4})(-?)(\d{4})(\2\d{4}){2}"
 pattern2 = r"((\d)(?!\2{3})){16}"

 tests = ["5123-4567-8912-3456"]

 for elt in tests:
     if re.match( pattern1, elt):
         print "example has dashes in correct place"
         elt = elt.replace("-", "")
         if re.match(pattern2, elt):
             print "...and has the right numbers."

Addendum: Now for desert. I've put a regex together to do this in one go. Let's think about what is needed for every digit depending on its position in a group:

  • 1st digit: followed by 3 digits
  • 2nd digit: followed by 3 digits OR digit, digit, dash, digit
  • 3rd digit: followed by 3 digits OR digit, dash, digit, digit
  • 4th digit: followed by 3 digits OR dash, digit, digit, digit

So, for the lookahead we used in example 1, we need to present for each digit all possibilities of follow-ups. Let's have a look at a pattern for a group of 4 digits:

(
  (\d)             # the digit at hand
  (?!              # negative lookahead
   \2{3}           # digit, digit, digit
  |\2{2}-\2        # OR digit, digit, dash, digit
  |\2-\2{2}        # OR digit, dash, digit, digit
  |-\2{3}          # OR dash, digit, digit, digit
  )
){4}               # 4 times, for each digit in a group of 4

We would like to expand that to 16 digits of course. We need to define if it's possible to add '-' before the digit. A simple -? won't do, because a creditcard doesn't start with a dash. Let's use alternation:

(?                 # if
  (?<=\d{4})       # lookbehind: there are 4 preceding digits
  -?               # then: '-' or not
  |                # else: nothing
)

Combined, this brings us to:

\b((?(?<=\d{4})-?|)(\d)(?!\2{3}|\2{2}-\2|\2-\2{2}|-\2{3})){16}\b

Look at example 3. We need the \b on both sides because we want to make sure that, whenever the match succeeds, it matches the complete string.

Let's be fair: one has its doubts if this is the way to go. On the upside, we have a valid reason for doing it in 2 steps now: python's standard re doesn't support conditionals and what not. You can workaround this, by using a replacement. Or switch programming language. ;-)

Addendum 2: People asked me where the 16 comes from in example 3. Isn't it true that the complete string can be 19 characters long? The reason is whenever the inner regex (group 1) matches once, it matches with either [0-9] or -[0-9]. That match has to succeed exactly 16 times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant