Back to blog

ID Verification Improvements

< Back
Matt Bradley
January 26, 2017

We love finding new ways that AI can make banking safer, easier, and more secure. That's why we're excited to announce new updates to our computer vision stack.

Updates include:

  •   Better facial detection
  •   Faster and more flexible text search
  •   Context specific security levels
  •   More useful feedback 

These changes will help us verify images with more accuracy, which is especially helpful when validating documents like Photo IDs as part of our KYC ("Know-Your-Customer") process.

Better Facial Detection

Our image processing is unique because we accept imperfect image submissions, which means people anywhere in the world can access banking services with just a cellphone.

The sample ID we are using below is much cleaner than the cellphone images we process every day, but it exhibits a common problem: the edges don't stand out from the background. Low contrast makes computer vision difficult, so our new algorithm automatically detects and corrects this problem. With the contrast adjusted into our ideal range, the algorithm is able to detect the ID and both face photos.

Faster and More Flexible Text Search

Because our AI is integrative, improvements in one area have systemic benefits. When IDs are rotated in the image field,  having two positive face detections allows us to quickly re-orient the image. Spending less time rotating images means our algorithms can now verify personal information in as few as 5 seconds for our best image submissions. For noisy images, we now have time to run 50% more pre-processing filters than before, leading to more successful auto-verifications. 

The ID has been re-oriented, and our algorithm begins reading. Zooming in now, we can see the extracted text, with fuzzy match ratios. Based on these ratios and other criteria, the ID is either accepted or denied by our algorithm.

Risk Contexts

For us, tough security doesn’t mean a cookie-cutter approach to compliance. With our latest update, we can build custom risk contexts which reflect our clients’ diverse needs. To accomplish this, we developed a hyper-parameter module, controlled by a global context variable, which can easily change fuzzy match thresholds, select text targets, permute through different pre-processing routines, and much more.

As an example, let’s say a client wants a well structured image upload template which will control image orientation, lighting, and noise, but the client suspects some users will accidentally upload thumbnail images. The client’s business is highly sensitive to risk. With our new updates, we can provide a structured upload experience for this client’s users. We can also define appropriate matching thresholds, and focus pre-processing on image size correction, with a small amount of de-noising. Let’s call this risk context CLASS 1. 

if CLASS ==1:

        parameters = {

                  "parameter_settings": 

                  [{‘resize’:2, 'blur':False,'factor':0,'lumen':False, 'thres':5}, 

                  {'resize':3, 'blur':True,'factor':5, 'lumen':False,'thres':5},

                  {'resize':4, ‘blur’:True,'factor':5, ‘lumen':False,'thres':5}],

                  "degrees_settings": [0],

                  "fuzzy_threshold": {'names':.85, 'other':.75}

        }

Now imagine another client wants their users to have an unstructured image upload such as a cell phone picture. These images may require heavy pre-processing and re-orientation, so we can also define a risk context with preprocessing filters which resize, illuminate, and de-noise their images at different thresholds Fortunately, this client operates a savings app which allows users to deposit money into a retirement account, but has no debit functionality. This greatly decreases the potential for financial risk. The client still wants to be confident about user names, but is less concerned if a user no longer lives at the address listed on their driver’s license. Lets call this risk context CLASS 2

if CLASS ==2:

        parameters = {

                  "parameter_settings": 

                  [{‘resize’:2, 'blur':False,'factor':0,'lumen':True, 'thres':0}, 

                  {'resize':2, 'blur':False,'factor':0, 'lumen':False, 'thres': 0}, 

                  {'resize':3, 'blur':True,'factor':5, 'lumen':True,'thres':5},

                  {'resize':3, 'blur':True,'factor':5, 'lumen':False,'thres':10},

                  {'resize':4, 'blur':True,'factor':10, 'lumen':True,'thres':5},

                  {'resize':4, ‘blur’:True,'factor':10, ‘lumen':False,'thres':15}],

                  "degrees_settings": [0, 45, 45, 45, 45, 45, 45],

                  "fuzzy_threshold": {'names':.85, 'other':.60}

        }

You can see that by tuning hyper-parameters, we can easily create two very different processing routines appropriate for different use cases. In the future, our research will be focused on applying machine learning techniques to make this customization automatic.

More Useful Feedback

When images aren’t accepted, it is nice to know why. That is why we now give our clients feedback on every job they invoke. Issues such as low image luminosity or overly compressed files get reported as notifications, but do not halt processing. This lets our clients determine the best follow-up steps for their users. We saw an example of this with the slightly washed-out image above.

Our New Notification Codes:

         N001: could not find first name

         N002: could not find last name

         N003: could not find dob

         N004: could not find address

         N100: image is too small

         N200: image is too dark

         N210: image is too light

         N300: ID is too small within image

Likewise, the output we give to clients should be readable and relevant. We updated our output to meet these goals. 

Our New Output:

{

         'matches': {

                  'dob': 'match', 

                  'first_name': 'match', 

                  'last_name': 'match', 

                  'address': ‘match',

                  'notes': ['N100'], 

         }

}

How will these updates affect your platform?

Image verification is a hard problem and we're using AI to solve it. With these updates you should expect to see an increase in the accuracy of image validation. Although accuracy will vary depending on the risk profile of a platform, we’re seeing about 95% or higher accuracy in Photo ID verification for most customers (this is an improvement from ~92% accuracy). Platforms can further improve these numbers by providing users with more details on image requirements as well as simple user experience tools to encourage better photo uploads (A good example of this is the online banking apps, that help users align checks within a rectangular frame for improved images).

This is some text inside of a div block.