Building identity document verification with Jonathan Chang, Product

The Microblink team has a bold vision to bring the benefits of AI to every person on earth. For a decade, we’ve been developing and delivering diverse products that digitize documents, automate processes, and eliminate data entry and transformation. While you are (hopefully!) familiar with our identity document scanning tech, we’re thrilled to introduce the latest and greatest component of our BlinkID suite: Verify. We sat down with Product Manager, Jonathan Chang, to discuss the challenges of identity document fraud, Microblink’s approach to solving the problem, and what this means for businesses.

I’ve spent my career in product management, building digital products to solve tough problems that plague end customers – with a special emphasis on utilizing AI/ML to tackle these challenges.  Microblink drew me in with a fascinating problem space that combined my own curiosity and desire to bring ML solutions to the world and a mission that resonated with me: combating the prevalence of identity fraud in an increasingly digital world.

Why is identity document fraud such a compelling problem to solve?

I think most people today recognize that identity fraud is a widespread problem. The Federal Trade Commission (FTC) in the U.S. reported that consumers lost almost $6 billion to fraud in 2021 – a 70% increase from the previous year. The agency received nearly 3 million fraud reports last year, with the most commonly reported category once again being imposter scams. 

What’s maybe less known, however, is that over 90% of observed identity fraud involves counterfeit and fraudulent documents. To me, this was a pretty staggering statistic, highlighting just how important identity document verification can be in mitigating the larger problem.

Tell us more about our Verify offering.

BlinkID Verify enables businesses to recognize the validity of a presented identity document (ID). Companies that need to verify the identity of their users can leverage our solution to perform the heavy lifting of identifying and filtering out valid IDs, while simultaneously flagging IDs that may be fraudulent as risky or for review (or immediate suspension, depending on the nature of the business). This can help Know Your Customer (KYC) compliance, customer onboarding, employment authorization, and more.

How should businesses be thinking about identity document fraud and detecting fraud?

There are three high-level buckets when it comes to approaching fraud detection: non-forensic, forensic human-in-the-loop (HITL), and forensic automated. Non-forensic relies on external verification like reaching out to third-party databases to validate authenticity and data checks like matching against the data in barcodes or comparing date logic. Forensic, on the other hand, relies on visual examinations of documents for signs of tampering or anomalies. This can be done with human document experts who manually review documents (HITL approach), an automated approach that uses rules and ML models to examine documents and detect abnormalities (such as photoshopped IDs, tampering with the photo on the ID, etc.), or a combination of the two. The benefits of a human-in-the-loop process are precise analysis, knowledge and experience of human examiners; however, human examiners are often costly, examination is usually time-consuming, and it’s a challenging approach to scale. 

A forensic automated approach offers speed at a lower price point as the major value proposition. With an excellent forensic automated solution, you can unlock efficiencies, reduce costs and arrive at similar or greater ability to detect fraud. These three buckets can also be used synergistically to produce an ideal outcome for the business, balancing costs, speed, assurance, and overall end user experience.

Which identity fraud detection techniques does Verify utilize?

BlinkID Verify as it exists today is an automated solution that combines non-forensic and forensic techniques (along with liveness detection, which is an additional bucket that arises when transacting digitally) to perform inspections of a diverse set of identity documents. Around half – somewhere between 40-60% – of document-based identity fraud can be stopped by running data checks. These non-forensic checks rely on BlinkID’s extraction technology to validate information that’s visually present on an ID or contained in a barcode and/or an MRZ. Depending on the document class and data check performed, our tech is correctly identifying genuine and fake IDs up to 99% of the time. 

In addition, our visual forensic checks and anomaly detection capabilities use ML models that have trained on 450,000+ real images (i.e., both authentic and fraudulent IDs). These images of identity documents (that we’ve collected over the years) are used to analyze fraud trends and prevalence and help us build fraud checks. We heavily rely on our expert, in-house data annotation team and document experts to make sense of the data.

Lastly, passive liveness detection uses ML to determine if the identity document is physically present in front of the camera through methods such as screen detection andphotocopy detection in a way that doesn’t burden end users with extra steps and required actions.

How did you go about the early days of product development?

On the Microblink Product & Engineering side, we tend to nerd out on identity documents and ID fraud. We work closely with our client partners on their user onboarding and authentication workflows, and we knew that identity document fraud was a problem – and a difficult challenge to solve – based on conversations with our clients. They wanted a solution that worked, at scale, to verify genuine identity documents and flag potentially fraudulent IDs, and they wanted it done in a robust, data-driven, mostly automated way. 

After further validating it in the market and with customers, we worked to understand the scope of identity document fraud, including what kind of fraudulent documents exist, where they exist, and who is impacted by identity document fraud. As I mentioned above, it’s a compelling problem set that brought me to this team. 

Understanding what “identity document fraud” actually means and entails took up the bulk of our initial work. There was a lot of cross-functional research into the kinds of fraud vectors that exist and are most and least commonly observed or used (e.g. photo forgery, showing an identity document on a screen, using sample templates to create fake documents). We leveraged both our internal and external document experts, analyzing our database of both real and fraudulent identity documents. As mentioned above, this helped us to identify trends and prioritize which fraud vectors to solve based on prevalence of fraud vectors, the impact that it has on identifying document fraud, and the amount of effort it takes to build a solution.

From there, we built an “alpha” version of our verification product and tested it alongside our product discovery group, which is composed of existing BlinkID customers who expressed interest in finding a document verification solution. For me as a Product Manager, and for the broader team, that conversation and feedback with our clients has been critical. 

Products are never done, especially one involving ever-changing documents and fraud “attacks”, so we’re continuously testing and iterating to improve our approach and the performance of our core fraud checks. I’m super proud of the team and the progress we’ve made, and it’s only the beginning.

What were some of the early product challenges you faced? What are you most excited about today for BlinkID Verify?

Some of the challenges we faced in the early days are some of the same challenges we face today, which is what makes this work so compelling.

Data is arguably the most difficult aspect of developing a good ML solution – both acquiring it and annotating it. We continue to rely on our data annotation team to identify fraudulent documents and fraud vectors for hundreds of thousands of documents, and we’re constantly exploring new ways to acquire additional data for our models to train on. As with all ML solutions, the more our product is used and the more data our models ingest, the more it learns and improves. We’ve also worked hard to understand identity document fraud in general, building an internal bench of document and fraud experts and upskilling with workshops from document experts.

Lastly, selecting the “right” fraud vectors to go after initially, and staying focused and performance metrics-driven based on value and effort has been crucial. Getting real user feedback on our solution via the product discovery group remains instrumental here, and we’re grateful for the ongoing collaboration. Fraudsters and bad actors will always invent new ways to circumvent detection, but with BlinkID Verify we have the opportunity to help our customers bridge that gap and focus on investing in their core business. Learning about that and hearing about the business impact first hand has made this work extra rewarding.

How can an organization get their hands on BlinkID Verify to test this tech?

We’ve been testing the technology with some of our existing customers and prospective customers and are continuing to keep our doors open to interested parties. Definitely reach out to our team or connect with me on LinkedIn for more information!

June 11, 2023

Discover Our Solutions

Exploring our solutions is just a click away. Try our products or have a chat with one of our experts to delve deeper into what we offer.