Technology

Deep learning approach for image recognition

May 9, 2019
Deep learning approach for image recognition

Though probably the buzzword of the century, AI is still a vague concept in many minds. In an attempt to unravel its ambiguity, here’s a gentle introduction featuring two ways how businesses adopt and develop AI, and how we chose to do it.

Outside the Westworld realm, AI comes in many forms, has quite a long history, and is more present in your everyday life than you might imagine.

We connect with friends on social media, get an Uber to work, discover new shows on Netflix, and get restaurant recommendations in Maps – all of those are a digital extension of activities we were always doing. Only now, in large part thanks to AI, the way we interact with those services is faster, easier, and smarter.

Machine learning, computer vision, natural language processing, neural networks, and recommendation engines – research engineers have been developing those for several decades now.

So why does everyone seem to be jumping on the AI bandwagon just now?

In 2012, AI research made a big breakthrough in the shape of deep neural networks.
They are way more powerful than the previous learning methods and provide the best results in numerous real-life problems. Their development is backed up by advances in computing power, which allowed running a huge number of parameters and the complex, data-heavy learning process.

Deep neural networks enable powerful solutions for object detection, image recognition, and segmentation. For companies, they provide an opportunity to build supreme user experience, increase efficiency, and delight their customers.

How are AI products built?

When it comes to building products and services based on AI, companies commonly go with one of two approaches. They either use plug&play APIs and machine learning models or develop their own, specialized solutions.

Off-the-shelf APIs are the first choice for many businesses because they’re very affordable and easy to use. The leading computing providers make them easily available with the goal of attracting more developers to use their platforms. That’s why Amazon Rekognition, Google Vision, or Microsoft Cognitive Services APIs are continuously becoming more advanced and powerful.

However, these tools are not flawless. They often perform poorly when solving specific AI problems, and businesses miss the opportunity to deliver supreme quality and user experience.

Many APIs enable text recognition on images, but they’re almost certainly not adjusted to extracting specific types and formats, i.e., making sense of the recognized text. For example, they can’t detect the name and surname on someone’s ID or recognize a particular kind of a math problem. At Microblink, we realized that gap and took a step further to provide a significantly better, specialized computer vision software on the market.

The other AI approach companies take is building proprietary solutions for specialized application in a particular product or service. That approach is also becoming easier to start with. The main reason for that is the growing number of publicly available datasets for training machine learning models. Some may work even for specific ML problems (e.g., MS COCO dataset for object detection, segmentation, and captioning).

Besides easily accessible datasets, there are also more open-source tools that simplify the learning process. PyTorch and Tensorflow 2.0, the most popular open-source machine learning libraries for research and production, are significantly more user-friendly than they were, let’s say, two years ago.

Sorry to spoil the party again, but there’s a catch in this approach as well. The theory is one thing, but in real-life problems, it can be hard to get the desired results. There are two common issues one should watch out for when building their own AI solutions:

  1. The discrepancy between the dataset that was used to train neural network models and the actual data that the models are applied to. Particularly, models trained on publicly available datasets are rarely suitable for highly specialized products.
  2. The problem of applicability in terms of resource, memory, and time consumption, given that the final solution needs to satisfy the efficiency requirements that often don’t exist in the research phase of AI development.
    (In simpler terms, end users have little understanding and even less patience for challenges in ML-powered services. The service has to work always and it has to work fast.)

Taking the best of both worlds

In the past six years, we’ve become well aware of the benefits and challenges of both approaches. We made AI the core of our technology and turned to develop deep neural networks several years ago to target the following problems:

  • Text recognition and extraction on images of ID documents, credit cards, retail receipts, and even math problems
  • Detection and classification of different document types
  • Image analysis and various other tasks.

We’re dedicated to tackling these computer vision challenges to enable companies supreme digital experiences, such as:

  • Effortless bill payments (by scanning payment slips – because who even goes to the bank anymore?)
  • Easy expense tracking (by scanning retail receipts instead of entering each purchased item manually)
  • Remote user onboarding (by scanning identity documents or credit cards)
  • Mastering math with ease (by using an app to scan, solve, and learn math concepts)

We use an integrated approach that combines the best of the two perspectives mentioned above, and our success confirms that we’re on the right track. Our technology has a global outreach, saving time and making everyday tasks easier for over 100 million end users around the world. Photomath, today a separate company from Microblink, is the #1 app to learn math, has over 120 million downloads and has recently been integrated into Snapchat. BlinkReceipt, our product for scanning retail receipts, has so far been used to scan over 180 million retail receipts. BlinkCard, our latest product, is the fastest and most accurate scanning software for various types of credit and bank cards.

If this has sparked your curiosity, great! We’ve sprinkled some more wisdom in another post: Our 5-Step Guide to AI Development.

Integrate ID document scanning into your existing application today

Continue Reading

Find more thoughts on the industry insights, use cases, product features, trends in AI, and development processes.

What is identity documentation verification and how does it work in finance?
ID and Document Verification

What is identity documentation verification and how does it work in finance?

August 31, 2023

Identity document verification ensures the authenticity of presented documents, which helps to mitigate the risk of fraudulent activities and breaches…

Upgrade your UX with ID document scanning for web browsers
Technology

Upgrade your UX with ID document scanning for web browsers

February 23, 2023

How easy is it for your customer to start utilizing your product or service? In an age with no abundance…

Microblink’s top 5 blogs of 2022

Microblink’s top 5 blogs of 2022

December 28, 2022

What a year it has been.  For both our Identity and Commerce business units, 2022 was highlighted by growth, innovation,…

Identity Document Scanning product updates – November 2022
Product Updates

Identity Document Scanning product updates – November 2022

November 22, 2022

Find out what’s new in the v6 release of Identity Document Scanning, and how the updates empower your solution and…

Blue in the face: Twitter’s vexing verification raises identity issue on social media
Social Media

Blue in the face: Twitter’s vexing verification raises identity issue on social media

November 17, 2022

In the Twittersphere, the term “verified” has progressively taken on a meaning of its own. It was back in 2009…

Document Verification product updates – August 2022
Product Updates

Document Verification product updates – August 2022

August 10, 2022

Here’s a quick overview of all new features and supported documents in the latest version of Document Verification. Our unique…

Identity Document Scanning product updates – July 2022
Product Updates

Identity Document Scanning product updates – July 2022

July 31, 2022

We’re super excited to announce a new-better-than-ever version of Identity Document Scanning with 50 new identity documents and significantly improved…

The Importance of Identity Verification and Customer Due Diligence in Indonesia’s Financial Sector
ID and Document Verification

The Importance of Identity Verification and Customer Due Diligence in Indonesia’s Financial Sector

September 28, 2023

In the rapidly evolving financial landscape of Indonesia, the Financial Services Authority (OJK) has been at the forefront of regulatory…

How to improve your customer verification process with better software
ID and Document Verification

How to improve your customer verification process with better software

August 31, 2023

In the fast-paced digital world we live in, ensuring the safety and security of customer identity is of the utmost…