Real-life users aren’t perfect

From what we've learned in the past couple of years working on document extraction and text recognition - reading data from IDs can be challenging. If you have a high-quality picture taken at the right angle, with good lighting and a plain background, all is well! In that case, the OCR can do its part without any difficulties.

Unfortunately, in reality, that is often not the case, and we can't really expect it to be.

The REAL challenge

This is where the real challenge for OCR lies - there are so many different aspects of imperfection to consider. What if the document is at an angle or the user is not holding the phone in parallel with the document (scanning in a perspective) so you don't get a rectangular image? What if a user has shaky hands or is scanning a document on a bus ride to work?

Users want technology to work for them and not the other way around. It is unrealistic to expect the users to adapt their behavior to technology. That rarely happens; what is more likely to happen is frustration and decreased use of tools that don’t perform well.

Real challenge

In order to provide the greatest possible value and experience for the end user, it's imperative to make the technology work in imperfect real-life situations. To be able to better tackle one of the challenges with this goal in mind, we have developed our own, proprietary real-time auto capture technology.

The scanning process

Now, let’s get technical! Here is how the real-time auto capture process works in several steps:

OCR process

Keep in mind that this is quite an abstract and simplified overview of the process. It’s important to notice is that the whole process is done locally on the device (no server-side processing at all!).

While it’s been a lot of work for us to develop, its most important advantage lies on the side of the end user: the process in its entirety lasts half a second, and consists of a single phone-pointing gesture. Users don't even have to take a photo, they just point the camera to a document and the recognition is done automatically.

Behind the curtain, our technology processes multiple frames and picks the best one in just a fraction of a second.

Check out how it works live in action!
DISCLAIMER: The demo video isn't the best-case scenario show-off. It really works that fast! ;)

Small differences that matter

The time difference between the solution we described above and one that requires the user to take a photo of the document (which is sent to a remote server and then returned with results) is about 2-3 seconds, providing that the document photo is of decent quality.

You might think of that as a big difference, but in terms of UX - that’s a huge deal. We strongly believe it’s important to recognize these little things and put as much effort into them as possible. Our ultimate goal is to provide mobile OCR that works smoothly and is fully adapted to the end-users. We also believe that is the reason why our SDKs are better than most other solutions you’ll find on the market.

Many potential use-cases

Besides real-time auto capture, we’ve got more tricks up our sleeve! BlinkOCR can read personal ID documents, payment slips, barcodes, retail receipts, payment data (such as IBAN numbers), and even free-form text.

If you’re feeling inspired to improve your business with advanced and user-friendly OCR solutions, feel free to explore all our products and use-cases and contact us below if you have any questions!