Keeping up with trends in OCR: ICDAR 2017

Our research team has come a long way in developing proprietary, custom-made machine learning system for mobile OCR. That is in large part due to our researcher’s expertise, who continuously keep up with trends in text recognition and explore new possibilities to optimize our neural network architectures. One such opportunity was this year’s International Conference on Document Analysis and Recognition in Kyoto, Japan. Our lead research engineer and CTO have recently visited the conference and got great insight into the latest advances in areas of text, document, and graphics recognition & analysis. They were especially interested to find out more about areas where deep learning played an important role. Here is a short review of their experience.

Conference

Great workshops

The conference was split into two parts - 4 days of workshops and 3 days of lectures and poster sessions. Workshops are considered optional when attending the conference, while the main program is shaped by lectures. However, after the last day of ICDAR, it was quite clear to us that the workshops were extremely valuable. There were a lot more face-to-face interactions with researchers, discussions after each lecture, and hands-on topics which are especially interesting for our industry specific solutions.

Workshop

Problem complexity and data driven research

One particular area where deep learning seems to have made significant progress is scene text recognition. The most advanced neural network architectures could be seen on problems in this area. There are two main reasons for this, in our opinion. First, we believe that the creativity and innovativeness in problem solving are driven by the complexity of the problem - and scene text recognition is a very complex problem. From our personal experience, the complexity of problems such as performing OCR on handwritten math expressions on device in real time was something that really pushed us forward in our research. The second reason is closely related to the emergence of deep learning as a standard tool in computer vision. In order to utilize the power of these methods, large amounts of annotated data are needed. There were problems on ICDAR harder than scene text recognition, but there simply wasn’t enough data to make effective solutions and/or conclusions.

Lack of optimization

To our surprise, the one thing that the conference was missing was focus on optimization. Optimization makes research methods applicable to real-world products. To us it seemed like there was a general misconception that deep learning solutions aren’t fast enough for on-device processing. Our proposal to the organizers was to dedicate at least one workshop to this important area, and also that optimization is taken into account when comparing models in competitions.
We would like to point out that optimization is an important part in developing solutions and that deep learning neural nets can run on a device in real-time. Our first ML-based OCR model was in production in September 2016. The model was learned entirely on data to perform OCR for handwritten math expressions in the Photomath app. Afterwards, optimizing the runtime of our OCR models allowed us to have real-time ID scanning in BlinkID and receipt scanning in BlinkReceipt. We’re planning to continue with the development of mobile OCR for many other use-cases in the future.

All in all, it was a great event and we’re looking forward to ICDAR 2019. In the meantime, our research team is packing their bags again and heading for another conference - see you at NIPS 2017!

Team

Latest news

  • Keeping up with trends in OCR: ICDAR 2017
  • Compiling thoughts from Meeting C++ 2017
  • [BLOG] Insurtech: optimizing processes with mobile data capture
  • BlinkID takes over Malaysia!
  • Microblink technology provider at Novathon #withPBZ!
  • We have a new look!
  • [BLOG] Mobile ID verification made easy
  • Great to be a technology provider for a new mbanking solution: mobile account opening!
  • Photopay spreading far and wide!
  • Best of Show Award at FinovateSpring 2017
  • Ahead of Finovate Spring: Mobile Vision is Transforming Data Collection and Improving Customer Experience
  • [BLOG] Seamless prepaid user registration
  • [BLOG] Where were we and where are we heading?
  • [BLOG] US Voter Registration
  • Improving Mobile User Experience with the Best OCR
  • Microblink was one of the Partners at the 2nd Adriatic FinTech Hackathon
  • Digital banking experience for Millennials
  • Latest events

  • Mobile World Congress 2018
  • Money 20/20 Las Vegas
  • BlinkReceipt is coming to Orlando
  • Money20/20 Europe
  • European Banking Forum
  • Finovate Spring
  • Seamless Asia
  • MEFTECH 2017
  • Mobile World Congress 2017
  • Money20/20 Conference
  • Adriatic FinTech Hackathon
  • 14th Annual Retail Banking Forum
  • Microblink uses cookies in order to improve your browsing experience. If you continue browsing, you are agreeing to use of cookies. More information.