【Toshiba】 ExpressReader 70J OCR Unit

The ExpressReader 70J was a Japanese-language document reader that provided high-speed reading and high recognition accuracy in a compact form factor and at an inexpensive price. The unit automatically parsed the layout of the document before character recognition so no layout specifications were necessary. And it employed omni-font reading so there were no font limitations. The ExpressReader 70J also had a function that automatically corrected misrecognized characters using a knowledge process based on Japanese word and grammar knowledge.

One problem with reading printed kanji characters was the processing loads to handle the many character types. This problem was overcome for the ExpressReader 70J by developing large categorization technology and dedicated IC chips. The large categorization technology pared down recognition candidates in two stages and the multiple similarity method was used in the final identification process. Toshiba developed and employed a logical structure deciphering technology for parsing layouts that first divided the scan area and then automatically estimated the order of the scanned area segments. And Toshiba developed and employed a read-error correction method for the knowledge process that used morpheme analysis technology based on Japanese grammar.

The basic specifications of the ExpressReader 70J were as follows:

  • Read speeds: 70 to 100 characters per second
  • Readable printed characters: Approximately 4,000 printed characters, including letters, numbers, symbols, katakana characters, hiragana characters, and kanji characters, omni-font
  • Readable forms: up to A4
  • Scanner: flatbed scanner
  • Recognition accuracy: 99.5 percent (with normal quality documents)

  
ExpressReader70J  

In the explanation for OCR, terminology from OCR Catalogue Glossary (Version 2) published by the Japan Electronics and Information Technology Industries Association is used.Please refer to this Glossary for meanings of terms used.