【Nippon Telegraph and Telephone Public Corporation (NTT Data)】OCR50 Printed Kanji Character Reader

The OCR50 printed kanji character reader recognizes documents printed in Japanese at high speeds and with high accuracy. This terminal OCR was used to efficiently input the source data to construct databases of documents described in Japanese.

With advances in Japanese information processing in the 1980s, there was a strong demand to construct databases from existing documents printed in Japanese. A problem had occurred by that massive amounts of manpower and time were required to enter documents including several thousand kinds of characters into information processors. The OCR50 had been designed to fulfill this need as an economical input method. Its ability for inputing various documents inexpensively had been achieved by high speed recognition and its highly flexible syntax editing.

The main features of the OCR50 were as follows:
  • Able to read many kinds of characters with high accuracy: The OCR50 read characters between 7 and 18 points (2.5 to 6.3 mm2) and recognized approximately 4,000 kinds of kanji characters, hiragana characters, katakana characters, alphabets, numerals, and symbols.
  • The OCR50 had used a character recognition dictionary constructed based on the typefaces daily used in printed documents. The inclusion of knowledge processing technology for word identification and context analysis ensured highly accurate character recognition.
  • Flexible enough to accept direct inputs of many kinds of printed documents: The OCR50 was able to directly read printed documents ranging from A6 to B4 in size. It was able to set the reading parameters on the machine to handle paper size, line spacing, text orientation(vertically, horizontally) and other type of text layout.
  • Able to create databases quickly: The OCR50 was able to read approximately 1,500 characters per minute, about 30 times faster than an average typist. The recognized text could be output to a floppy disk or magnetic tape to construct the source data for databases.
  • Equipped with functions to edit the recognized text: Keyboards enabled to delete, correct, and insert texts in the recognized printed documents.
Main specifications of the OCR50 printed kanji character reader
Parameter Specification
Recognition technique Structure feature distribution method
Read speeds Character reading About 1,500 characters per minute
Form processing About 20 sheets per minute (maximum)
Readable characters Character types Approx. 4,700 letter types:
JIS Level 1, standard JIS Level 2, kana characters, alphabets, numerals, and symbols
Sizes 7 to 18 points (2.5 to 6.3 mm2) )
Character spacing Supports reading of fixed pitch and variable pitch characters, vertical and horizontal oriented text, and paragraph structures
Forms Sizes and thicknesses From 148 x 105 mm to 364 x 257 mm (l x w), ream weight from 45 to 90 kg
Paper quality OCR paper, fine paper, plain paper
Paper feed Continuous (automatic) / One-sheet (manual insertion)
Form capacities Hopper 100 sheets (with ream weight 70 kg)
Stacker 100 sheets (with ream weight 70 kg)
Read Correction functions Reject processing with candidate character selection
Editing functions Kana-kanji conversion, separators, kanji search input Word processing functions such as insertion, deletion, and correction
Outputs Outputs to floppy disk or magnetic tape (JIS-C-6226 kanji encoding)
Transfers to other devices possible with an RS-232C interface
Dimensions and weight Recognition unit: 61 x 96 x 82 cm (w x d x h), 150 kg
Scanner:61 x 71 x 86 cm (w x d x h), 130 kg
Controller:105 x 71 x 66 cm (w x d x h), 150 kg
Console unit: 60 x 80 x 40 cm (w x d x h)
Magnetic tape unit: 52 x 36 x 40 cm (w x d x h), 42 kg
Printer: 61 x 50 x 90 cm (w x d x h), about 65 kg
Power supply 100±10 V, approximately 2.8 KVA

  
NTT Data OCR50
※From left: printer, magnetic tape unit (on desk), console unit (on desk), controller (under the console unit), scanner, recognition unit
  

In the explanation for OCR, terminology from OCR Catalogue Glossary (Version 2) published by the Japan Electronics and Information Technology Industries Association is used.Please refer to this Glossary for meanings of terms used.