【Nippon Telegraph and Telephone Public Corporation (NTT Data)】DT-OCR100CN1 / DT-OCR100CNKX1

The DT-OCR100 series, which were able to read handwritten katakana characters and numerals, were the first terminal OCRs introduced in Japanese national-scale data communication systems. The OCRs were installed for customer-service operations at the Social Insurance Agency in 1980 and the Ministry of Labor in 1981.

Expectations for mechanizing data entry had been rising since the late 1970s. The Social Insurance Agency had started investigation of OCR that handled handwritten numerals for entering receipt data and etc., and the Ministry of Labor had begun examining OCRs that handled handwritten katakana characters and numerals for entering names, addresses, and other data. The DT-OCR100 series addressed these needs with the capability to read ordinary handwritten katakana characters and numerals (written carefully inside writing boxes) . The Nippon Telegraph and Telephone Public Corporation had developed a new recognition method named Feature Concentration Method to realize an inexpensive OCR reader with sufficient recognition accuracy and speed for practical applications.

Around 400 DT-OCR100CN1 units with numeral recognition were installed in the Social Insurance Agency’s data communication system, and about 1,200 DT-OCR100CNKX1 units with katakana and numeral recognition were installed in the Ministry of Labor’s system, making it the largest OCR application system in the world.

The main technologies implemented in the DT-OCR100 series were as follows:
  • Achieved sufficient recognition accuracy for practical applications with the newly developed Feature Concentration Method: NTT Data adopted the method because the method covered the recognition of printed characters, handwritten katakana characters, and numerals, and because the method could be implemented with simple character pattern processing.
  • The method worked as belows. First, the method obtained information for each point in the character pattern whether a character stroke existed in the up, down, left, and right directions. For each point, then, the information of the four points existing beyond a character stroke in the up, down, left, or right directions were concentrated. Finaly, recogtnitionwas carried out by logically comparing the concentrated information(feature) with a dictionary prepared for each category.
  • An automatic structuring method for the recognition dictionary was developed. By using the method, the recognition dictionary had been made smoothly, which achieved accuracies suitable for practical applications.
  • Development of a method to get an appropriate binary character pattern: In order to scan character patterns with no dropouts, blurs, or smears in character strokes, scanning optics was developed to obtain consistent multi-value (light and shade) patterns from forms, and a preprocessing method was devised that gave appropriate width of the character stroke by controlling the threshold used to get binary patterns.
  • Obtaining multi-value patterns helped speed up the second time recognition of rejected characters.
  • Ensured sufficient scanning speed for a terminal OCR unit: The scanner assembly was able to scan as fast as about 60 centimeters per second per line using a rotary drum spool. The recognition unit used a pipeline process that took advantage of the characteristics of Feature Concentration Method to reach reading speeds of about 100 handwritten characters per second or 256 printed characters per second.
  • Developed a clear dropout color that suited most writing implements: A wavelength band-pass optical system was realized to equate the high contrast of blue and other common ballpoint pen colors to “black” and equated the low contrast orange outline boxes to “white.”
  • Developed a method to control reading of mixed character types: When reading a mix of numerals, katakana characters, and symbols, there were many character pairs with similar shapes (such as ク versus 7, or 5 versus S). To distinguish such pairs, a decision was carried out that focused on the local character shapes, or on the syntactic relation to the previous and following characters. This process ensured high recognition accuracy.
Main specifications of the DT-OCR100 series
Parameter DT-OCR100CN1 DT-OCR100CNKX1
Reading unit Recognition method Feature Concentration Method
Reading speed Character reading Handwritten: 100 characters / second (6 mm pitch)
Printed: 256 characters per second (10 characters per inch pitch))
Form processing 55 sheets per minute max. (reading one row)
Readable character types Handwitten (careful writing in boxs) Numerals and symbols (mixed reading supported) Katakana characters, numerals, symbols (mixed reading supported)
Printed Numerals and symbols (OCR-B size I, pseudo 7 x 9 dot characters)
Forms Size and thickness From A6 to A4,
Ream weight from 70 to 110 kg
Paper quality OCR paper, fine paper (brand specification)
Scanning method Line direction: rotary drum, line feed: reading head moving type
Paper feed Continuous (automatic) / One-sheet (manual feed)
Form capacities Hopper 300 sheets (elevator hopper)
  Stackers Accept: 300 sheets, Reject: 100 sheets
Form format control Format stored on a floppy disk drive. Add / update format by reading a format definition sheet. Capacity for more than 100 formats.
Sequence number printing Six-digit numbering (printed on reverse side of the form, print location selectable)
Read modes Batch correction mode: ejects rejected forms
Immediate correction mode: corrects rejected characters as they occur
Verification correction mode: corrects errors by comparing the recognition results after reading with the form
Dimensions, weight 100 x 60 x 96 cm (width, depth, height), less than 200 kilograms
Power supply 100±10 V (single phase), approximately 1.6 KVA
Controller Display Monitor PDP display (60 characters x 3 lines) 9-inch CRT (40 characters x 16 lines)
Reject display 64 x 32 dot pattern 96 x 40 dot pattern
Keyboard Number keypad, kana characters (in Japanese syllabary order), symbols Number keypad, symbols
Dimensions, weight 60 x 52.5 x 75 cm (w x d x h), 60 kg 72 x 84 x 113 cm (w x d x h), 135 kg
Power supply 100±10 V, approximately 300 VA 100±10 V, approximately 900 VA

 
NTT Data DT-OCR100CN1 NTT Data DT-OCR100CNKX1 

In the explanation for OCR, terminology from OCR Catalogue Glossary (Version 2) published by the Japan Electronics and Information Technology Industries Association is used.Please refer to this Glossary for meanings of terms used.