The DT-OCR100 series, which were able to read handwritten katakana characters and numerals, were the first terminal OCRs introduced in Japanese national-scale data communication systems. The OCRs were installed for customer-service operations at the Social Insurance Agency in 1980 and the Ministry of Labor in 1981.
Expectations for mechanizing data entry had been rising since the late 1970s. The Social Insurance Agency had started investigation of OCR that handled handwritten numerals for entering receipt data and etc., and the Ministry of Labor had begun examining OCRs that handled handwritten katakana characters and numerals for entering names, addresses, and other data. The DT-OCR100 series addressed these needs with the capability to read ordinary handwritten katakana characters and numerals (written carefully inside writing boxes) . The Nippon Telegraph and Telephone Public Corporation had developed a new recognition method named Feature Concentration Method to realize an inexpensive OCR reader with sufficient recognition accuracy and speed for practical applications.
Around 400 DT-OCR100CN1 units with numeral recognition were installed in the Social Insurance Agency’s data communication system, and about 1,200 DT-OCR100CNKX1 units with katakana and numeral recognition were installed in the Ministry of Labor’s system, making it the largest OCR application system in the world.
- The main technologies implemented in the DT-OCR100 series were as follows:
- Achieved sufficient recognition accuracy for practical applications with the newly developed Feature Concentration Method: NTT Data adopted the method because the method covered the recognition of printed characters, handwritten katakana characters, and numerals, and because the method could be implemented with simple character pattern processing.
- The method worked as belows. First, the method obtained information for each point in the character pattern whether a character stroke existed in the up, down, left, and right directions. For each point, then, the information of the four points existing beyond a character stroke in the up, down, left, or right directions were concentrated. Finaly, recogtnitionwas carried out by logically comparing the concentrated information(feature) with a dictionary prepared for each category.
- An automatic structuring method for the recognition dictionary was developed. By using the method, the recognition dictionary had been made smoothly, which achieved accuracies suitable for practical applications.
- Development of a method to get an appropriate binary character pattern: In order to scan character patterns with no dropouts, blurs, or smears in character strokes, scanning optics was developed to obtain consistent multi-value (light and shade) patterns from forms, and a preprocessing method was devised that gave appropriate width of the character stroke by controlling the threshold used to get binary patterns.
- Obtaining multi-value patterns helped speed up the second time recognition of rejected characters.
- Ensured sufficient scanning speed for a terminal OCR unit: The scanner assembly was able to scan as fast as about 60 centimeters per second per line using a rotary drum spool. The recognition unit used a pipeline process that took advantage of the characteristics of Feature Concentration Method to reach reading speeds of about 100 handwritten characters per second or 256 printed characters per second.
- Developed a clear dropout color that suited most writing implements: A wavelength band-pass optical system was realized to equate the high contrast of blue and other common ballpoint pen colors to “black” and equated the low contrast orange outline boxes to “white.”
- Developed a method to control reading of mixed character types: When reading a mix of numerals, katakana characters, and symbols, there were many character pairs with similar shapes (such as ク versus 7, or 5 versus S). To distinguish such pairs, a decision was carried out that focused on the local character shapes, or on the syntactic relation to the previous and following characters. This process ensured high recognition accuracy.
Parameter | DT-OCR100CN1 | DT-OCR100CNKX1 | ||
---|---|---|---|---|
Reading unit | Recognition method | Feature Concentration Method | ||
Reading speed | Character reading | Handwritten: 100 characters / second (6 mm pitch) Printed: 256 characters per second (10 characters per inch pitch)) |
||
Form processing | 55 sheets per minute max. (reading one row) | |||
Readable character types | Handwitten (careful writing in boxs) | Numerals and symbols (mixed reading supported) | Katakana characters, numerals, symbols (mixed reading supported) | |
Printed | Numerals and symbols (OCR-B size I, pseudo 7 x 9 dot characters) | |||
Forms | Size and thickness | From A6 to A4, Ream weight from 70 to 110 kg |
||
Paper quality | OCR paper, fine paper (brand specification) | |||
Scanning method | Line direction: rotary drum, line feed: reading head moving type | |||
Paper feed | Continuous (automatic) / One-sheet (manual feed) | |||
Form capacities | Hopper | 300 sheets (elevator hopper) | ||
Stackers | Accept: 300 sheets, Reject: 100 sheets | |||
Form format control | Format stored on a floppy disk drive. Add / update format by reading a format definition sheet. Capacity for more than 100 formats. | |||
Sequence number printing | Six-digit numbering (printed on reverse side of the form, print location selectable) | |||
Read modes | Batch correction mode: ejects rejected forms Immediate correction mode: corrects rejected characters as they occur Verification correction mode: corrects errors by comparing the recognition results after reading with the form |
|||
Dimensions, weight | 100 x 60 x 96 cm (width, depth, height), less than 200 kilograms | |||
Power supply | 100±10 V (single phase), approximately 1.6 KVA | |||
Controller | Display | Monitor | PDP display (60 characters x 3 lines) | 9-inch CRT (40 characters x 16 lines) |
Reject display | 64 x 32 dot pattern | 96 x 40 dot pattern | ||
Keyboard | Number keypad, kana characters (in Japanese syllabary order), symbols | Number keypad, symbols | ||
Dimensions, weight | 60 x 52.5 x 75 cm (w x d x h), 60 kg | 72 x 84 x 113 cm (w x d x h), 135 kg | ||
Power supply | 100±10 V, approximately 300 VA | 100±10 V, approximately 900 VA |