Limitations Of Optical Character Recognition Technology

102 19
Optical Character Recognition, abbreviated as OCR, is the technology through which physical written documents can be converted into digital data. Though OCR systems have been successfully used since decades, but till date they are dependent on human intervention and are not fully automated. However we must concede that they have made the job of data entry less arduous and time-consuming than before.

Take for instance that there are bank account opening forms of 5 thousands of people, which the bank wants to process. Suppose that the bank employs an in-house Data Entry Operator, who has a good speed of 10,000 keystrokes per hour or about 100 words per minute. One form can be assumed to have at least 100 words, including the name, permanent address etc. A simple calculation tells us that at the speed of 100 wpm the clerk will take 83 hours of work to finish 5000 forms. And if a standard office day has 8 hours of work, then at least 10 days would be required for entry of the forms.

In such a situation, it is more convenient to get an OCR which consumes only as much time as is required to scan the document. So you can easily scan 5000 documents within one day. After scanning, all the data is automatically digitally entered and you get your soft copy of all the information. However there are some inherent limitations also of the OCR scanning technology. Therefore it becomes necessary to manually proof-read the document obtained by OCR. These limitations of arise because the software is unable to recognize what is written in the document.

This can happen in the following cases:

The handwriting is illegible or too stylized

The software is fed to recognize only the most common styles of handwriting. So often it gets confused in identification of the letters, misinterprets the alphabets or cannot make sense of the words. In such a situation the OCR machine highlights all such content that it is unsure of. This highlighted portion is then looked up manually and then entered.

Document is poorly printed and formatted

OCR relies on good scanning to convert the data into digital format. The scan cant be proper if the document is faded and there is low contrast between the background and letters. Also if there is no suitable formatting, then the sequence of digital entry of data shall be erroneous. So the printed document, like a form, should be formatted appropriately and be a good quality print.

Document is a carbon copy or a low quality photocopy

The performance of OCR software is always better with original documents. In case of carbon copies or photocopies, chances of errors increase. Also OCR scanning will be problematic if the paper is too thin or crumpled.

So if you want to avail Outsource OCR services then you must make sure that he back-office service provider has a strategy to overcome these limitation s of OCR and promises zero error back office solutions.
Subscribe to our newsletter
Sign up here to get the latest news, updates and special offers delivered directly to your inbox.
You can unsubscribe at any time

Leave A Reply

Your email address will not be published.