Fourth Step

A perfect world vs. reality

Intelligent Capture & Straight Through Processing

When it comes to intelligent capture, there's the perfect world scenario, and there's the reality. So in a perfect world, you get a hundred percent accuracy out of a hundred percent automation of your tasks without any need to have any type of human intervention. If we were to look at this in terms of its Intelligent Capture parts: the first one is the number of document tasks that you typically must deal with as part of your workflow. Those tasks can be things like reorienting images that are scanned in and doing document identification, separating one document from another locating information and keying it in all those representative visual tasks.

In a perfect world, you would have a hundred percent of these processes automated. Let’s suppose that you've got every single one of your document processing tasks automated.

The second part of it is how many of those tasks that are automated are accurate. So that's the next step in this process is that you want to have every single one of those task results be accurate.

The third part is that of those tasks that you can automate and that are accurate, you want the system to be able to identify which can go straight through without any type of human intervention. In most cases, systems —even though they present accurate information—can’t tell the user what data is accurate from what data is inaccurate. So, unfortunately in most implementations, it’s often necessary to perform a hundred percent verification of that data.

Intelligent Capture In Reality

We've talked about the perfect world. Let's talk about the reality. So let's take that first step where we try to automate all our tasks. No system is perfect –the amount of automation you can get out of the system depends on how much time you're willing to put into configuring the system and testing it. The reality is that most systems —due to many factors—only automate a percentage of the tasks, leaving the remaining percentage to be manually verified and executed.

We start with the number of tasks that can be automated, and then we have the number of automated tasks that are accurate. Of correct automated tasks, only a percentage of them can flow straight through. The reality is that most systems require manual verification because of the level of effort necessary to optimize the identification, document separation, data location and extraction so to ensure correctness, few tasks achieve straight-through processing.

If the system can't tell between accurate and inaccurate data, then you're going to have only a small percentage of that data flow straight through, leaving the remainder of task to be completely manually reviewed.

There are many reasons why the actual system performance fails to meet expectations. While we always want a 100% perfection, the reality is that within a configuration, or setting up an intelligent capture system, there are many steps that must be completed very well.

The first step is data preparation, which consists of understanding the scope of your documents, not only the number of documents that you'd like to automate but also the characteristics of those documents:

Is there a high degree of variance in terms of image quality or data layout?
Anything that affects the amount of comprehensiveness in terms of when you create and configure a system?
Is it going to know how to deal with the documents?

Once you have the wide array of samples that you want to build a system around, you must take the time to configure it. Configuration can often require technical capabilities, and that often requires a significant amount of investment. The next step is testing and tuning, which is an iterative process of testing your output and optimizing it. Once you get it into production, the unfortunate reality is that things change. Documents change. Layouts change. You onboard a new client or customer who has a new type of document, for example, and this means you have to go through all these steps again.

Machine Learning and Data Science: The Promise

Machine learning is not a technology or a solution unto itself, although sometimes it’s presented that way. However, machine learning offers real benefits when you combine it with data science. You reduce the amount of configuration, iteration and investment required for traditional intelligent capture solutions. Ultimately, data preparation, configuration, extraction, testing and tuning, and then operational maintenance, can all be automated and optimized as a compute-time operation without requiring direct human involvement. That’s the great thing about applying machine learning. Leveraging machine learning and data science means that a significant amount of the data can be automated at high precision. This is possible when you have machine learning software crunching through enormous volumes of data. You end up with a greater amount of automated tasks performed at a greater level of accuracy.