A long time ago, in a field far far away from Documentum, I worked at USDA while I was undergrad. One of the research projects at lab was the automation of detection of the healthy chicken carcass. This process was/is mainly done by human inspectors who look at color and noticeable markings on the chicken skin to determine if the chicken was healthy. Since this detection was done by visual inspection, the lab was researching whether we could create a system that would do this automatically with no human intervention.
I will spare you the gruesome details of collecting data from a chicken slaughter house, but in the end, we had data from hundreds of chicken – most of which were healthy, and a few that were not deemed not suitable for sale. We took all this data and fed it to a neural network application. For those of you who are not familiar with the terminology, its a type of artificial intelligence modeling. The ability of neural networks to “learn” and to make correct determinations on whether a chicken was healthy was heavily reliant on the amount of data we fed into the system. The more data we provided, the better we could trained the neural network to recognize healthy (vs sick) chickens. If I remember correctly, the results of the initial testing pointed towards 85% success. This was still below the rate of human detection, but it was a good start.
So how does this relate to Captiva Dispatcher? Dispatcher is designed to intelligently recognize document types. How it does it is similar to what I worked on at USDA. In order for it to make this determination of document type, a customer has to provide various sampling of documents to “train” Dispatcher. Sampling is pretty straight forward if you are trying to identify predefined forms. This becomes a bigger challenge when there is no consistent template/structure that you can provide as a good sample to train Dispatcher. This is the case for accounts payable. There is no consistent look and feel for invoices from vendor to vendor. This is especially problematic for large companies like Walmart; imagine the number of invoices that Walmart gets from all of its vendors.
Now imagine if you are a small-medium business (SMB). You probably have significantly less vendor invoices than Walmart, but you still do not have a good sampling of various permutations of invoices you could get. This is where the power of SaaS can exponentially help smaller businesses have similar competitive advantages as larger companies. If Dispatcher could be configured to run as SaaS, you could harness the knowledge (aka form recognition data) from hundreds if not thousands of customers. This data can then be used to train Dispatcher to recognize more and more variants of invoices. Dispatcher intelligence would get better and better over time as more variations of invoices are submitted to Dispatcher and Dispatcher learns to recognize those.
The power of many can be used to help the one.
Pingback: Crystal Ball Into the Future of Email Categorization « Enterprise Solutions. Quality People. Beach Street.