Extracting data from 180,000 pages per day in real-time
Dun & Bradstreet is a leading global data provider who are adept at using cutting-edge technology to source timely and accurate data. Shareholder information of all five million UK companies held in Companies House is a critical resource for their customers.
What was the problem?
Companies House documents are often poor quality scans and PDFs, meaning data extraction can be expensive and time-consuming. Their previous supplier manually keyed in the data from annual returns and confirmation statements. Automation of this process was challenging, because of the variations in document layout.
As Patrick Walsh, Dun & Bradstreet’s Public Registry Data Leader, explains, “the main challenge for automation was dealing with exceptions. Data collection from any source will follow general rules [...]. However, it's managing exceptional cases efficiently that defines a successful project.”
What was our solution?
Evolution AI’s proprietary OCR was built in collaboration with the University of Southampton specifically to handle poorly scanned financial documents. Accurately reading poor quality scans unlocked the possibility of automatically processing of these challenging documents.
Evolution AI approached all requests and feedback from a consumer focused lens—Patrick Walsh, Public Registry Data Leader,
Handwriting and complex financial tables represented additional hurdles. Evolution AI's flexible software allows accurate extraction from even these recalcitrant elements. AI extraction accuracy was at 99.8% and any remaining exceptions were dealt with by a human operator via the software’s QA workflow. The end-to-end approach reduced Dun & Bradstreet from 40 data entry staff to just two human operators.
Patrick concluded, “Dun & Bradstreet has high standards, our customers expect nothing less. We strive for 100% accuracy, with minimal latency. Achieving 99.8% accuracy in a short time frame is impressive. I found the team responsive and innovative when dealing with challenges as we worked towards go-live. Evolution AI approached all requests and feedback from a consumer focused lens.”