Going to production

This page describes a process for production usage

Before going to production

After configuring your project completely and performing user acceptance testing, there are a couple of things you should ensure before going to production.

While in production: roles and responsibilities

The following responsibilities need to be taken up when running in production. In the initial weeks of production usage, we recommend to promote data, perform detailed pipeline analyses, create at least suggested tasks (if time allows create as well a custom task) and re-train weekly to ensure you get to higher automation rates fast. When you see the accuracy is plateau-ing, you can switch to monthly reviews.

Debugging bad performance in models

If you would see bad performance in your models during production usage, here are a couple of tips to find out what is causing them

  • Bad scans that cause bad OCR issues. These issues have to be fixed at the source and cannot be fixed in Metamaze. You will need to train the people scanning that you expect clean, correctly oriented, straight, high-resolution scans (min >150 preferably 300 dpi)

  • Irrelevant documents. Inform the people scanning or sending the documents that they should only send relevant documents.

  • Production data annotations are low quality and worsen the model. We recommend creating suggested and/or custom tasks on production data before retraining the model.

  • Document content or layout different in production compared to training data can cause the models to underperform. In that case we recommend adding a sufficient amount of recent documents from production to training data. For example if you have only trained a date entity with values from October, but you go to production in January, it might cause bad recognition. This is however easily solved by adding some production data.

The analytics module can help you find problematic document types, entities or languages.

Last updated