Going to production

This page describes a process for production usage

Before going to production

After configuring your project completely and performing user acceptance testing, there are a couple of things you should ensure before going to production.

While in production: roles and responsibilities

The following responsibilities need to be taken up when running in production. In the initial weeks of production usage, we recommend to promote data, perform detailed pipeline analyses, create at least suggested tasks (if time allows create as well a custom task) and re-train weekly to ensure you get to higher automation rates fast. When you see the accuracy is plateau-ing, you can switch to monthly reviews.

Responsibility

What

Who

When

Human validation

Human validation is needed for documents that are not fully processed automatically. Depending on your use case, in the beginning of the project this might be more than expected but typically rapidly decreases after promoting production data to training data and re-training.

In human validation, make sure you always check

  • If documents are correctly split and received the correct type. If needed, perform page management and resume the pipeline.

  • The issues section (required entities not found or below threshold, aggregated value not defined, parsing problems, ...)

  • If all values are found for an entity and not just one. This is not strictly needed for processing most documents but reduces the load on managers performing QA tasks after promoting production data to training data.

  • If no false positives are present.

  • Business rules output.

By adhering to the labelling guidelines for training data while performing human validation, the load on managers for adding training data and performing QA tasks is reduced.

Operators

Hourly or daily

Promotion

A manager should go through the production data daily and promote documents that were not fully automated and would be valuable as training data. Bad scans that caused mis-recognised OCR, irrelevant or exceptional documents should not be promoted to training data to make sure the training data stays clean. Documents that were automated but still contained mistakes - for example when there were false positives - can still be promoted, even though they did not require manual intervention.

There is no use in adding documents that were 100% correct to the training data as it will only slow down training time and the model will not learn anything new.

Managers

Daily

Pipeline analysis

A manager should thoroughly analyse bad predictions to detect patterns in documents that were not fully automated. Based on that analysis, (s)he should define actions to improve models like:

  • adding additional training data or performing new QA checks for specific document types, entities, layouts, suppliers, languages, ...

  • reconfiguring business rules

  • informing the source of the documents to correct them (for bad scans, irrelevant documents, ...)

  • updating the annotation guidelines for human validation or training data. Remember to notify all operators/labellers.

Managers

Daily or Weekly

Training data QA analysis

Review and correct all annotations on documents that were promoted from production data to training data to make sure they are 100% perfect quality and do not contain mis-annotations.

It is crucial that this is done as diligently and accurately as during the initial training phase. Bad (or no) QA analysis can easily make models drastically worse.

The annotation guidelines should be revised and updated if necessary.

Managers

Weekly

Re-training of models

After the new training data is completely checked, you can retrain the model in model management. After retraining, check the resulting accuracy. If the results are good, you can deploy it to production.

Re-training is what makes models improve over time and makes sure your automation rates increase over time. Typically, the biggest improvements happen during the first weeks of running in production.

Managers

Weekly

Debugging bad performance in models

If you would see bad performance in your models during production usage, here are a couple of tips to find out what is causing them

  • Bad scans that cause bad OCR issues. These issues have to be fixed at the source and cannot be fixed in Metamaze. You will need to train the people scanning that you expect clean, correctly oriented, straight, high-resolution scans (min >150 preferably 300 dpi)

  • Irrelevant documents. Inform the people scanning or sending the documents that they should only send relevant documents.

  • Production data annotations are low quality and worsen the model. We recommend creating suggested and/or custom tasks on production data before retraining the model.

  • Document content or layout different in production compared to training data can cause the models to underperform. In that case we recommend adding a sufficient amount of recent documents from production to training data. For example if you have only trained a date entity with values from October, but you go to production in January, it might cause bad recognition. This is however easily solved by adding some production data.

The analytics module can help you find problematic document types, entities or languages.

Last updated