After configuring your project completely and performing user acceptance testing, there are a couple of things you should ensure before going to production.
Make sure your labelling guidelines are complete and adapted for production usage
Train your operators to
use the Metamaze human validation module including exceptions like adding text that was handwritten or misrecognized, performing page management, ...
follow the labelling guidelines very accurately
know and understand the business process
Have a clearly defined process and responsibilities for following up on human validation, promotion of production data to training data, pipeline analysis , creation of suggested tasks, model training and model deployment. Below is a recommended overview of roles and responsibilities.
The following responsibilities need to happen when running in production. In the initial weeks of production usage, we recommend to promote data, perform detailed pipeline analysis, create at least suggested tasks (if time allows create as well a custom task) and re-train weekly to ensure you get to higher automation rates fast. When you see the accuracy is plateau-ing, you can switch to monthly reviews.
Human validation is needed for documents that are not fully processed automatically. Depending on your use case, in the beginning of the project this might be more than expected but typically rapidly decreases after promoting production data to training data and re-training.
In human validation, make sure you always check
By adhering to the labeling guidelines for training data while performing human validation, the load on managers for adding training data and performing QA tasks is reduced.
Hourly or daily
A manager should go through the production data daily and promote documents that were not fully automated but would be valuable as training data. Bad scans that caused mis-recognized OCR, irrelevant or exceptional documents should not be promoted to training data to make sure the training data stays clean. Documents that were automated but still contained mistakes - for example when there were false positives - can still be promoted, even though they did not require manual intervention.
There is no use in adding documents that were 100% correct to the training data as it will only slow down training time but the model will not learn anything new.
A manager should thoroughly analyse bad predictions to detect patterns in documents that were not fully automated. Based on that analysis, (s)he should define actions to improve models like:
Daily or Weekly
Training data QA analysis
Review and correct all annotations on documents that were promoted from production data to training data to make sure they are 100% perfect quality and do not contain mis-annotations.
It is crucial that this is done as diligently and accurately as during the initial training phase. Bad (or no) QA analysis can easily make models drastically worse by giving them bad examples.
The annotation guidelines should be revised and updated if necessary.
Re-training of models
After the new training data is completely checked, you can retrain the model in model management. After retraining, check the resulting accuracy. If the results are good, you can deploy it to production.
Re-training is what makes models improve over time and makes sure your automation rates increase over time. Typically, the biggest improvements happen during the first weeks of running in production.
If you would see bad performance in your models during production usage, here are a couple of tips to find out what is causing them
Bad scans that cause bad OCR issues. These issues have to be fixed at the source and cannot be fixed in Metamaze. You will need to train the people scanning that you expect clean, correctly oriented, straight, high-resolution scans (min >150 preferably 300 dpi)
Irrelevant documents. Inform the people scanning or sending the documents that they should only send relevant documents.
Production data annotations are low quality and worsen the model. We recommend creating suggested and/or custom tasks on production data before retraining the model.
Document content or layout different in production compared to training data can cause the models to underperform. In that case we recommend adding a sufficient amount of recent documents from production to training data. For example if you have only trained a date entity with values from October, but you go to production in January, it might cause bad recognition. This is however easily solved by adding some production data.
The analytics module can help you find problematic document types, entities or languages.