Model Predictions

Metamaze has several AI models that are used when processing a document:

  • OCR model - This model ensures that words and objects are read from a non-text file format with their respective coordinates so that the document can be used in the software and the text can be used as input for the other models. Metamaze does not develop the OCR model but makes use of OCR providers. You can chose a provider in the project settings.

  • Page management model - This model ensures that a file consisting of several documents can be split into individual documents. There is one model per document type.

  • Document classification model - This model will determine the document type for each document. There is one model per project.

  • Object detection model - This model can recognise objects such as signatures. There is one object detection model per document type.

  • Entity extraction model - This model can extract information from the document. There is one entity model per document type.

Which models will or will not be used in your project depends on the steps you have activated in the project settings.

Apart from the OCR model, the models are dedicated models per project and you need to maintain them yourself:

  • Upload and annotate training data

  • Do quality reviews of the training data

  • Train and deploy models

  • Expand training data with production data

When a model makes a prediction, it will always give a confidence score. E.g. entity A has a value of 'BBB' in the document with a confidence score of X%.

It is possible to use the manual intervention module to perform a human validation check. If manual intervention is enabled, in certain cases documents will be asked to be checked:

  • The document type is recognised by the model with a confidence score lower than the set threshold for that document type.

  • An entity type is recognised by the model with a confidence score lower than the set threshold for that entity type.

  • An entity cannot be validated or converted to the desired format (e.g. type is date but the value found is not a date).

  • An entity has been designated as mandatorily present but was not found by the model.

You can make manual intervention mandatory for each step, document prediction and/or entity extraction, so that each document can be checked. If you do not have historical documents for training your models, you can start using Metamaze without any trained models, by annotating your data in human intervention. The Metamaze software provides all the tools to quickly label documents, which will always be faster than manual data entry.

Last updated