👤Human validation

These settings make it possible to always make manual intervention mandatory for document classification and/or entity extraction. This can have two reasons:

a. Quality control - If you always want to carry out a human validation check at a certain step in the processing pipeline.

b. No initial training data for your models - If you do not have historical documents to train your document classification and entity extraction models, you can always start from the manual intervention module. You can process the documents through the manual intervention module and by doing so, the documents are labeled and the training data is extended over time until reaching full automation.

Human validation general settings

You can choose whether documents can be marked as "done" while there are still unresolved issues present. Enabling this option provides your validators with increased flexibility, but they may also choose to overlook hard validation rules. It is important to keep this in mind when making this selection.

Document classification thresholds

You can set a general threshold for all document types, a score from 0 to 100, or an individual threshold for each document type. If a document is classified by the document classification model as a certain document type with a confidence score lower than this threshold, this document will be placed in the manual intervention module for a human validation check.

Entity extraction thresholds

You can set a general threshold for all entities, a score from 0 to 100, or an individual threshold for each entity type. If one or more words from the document are recognised by the entity extraction model as a certain entity type with a confidence score lower than the threshold, this document will end up in the manual intervention module for a human validation check.

OCR score

By including the OCR score in the confidence score, you can account for the accuracy of the extracted text. Essentially, this means that you can take into consideration the quality of the text extraction process while determining the confidence level of the prediction of an entity.

It's worth noting, though, that incorporating the OCR score may result in a lower overall confidence score.

Last updated