Entity Extraction

If there is a document type with entities extracted by an AI model, you can view the metrics for the Entity extraction trainings in the Entity Extraction tab under model management. Select the document type you want to inspect in the dropdown menu to visualise the graphs with the main metrics over time. The graphs include the number of documents, the Straight-Through-Processing rate (STP), the F1 score, the Precision and the Recall. The last three graphs detail the metrics for each entity type separately. The metrics in the graphs are reported for the validation set, which consists of 10% of the training data, which is automatically split off at the start of a training. If your training dataset is very small, be careful when interpreting the metrics: it is not recommended to generalise metrics calculated on a handful of documents.

The STP rate is calculated based on the most strict criteria possible: it represents the percentage of documents in the validation dataset for which there was no mismatch between annotations and predictions by the trained model. The metric does not take into account the optional character of entities or entity aggregation, or any business rules. For instance, if an entity is not marked as required, and the model failed to predict it while it was annotated for a certain document, it will be counted as a non-automated document for the STP rate, even though it would be processed automatically in the production workflow. Also if it is sufficient that an entity is found once on a document for production purposes, but it was annotated twice, and the model only predicted it once, it will be counted as a non-automated document, while in the production workflow it would be processed automatically. The STP-rate in the graphs can hence be viewed as the minimum automation rate.

The graphs can be exported in several formats by clicking on the hamburger menu:

Select a training date in the drop down menu at the bottom of the page to view the metrics per entity. For each entity that was included in the training data (too unfrequent entities are removed from the training data in order to obtain as accurate models as possible, see Frequently asked questions), the section details the amount of training documents it was found on, the amount of annotations there are, the F1-score, the Precision, the Recall and the False Discovery Rate. Keep in mind that during a training, the data is split into a training and a validation dataset. The validation dataset contains approximately 10% of the documents. The document and annotation counts are reported for the training set, while the accuracy metrics are reported for the validation set. This allows you to understand whether an entity is not recognised well due to its low frequency in the dataset or to other reasons, such as bad quality of the annotations. In case a badly recognised entity is very frequent in the dataset, there are probably issues with the quality of the annotations for this entity.

Last updated