๐Ÿ“‹Tasks

Tasks is the way in Metamaze to efficiently label and review your training data.

Metamaze contains a task module that makes it possible to quickly examine the documents in the training data and correct them if necessary. For example, you can select all documents for one or more entities and step by step it will present the pages where those entities occur for checking.

This module has two panels: the first one shows the created tasks, the second one shows suggestions for creating new tasks.

Suggested tasks

Metamaze will suggest tasks for both data that was uploaded in the training module and data that was uploaded in the production module and needed human intervention:

The suggested tasks show the following columns:

  • Task type - review or an annotation task

  • Documents - the number of documents in the task

  • Source - documents uploaded in training or production

  • Model type - entity extraction or document classification

  • Document type - corresponds with the document types that are defined in project settings

  • Language - the language of the documents in the task

Suggested annotation tasks

Suggested annotation tasks help speed up the annotation process by choosing the most useful data to annotate from scratch. At the end of a model training, Metamaze will select, among the unlabelled documents present in the training module at that moment, those documents from which the model can learn the most. Annotating these documents will allow the model to become more accurate more quickly (active learning). It is thus recommended to have sufficient unlabelled data present in the training module when triggering a training. You will notice that the documents in the suggested annotation task already have some suggested annotations to speed up labelling.

In case you haven't trained a model yet, Metamaze will also bundle unlabelled documents for annotation. However the resulting task won't include model-assisted labelling or any pre-selection of particularly relevant documents.

By default, the following is enabled:

  • Grouping of similar documents

  • Documents that add the most value are ranked first (based on document confidence score)

  • Model-assisted labelling (predictions are loaded based on the previous training of the model)

Suggested review tasks

Suggested review help you improve the annotation quality of existing documents that are already annotated. These documents can either already be in the training set or come from production.

Suggested review tasks for training documents

Only documents that had the status PROCESSED before the last training can be included in this task. Out of those documents, only the ones that are likely to contain annotation errors are part of the suggested tasks.

By default, the following is enabled:

  • Grouping of similar documents

  • Misannotation hints (based on annotation confidence score): in this section Metamaze will give suggestions about fields or document types that are likely to be misannotated. As a user you still need to validate those hints since they are merely suggestions.

Suggested review tasks for production documents

The following documents are included

  • Documents that required human validation and for which the human validation was completed,

  • Documents that were manually sent to training from production

In both cases, these documents are included if the production status is PROCESSEDbut the training status is Input required because these documents have not yet been approved as training data. By performing this task, you will approve these production documents as training data.

This tasks resets after a model training as the results would be outdated. So only documents that were uploaded after the last model training are included.

By default, the following is enabled:

  • Grouping of similar documents

Creating a suggested task

By clicking on a suggested task, a pop-up window will open:

The documents for the task have already been added, the only thing left to do is add operators and fill in the optional information if you want. Note that you can split tasks among several operators to keep them short.

Custom tasks

When you create a new task, via the '+ Task' button, the following pop-up window will open:

In this window, you fill in fields for the task

FieldDescription

Validation type

  • Entity type Review or add entity annotations on documents

  • Document type Review or assign document types to documents

Languages

Select which languages you want to allow for the task

Filter options to select a group of documents for checking

These will allow you to apply additional filters, such as the user who did the annotations or specific document statuses.

  • Annotated by user Filter based on annotations done by specific users

  • Document status Filter based on document status

  • Documents in tasks Filter based on documents in other tasks

  • Entities list Filter based on entities existing or not

  • Occurrences of entity Filter based on the number of times an entity occurs

  • Source Filter based on the source of the documents: training or production

  • Upload date Filter based on the upload date of a document

  • Validation date Filter based on the validation date on a document

  • Value of entity Filter based on the value of an entity

As a final step you can calculate how many documents meet these conditions. Just like the suggested tasks, you can assign operators and fill in optional information.

Working with active tasks

Once a task is created (suggestion or custom) the task will appear in the active tasks overview where all tasks are shown with their current progress.

Each task can be expanded to show the progress for each individual operator assigned to the task.

When a task is created to which you have been assigned as an operator, you get an extra link which will allow you to start or resume the task.

Document view

When the task is started, the first document will be opened in the labelling view and you can examine the different entities and/or the document type and language.

Task details

The task details show you a high level overview of the task which was created which is relevant to you. This means if a task of 400 documents is created and 100 documents were assigned to you, you will only see those 100 documents.

The task section displays information about the task:

FieldDescription

Progress

Your current progress in percentage and progress bar widget.

Source

The source of the documents in the task:

  • Empty for custom tasks

  • Training

  • Production

Task type

  • Review Review documents that already have annotations

  • Annotation Annotate documents which have no annotations yet.

Deadline

The deadline by when the task should be completed.

Description

The description of the task

Assignee

Assignee of for the current task

Documents represented with colored and numbered buttons

  • Grey Document needs to be processed

  • Green Document marked as processed

  • Yellow Parked document

  • Red Document marked as failed

Metadata document

The 'metadata' shows all the properties of the document:

FieldDescription

Status

  • Processing Document is being processed

  • Input required Document has been processed by Metamaze AI but requires human input to be completed

  • Processed Document has been processed

  • Failed Processing of the document failed

Type

The document type

Language

The language of the document which was set manually or predicted by the OCR step

Pages

The number of pages in the document

Name

The name of the document

ID

The unique ID of the document with handy copy button

Upload date

The date and time the document was uploaded in Metamaze

Actions

  • Copy URL Copy the URL for easy sharing with colleagues or support when you have a problem

  • Download PDF Download a PDF version of the document

Edit document settings

The document settings allow you to change the name of the document, the language and the document type. Changing the document type will delete all the annotations because the entities are different for each document type.

Document actions

At the top of the document processing screen you have a number of buttons that help you navigate between documents and uploads.

  • Back - When in human validation, this button puts the document back in the queue. When in training or production data modules, or in tasks, it will go back to the overview of uploads, documents or tasks.

  • Upload done - This button is only available in human validation. You use this button when you have successfully finished processing all documents from an upload. The upload will then go to the next step 'output' to send the result to your system.

  • Done - When you are done with your intervention, you can mark the document as done.

  • Park - This button is only available in human intervention and in the tasks module. Use this button to park the document to handle it at a later time. A popup will be shown where you need to fill in a reason for parking, this will help you later when asking your colleagues or manager for feedback about the document. It reminds you of why you decided to put this specific document in parked.

  • Reject - This button is used if there is a problem with the document. Besides some standard errors like 'bad OCR' or 'irrelevant document' you can define your errors in the project settings, see Custom errors.

Entities

This list shows each entity defined for this document together with the number of times this entity was found in the document.

Suggestions and misannotations

The list shows the following data:

FieldDescription

Type

The type of the suggestion, can be one of suggested annotation, missing annotations, wrong indices, wrong label or wrong composite group.

Suggested

The suggested entity with its value in the document

Actions

  • Click on a row Selects the entity in the document view, giving you the option to apply or ignore.

Suggestion detail

The suggestion detail in the document view allows you to edit the annotation, validate or reject it.

The following actions are possible:

  • Edit - Allows you to edit an annotation by dragging the start and end cursor

  • Apply - Applies the suggestion and adds the annotation

  • Reject - The suggestion will be ignored and not added as an annotation

Annotations

The list gives an overview of all entities with their values found in the document which have been predicted by a model or manually added by a person. An entity always has a color but if the entity was found by the model it has a lighter transparent color. If the entity was manually added by a person it has a darker color.

The list shows the following data:

FieldDescription

Name

The name of the entity

Value

The value of the entity

Parsed

The parsed value of the entity

User

The user who did the annotation. The value AI indicates that the annotation was predicted by the model.

Score

The confidence score of entities that were predicted by the model. A red color indicates that the score is lower than the threshold.

Page

The page on which the entity appears

Actions

  • Click on a row Selects the entity in the document view, giving you actions to perform on the entity

You can click a row in this list to see the value in the document. You can add additional entity values in the document by clicking the first and the last word of an entity, or by selecting it with dragging, and then selecting the correct label in the dropdown menu that will open. More info on how to perform entity annotation can be found in Annotation of training data.

Checkbox actions

Selecting 1 or more entities through the checkboxes allows you to perform the following actions:

  • Delete annotations - Removes the selected annotations

  • Enrich annotation - Can only be used on 1 annotation at a time and allows you to link an enrichment to the entity.

Document preview

The document preview allows you to see what the document looks like. It shows al the entities that were found. This view also has 2 modes:

  • Hybrid - the original document is shown

  • Formatted - the text is displayed as the OCR model has recognised it

When certain words are not readable in a display, you can always change the display. In the 2 possible views it is possible to add entities.

It is also possible to enlarge or reduce the display or to open it on a second screen.

Last updated