Before you start
What you need to know before starting a new Metamaze project.
Scoping questions for Metamaze
Before you can start with creating a Metamaze project, you need to have a clear view of what the process is that you want to automate and how. Specifically, an answer is needed to the following questions
Decision | Typical options |
How many document types are there? Is there an exhaustive list available? Is document type prediction needed? |
|
Does text information need to be extracted with entity extraction? |
|
Are image detection models needed? |
|
Which type of page management is needed? |
|
How will the documents be sent to Metamaze? Where should the results be sent to? |
|
Which languages are in scope? | |
Do you have annotated data? Describe the format and size of the data you have. |
What to prepare for a scoping meeting
To prepare for a scoping meeting, it helps to bring an overview of
List of all document types that you need to extract
Per document type, a list of entities (fields) you want to extract.
Per entity, basic information like
Required?
Minimum number of occurrences
Maximum number of unique occurrences
Parsing / standardisation required?
Annotated examples of all document types and entities. Preferably a couple of examples per document type is needed to estimate diversity.
Last updated