What you need to know before starting a new Metamaze project.
Scoping questions for Metamaze
Before you can start with creating a Metamaze project, you need to have a clear view of what the process is that you want to automate and how. Specifically, an answer is needed to the following questions
How many document types are there? Is there an exhaustive list available?
Is document type prediction needed?
Not needed because there only is one type of document
Not needed because we already know the incoming document type
Does text information need to be extracted with entity extraction?
If yes, see section on data requirements for entity extraction
Are image detection models needed?
Needed but only standard objects like signatures, stamps, handwriting, "gelezen en goedgekeurd", ...
Needed with custom detection models => contact an ADP Engineer.
Which type of page management is needed?
Each file is exactly one document
Splitting and merging files into documents
How will the documents be sent to Metamaze? Where should the results be sent to?
Custom API => technical scoping needed with Metamaze Engineer
Custom integration => technical scoping needed with Metamaze Engineer
Which languages are in scope?
Do you have annotated data? Describe the format and size of the data you have.
What to prepare for a scoping meeting
To prepare for a scoping meeting, it helps to bring an overview of
List of all document types that you need to extract
Per document type, a list of entities (fields) you want to extract.
Per entity, basic information like
Minimum number of occurrences
Maximum number of unique occurrences
Parsing / standardisation required?
Annotated examples of all document types and entities. Preferably a couple of examples per document type is needed to estimate diversity.