Before you start

What you need to know before starting a new Metamaze project.

Scoping questions for Metamaze

Before you can start with creating a Metamaze project, you need to have a clear view of what the process is that you want to automate and how. Specifically, an answer is needed to the following questions

Decision

Typical options

How many document types are there? Is there an exhaustive list available?

Is document type prediction needed?

  • Not needed because there only is one type of document

  • Not needed because we already know the incoming document type

  • Is needed

Does text information need to be extracted with entity extraction?

  • If yes, see section on data requirements for entity extraction

Are image detection models needed?

  • Needed but only standard objects like signatures, stamps, handwriting, "gelezen en goedgekeurd", ...

  • Needed with custom detection models => contact Metamaze at support@metamaze.eu

Which type of page management is needed?

  • Each file is exactly one document

  • Splitting and merging files into documents

How will the documents be sent to Metamaze? Where should the results be sent to?

  • Standard API

  • E-mail integration

  • Custom API => technical scoping needed with Metamaze Engineer

  • Custom integration => technical scoping needed with Metamaze Engineer

Which languages are in scope?

Do you have annotated data? Describe the format and size of the data you have.

What to prepare for a scoping meeting

To prepare for a scoping meeting, it helps to bring an overview of

  • List of all document types that you need to extract

  • Per document type, a list of entities (fields) you want to extract.

  • Per entity, basic information like

    • Required?

    • Minimum number of occurrences

    • Maximum number of unique occurrences

    • Parsing / standardisation required?

  • Annotated examples of all document types and entities. Preferably a couple of examples per document type is needed to estimate diversity.

Last updated