Tutorial for creating a new enrichment
Supplier look-up in Python
In this tutorial, we'll define a new enrichment step by step for looking up a supplier based on a provided VAT number.
The complete code of this tutorial can be found on GitHub.
1. Create and configure a simple test project
For this tutorial, we will assume you are familiar with working in the Python programming language.
Create a new project in Metamaze with a clear name like
Development project for testing enrichments
.In that project, create a new document type and give it name, like
Purchase Order
Add an entity to that document type called
VAT number
. Make this entity required so that all documents go to human validation, which is helpful for testing purposes.
Next, we will configure the enrichments API.
2. Create a new API endpoint for matching a supplier
First, we define a boilerplate server Flask server with some Bearer token authentication
Then, let's add some example data.
In real life, this data would be populated by reading from a database or a reference file.
We'll do some transformations on the data to make it a bit easier to work with. Let's create simple dictionaries from the data and store it in a new SUPPLIERS
list.
Great! Now, we are ready to define our first API call. We are going to use a GET
request to the /api/find-supplier
route, and re-use the token-based authentication we defined earlier.
The body of the API call will contain the whole document (reference). In this case, we want to match based on the found VAT number
, so let's first find the correct value:
Make sure that the entity name matches the entity name you used to configure in Metamaze precisely, including the casing.
To improve matching rates, it often makes sense to make the lookup a bit more robust instead of matching strings exactly. In real life, you would often even use fuzzy matching on multiple fields to find the correct field. For an example of how you could use fuzzy matching, see the example FuzzyPurchaseOrderEnrichment (Typescript). In this case, we'll ignore all non-alphanumeric characters by adding
We'll find the first match in our reference data by using the next
function
Finally, all we need to do is return the found supplier object, if it exists
We're only returning one match here. Later, you can extend that by returning multiple potential matches, or custom exception codes.
Now, let's start-up the Flask server by opening a terminal and running
To make sure Metamaze can access the little debug server running on your local machine, you can use a free tool like ngrok
. For example, you could run ngrok http 5001
and get output like this:
Grab the public URL (in this case https://7032-109-135-42-38.ngrok-free.app
) and store it somewhere. We'll need it to configure the enrichment endpoint in the next step.
Note that ngrok
tunnels are temporary and for debugging purposes only. If the session times out and you restart the tunnel, it will be on a different URL. You will need to change the URL in the enrichment settings too.
We have everything up and running now to start configuring Metamaze. Awesome work 🎉!
Remember, if you want the full code example, you can find it here.
3. Configure the Find Supplier enrichment
In Metamaze, navigate to the Project Settings > Enrichments and click on the blue
+ Create
button to add a new enrichment.Configure the General settings
We'll give it the name
Find supplier
. Note that this needs to be exactly the same name as you are returning in the API call from before.Let's enable Human validation, and make the enrichment required. This will make it easier to debug.
We'll link the document type "Purchase Order" we created in the set-up.
Then, we will define the Triggers.
Add a trigger "After entity extraction" of the document type "Purchase Order". This will make sure that the enrichments is triggered when entities are predicted automatically. Note that we didn't train a model in this tutorial.
Add a trigger "After labeling" of the entity
VAT number
. This will make sure that when we change an annotation manually, the enrichment will be re-triggered.
In the section "Value types", we will take Entries
since we are returning full objects, not just simple strings. We can define the columns that we want to show. On the left side, take the exact same name as you will return in the API. On the right side, we can give them some user-friendly labels.
Finally, we'll define the webhook. If you are using ngrok
, make sure you are using the live tunnel URL, and appending the route (/api/find-supplier
) we defined in our code. Also make sure you are using the same Bearer token as you are expecting.
Click the Create
button to finish your enrichment.
Upload a document and test
Navigate to your Production Uploads, and upload a new file. For example, we can use the test file.
Since we have not trained any model, there will be no predictions at the start, and the document will look empty:
Add an annotation for the VAT number by clicking on the document and choosing the entity VAT number
.
The enrichment will be triggered automatically, and you will see the result:
By clicking on the enrichment line, you can see all the details of the object too:
Congratulations, your first enrichment is complete 🎉!
If you look in the "All" tab, you'll notice that you can't search for suppliers here. We'll configure that in the next section.
(Optional) Add a second API call to list all suppliers
Add a new route to the Flask server
We can also add a second API call to list all the suppliers. In the code, add a new route on /api/list-suppliers
by adding
The keys of the dictionaries contained in the suppliers
list should be the same as the configured column names. These column names are set in the enrichment settings in the Metamaze platform.
For example, if you have a column called company_name
, you should have a key called company_name
in the dictionary too.
Here's an example object that we are returning
Configure the options in the enrichment
Navigate back to the "Find supplier" enrichment in the Project Settings and go to the "Options" section. Fill in the new API route on the correct URL with the correct Bearer token.
Click Update.
Test the options
Now, in the All tab, you will be able to search and select from a list of all suppliers
Last updated