Enrich Data with Elasticsearch 8.x - Part 1: Basic Examples

Published on 2023-01-28

« Back to all documents Contact Us
Enrich Data with Elasticsearch 8.x - Part 1: Basic Examples

Introduction

Code on Github: Elasticsearch Data Enrichment

If you do not have Elasticsearch and Kibana set up yet, then follow these instructions.

This video assumes you are using Publicly Signed Certificates. If you are using Self Signed Certificates, go here TBD.

Requirements

Process

Definition of Data Enrichment [00:09]

Got to the specific timestamp on the video to understand more on what data enrichment means.

How To Enrich Data Based On "Exact Value Match" [05:47]

Step - 1 Setup A Source Index [06:22]

In Kibana, go to Dev Tools > Console. Paste the below command in the console, and when you run this command, what Elasticsearch will do is that it will automatically create a new index and add this data to it and then do an automatic mapping for each of the field.

PUT /users/_doc/1?refresh=wait_for { "email": "mardy.brown@asciidocsmith.com", "first_name": "Mardy", "last_name": "Brown", "city": "New Orleans", "county": "Orleans", "state": "LA", "zip": 70116, "web": "mardy.asciidocsmith.com" }

After running, It should produce a similar result to the image below;

alt text Console result for step 1

To confirm the index was created successfully, go to Stack Management > Index Management. And you should see a similar result to the image below:

alt text Index created succesfully

Step - 2 Setup An Enrichment Policy [07:58]

In Kibana, go to Dev Tools > Console. Paste the below command in the console, and when you run this Policy command:, what the Policy will do is that it will give instructions on how to populate incoming data, with data from the source index.

PUT /_enrich/policy/users-policy { "match": { "indices": "users", "match_field": "email", "enrich_fields": ["first_name", "last_name", "city", "zip", "state"] } }

Use the below command to create an enrich index for the policy.

POST /_enrich/policy/users-policy/_execute?wait_for_completion=false

After running, It should produce a similar result to the image below;

alt text Console result for step 2

To confirm the index was enriched successfully, go to Stack Management > Index Management, toggle the include hidden indices button On, then reload indices. And you should see a similar result to the image below:

alt text Index enriched succesfully

Step - 3 Setup An Ingestion Pipeline [11:05]

Create an ingest pipeline with an enrich processor. Use the command below to do so:

PUT /_ingest/pipeline/user_lookup { "processors" : [ { "enrich" : { "description": "Add 'user' data based on 'email'", "policy_name": "users-policy", "field" : "email", "target_field": "user", "max_matches": "1" } } ] }

After running, It should produce a similar result to the image below;

alt text Console result for step 3

To confirm the index for the ingest pipeline was successfully done, go to Stack Management > Ingest Pipelines. And you should see a similar result to the image below:

alt text Ingest Pipeline Index succesfully Created

Step - 4 Insert Document Using The Ingestion Pipeline [13:16]

Use the below ingest pipeline to insert a new document. The incoming document should include the field specified in your enrich processor.

PUT /my-index-000001/_doc/my_id?pipeline=user_lookup { "email": "mardy.brown@asciidocsmith.com" }

After running, It should produce a similar result to the image below;

alt text Console result for step 4

And when u perform a GET request, u will see the result:

GET /my-index-000001/_doc/my_id

alt text Console result for documents

Let's insert another document into the source index

POST /users/_doc { "email": "test@test.com", "first_name": "Test", "last_name": "Brown", "city": "New Orleans", "county": "Orleans", "state": "LA", "zip": 70116, "web": "mardy.asciidocsmith.com" }

alt text Console result for Inserting New document To Source Index

Then run the below command:

PUT /my-index-000001/_doc/pipeline=user_lookup { "email": "test@test.com" }

After running, It should produce a similar result to the image below;

alt text Console result for using POST command to insert a document

How To Enrich Data Based On "Range Value Match" [17:25]

Step - 1 Setup A Source Index [17:33]

In Kibana, go to Dev Tools > Console. Paste the below command in the console:

PUT /networks { "mappings": { "properties": { "range": { "type": "ip_range" }, "name": { "type": "keyword" }, "department": { "type": "keyword" } } } }

You should get a similar output as the image below:

alt text Console result for setting up an index

Step - 2 Insert Document Into The Source Index [20:10]

Run the below command to insert a document into the source index that was created

PUT /networks/_doc/1?refresh=wait_for { "range": "10.100.0.0/16", "name": "production", "department": "OPS" }

You should get a result like this:

alt text Console result for inserting documents into the source index

Step - 3 Setup An Enrichment Policy [21:20]

Setup the enrichment policy with the below command:

PUT /_enrich/policy/networks-policy { "range": { "indices": "networks", "match_field": "range", "enrich_fields": ["name", "department"] } }

Use the below command to create an enrich index for the policy.

POST /_enrich/policy/networks-policy/_execute?wait_for_completion=false

After running, It should produce a similar result to the image below;

alt text Console result for step 3

To confirm the index was enriched successfully, go to Stack Management > Index Management, toggle the include hidden indices button On, then reload indices.

Step - 4 Create An Ingestion Pipeline [22:58]

Create an ingest pipeline and add an enrichment processor:

PUT /_ingest/pipeline/networks_lookup { "processors" : [ { "enrich" : { "description": "Add 'network' data based on 'ip'", "policy_name": "networks-policy", "field" : "ip", "target_field": "network", "max_matches": "10" } } ] }

Use the below ingest pipeline to insert a new document. The incoming document should include the field specified in your enrich processor.

PUT /my-index-000001/_doc/my_id?pipeline=networks_lookup { "ip": "10.100.34.1" }

After running, It should produce a similar result to the image below;

alt text Console result for step 4

And when u perform a GET request, u will see the result:

GET /my-index-000001/_doc/my_id

alt text Console result for documents

How To Enrich Data Based On "Geolocation Match" [25:23]

Step - 1 Setup A Source Index [26:10]

In Kibana, go to Dev Tools > Console. Paste the below command in the console:

PUT /postal_codes { "mappings": { "properties": { "location": { "type": "geo_shape" }, "postal_code": { "type": "keyword" } } } }

You should get a similar output as the image below:

alt text Console result for setting up an index

Step - 2 Insert Document Into The Source Index [27:00]

Index enriched data into the source index, by using the below command:

PUT /postal_codes/_doc/1?refresh=wait_for { "location": { "type": "envelope", "coordinates": [[13.0, 53.0], [14.0, 52.0]] }, "postal_code": "96598" }

You should get a result like this:

alt text Console result for inserting documents into the source index

Step - 3 Setup An Enrichment Policy [27:38]

Setup the enrichment policy with the below command:

PUT /_enrich/policy/postal_policy { "geo_match": { "indices": "postal_codes", "match_field": "location", "enrich_fields": [ "location", "postal_code" ] } }

Use the below command to create an enrich index for the policy.

POST /_enrich/policy/postal_policy/_execute?wait_for_completion=false

After running, It should produce a similar result to the image below;

alt text Console result for step 3

To confirm the index was enriched successfully, go to Stack Management > Index Management, toggle the include hidden indices button On, then reload indices.

Step - 4 Create An Ingestion Pipeline [28:51]

Create an ingest pipeline and add an enrichment processor:

PUT /_ingest/pipeline/postal_lookup { "processors": [ { "enrich": { "description": "Add 'geo_data' based on 'geo_location'", "policy_name": "postal_policy", "field": "geo_location", "target_field": "geo_data", "shape_relation": "INTERSECTS" } } ] }

Use the below ingest pipeline to insert a new document. The incoming document should include the field specified in your enrich processor.

PUT /users2/_doc/0?pipeline=postal_lookup { "first_name": "Mardy", "last_name": "Brown", "geo_location": "POINT (13.5 52.5)" }

After running, It should produce a similar result to the image below;

alt text Console result for step 4

And when u perform a GET request, u will see the result:

GET /users2/_search

alt text Console result for documents

If you need any assistance, email us through our Contact Form.