Detecting & Extracting Data

In Gnatta, data can be detected and extracted from any body of text using regular expressions ('RegEx') in the ‘Find Text’ action. This has a huge range of applications in a contact centre. For example, your customer has sent you this message:

Received at 2.39pm: Hi! My name is john. I don't remember my order number, but it's not arrived. I placed the order with john.doe@gmail.com. Please can you find out what's happening?

It’d be really useful if Gnatta could automatically extract John’s email, and store it in a custom data field ready for your agent, right? Well, that’s exactly what this article is going to explore!

What is Regex?

Regexes are used to look for patterns in text. That means you can use them to identify and store any key data you might need. This is a Regex that will search for an email address: \b\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*\b

There are two typical uses for this in Gnatta:

  1. Detecting keywords to understand reason for contact (loss, damage, delivery etc.) so you can route appropriately. We have a flow template for that here.

  2. Finding data you’ll need to resolve the query or identify the customer. Common examples include:

    1. email addresses

    2. order numbers

    3. customer ID numbers

    4. phone numbers

    5. postcodes

There’s no limit to what you can regex!

Getting Started

First, head to the builder in your domain. Create a new action with the ‘+' button on the map, then select Find Text. If you have your Regex to hand, paste it into the RegEx field. We have a selection of common Regexes stored in templates - like this one, which will search for an email address: \b\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*\b

 

flow.gif
Create a ‘Find Text’ action and paste in your Regex!

If you need to find a custom string of text (for example, if your order numbers look like ‘SOX-3253') you’ll need to write a custom regex to extract that exact pattern. Writing a custom regex can be tricky, so we suggest using a third party tool like Chat GPT to help you write them.

Regexes look scary and complex - they’ll look like gibberish for the most part! The best way to validate it is to test it: https://regex101.com/

chrome-capture-2024-6-11.gif
Testing Regex in Regex101

Source

The source tells Gnatta where to look for this regex. Depending on what you’re searching for will change the source you need, but in most cases you’ll be wanting to use Echo.Body (which will look through the customer's message). You can search in any string of text Gnatta has stored, which you can select in the contextual data explorer under ‘Explore’.

Output

So, we’ve defined the Regex, and we’ve told Gnatta where we want it to look. The last thing we need to do is set an output for it - so you can store what you find, and use that extracted piece of information later in your flow.

If Gnatta finds a match for your Regex in the search, it’ll store it in the Output. If it doesn’t find a match, the Output will be blank. You can then use this Output field to insert your findings elsewhere - such as in an automated response, or a custom data field etc.

Outputs can be given any name. We suggest using a naming convention that’s standard and easy to remember, we usually follow a format like this:

output.OrderNumber (if we’re storing an order number)

output.EmailAddress (if we’re storing an email address)

So, here’s our full ‘Find Text’ action, ready to go:

Regex: Looking for an email address

Source: Echo.Body (the message the customer sent us)

Output: output.EmailAddress

I can now use output.EmailAddress to get Gnatta to insert the email address anywhere else in the flow! The most common use case is to add this to an Email Address custom field in your interaction, so the advisor doesn’t need to go looking for it. More on that in this article.

Groups (Advanced)

Groups in regex allows you to capture parts of the matched text for further processing or extraction. Here's an explanation of groups and how to use them. We’d suggest not using groups if you’re new to regex, but if you read through and can see value in them, we can help you set them up.

What Are Groups Good For?

Groups are good for either pulling out numerous pieces of key data from one regex, or for taking part of a matched regex and focussing on that. Some examples would be:

  • Splitting out dates by day/week/month/year to be able to use those in autoresponses later.

  • Matching part of a unique number (order numbers, email addresses, names)

Using Groups

Groups are portions of a regex pattern enclosed in parentheses (). They serve two main purposes:

  1. Capturing: They capture the text matched by the portion of the regex inside the parentheses.

  2. Organising: They help organise and manage complex patterns by treating parts of the regex as a single unit.


Basic Grouping

To create a group, simply enclose part of your regex in parentheses. For example:

  • Regex: (cat)

  • This matches the word "cat" and captures it as a group.

You can reference captured groups later in your pattern or use them in replacements.
Example
Consider the regex to match and capture parts of a date in the format dd/mm/yyyy:

  • Regex: (\d{2})/(\d{2})/(\d{4})

  • Here, (\d{2}) captures the day, (\d{2}) captures the month, and (\d{4}) captures the year.

If you apply this regex to the string "15/08/2024", the captured groups are:

  • Group 1: "15"

  • Group 2: "08"

  • Group 3: "2024"

Named Groups
Some languages and regex flavors support named groups, which can make your patterns more readable and your code easier to maintain.
Non-Capturing Groups
Sometimes you want to group parts of your regex without capturing them. You can use (?:...) to create a non-capturing group. For example:

  • Regex: (?:cat|dog)

  • This matches either "cat" or "dog" without capturing the match.

Summary

Groups in regular expressions allow you to capture specific parts of the matched text and organize your patterns better. By using parentheses (), you can create capturing groups, which are useful for extracting data and structuring complex patterns. Named groups and non-capturing groups offer additional flexibility and readability in your regex patterns.

Â