Garreth Hanley:
This is Excel Tips, a podcast for accounting and finance professionals, brought to you by CPA Australia.
Neale Blackwood:
Welcome to the Excel Tips podcast. My name is Neale Blackwood, and in this episode, we're going to look at three new functions that have been added to Excel. These functions are based on what are called Regular Expressions, which is usually abbreviated to REGEX.
So the idea behind Regular Expressions is that when you're working with messy data, which out in the real world you can often have some messy text, you need ways to identify patterns within the text. And so REGEX patterns allow you to identify types of text within text.
Now, if you've got structured data, Excel has TEXTBEFORE and TEXTAFTER and TEXTSPLIT. So those three functions work really well with structured data. But if you have to handle messy data, you can do that now with these REGEX functions. So there's three of them as well. Each of them has the REGEX, so R-E-G-E-X, prefix, and there's REGEXEXTRACT, REGEXTEST, and REGEXREPLACE.
So the REGEXEXTRACT enables you to use a pattern to extract some text from other text. In the example video, which I highly recommend you check out because it's got examples of all of these, we can extract the email address from a sentence no matter where the email appears in the sentence.
The REGEXTEST function will return true or false depending on whether the pattern is found within the text. And then the REGEXREPLACE allows you to replace a pattern with some other text. That one, I can't really see a lot of use for, but it's there if you need it.
Now, the patterns that you use have to be enclosed in quotation marks, and they use a whole lot of different codes to represent characters. I won't go into them because explaining them won't make a lot of sense.
The video goes through them. So suffice to say that you need to build up the pattern using the codes surrounded by the quotation marks and that allows you then to identify the specific pattern within a text string.
So you can use patterns to identify words, numbers, dates, and as I mentioned, email addresses. And the example in the video uses the email pattern for a number of its examples.
Now, one concept you need to get your head around is something called white space. So when you're working with data that comes from systems or data that's coming from the internet, so you'll often find that there's extra characters that aren't standard. So the space character is pretty straightforward, but there are a number of other characters that can be used to separate text.
And that can be the tab, could be a line feed, a form feed, a carriage return. All of these different characters can split up text. And white space is the terminology that they use to describe all of those characters. And you can identify the white space within the pattern.
Once you start using these REGEX patterns, consider creating a centralised table that you can capture the patterns. The patterns can get quite long, so it can be a good idea to capture those so you can reuse them. Just as a heads up, AI is really good at creating REGEX patterns.
REGEX patterns have been around for a long time. So programmers have been using them for decades, but they've only just come to Excel. So there's lots of internet resources to help you build REGEX patterns.
So all three REGEX functions within Excel default to being case sensitive, okay? So uppercase and lowercase. Now, each of them has an option that you can change so that if you don't want it to be case sensitive, you can just change that setting and it will be case insensitive.
So as well as the three new functions, Excel has updated two existing functions to use REGEX patterns. They are XLOOKUP and XMATCH. Both of those can take a pattern as its first argument, and then under the MATCH mode argument, you can use the number three to specify a REGEX match.
So what that allows you to do is to look something up based on a pattern rather than an exact sort of match. So that's in XLOOKUP and XMATCH.
Now, as I mentioned, the REGEXTEST function returns true or false when a pattern is found, and that's really useful to use with the FILTER function. So in the companion video, I actually showed you how you can use the FILTER function to look through text strings and identify when a text string has an email address in that text string.
And the follow on from that was to show how you can use then the hyperlink function to create a link that when you click on it, it'll actually open your email client and allow you to create an email based on that email address.
So data can be messy and the REGEX functions provides the tools to enable you to handle that messy text and make some sense of it. Hope you found that useful. Thanks for listening.
Jackie Blondell:
If you're enjoying this podcast, you should check out our in-depth business and finance show, INTHEBLACK. Search for INTHEBLACK on your favourite podcast app today.