Garreth Hanley:
This is Excel Tips, a podcast for accounting and finance professionals brought to you by CPA Australia.
Neale Blackwood:
Welcome to the Excel Tips Podcast. My name is Neale Blackwood. And in this episode we're going to be looking at the final part in a three-part series on Python in Excel. So Python in Excel is a new feature that allows you to use Python code. So Python is a full-blown programming language. And you can write Python code in a cell within Excel. Now, in part one, I introduced the terminology and also the changes that have been made to incorporate Python in Excel.
Part two covered some of the things that you can do with an external library that's loaded called the Pandas library, and I'll discuss more about that in this episode as well. So the Pandas library handles data within Python. So tables, lists, those sort of things. And the Pandas library is automatically loaded when Python loads in Excel.
What I wanted to focus on in this episode is some of the other things that you can do with Python. Now, if you want to see any of this and how it works in practice, check out the companion video which covers all of this and goes through step-by-step in how you can build the solutions. So the first thing I wanted to cover is the fact that Python can interact with a connection-only Power Query.
So Power Query is the best practice way to import data into Excel. And Power Query can either bring the data into a table within Excel, so you can actually see the data in the spreadsheet. Or it can use what's called a connection-only query, which still imports the data, but the data is held in memory. Now, this data can also be loaded into what's called the data model, but I'm not going to be discussing that.
So for a connection-only query, you can't really get to that with an Excel formula. It's very difficult. But Python can extract the data from a connection-only query, and it's very easy. It uses the xl() function. And the XL is the letter X and the letter L(). And then, you can put in the query name in quotation marks. As soon as you type the quotation marks, it'll actually list all of the queries and all of the tables that you can access with that XL function.
So you can bring the query in. Now, one of the advantages with using the Pandas library is that there are some built-in ways to preview that query. So to see the data you can use, there's a head() function. So that shows you the top five rows. There's a tail() function that shows you the last five rows. And there's also a sample() function.
And with the sample() function you can specify how many rows to preview. And this just brings in random rows, so you can get a feel for what the data looks like. With the head() and the tail() function as well, the default is five rows, but you can enter a number to specify how many rows to show. So the Pandas functionality also allows you to filter the data. So in the companion video, I filtered the connection-only query using some code.
There's another library that's very useful for data visualisation. It's called Matplotlib. It's a fairly common library, and this is automatically loaded when Python is loaded in Excel. In the companion video, I created a horizontal bar chart based on the filtered list that had been created earlier in three lines of code. So three lines of code created a horizontal bar chart that you could move around the spreadsheet, and it was dynamic.
Something else that's dynamic and is an advantage of Python over Power Query is that Python through the Pandas functionality has the ability to do some data cleansing which, again, Power Query can do. But the beauty with Python is that you don't need a refresh to do it. So I use an example where there's horizontal data. So this is data that's entered column by column going across the page. Each column represents another day.
So this type of layout is really easy to input data, but it's difficult to extract data from that type of layout. So what you can do is you can convert the horizontal data into vertical data. Now in Power Query, that's called Unpivot. So instead of having Y data, which as I mentioned is hard to work with, you can have long data, which is also called Normalised Data, which is a lot easier to work with in things like pivot tables.
Python has the melt() function, and that melt() function is designed to take horizontal data and turn it into vertical data. There are options to fill down as well. So we fixed one of the problems that we had a branch name, but that wasn't filled down in the column. So there is an ffill() functionality in Pandas that allows you to fill entries down. There was also a text-based date that you can convert to a proper date, and this is all within the Python code.
There was about six lines of Python code that converted the horizontal data into the vertical data. And as I demonstrated in the video, it is automatically updated. There is no refresh required which, again, Power Query, if the data changes, Power Query requires a refresh to update the data.
Okay, so just a reminder how Python works. Python is entered in a cell. So when you press Control Enter, which is how you enter the code, what happens is that code is processed through the internet in an Azure, which is an online database container that runs the code. The result is returned to the cell. So you're actually working across the internet for the processing.
Now, that has one advantage is that everyone's using the same version of Python. The disadvantage is you are using some web-based resource, and it's Microsoft's resource. And so, they are providing a certain amount of what they call Premium Python Speed. Every month when you start using Python, you have access to a certain amount of this Premium Python Speed. If you use a lot of resources, then you will eventually hit a limitation, and you'll drop down to what's called Standard Python.
So that's just a bit slower. I don't know. I hadn't really compared speed, but it's a slower calculation time. Now, there is also another limitation, and it's sort of like a reasonable use limitation. If you use too much Python, there is the possibility that you could end up being blocked. And so, that's something to consider. So that's all within the free use of Python and within the standard normal Excel subscription.
You can purchase an add-in for Python that allows you to have the premium speed for longer. So it's not premium speed for the whole month no matter how much you use it. So I still think there is some sort of limitation there. There isn't much on the net about how the speed is calculated or how the resource is measured. So it's still all fairly new. But there is the possibility to have longer access to the Premium Python with a fee-based add-in.
So have a play with Python. It's a fairly safe environment. Python also works with the web-based version of Excel. There are lots of resources for Python on the internet. Python is a full-blown programming language that has been around for a very long time, and so there's lots of free resources for Python on the net.
Think about the fact that the speed will be impacted if you use a lot of Python in a month. Just realise that. And if you are using a lot of Python, it might be worthwhile investing in the add-in to keep the premium speed going for longer. I hope you've enjoyed learning about Python in Excel, and thanks for listening.
Garreth Hanley:
If you're enjoying this podcast, you should check out our in-depth business and finance show, In The Black. Search for In The Black on your favourite podcast app today.