Garreth Hanley:
This is Excel Tips, a podcast for accounting and finance professionals, brought to you by CPA Australia.
Neale Blackwood:
Welcome to the Excel Tips podcast. My name is Neale Blackwood and in this episode we are looking at Part Two of a three-part series on Python in Excel. Now, Part One introduced the idea, this is brand-new in Excel. You have the ability to write Python code in a cell in Excel, and this episode is going to cover a little bit more detail and actually give you a practical application of using Python in Excel. Now, when you open up Excel, Python is already loaded. So when you click on the Formulas tab, the Python icon is there and there's an initialization icon as well.
Now, if you click that, you'll see that there are a number of external libraries already installed. If you've worked with Python as the programming language, you'll know that you need to load up the libraries that you need to use. There are a number of common libraries automatically installed for use in Excel. They are the NumPy Library, which handles the scientific and mathematical computing, so you have a few more options than multiply and divide. The Pandas library.
Now, Pandas is probably the most famous Python library and it has a lot of functionality associated with data, so lists, tables in Pandas.
Now, Pandas stands for something, so have a guess at what it stands for. I'll let you know at the end of the podcast. But Pandas are designed to work with tables. Now, they don't call them tables in Pandas, they're called data frames. Excel, we call them tables, but Pandas call them data frames, so I'll probably use those two terms interchangeably during the podcast. And so the example that I'm going to talk about today actually uses the Panda functionality.
Now, when they're loaded, they're actually given a variable name, so NumPy library you can access using NP, and the Pandas library you can access using PD. So it just basically shortens the name for you. A few other of the libraries that are loaded are the Matplotlib and the Seaborne libraries. Now, both of those are associated with data visualisation. Part Three will actually cover some of that.
There's also the stats model and a couple of other ones in Excel and a warnings one, so all of those are loaded. If you click on the initialization icon in the Formulas tab under Python, you will see a separate task pane open on the right-hand side, and that lists all of the libraries with their shortcut name. So as I mentioned, Pandas, we can access via PD.
Okay, so the example that's in the companion video. Now, if you want to see how this all works, check out the companion video on the INTHEBLACK website, and that shows you all the code, goes through it. I just want to talk about sort of high level what we're going to be doing and what's now possible using Python in Excel.
So I've got two tables. I've got an order table, so that's got an order number, a product code, and a quantity. And then I've got a products table and that has a code whether or not we charge GST, and also the unit price. Now, they're two separate tables. They are formatted tables, so they've been named, and we want to merge them together. Now, Power Query in Excel, very powerful. It can do table mergers. Well, so can Python, and you can do that in a cell in Excel. Now, just a reminder for Python, if you want to enter Python code, a quick way is to type equals PY, and then hit the tab key.
And when you hit enter, when you are editing in the Python code, it will give you a new line, which is different to Excel's normal functions. And also different is when you want to accept the final code, you press control, enter hold control, and press enter. There's a slight difference there when you are entering Python code as opposed to Excel formulas and functions.
Now, also, when you are working with Python, you can work in the formula bar. And by the way, the keyboard shortcut to open up the formula bar is control shift and the letter U. When you are working in Python, you do tend to need to make your formula bar a little bit higher to see the code, but there is also an editor. So under the Formulas tab under Python, there's an editor icon and that opens up a task pane on the right-hand side so you can actually edit your code in that editor. I must admit, I haven't made that switch yet. I still use the formula bar. I've been using it for so long, I'm quite used to it, but there is a separate editor if you want to use that as well.
Now, one cool thing about Python is it does recognise Excel's formatted tables. So even though the code looks quite long, a lot of it is actually entered by Python and you don't have to type it all. There's not a lot of typing required. Basically, you can just use your mouse to select the table, and Python will enter all of the code that it needs to refer to that table. And basically we just need to type three lines of code. And those three lines will merge the products table with the order table and create a new table that now has five columns in it, and we can use that to then figure out our sales based on the quantity times the unit price, and also figure out whether or not we charge GST.
So to do all of that takes another five lines of code. So in the final table, we end up with a single table that has all of the order information as well as the sales column, as well as the GST column calculated. So Pandas are powerful. As I said, they're built to handle lists and also what it calls data frames, which to excel people is a table. One of the things that you get used to with Python is the fact that they use a fairly standard way to refer to a data frame. They use the variable DF, short for data frame obviously.
Some things to keep in mind when you are coding in Python. Python code tends to be all in lower case, and so variable names and a lot of the functions are all in lower case. Some words you will notice have some capitalization, but most are in lower case. And if you want to refer to, let's say, a two-word variable, and in the example I use GST amount, you separate the two words with the underscore character. It's actually called Snake Case. So they're all in lower case, it was GST_amount.
Now, something else that Python recommends is that you don't use a lot of abbreviation. Now, I know that GST is abbreviated, but rather than use, let's say AMT as short for amount, they would prefer to use the word amount. So the variables should be as descriptive as possible and try and avoid abbreviation. Obviously, sometimes you have to use it. Now, Python is case-sensitive, so as I mentioned, most things are entered in lower case.
Now, one thing that was unusual in the code that I did share, or it's not unusual in Python, but it is unusual in Excel, is when you are comparing one thing to another and seeing if one thing equals another. Now, in Excel we just use the equal sign, but in Python code you use two equal signs together to do that. Now, when you use one equal sign in Python, it basically means something equals something else, so a variable equals something else. You're setting a variable. If you need to do comparison to see if something does equal something else, then you need to use two equal signs and that returns true or false. So just a heads-up there, that's something that is a little bit different between Excel and Python.
So once you create a table or a data frame in Python, when you hit control enter, it won't actually display. It'll have PY and then followed by the word data frame, so it won't display automatically. So what you have to do is click the little icon on the left of the PY that defines the Python cell and click Excel value, and that will display the table. Now, once you've displayed the table, you can then use that table or data frame in another Excel calculation. So, so far I've just been talking about Python code, but we can go back to good old Excel. And once you've got a table, you can refer to that table using the same symbol as referring to a dynamic array. So dynamic arrays have spill ranges, so there's a single formula that spills down and across or can spill down and across.
To refer to that spill range, you refer to the top left cell and you put the hash after it. So it might be something like A 1 hash. Now, that's the same symbol that will refer to a Python table that's spilling, so a data frame, you just refer to the top left cell and use the hash after the reference, and then you can do all your Excel calculations on that Python table. So in the next episode, Part Three of our Python series, we're going to look at some data visualisation using the libraries that were installed.
Oh, by the way, Pandas. Okay, so Pandas stands for panel data, but Pandas sounds a little bit more exotic than panel data. Thanks for listening. Hope you've enjoyed learning about Python in Excel.
Garreth Hanley:
If you're enjoying this podcast, you should check out our in-depth business and finance show INTHEBLACK. Search for INTHEBLACK on your favourite podcast app today.