Neale Blackwood:
Welcome to the Excel Tips podcast. My name is Neale Blackwood, and in this episode we're going to look at two brand new functions in Excel which may change the way you build reports. You do need the subscription version of Excel to access these new functions.
Pivot tables have been around in Excel for a long time, and they allow you to summarise a large data set without any formulas. So you can take a data table and convert it into a summary report. The downside with pivot tables is that they need to be refreshed, so if the data changes, your pivot table won't change until you refresh the pivot table, which tends to be a manual process.
Microsoft have just released two brand new functions, group by and pivot by, which allow you to mimic what a pivot table can do. You can take a data set and use one single formula and summarise that whole data set into a two-dimensional report, and you don't need to refresh because it's a formula, and so it will automatically update when the data changes.
Group by will produce a row based report. Pivot by allows you to have rows and columns in your report so that you can, as I mentioned, mimic what a pivot table looks like. So you will be able to replace some of your pivot tables with group by and pivot by.
It is best practise to use a formatted table as a data source for your pivot tables and for the group by and the pivot by. Formatted tables have been covered in a separate podcast as well as the companion articles. The group by function has eight arguments, but only three of them are required.
So you provide the column that you want to report on, the second argument is the column that has the values that you want to summarise, and then the third argument is the function that you want to use to do the calculation, which 99 times out of a hundred is probably going to be the sum function. There is another function that you will probably want to use as well, which I'll get to a little later.
You are allowed to include subtotals as well, so the report can have multiple columns. So to do multiple columns you do need to use other functions to combine the columns. If the columns are next to each other, it's fairly easy to include them. If they're not next to each other in the data set, then there's two functions that allow you to combine columns. One is HSTACK and the other is.
They have advantages and disadvantages. The advantage with HSTACK is that you do specify the columns that you want to bring in. You don't actually type in the name of the column, you type in the number function, the index number of the column. Now the problem with that is that if columns get inserted into your data set, then that can throw out the choose column, so I prefer the HSTACK because you actually specify the column name.
Now again, I am assuming that you're using formatted tables as the data source, and so that allows you to specify the column name using what are called structured references in Excel. A structured reference is basically the table name followed by an opening square bracket with the column name, and then a closing square bracket. So that's a structured reference that refers to a column in a formatted table.
The beauty with formatted tables and the reason that it's best practise to use them as the data source is that formatted tables automatically expand as data is added to them, which means that once you add the data to the table, the report, the group by report is ready to go, you don't have to refresh it.
Now, when you add subtotals to your report, you might want to format the subtotal and the total rows slightly differently. And if you check out the companion video, I take you through a conditional format that will allow you to format those rows differently. So the example uses a bold format with a top and a bottom line so that they are differentiated from the rest of the report.
You can also filter and sort in group by, so the filtering and sorting allows you to get the layout that you're after. The sorting is based on alphanumeric or values. The filtering is sort of a standard filtering where you can check if a column equals something, for example. There's also a more complex filter in the companion video, which allows you to use a slicer to actually do the filtering in your report. Just like a pivot table, you can set it up so that a slicer can do the filtering.
With the sorting, you can sort in ascending order or descending order, and the way you do that is you provide the column number that you want to sort by. If you use the positive number, so let's say two, then that's going to sort the second column in ascending order. If you put minus two, it will sort that column in descending order. So you can control the order by putting a negative in there, and that switches it to a descending order.
There is another new function that's been added that I think will prove popular. It's called percent of, and the idea is that you can include the sum function as one of your calculations, and then you can include the percent of, and what that does is that gives you the percentage of total, so you can return the value as well as the percentage of the total. So percent of is a new function that works really well with group by and pivot by.
Check out the companion video as well because I do show you a way to simplify using both of those together. You can actually set them up using a range name to do those, otherwise you've got to use the HSTACK function to combine the sum and the percent of to include that in the report.
Now, when you do use multiple calculations in your report, you will get headings in the report. To remove those headings, which will have the word sum and percent of, or whatever calculation you're doing, you can use the drop function. So the drop function allows you to remove the first row fairly easily. So again, check out the companion video to see how you can do that.
Now, as I mentioned, pivot by has 11 arguments, and pretty much all of the stuff that you do in a group by you can do in a pivot by. So pivot by has the added argument of columns, so you can design your report to have rows and columns. In the group by what you see is basically just a row listing of your entries and then different calculations going across the page.
With pivot by, you can summarise multiple columns, one of the columns can be a row based, and then the other one can be going across the page in columns. And again, if you want to do multiple columns, you can use the HSTACK function to combine those.
Now dates. Dates are a little bit tricky when you are working with group by and pivot by. Unfortunately, if you sort of summarise your data by, let's say, month name, then it is sorted in alphanumeric order, so your April and August are at the top, which is not what you want.
So I've shown in the companion video how you can use the text function to split up your dates so that they do work as you're expecting, so for a financial year you'll have July through to June. So again, check out the companion video to see how to use the text function to get your dates in the proper order.
Now, pivot tables have the advantage of allowing you to drag and drop to change the pivot table. Now, you can't really do that with the group by or pivot by, but you can allow the user to change the report and create flexible reports if you use the indirect function.
Now, the indirect function does split the Excel community, some people don't like it, I'm a big fan of it. It is my function of last resort, so I will try to find solutions without indirect, but sometimes indirect is the only way to do something, and this is one of those examples. What we can do is we can have the user use a drop down to select the columns that they want to include in the report.
We can then build the structured reference that I mentioned earlier, which is the table name, and then square brackets around the column name, we can build that using our text string formulas. And then once we've got that, we can pass that structured reference as a text string to the indirect function.
Now, it's actually the worst named function in Excel because it does not tell you what it does. What the indirect function does is it converts everything between the parentheses into either a cell reference, a range reference, a structured reference, which is how you refer to table names, or range names. You can set up a structure that will allow the user to select the column, and that will flow into THE pivot by or group by, by using the indirect function. So again, check out the companion video, it goes through in a lot of detail in how to do that.
Now, when you're building your reports, you may want to do some extra calculations. So the example in the companion video and article is doing a margin percentage. Now, you can't really do calculations within the group by and the pivot by, but what you can do is capture the reports so you could bring in the margin and bring in the sales, and then do a calculation based on that report.
So the let function, which I've covered in previous podcasts, allows you to use variables in a formula, and those variables, for example, can capture that whole group by or pivot by report. And once you've got that report, then you could do calculations with it.
In the example, we take a group by report that's bringing in the margin and the sales, and what we do is then we do a margin calculation, so we can refer to various columns inside the report and then create the calculation for the margin percentage. And once you've got that, you can just combine the two reports together using the HSTACK function, and so you get a final report which has the sales, the margin, and the margin percentage.
So the group by and pivot by functions do offer a lot of solutions to reporting, they are single cell formulas that they spill down and across, and they allow you to create two-dimensional reports that normally you would create with pivot tables, but now you can create with a single formula. I hope you found that useful, thanks for listening.