As product owners or developers, we probably have a good handle on which core assets we need to make a website work. But rarely is that the whole picture. How well do we know every last thing that loads on our sites?
An occasional web performance audit, done by hand, does make us aware of every last thing. What’s so great about that? Well, for starters, the process increases our mindfulness of what we are actually asking of our users. Furthermore, a bit of spreadsheet wizardry lets us shape our findings in a way that has more meaning for stakeholders. It allows us to speak to our web performance in terms of purpose, like so:
Want to be able to make something like that? Follow along.
Wait, don’t we have computers for this sort of thing?
A manual audit may seem like pointless drudgery. Why do this by hand? Can’t we automate this somehow?
That’s the whole point. We want to achieve mindfulness—not automate everything away. When we take the time to consider each and every thing that loads on a page, we get a truer picture of our work.
It takes a human mind to look at every asset on a page and assign it a purpose. This in turn allows us to shape our data in such a way that it means something to people who don’t know what acronyms like CSS or WOFF mean. Besides, who doesn’t like a nice pie chart?
Here’s the process, step by step:
- Get your performance data in a malleable format.
- Extract the information necessary.
- Go item by item, assigning each asset request a purpose.
- Calculate totals, and modify data into easily understood units.
- Make fancy pie charts.
The audit may take half an hour to an hour the first time you do it this way, but with practice you’ll be able to do it in a few minutes. Let’s go!
Gathering your performance data
To get started, figure out what URL you want to evaluate. Look at your analytics and try to determine which page type is your most popular. Don’t just default to your home page. For instance, if you have a news site, articles are probably your most popular page type. If you’re analyzing a single-page app, determine what the most commonly accessed view is.
You need to get your network activity at that URL into a CSV/spreadsheet format. In my experience, the easiest way to do this is to use WebPagetest, whose premise is simple: give it a URL, and it will do an assessment that tries to measure perceived performance.
Head over to WebPagetest and pop your URL in the big field on the homepage. However, before running the test, open the Advanced Settings panel. Make sure you’re only running one test, and set Repeat View to First View Only. This will ensure that you don’t have duplicate requests in your data. Now, let the test run—hit the big “Start Test” button.
Once you have a results page, click the link in the top right corner that says “Raw object data”.
A CSV file will download with your network requests set out in a spreadsheet that you can manipulate.
Navigating & scrubbing the data
Now, open the CSV file in your favorite spreadsheet editor: Excel, Numbers, or (my personal favorite) Google Sheets. The rest of this article will be written with Google Sheets in mind, though a similar result is certainly possible with other spreadsheet programs.
At first it will probably seem like this file contains an unwieldy amount of information, but we’re only interested in a small amount of this data. These are the three columns we care about:
- Host (column F)
- URL (column G)
- Object Size (column N)
The other columns you can just ignore, hide, or delete. Or even better: select those three columns, copy them, and paste them into a new spreadsheet.
Auditing each asset request
With your pared-down spreadsheet, insert a new first column and label it “Purpose”. You can also include a Description/Comment column, if you wish.
Next, go down each row, line by line, and assign each asset request a purpose. I suggest something like the following:
- Content (e.g., the core HTML document, images, media—the stuff users care about)
- Analytics (e.g., Google Analytics, New Relic, etc.)
- Ads (e.g., Google DFP, any ad networks, etc.)
Your Purpose names can be whatever you want. What matters is that your labels for each purpose are consistent—capitalization and all. They need to group neatly in order to generate the fancy charts later. (Pro tip: use data validation on this column to ensure consistency in your spreadsheet.)
So how do you determine the purpose? Typically, the biggest clue is the “Host” column. You will, very quickly, start to recognize which hosts provide what. Your root URL will be where your document comes from, but you will also find:
- Analytics URLs like googletagservices.com, googletagmanager.com, google-analytics.com, or js-agent.newrelic.com.
- Ad URLs like doubleclick.net or googlesyndication.com.
If you’re ever unsure of a URL, either try it out yourself in your browser, or literally google the URL. (Hint: if you don’t recognize the URL right away, it’s most likely ad-related.)
Just doing the steps above will likely be eye-opening for you. Stopping to consider each asset on a page, and why it’s there, will help you be mindful of every single thing the page loads.
You may be in for some surprises the first time you do this. A few unexpected items might turn up. A script might be loaded more than once. That social widget might be a huge page weight. Requests coming from ads might be more numerous than you thought. That’s why I suggested a Description/Comment column—you can make notes there like “WTF?” and “Can we remove this?”
Augmenting your data
Before you can generate fancy pie charts, you’ll need to do a little more spreadsheet wrangling. Forewarned is forearmed—extreme spreadsheet nerdery lies ahead.
First, you need to translate the request sizes to kilobytes (KB), because they are initially supplied in bytes, and no human speaks in terms of bytes. Next to the column “Object Size,” insert another column called “Object Size (KB).” Then enter a formula in the first cell, something like this:
Translation: you’re simply dividing the amount in the cell from the previous column (E2, in this case) by 1000. You can highlight this new cell, then drag the corner down the entire column to do the same for each row.
Now, to figure out how many HTTP requests are related to each Purpose, you need to do a special kind of count. Insert two more columns, one labeled “Purpose Labels” and the second “Purpose Reqs.” Under Purpose Labels, in the first row, enter this formula:
This assumes that your purpose assessment is column B. If it’s not, swap out the “B” in this example for your column name. This formula will go down column B and output a result if it’s unique. You only need to enter this in the first cell of the column. This is one reason why having consistency in the Purpose column is important.
Now, under the second column you made (Purpose Reqs) in the first cell, enter this formula:
This formula will also go down column B, and do a count if it matches with something in column G (assuming column G is your Purpose Labels column). This is the easiest way to total how many HTTP requests fall into each purpose.
Totaling download size by purpose
Finally, you can now also total the data (in KB) for each purpose. Insert one more column and call it Purpose Download Size. In the first cell, insert the following formula:
This will total the data size in column F if its purpose in column B matches G2 (i.e., your first Purpose Label from the section above). In contrast to the last two formulas, you’ll need to copy this formula and modify it for each row, making the last part (“G2”) match the row it’s on. In this case, the next one would end in “G3”.
Make with the fancy charts
With your assets grouped by purpose, data translated to KB, number of requests counted, and download size totaled, it will be pretty easy to generate some charts.
The HTTP request chart
To make an HTTP request chart, select the columns Purpose Label and Purpose Reqs (columns G and H in my example), and then go to Insert > Chart. Scroll down the list of possible charts, and choose a pie chart. Be sure to check the box marked “Use column G as labels.”
Under the “Customization” tab, edit the Title to say “HTTP Requests”; under “Slice,” be sure “Value” is selected (the default is “Percentage”). We do this because the number of requests is what you want to convey here.
Go ahead—tweak the colors to your liking. And ditch Arial while you’re at it.
The download-size-by-purpose pie chart is very similar. Select the columns Purpose Label and Purpose Download Size (columns G & I in my example); then go to Insert > Chart. Scroll down the list of possible charts and choose a pie chart. Be sure to check the box marked “Use column G as labels”.
Under the “Customization” tab, edit the Title to say “Download Size”; under “Slice,” be sure “Value” is selected as well. We do this so we can indicate the total KB for each purpose.
Or, you can grab a ready-made template. If you want to see a completed assessment, check out the one I did on an A List Apart article. I’ve also made a blank file with a lot of the trickier spreadsheet stuff already done. Feel free to go to File > Make a Copy so you can play around with it. You just need to get your page data from WebPagetest and paste in the three columns. After that, you can start your line-by-line assessment.
Telling the good from the bad
If you show your data to a stakeholder, they may be surprised by how much page weight goes to things like ads or analytics. On the other hand, they might respond by asking what we should be aiming for. That question is a little harder to answer.
Some benchmarks get bandied about—1 MB or less, a WebPagetest Score of 1000, a Google PageSpeed score of over 90, and so on. But those are very arbitrary parameters and, depending on your project, unattainable ideals.
My suggestion? Do an assessment like this on your competitors. If you can come back to your stakeholders and show how two or three competitors stack up, and show them what you’re doing, that will go much further in championing performance.
Remember that performance is never “done”—it can only improve. What might help your organization is doing assessments like this over time and presenting page performance as an ongoing series of bar charts. With a little effort (and luck), you should be able to demonstrate that the things your organization cares about are continually improving. If not, it will present a much more compelling case for why things need to change for the better.
So you have some pretty charts. Now what?
Your charts’ usefulness will vary according to the precise business needs and politics of your organization.
For instance, let’s say you’re a developer, and a project manager asks you to add yet another ad-metrics script to your site. After completing an assessment like the one above, you might be able to come back and say, “Ads already constitute 40 percent of our page weight. Do you really want to pile on more?”
Because you’ve ascribed purpose to your asset requests, you’ll be able to offer data like that. I once worked with a project manager who started pushing back on such requests because I was able to give them easy-to-understand data of this sort. I’m not saying it will always turn out this way, but you need to give decision makers information they can grasp.
Remember, too, that you are in charge of the Purpose column. You can make up any purpose you want. Interested in the impact that movie files have on your site relative to everything else? Make one of your purposes “Movies.” Want to call out framework files versus files you personally author? Go for it!
I hope that this article has made you want to consider, and reconsider, each and every thing you download on a given page. Each and every request. And, in the process of doing this, I hope you are equipped to call out by purpose every item you ask your users to download. That will allow you to talk with your stakeholders in a way that they understand, and will help you make the case for better performance choices.