order in the sort!

When you’re visualizing categorical data, sorting the bars in your chart is usually a straightforward task. Or is it?

In most cases, you probably take the category with the largest value and stick that in the prime spot, the leftmost slot on the horizontal axis. Then, you proceed from left to right in descending order of value. Easy peasy.

 

OK, sure, sometimes you want to emphasize a metric where a lower value is better or more important, and in that case you might sort by ascending bar values instead.

 

Or perhaps you have a large number of categories, and you want to make it easy for a wide audience to find any particular category quickly. In that case, you may choose to sort your bars alphabetically by category, rather than based on any particular value.

Be careful, though! All of these sorting options presume that each of your categories is of equal weight, and that there’s no inherent natural or implicit order to them. When that isn’t the case, you can unintentionally confuse an audience by sorting your charts in a way that feels unnatural.

For example, which of these two charts feels “correct” to you?

Even though there’s a descending sort in the left chart, the categories themselves are ordinal—they represent a position in a sequence. It’s strange to see time, or ages, presented in a way that doesn’t follow their usual sequence. In the Western world, we typically perceive time as flowing from left to right, so an audience will likely find it challenging to consume a visual that depicts it otherwise.

What about these two charts? Which one feels more correct?

Again, there’s a natural sequence in these height ranges. A reader would expect them to be in that order in the graph, regardless of the values encoded in each bar. The version on the left is jarring to read, while the version on the right, with heights in a consecutive order moving from left to right, feels fully intuitive.

How about these?

This one is a bit more of a gray area. Does it make more sense to depict these cities in geographic order, from westernmost to easternmost? It might, if the data itself were related in some way to longitude or geography in general. While this dataset, sunny days per year, is somewhat related to geography, it doesn’t have much of a west-to-east bias. Sorting these bars by value, rather than location, makes more sense here.

Finally, what about these graphs?

We might sort the actors who have played James Bond in the order they first took the role, rather than by the number of appearances. Is that preexisting sequence important enough to take precedence over ordering the chart by movie count? Perhaps you want to show that after the iconic Sean Connery, finding the next long-term Bond took a couple of tries, or that something similar happened after Roger Moore as well. It all depends on your audience and the particular message you want to deliver.

Although it’s often suggested that categorical data can be sorted and displayed in any order, it’s worth taking the time to think through our chosen layout. By considering any natural or commonly-understood ordered relationships among our categories, and weighing those against the message we want our audience to come away with, we’ll be able to select the ideal sort order for our chart, and avoid any unnecessary or unintentional confusion.

how does this graph make you feel?

Using color strategically and sparingly is often the quickest and easiest change to improve your data communications. Today’s quick post is a cautionary tale about not using color strategically—both in quantity and color choice. 

I recently encountered the following graph. At first glance, how does it make you feel?

 
 

If you’re like me, I feel alarmed. I feel even worse after examining the chart title and legend—warehouse accuracy rates, encoded as red. This doesn’t seem very positive! 

The reason I feel on alert is because of cultural associations with the color red. Western audiences often interpret red as a signal of danger, anger or alarm. It can also be associated with love, excitement, or passion, as we explored in a past SWD challenge. In this example, my brain didn’t immediately associate “accuracy rate” with “passionate love,” so I assumed that this chart was delivering some bad news.  

As it turns out, I was mistaken. This data actually represents positive performance of warehouse fulfillment. These warehouses are filling orders accurately about ~90% of the time. (A caveat here: without knowing more about the underlying context, we can’t be certain that 90% accuracy should be considered a good score, but for illustrative purposes, let’s assume that it is.)

To avoid the knee-jerk reaction of alarm (and improve the visual’s overall effectiveness by bringing the data to life), I’m going to make a few simple changes to this graph:

  1. Utilize a different color palette. There is a positive/negative connotation to this data, so I’ll elect to use blue to signal the positive (accurate) and its complement on the color wheel—orange—to accentuate the negative (errors).

  2. Eliminate clutter. The original graph has many elements (gridlines, harsh bolding, rotated x-axis labels, legend placement) that make it appear more complicated to process than it really is. I’ll strip away these non-essential elements, leaving only those that add enough value to make up for their presence.

  3. Use words effectively. If I want my audience to understand that this data is positive, I shouldn’t assume that they will come to that conclusion on their own. I’ll not only state it in words, but tie the words to the data it describes using similarity of color. 

Check out the difference! Does the revised graph still evoke feelings of alert? Likely not. 

The “after” graph still has room for improvement. The data could be sorted differently and there's an opportunity to add additional context and a call to action. You may see other things you’d approach differently as well. Join me Thursday, March 24 at 11AM Eastern time for a live chat on our YouTube channel, where I’ll continue to transform this visual into a data-driven story. See you there—click the link above to set a reminder and access the event!

how to do it in Excel: adjusting bar width

Today’s post is a tactical one: how to adjust the widths of bar charts in Excel (and why you should). 

Before we get into the step-by-step, I should mention that there aren’t any strict rules for optimal spacing between bars. Rather, it’s personal preference similar to wearing white after Labor Day (in the U.S., that’s the first weekend in September). As a resident of the muggy Southeast, I’ll be rocking white until fall temperatures arrive in mid-October. However, if you live in cooler climes and consider Labor Day the symbolic end of summer, your preference might be to say sayonara to white until Memorial Day. 

The same gray area goes for optimal spacing between bars. The actual width is not set in stone. Our goal is to enable our audiences to compare the lengths of the bars (instead of the area between them), so general guidance is to thicken the bars to minimize the surrounding white space.

Let’s turn now to how to accomplish this in Excel. In the spirit of Labor Day, I’ll use some data from the Bureau of Labor Statistics (BLS) showing the top ten occupations in the U.S. as of May 2020. 

Compare the bar spacing in the two visuals shown below:

optimal bar chart spacing.png

On the left, the gaps are attention-grabbing and create an unnecessary shimmer to the visual. The adjusted version puts emphasis on the length of the bars.  Download the Excel file to following along with these steps to manually adjust:

  1. Highlight all the bars, right-click and choose Format Data Series:

how to adjust bar chart spacing.png

2. In the Format Data Series menu, under Series Options, adjust the Gap Width dialog box:

 
how to adjust bar chart width.png
 

The result is this:

bar chart example.png

Another benefit of doing this is that now there’s enough space to pull the long data labels into the ends of the bars. This is just one of the decluttering steps we can take to reduce perceived cognitive burden. Here’s how to achieve this:

3. Click on any data label to highlight them all, then right-click and choose Format Data Labels:

how to reformat bar charts.png

4. In the Format Data Labels menu, select Label Options, and in the Label Positions section, choose Inside End. (While you’re at it, in the Label Contains section, uncheck “Show Leader Lines.” These are almost never necessary.)

 
bar chart example.png
 

My graph renders like this, due to my color scheme. I’ll adjust the font color so that the labels have sufficient contrast against the dark blue bars. (TIP: you can use the online WebAIM contrast checker to see if your text is sufficiently readable against your background color.)

bar chart example.png

5. To adjust the font color, click to select all the labels, choose the Font options dropdown arrow, and then select a different hue (you can also do this in the Format Data Labels menu if you still have it open):

bar chart example.png

The final visual looks like this:

bar chart example.png

I might even choose to further format the numbers by displaying the units in millions.

Just as the modern-day guidelines for wearing white after Labor Day are subjective, so too are the rules for the exact spacing between bars. As the designer of your own graphs, experience and personal preference will help you find your own “Goldilocks” of spacing: too thin, too thick or just right.

bar chart example.png

More Excel how-to’s:

connecting the slide title to the graph

Today’s post outlines one approach to get your message across more clearly: use color to connect the slide title to the graph. 

First, a bit of background. When communicating with data in PowerPoint, your slide title is precious real estate. Your audience is typically looking there first to understand what the slide will display, so we should be using active slide titles to help set their expectations.

Let’s look at an example, adapted from Exercise 5.7 of storytelling with data: Let’s Practice!. The following visual shows a competitive landscape overview for an on-demand printing company.

datastorytellingpracticeexercise.png

Consider the slide title. I’d categorize it as an active title because it primes me for what I should see in the forthcoming data. The designer was thoughtful both to put the main point into words and to make the words stand out via their size and placement at the top of the page.

If you’re like me,  then you’re probably now searching for evidence of an increase in XBX Business in the graph. You’ll find it eventually, but there are ways to eliminate the need for this tedious search process altogether.

One option is to use the same color between the data and the text, while simultaneously de-emphasizing the rest of the visual with grey. Check out what a difference this makes: 

data storytelling example.png

This simple change—the power pairing of color and words—ensures that the audience is more likely to immediately understand the results of all the hard work we’ve done. All we have to do is make it easier for them to see in the first place. 

data storytelling before and after.png

Are there more improvements we can make to this slide? Absolutely—you can download the data and practice improving this visual with me in the SWD community exercise from good to great.