teaching data storytelling

At SWD, our mission—to inspire positive change through the stories we tell with data—is one we share with professional educators, the schools and universities where they teach, and students at all stages of their academic journey. If you're an educator who is teaching from or would like to teach from our resources, please let us know so we can keep you informed!

While our shared interests have inspired a number of collaborations over the years (our team happily guest lectures when our schedules allow), our books have primarily been the mechanism through which we’ve reached the most students. More than 400 universities worldwide (and counting) use our resources as part of their curricula, with both instructors and students expressing their appreciation for the books’ straightforward content, practical application to real-world examples, and affordable price.

A sampling of universities teaching from the storytelling with data books.

A sampling of universities teaching from the storytelling with data books

We are committed to helping educators teach their students how to drive positive change with data. If you’re assessing how you can incorporate the storytelling with data books as textbooks into your course, evaluation copies can be requested here.  If you’re already using them, we have some accompanying resources to support you in teaching.

  • Real-world datasets

  • Exercises and case studies

  • Connections with instructors, students, and professionals

  • Short-form videos

  • Personalized support from the SWD team

You can find more details about these resources below. Are there other resources that you would find helpful? Complete this form to share your feedback and connect with the SWD team. We always enjoy hearing how instructors use the storytelling with data lessons. We’re eager to do what we can to help enable you and your students.

 

Real-world datasets

You are welcome to download the examples from the books containing data, graphs, and solutions and use them to teach (with proper attribution) from our books page. Most solutions are Excel-based, with select exercises demonstrated in other tools including Tableau, PowerBI, Python, R, Datawrapper, Flourish, Google Data Studio and more. All of the examples are anonymized from real organizations, giving your students insight into what they’ll encounter in the workplace.

 

Exercises and case studies

Many instructors create assignments using the exercise bank in the online SWD community. It’s free to sign up for this low-risk, supportive environment where the data, scenario and steps are provided with solutions. Each of these practice exercises focuses on a specific data storytelling skill that will have practical application in students’ careers, and is based on a real, de-identified industry scenario.

 

Connections with instructors, students, and professionals 

In the SWD community, you can engage in conversations with other educators to ask questions or share experiences, ask your students to weigh in on a topic of discussion, or hear from working professionals on practical applications of communication techniques.

 

Short-form videos

If you like sharing video examples of effective data storytelling, quick tips, tool tutorials and in-depth case studies as presented by Cole and the SWD team, subscribe to the SWD YouTube channel

 

Personalized support from the SWD team 

Need to brainstorm ideas for developing your course? Subscribe to premium for direct access to Cole and the SWD team through weekly office hours, our full video library and in regular virtual events where you can experience first-hand how the SWD team teaches lessons in a live, interactive setting. Subscribing to premium also gets you additional perks like exclusive content, early access to special events, discounts and more.

 
 
 

The newest book—storytelling with YOU

If you'd like to take your students beyond data, Cole’s latest book, storytelling with YOU, explains how to plan, create and deliver a stellar presentation that makes the audience want to listen and act. You can get sample content from the book now, in advance of its release in September 2022.

Thank you very much for your support. Here’s to driving positive change by instilling this important skill set in your students!

how to create a line chart in Excel

A line chart is a simple graph that is familiar to most audiences. Lines are great for showing continuous data, such as plotting how the value of something changes over time. In this post, we will cover how to create a line chart in Excel, using a sample dataset from a community exercise: table takeaways. The information is about an annual corporate fundraiser to provide meals to those in need. You can download the file here to follow along as we build the line chart. 

The dataset includes two columns—a column for the campaign year and a column for the number of meals served in a given year. Let’s assume we want to make a line chart with the years going along the x-axis and the meals served along the y-axis.

Creating a single line graph in Excel is a relatively straightforward process, as it is a default chart type.

Insert a line chart

To begin, highlight the data table, including the column headers. To do this, click cell B7 and drag your cursor to C18. Next, navigate to the Insert ribbon and select the line chart icon. (Note that you can also use the Insert menu at the very top, then choose Chart -> Line to achieve a similar result.)

Excel offers several different variations of the line graph. For our purpose, we will go with the first option available—a 2-D line chart without markers

Now, we have a simple line chart, but it is not exactly what we had imagined. Excel made the Campaign Year data points a series instead of the x-axis labels that we wanted.

To adjust the x-axis to show the years, right-click on the chart and go to Select Data… in the pop-up menu. Campaign Year and Meals Served are in the list in the series box in the middle. Pick Campaign Year and then click the minus (-) sign box below the list to remove Campaign Year as a series.

Next, add the Campaign Year as the horizontal axis labels by clicking the cell selector icon to the right of the box and then highlighting the data cells with the Campaign Year (cells B8:B17). Finally, click the OK button.

With this adjustment, we have our desired line chart with the years going along the x-axis and the meals served along the y-axis. Excel has automatically provided a ‘Meals Served’ title for us. We can improve upon this title, and some of the other formatting decisions Excel has made for us. We'll cover these steps in detail in an upcoming post. In the meantime, check out other Excel how-to articles below.

More Excel how-to’s 

For more tactical Excel instructions, check out these other resources: 

how to do it in Excel: emphasizing a data point

Today’s post is a tactical one: how to highlight a data point in Excel. 

When we craft visualizations for explanatory purposes—that is, when there’s a specific finding or recommendation that we want to communicate to someone specific—our goal is to drive action. In those cases,  our visuals should emphasize what’s interesting in the data and what requires attention. Highlighting key points in our graph is an important step in creating successful explanatory communications.

Here are some charts that use this technique effectively, making it clear where an audience is meant to focus. You can check out many more examples by using the Discover feature in the SWD community. 

Let’s get right to how to accomplish this in Excel. The example we’ll be working with today is a graph that shows average revenue by event type for a venue that hosts various public functions. You can download the accompanying data file to follow along with these steps.

Here’s our starting point, which Excel created by default.

First, I'm going to create a clean, uncluttered version of this graph. (You can reference this prior post for steps to reformat Excel graphs.)

Let’s assume my audience is a marketing leader who’s interested in knowing which event type is expected to have the highest average revenue in the coming year. Since the answer is “Fundraisers,” I’ll direct attention to the 2020 forecast for that event type with these steps:

  1. Click once anywhere on the Fundraisers line, then click on just the 2020 data point. Right click on the selected point and choose Format Data Point

2. The Format Data Point menu opens. Under the Fill & Line tab (with the paint bucket icon), click on the button that says “Marker,” and then navigate to Marker Options. Click the Built-in option, then choose a filled circle in the drop-down Type menu. I’ll also adjust the size to 9 (the size you choose should depend on the existing thickness of your lines and your individual needs for using the marker as emphasis).

3. In the Fill menu, I can adjust the color by selecting my preferred choice under Solid fill:

4. I’ll also adjust the outline color in the Border menu below it. I chose no line, but this is my personal preference.

My resulting visual looks like this:

Even if you just quickly glance at this graph, notice how your eyes are drawn to that blue dot. By adding and formatting this particular data point, I’ve made it clear to my audience that the Fundraiser value in 2020 is where I want them to look.

Since the 2020 data point is a forecast, I’d want to visually distinguish it from actual data by varying the line style (making it dotted instead of solid) and the marker style (making it open rather than filled). I’ll also include the numeric data labels for specificity’s sake.

Here’s how we can apply those changes, as we continue to refine our graph:

5. To vary the line style, highlight the data point of interest the same way you did earlier in Step 1. Right-click and choose Format Data Point.

6. In the Fill & Line tab again, instead of clicking on Marker, click into the Line menu, where you can adjust the various settings of the line immediately preceding the data point. To achieve the effect in my example above, my selections are a solid blue line with a width of 2.5 pt and a Dash type of Square dot. You can follow these same steps for the other event types in the graph as well so that all of the lines for 2020 are dashed to indicate a forecast (without changing the color to blue).

7. To add the numeric labels, I’m opting to include them as text boxes so that I have greater control over the formatting. You can read more about the considerations for text vs embedded labels in this prior post.

These formatting changes require a bit of extra effort, but they make your charts far more effective. Don’t assume that your audience knows where to look in your visuals—make that task easier for them through your design choices.

But wait…there’s more! What about all the other data? Isn’t that interesting to the audience? It absolutely could be—I invite you to practice how you’d approach this task in the community exercise emphasize multiple takeaways. You can also check out how I’d walk my audience through a story in the video so what? using story to drive action (open to premium subscribers). 


More Excel how-to’s:

how to make a scatter plot in Excel

 
 

Scatter plots are excellent charts for showing a relationship between two numerical variables across a number of unique observations. We see them in business communications from time to time, although they’re much more commonly used in the “exploration” part of the process—when we’re still trying to understand our data and find the important insights. 

If you’re unfamiliar with scatter plots, their common use cases, or their benefits and drawbacks in a range of scenarios, check out the what is a scatter plot? article in our SWD Chart Guide. There, we explore some of the basics of scatter plots via an example, share tips for designing them more effectively, and discuss common variations (bubble charts, connected scatter plots, and more).

In this article we’ll walk through the steps of creating a scatter plot in Microsoft Excel. We’ll use a small dataset to:

  • create a simple scatter plot with a single data series;

  • modify that graph to show multiple data series in one scatter plot; 

  • learn how to add contextual elements to our view (like averages, quadrant lines, and trendlines);

  • add data labels to all, or just a few, points in our graph; and

  • create custom labels using other fields in our dataset.

The scenario

Four years ago, our organization wanted to find a way to make newly-hired junior analysts more successful and effective. We launched a small, competitive pilot program that would start new employees with a full year of dedicated and comprehensive training. All other junior hires would continue to receive the on-the-job, course-based and ad-hoc learning experience that we have traditionally provided. 

We currently have 20 individuals who have completed the program. Each month, they and all the analysts in the organization have their performance rated as below-average, average, or above-average in comparison to their peers across our three global locations.

The following table shows our pilot program graduates, the number of months they’ve been with the organization, and how many months their performance has been rated “above average.”

This data table is the source for our scatterplot.

This data table is the source for our scatter plot.

We’ll use this dataset to create and refine our scatter plots, reshaping it and adding to it as needed.

How to create a single-series scatter plot

The simplest way to create a scatter plot in Excel is to highlight the cells in the two columns that contain your two numeric variables—in this case, the “MONTHS OF EMPLOYMENT” and “MONTHS WITH ABOVE-AVERAGE PERFORMANCE” columns. 

XY columns highlighted in our data table.

Highlight the two columns you want to include in your scatter plot.

Then, go to the “Insert” tab of your Excel menu bar and click on the scatter plot icon in the “Recommended Charts” area of your ribbon.

Select scatter plot from the menu.

Select “Scatter” from the options in the “Recommended Charts” section of your ribbon.

Excel will automatically create a scatter plot for you in the same sheet as your data, using the first column of your dataset as the horizontal (X) axis, and the second column as your vertical (Y) axis. 

A quick note here: in creating scatter plots, a common practice is to make the horizontal axis your “independent variable” and the vertical axis your “dependent variable” (that is, the number that is likely to change based on the value of our independent variable).

For our scenario, the number of months a person has been employed is more likely to affect the number of “above average” ratings they receive, rather than vice versa. That’s why our independent variable—months of employment—is in our data table’s left-hand column, and our dependent variable is in the right hand column.

 
Excel creates a scatter plot with these default settings.

Excel creates a scatter plot with these default settings.

 

This automatically-generated graph could use some formatting and cleanup. Taking time to strengthen the skeleton of your graph—everything that isn’t the actual data points—will help make your insights and information stand out.

After removing unnecessary lines, and cleaning up our axes and titles, our graph looks like this:

 
Fixing the graph skeleton to create a cleaned-up scatter plot.

Fixing the graph skeleton to create a cleaned-up scatter plot.

 

Now we have a nicely-formatted, single-series scatter plot that uses an identical black circle as a marker for each of our unique data points. From here, we can continue to make modifications and refinements to our graph.

How to create a scatter plot with multiple series

In the scatter plot we’ve just created, there is only one data series, consisting of our entire cadre of pilot program participants. Each participant’s length of employment is plotted on the horizontal axis, and their total of above-average monthly ratings is on the vertical axis.

Let’s assume we wanted to subdivide this data series into multiple series. For instance, our participants are assigned to three different offices worldwide (United States, Canada, and Japan); what if we wanted to color our data markers to represent that person’s location?

In Excel, creating a scatter plot with multiple data series can be done several ways. The easiest is to have a single column in your data containing the X values for all of your data series, and then have a separate column for the Y values of each individual data series.

Let’s take a look at how we could modify our existing data table to do this.

Our original data table, which we’ll modify in order to make a multi-series scatter plot.

 

In this table, the “OFFICE” is its own column, and it contains three unique values: United States; Japan; and Canada. Instead, add three new columns to the right of the existing table, and make each OFFICE value the name of one of the columns:

Add a column to the right for each of the three different offices.

Add a column to the right for each of the three different offices.

In the cells of those columns, we’ll write a formula that says “If the value of [OFFICE] in this specific row matches the header of this column, then give this cell the same value as the [MONTHS WITH ABOVE AVERAGE PERFORMANCE] column; otherwise, give it a value of #N/A.”

In cell G5, add a formula to decide if the cell should be empty, or should contain the value from cell F5.

In cell G5, add a formula to decide if the cell should be empty, or should contain the value from cell F5.

When we propagate this formula across our new columns and down all of our rows, the table will look like this:

Copy the Y values from column F into the appropriate column G, H, or I, based on if the OFFICE value in column A matches the header.

Copy the Y values from column F into the appropriate column G, H, or I, based on if the OFFICE value in column A matches the header.

As you can see, our “United States” column only has numeric values if the “OFFICE” column value in that row is “United States.” 

When you click on the existing scatter plot, you’ll see purple and blue highlighting around the X and Y columns that Excel is currently displaying in that graph.

Click on the scatter plot to highlight the columns Excel is using for the X (purple) and Y (blue) values.

We’d like this graph to show the Y values in the three new columns we’ve just created. To do that, hold your cursor over the edge of the blue rectangle until it becomes a hand, and then drag that rectangle right by a single column, so that it’s highlighting the data underneath “United States.”

Click and hold the blue column, and drag it to the right by a single column.

You might notice that a lot of your data points are now missing! That’s because now, Excel is only using the “United States” column for our Y axis, and Excel won’t draw a data point if there’s an “#N/A” as a Y value.

Not to worry, though: we’ll get all our data points back now, by clicking on the bottom right corner of that blue rectangle and dragging it to the right, so that the rectangle covers all three new columns we’ve created.

Click on the bottom right corner of the blue rectangle, and drag that corner to the right so that all three new columns are highlighted.

All of our data has returned, hooray! And, as you can see, Excel is now using a different color for each of our data series. 

Let’s add a legend so that our viewers know what these different colors represent. First, we’ll name each of our data series: right-click on the chart, choose “Select Data,” and add the data series names manually in the pop-up window.

Type in the name of each series, or select a cell from the Excel sheet that contains the name.

 

Then, you can fine-tune the look of your graph—perhaps you add a legend as a subheader to your title, and pick specific colors for your series—and your multi-series scatter plot is ready to go.

 

This scatter plot contains the same data as our first one, but the data is now subdivided into three series—one for each office—and is color-coded to reflect that. We’ve also added a legend beneath our chart title to identify which color goes with which office.

 

How to add averages, reference lines, and trendlines to a scatter plot

Whenever we present data to an audience who might be unfamiliar with it, it’s a good idea to include contextual information to help make it easier to understand. In a scatter plot, we can  add context like:

  • What was the average X value? 

  • What was the average Y value?

  • Was there a goal for either variable?

  • Is there a trend worth emphasizing?

Depending on the insights or the data itself, you might use one (or several) of these techniques, so let’s go through them one at a time.

How to add an average point to a scatter plot

We’ll start by adding another row to the bottom of our data table, where we’ll calculate what the average X and Y values were. For simplicity’s sake, we’re only going to look at the average across ALL of our participants, rather than separate averages for each office (although the same techniques would apply). 

Since columns E and F contain our X and Y values, we’ll write formulas to average the values in each of those columns.

Row 24 will contain the averages of our X values (in column E) and our Y values (in column F). Use the AVERAGE formula, as shown in this screen capture, to generate the correct values.


Then, we’ll right-click on our chart, choose “Select Data…” from the menu that pops up, and add another data series just for our average values. Click the “+” button below the “Legend entries (Series):” window to add a new series, which you can then rename and set the range of X and Y values for in that pop-up.

The “average” series will consist of a single point, showing the average of all X values and the average of all Y values.

 

That will put a single point on our chart that marks our X and Y averages, as you can see in the updated graph below.

I’ve deleted the new “average” series from the legend (by clicking once to highlight the legend, then a second time to highlight the average, and hitting “delete”), and formatted the single point to be gray with a black outline.

 

The average of X and Y values across all three data series is now shown in a single gray marker, outlined in black.

 

How to add reference lines and create quadrants in a scatter plot

Now that we have an average point, we can visually break our scatter plot into four quadrants: 

  1. above average in both X and Y variables; 

  2. below average in both;

  3. high X but low Y; and 

  4. low X but high Y.


We’ll use our average point as the basis for drawing the lines that define those quadrants. Specifically, we’ll add a chart element to that point, and maybe not the one that you’d immediately expect: error bars.

Error bars are typically intended to show additional statistical context around a data point. Instead, we’ll be using them to draw both a vertical and a horizontal line, each of which will connect our axes to the edges of our graph, running directly through our average point.

First, we’ll need to know how long to make these error bars.

  • We know that each of our axes has a maximum value of 50, so we’ll make sure our error bars cover that total distance. 

  • Vertically, we’ll draw one bar from the baseline to our average point, and one bar from our average point to the top of our graph. 

  • Horizontally, we’ll draw one bar from our Y-axis to our average point, and then one bar from the average point to the right-hand edge of our graph.

To define the exact length of these error bar segments, we’ll add one more row of data to our table. Below the AVERAGE row, we’ll add a row called “Upper bound.” The value in those cells is calculated as “50-[average value]”.

Add an “Upper bound” row to the data set in preparation for building the quadrant lines.

Now, to add our error bars to the graph: click on the “average” data point in the chart, and then go to the Chart Design > Add Chart Element option in the ribbon, and select “Error Bars > More Error Bars Options...

Here’s where to find the “More Error Bars Options” item in the drop-down menus.

This will open up a Format Error Bars pane on the right side of the screen. (At this point, Excel defaults to having you format the vertical error bars; we’ll get to the horizontal bars in a minute.) You’ll see options for “Direction” (select Both), “End Style” (select No Cap), and “Error Amount” (select Custom). 

This screen shows the correct settings under “Format Error Bars.”

By selecting a Custom option under Error Amount, you’ll have to then click on the “Specify Value” button. In that popup window, tell Excel how far above and below your “average” point you want your vertical error bar to be drawn. That’s why we created that “upper bound” row of data earlier. For the “Positive Error Value” in the popup box, use the value from the “upper bound” row, and for the “Negative Error Value,” use the “AVERAGE” row value.

 

The vertical error bars are sized correctly, but the horizontal ones look like low-res wagon wheels. That’s what we’ll fix next.

 

Then, click directly on the horizontal error bars in the plot and follow the same steps to modify those bars, using the AVERAGE and Upper bound rows for the MONTHS OF EMPLOYMENT column. 

(Note: these bars might be really small by default; if you can’t click on them, then click anywhere in your graph, then go to the “Format” menu next to “Chart Design”, and in the drop down menu on the far left of the ribbon, select “Series ‘average’ X Error Bars”.)

 

Quadrant lines are drawn, but could still use some re-formatting.

 

Once that’s done, you can modify the format of the lines and the average point however you like. I prefer to push these reference lines visually toward the background, and to make the average point itself invisible by turning off its data marker. (Careful! Don’t delete the data marker entirely, because that will also make the error bars disappear.)

 

With some formatting changes, our error bars have become perfectly sized quadrant lines, faded nicely into the background.

 

By drawing quadrants, I can see right away that we have two employees from the Japan office in the top left who have been above average on performance frequently, even though they’re among the newest 50% of program participants. Conversely, the yellow mark in the bottom right shows me that one participant has been here for a long time, but is far below the 50th percentile in terms of above-average performance periods. I don’t know if this is an interesting story, or the most important insight, but simply drawing the quadrants on the graph makes it easier for me to analyze, and talk about, some of the data points within it.

How to add trendlines to a scatter plot

Unlike drawing quadrant lines, drawing trendlines in Excel is fairly straightforward. Perhaps it’s TOO straightforward, actually; it’s awfully easy to put a trendline in a graph that doesn’t have any particular basis in reality, or doesn’t describe the actual trend in a way that would be useful for setting future expectations. Trendlines can also be attention-grabbing, adding visual clutter to a graph type that is already challenging for some viewers to accurately interpret. Typically, if there’s a trend worth highlighting, it’s already visible in the data without drawing an additional line on the graph itself.

Those concerns aside, a trendline can provide helpful context in certain situations. The graph we’re looking at, for instance, may benefit from some visual guidance showing us how well we should expect our pilot program cadre to perform, based on their experience at the company. A linear trendline could use historical data to provide some of that insight.

To create a trendline that uses all of our data points, we’ll create a fifth data series in addition to “United States,” “Canada,”  “Japan,” and “average.” Right-click on the chart and choose “Select Data…”

In the popup window, add another data series that uses columns E and F (our original single-series data) as our X and Y values. I named it “for creating trendline.”

Add another data series to the view so that Excel can use it to calculate a trendline across all data points.

This data series will be plotted on top of all the existing series, using Excel’s default settings for size, color, and so on. In the view below, I’ve made the “for creating timeline” series markers larger and purple just so you can see them more clearly.

 

This purple series will be the basis for our trendline; eventually we’ll make the markers invisible.

 

Believe it or not, we’re almost done. All you have to do is right-click on any one of these markers and then select “Add Trendline…” Excel will automatically add a linear trendline to your chart, and show you the Format Trendline menu on the right hand side of the screen. 

The red line in the view is the trendline Excel draws as its default option.

If you know what you’re doing with trendlines, here’s where you’d make your preferred adjustments; for our purposes today, all I’m going to do is mute the color of the line, and then turn off the fill and border settings for my “for creating trendline” data markers.

 

Using color, size, and line style to de-emphasize the default linear trendline.

 

How to add data labels to a scatter plot

It’s great that we’ve put our data on the graph…but what does that data actually represent? If we care about more than just the overall distribution of data points, we should add data labels to some, if not all, of our points. 

To add data labels to a scatter plot, just right-click on any point in the data series you want to add labels to, and then select “Add Data Labels…” Excel will open up the “Format Data Labels” pane and apply its default settings, which are to show the current Y value as the label. (It will turn on “Show Leader Lines,” which I usually turn off.) 

 

By default, Excel will add the Y values as the data label, and will align it to the right of each data point.

 

Using only the checkboxes in the “Format Data Labels” pane, you can get your labels to include other values or combinations of values, including the Series name, the X value, and/or the Y value. In many cases, though, we’ll want to customize our data labels to show other information than just those fields. For that, we’ll use the “Value From Cells” option.

How to customize labels in a scatter plot

You can tell Excel to use any cell or series of cells as its source for Data Label information. For instance, let’s label each of the markers in this chart with the first initial and last name of the pilot program participants. We can add a “Data Label” column to the right of the data table and use the CONCATENATE formula to create a cell in each row with this information.

You can use formulas to fill in your new “Date Label” column, or you can just manually add the data. In this version, we used the CONCATENATE formula to combine the first initial from the FIRST NAME column, a period, and the full LAST NAME column value for every row.

Then, click on a data label in the existing scatter plot. That brings up the “Format Data Labels” pane, where you can change the settings in the “Label Contains” section to use “Value From Cells”. Clicking on the “Select Range…” button brings up a popup window asking what cells to use for the data label information, and you can highlight all the cells in that brand-new column.

Select any specific range of cells to use as the source for your custom labels.

When you click OK, every data marker will be labeled..

 

All marks are now labeled with our custom values.

 

To reduce the amount of text in the graph, let’s only keep labels for those participants who were far above or far below the trendline—in other words, overachievers and underachievers.

Double-click on one of the labels you want to remove, and then either delete it or set its Text Fill to “No Fill.” Do the same to several other unwanted labels, until you’re left with just high- and low-performing participants. (You can also click-and-drag labels slightly out of their default positions, to make labels with close-by neighbors easier to read.)

 

Only high and low performers and lableled in this view.

 


How to put it all together

With the techniques described above, you should be well prepared to create a scatter plot in Excel that can be designed and formatted to support the specific story you intend to tell. Use the power pairing of color and words to help your audience see what you want them to notice and learn from your scatter plot. Feel free to include annotations inside or near the chart area itself, as well as large headline or takeaway text above your visualization to emphasize the key messages.

The final version of our visualization includes a takeaway title, annotations, a reference line, and custom labels on selected elements in our mult-series scatter plot.

Now, you’re ready to create your own scatter plots in Excel! The data and the graphs you see in this post (along with some bonus content) can be downloaded here.

Check out our chart guide for more about scatter plots and other graph types; subscribe to our YouTube channel for Excel tutorials and lots of other videos; and follow our blog to get the latest in how-to instructions, graph makeovers, and other tips and tricks for communicating more effectively with data.

how to do it in Excel: adjusting bar width

Today’s post is a tactical one: how to adjust the widths of bar charts in Excel (and why you should). 

Before we get into the step-by-step, I should mention that there aren’t any strict rules for optimal spacing between bars. Rather, it’s personal preference similar to wearing white after Labor Day (in the U.S., that’s the first weekend in September). As a resident of the muggy Southeast, I’ll be rocking white until fall temperatures arrive in mid-October. However, if you live in cooler climes and consider Labor Day the symbolic end of summer, your preference might be to say sayonara to white until Memorial Day. 

The same gray area goes for optimal spacing between bars. The actual width is not set in stone. Our goal is to enable our audiences to compare the lengths of the bars (instead of the area between them), so general guidance is to thicken the bars to minimize the surrounding white space.

Let’s turn now to how to accomplish this in Excel. In the spirit of Labor Day, I’ll use some data from the Bureau of Labor Statistics (BLS) showing the top ten occupations in the U.S. as of May 2020. 

Compare the bar spacing in the two visuals shown below:

optimal bar chart spacing.png

On the left, the gaps are attention-grabbing and create an unnecessary shimmer to the visual. The adjusted version puts emphasis on the length of the bars.  Download the Excel file to following along with these steps to manually adjust:

  1. Highlight all the bars, right-click and choose Format Data Series:

how to adjust bar chart spacing.png

2. In the Format Data Series menu, under Series Options, adjust the Gap Width dialog box:

 
how to adjust bar chart width.png
 

The result is this:

bar chart example.png

Another benefit of doing this is that now there’s enough space to pull the long data labels into the ends of the bars. This is just one of the decluttering steps we can take to reduce perceived cognitive burden. Here’s how to achieve this:

3. Click on any data label to highlight them all, then right-click and choose Format Data Labels:

how to reformat bar charts.png

4. In the Format Data Labels menu, select Label Options, and in the Label Positions section, choose Inside End. (While you’re at it, in the Label Contains section, uncheck “Show Leader Lines.” These are almost never necessary.)

 
bar chart example.png
 

My graph renders like this, due to my color scheme. I’ll adjust the font color so that the labels have sufficient contrast against the dark blue bars. (TIP: you can use the online WebAIM contrast checker to see if your text is sufficiently readable against your background color.)

bar chart example.png

5. To adjust the font color, click to select all the labels, choose the Font options dropdown arrow, and then select a different hue (you can also do this in the Format Data Labels menu if you still have it open):

bar chart example.png

The final visual looks like this:

bar chart example.png

I might even choose to further format the numbers by displaying the units in millions.

Just as the modern-day guidelines for wearing white after Labor Day are subjective, so too are the rules for the exact spacing between bars. As the designer of your own graphs, experience and personal preference will help you find your own “Goldilocks” of spacing: too thin, too thick or just right.

bar chart example.png

More Excel how-to’s:

exploring data is different than explaining data

Today’s topic was inspired by a discussion in a recent client workshop. At the beginning of our sessions, we ask attendees to share what they want to learn during our time together. Their answers usually include being more concise, creating better presentations, getting stakeholders to act, ways to visualize data differently, and so on. 

Recently Mike and I were conducting a session where these two responses came in back-to-back:

Participant 1: I want to learn to create interactive data visualizations that enable my stakeholders to find the story.

Participant 2: I want to develop a sharper editorial eye for what data to include to tell the right story. 

While these two responses might seem at odds with one another, they really aren’t. They’re just describing two different stages in the analytic process: the exploratory stage and the explanatory stage.

Exploratory analysis is what you do to get familiar with the data. You might begin with a hypothesis or question you want to answer, or you might just look at the data from different angles to find what’s interesting about it. We’re often using dashboards (or interactive data visualizations, as Participant 1 described them) in this step of the process. When designed well, they can lead us to insights faster. 

Compare that to explanatory analysis, which is when you’ve identified something specific, and are ready to communicate it to someone specific. At this stage, we’re all about driving action and positive change: we want to make it abundantly clear to a decision-maker that something demands attention and (hopefully) action. This requires—as Participant 2 put it—developing a sharper editorial eye for what data to include.

The challenge with using the same output from our initial exploratory analysis for our final explanatory communication is that it will often include not just the important, actionable insights, but also a bunch of other irrelevant details. For our audience, that makes the story harder to absorb. 

To be effective data communicators, we should strive toward Participant #2’s goal: developing a sharper editorial eye for how to weave data into an overarching story.

Illustration by Catherine Madden

Illustration by Catherine Madden

I’ll use a recent client makeover to illustrate why the visual you use in exploratory analysis might—and should—differ from what you use to communicate. 

In this example, imagine you work for a pharmaceutical company that is conducting market research on your newly launched hypertension therapy, Vinsulfan. A third party aggregates and provides the data to you for analysis. One metric—physician's likelihood to prescribe—is shown in the graph below. (Details, including the names of the drugs, have been modified to protect client confidentiality.) 

Picture1.png

This view might suffice for you during your initial analysis. However, once you’ve identified what’s interesting about the data, you should modify this visual so that those insights stand out more for your intended audience. 

For example, let’s assume we want to draw attention to the favorable responses—”likely” and “extremely likely.” I can use color sparingly to emphasize the proportion of these actions. Here I’m electing to use green to focus attention because it’s the drug’s brand color—a different scenario might call for a different color palette.

bar chart example.png

Or, I might even modify the intensity of color to draw attention to one particular insight—in this case, that more physicians responded favorably to initiating our brand than the other options.

bar chart example.png

If I really want to make sure my audience doesn’t miss the point, I should state that in words:

Picture4.png

Or, I could summarize multiple takeaways with a single slide like this:

data storytelling example.png

Compare my original exploratory visual with the one I’m using to explain my findings. It’s the same data in both cases, as well as the same chart type, and many of the same words. But in the end, some minor modifications to the color and words I use make my actual insight unmistakable, and my recommended action clear. In the final visual, I’ve honed my editorial eye towards highlighting how this data fits into an overarching story in my organization. 

data storytelling example.png

Here are some additional resources to help you and others you work with practice differentiating between exploratory and explanatory analysis: