how to create a line chart in Excel

A line chart is a simple graph that is familiar to most audiences. Lines are great for showing continuous data, such as plotting how the value of something changes over time. In this post, we will cover how to create a line chart in Excel, using a sample dataset from a community exercise: table takeaways. The information is about an annual corporate fundraiser to provide meals to those in need. You can download the file here to follow along as we build the line chart. 

The dataset includes two columns—a column for the campaign year and a column for the number of meals served in a given year. Let’s assume we want to make a line chart with the years going along the x-axis and the meals served along the y-axis.

Creating a single line graph in Excel is a relatively straightforward process, as it is a default chart type.

Insert a line chart

To begin, highlight the data table, including the column headers. To do this, click cell B7 and drag your cursor to C18. Next, navigate to the Insert ribbon and select the line chart icon. (Note that you can also use the Insert menu at the very top, then choose Chart -> Line to achieve a similar result.)

Excel offers several different variations of the line graph. For our purpose, we will go with the first option available—a 2-D line chart without markers

Now, we have a simple line chart, but it is not exactly what we had imagined. Excel made the Campaign Year data points a series instead of the x-axis labels that we wanted.

To adjust the x-axis to show the years, right-click on the chart and go to Select Data… in the pop-up menu. Campaign Year and Meals Served are in the list in the series box in the middle. Pick Campaign Year and then click the minus (-) sign box below the list to remove Campaign Year as a series.

Next, add the Campaign Year as the horizontal axis labels by clicking the cell selector icon to the right of the box and then highlighting the data cells with the Campaign Year (cells B8:B17). Finally, click the OK button.

With this adjustment, we have our desired line chart with the years going along the x-axis and the meals served along the y-axis. Excel has automatically provided a ‘Meals Served’ title for us. We can improve upon this title, and some of the other formatting decisions Excel has made for us. We'll cover these steps in detail in an upcoming post. In the meantime, check out other Excel how-to articles below.

More Excel how-to’s 

For more tactical Excel instructions, check out these other resources: 

eight simple ways to edit a legend in Excel

One essential element of our charts and graphs rarely gets the attention it deserves: the legend. 

Without a clear and thoughtfully-incorporated legend, viewers of our data communications will struggle to understand exactly what we’re presenting to them. Any additional effort an audience needs to devote to solving the mystery of “which data series is green?” or “what’s the difference between square data markers and circles?” is energy they won’t have to put towards grasping your visual’s important insights. A well-designed legend will remove that cognitive burden. 

With every graph you create, plan to spend some time crafting a legend that supports and clarifies the data you’re presenting. 

Almost every chart template in Excel will place a legend into a new graph by default. For bar charts, it’s usually set off to the right:

Most Excel graphs have the legend on the right hand side as a default.

Did you know, however, that you can edit your legends? In fact, there are many things that you can do directly in Excel (or directly in PowerPoint) to customize this essential element of your communication.  

But first, here are a couple of things you CAN’T do to your legend in Excel.

  • Reorder the elements in your legend. In some visualization tools, you can drag entries in the legend to change the way the series are sorted, but in Excel the order is determined in the data source itself by the series order.

  • Modify the text of the entries. Unlike data labels, into which you can re-type or add new text, legend entries are also fully determined by the data source. If you wanted this graph’s legend to be Title Case or ALL CAPS, you’d have to change that in the underlying worksheet.

Here are eight simple ways to customize your Excel legend.

1. Place it somewhere else.

As long as you haven’t resized your graph’s plot area (the space reserved for the data itself), you can use the “Format Legend” pane in Excel to move your legend to the top, left, bottom, or top right corner of your chart area. When you do this, your plot area will resize to make room for the relocated legend.

In this graph, the legend position is set to "Top" rather than "Right." Notice that the plot area automatically gets shorter and wider to make room for the legend.

In this graph, the legend position is set to "Top" rather than "Right." Notice that the plot area automatically gets shorter and wider to make room for the legend.

How to do it in Excel:

Click on your chart, and then click the “Format” tab in your Excel ribbon at the top of the window. From the very right of the ribbon, click “Format Pane.” Once that pane is open, click on the legend itself within your chart.  In your Format Pane, the options will then look something like this:

The "Format Legend" pane will let you position your legend in one of five pre-defined locations in your Chart Area.

Then, set that legend position to be whichever location you want. My tendency is to put the legend at the top, because of people’s tendency to scan any new visual from top-to-bottom; that way, a reader will know what dataset corresponds to each color or marker type before they start to evaluate the information in the graph.

2. Overlay the legend atop your data.

Excel prefers not to have the legend and the plot area use the same physical space…but if you want them to overlap each other, you can make that happen.

How to do it in Excel:

That bottom option in the Format Legend pane above—the one that reads “Show the legend without overlapping the c…”—is the key item here.

If that box is checked, the legend will not overlap the plot area at all. By unchecking it, the legend will then float on top of the plot area (and any data series that happen to be there).

By default, the available plot area for a graph stops short of the space that Excel reserves for the legend. Unchecking one box in the “Format Legend” pane will extend your graph’s plot area all the way to the edge of your available space, and will place the legend such that it overlaps with the visual.

3. Hand-drag it to a precise location of your choice.

You’re not limited just to the options Excel’s Format Pane provides by default. You can actually put your legend anywhere you want in your chart area. For instance, I put mine slightly closer to the bars in this example:

For this view, I hand-dragged the legend as far to the left, within the available plot area, as I could without it overlapping actual data bars or feeling too crowded.

How to do it in Excel:

If you click on your legend, you’ll see it highlighted with squares at each corner and at the midpoint of each line.

Clicking on the legend directly will select it, and make these little squares (called “handles”) visible.

Once the legend is highlighted like this, click and hold your cursor down in the middle of the outlined legend (make sure your cursor looks like a plus sign with arrows on the ends—that means you’ll be able to move the legend without distorting it). Then, drag-and-drop it anywhere you want in your chart area. 

4. If, for some reason, you insist on doing so: add outlines, shading, or other visual effects.

99 times out of 100, you won’t need to do this. Decorative effects such as these primarily serve only to clutter up a visual. However, since it’s impossible to predict every possible scenario in which you’ll be creating a chart, there may be a time when you’ll want to visually set off your legend from surrounding elements. In that unlikely event, it’s worth knowing that Excel offers you the ability to add fills, borders, and more to your legend.

I made mine super pretty! Look at my beautiful shaded, outlined, and glowing legend: 

Just because you can put a gray background, a thick dotted border, and a purple glow around your legend, doesn’t mean you should. In fact, you probably shouldn’t…but it’s important to know how to make these kinds of changes on the VERY rare occasion that it becomes necessary to do so.

How to do it in Excel:

To reiterate: I am not encouraging you to bling out your legends on a regular basis. But if for some reason you find it’s necessary to do so…you can click on the legend in your graph, and then change the Fill, Stroke, and other options from the Format Pane: 

From the “Format Legend” pane, you can change the fill, border, and other appearance traits of your legend.

Please apply this knowledge with the utmost restraint. With great power comes great responsibility.

5. Resize the dimensions of the legend and the text inside it.

You can make your legend taller, wider or both. At the same time, you can increase or decrease the font size for your legend without changing the font size of the rest of the graph.

This legend is manually resized to be taller.

This legend’s font size has been increased.

How to do it in Excel:

To resize the dimensions of the legend, highlight it so that you get this outline again:

Then, click on any one of the squares (they’re called “handles”) to make the legend larger in that direction. The corner handles let you change width and height at the same time; the handles in the middle of straight lines only let you change one or the other. 

Resizing the fonts in the legend is pretty straightforward: you click on your legend, then click the “Home” tab in your Excel ribbon, and change the font size, family, weight, or anything else you want just the way you’d change any other text.

6. Use the legend to change the look of a data series.

It’s not surprising that when you change the color of a data series in your chart, your legend updates to show the correct new color. What’s less well-known is that you can do this operation in reverse as well. For instance, I used the legend to reset my “Fruit smoothie” color to red.

You can use the legend itself to change the color of any or all data series in your view.

How to do it in Excel:

Click once on the legend (to get the “handles” to show up around the whole thing), and then click a second time on one specific item in the legend itself. I clicked on “Fruit smoothie,” and it looked like this:

Click on the legend once, and then again on a specific entry, to select just one part of the legend (and its related data series.)

Then, I changed the fill color of that item to red in the Format Pane, which had changed to show the header “Format Legend Entry:” 

The “Format Legend Entry” pane will appear if you only select a single element within your legend.

That’s a handy thing to remember: “Format Legend” refers to everything in the entire legend, but “Format Legend Entry” refers only to the item in the legend that describes a particular data series.

7. Remove unwanted legend entries.

This would be favorable when you’re doing some creative Excel brute-forcing—perhaps there are some series you need on the back end but don’t want displayed in the legend itself. You can actively delete entries from your legend directly, without removing the data from the chart itself. As you can see here, “Flavored milk” is gone from the legend, but the data still persists in the view.

Even though “Flavored milk” has been removed from the legend, the data series it referred to is still present in the view.

How to do it in Excel:

Click once in the legend and then click a second time on the specific legend entry you’d like to remove, and then just press the “Delete” key. 

8. Remove the legend entirely, and do it yourself!

This, of all the techniques we’ve covered here, is my preferred option. Although you can use Excel settings and menus to create quick-and-dirty legends, and even customize them somewhat, your communications will benefit from a bit more thoughtfulness and polish. Either delete or hide the Excel legend, and instead add some text in useful places.

You could incorporate your legend as a subheader, so your readers see it right away and know which data series goes with which color.

Using appropriately colored text as a subheader can be an elegant approach to incoporating a legend into the graph view.

How to do it in Excel:

Click on your graph, and then in your Excel ribbon, select the “Format” tab next to the “Chart Design” tab. Towards the left hand side of your ribbon, click on the icon of a text box. That will place a blank text box in your Chart Area, which you can move, edit, and format just like a regular text box.

Type out the names of your data series, format them as you see fit, and place the subheader you’ve created just below the title of your graph. 

Or, you could directly label the first instance of each of your data series, either at the ends of the bars or inside the base of them:

Label the first instance of each series at the ends…

…or at the bases of the bars as another legend option.

How to do it in Excel:

You can use the same technique for this as for adding a subheader, or you could use Excel’s ability to add data labels by interacting with the bars themselves. Either of these approaches also would work for line graphs, or most any other chart type you’d use on a regular basis.

Whichever approach you choose, let your decision be based on what will best serve your audience, rather than on what your graphing tool offers you as a default setting.


More Excel how-to’s:

plan & practice

 
 

If you’ve been following our work at storytelling with data, you might be aware of our penchant for planning. Taking time to understand our audience, articulate the message, sketch graphs and other visuals, storyboard potential content, and craft the story in a low-tech manner—each of these activities helps us think and reflect in important ways that make our communications stronger.

I’ll go so far as to say that one of the primary reasons that presentations fail is when time and energy isn’t spent planning.

This is such a critical—and oft-overlooked—part of the process, that I’ve devoted an entire section to it in my new book. The book is titled storytelling with you: plan, create, and deliver a stellar presentation. I’ll tell you much more about it soon.

In the meantime, if you’d like insight into how to plan your next presentation with pen and paper before turning to your tools, consider attending our virtual craft the narrative practice session on May 11th and 12th. In it, we’ll guide participants to practice a variety of low-tech strategies using a project of their choice. Attendees will leave the session with a robust plan of attack for their next presentation—plus a process to use in the future to communicate more effectively. Learn more about this and other upcoming workshops.

I’ll close with a practical tip: once you have your low-tech plan, talk through it out loud. This practice will help you vet the flow, find your words, form effective transitions, and ultimately, create content that will better support you. Hear more about the various benefits of articulating your thoughts aloud in the SWD podcast episode 6 “say it out loud.”

how to do it in Excel: emphasizing a data point

Today’s post is a tactical one: how to highlight a data point in Excel. 

When we craft visualizations for explanatory purposes—that is, when there’s a specific finding or recommendation that we want to communicate to someone specific—our goal is to drive action. In those cases,  our visuals should emphasize what’s interesting in the data and what requires attention. Highlighting key points in our graph is an important step in creating successful explanatory communications.

Here are some charts that use this technique effectively, making it clear where an audience is meant to focus. You can check out many more examples by using the Discover feature in the SWD community. 

Let’s get right to how to accomplish this in Excel. The example we’ll be working with today is a graph that shows average revenue by event type for a venue that hosts various public functions. You can download the accompanying data file to follow along with these steps.

Here’s our starting point, which Excel created by default.

First, I'm going to create a clean, uncluttered version of this graph. (You can reference this prior post for steps to reformat Excel graphs.)

Let’s assume my audience is a marketing leader who’s interested in knowing which event type is expected to have the highest average revenue in the coming year. Since the answer is “Fundraisers,” I’ll direct attention to the 2020 forecast for that event type with these steps:

  1. Click once anywhere on the Fundraisers line, then click on just the 2020 data point. Right click on the selected point and choose Format Data Point

2. The Format Data Point menu opens. Under the Fill & Line tab (with the paint bucket icon), click on the button that says “Marker,” and then navigate to Marker Options. Click the Built-in option, then choose a filled circle in the drop-down Type menu. I’ll also adjust the size to 9 (the size you choose should depend on the existing thickness of your lines and your individual needs for using the marker as emphasis).

3. In the Fill menu, I can adjust the color by selecting my preferred choice under Solid fill:

4. I’ll also adjust the outline color in the Border menu below it. I chose no line, but this is my personal preference.

My resulting visual looks like this:

Even if you just quickly glance at this graph, notice how your eyes are drawn to that blue dot. By adding and formatting this particular data point, I’ve made it clear to my audience that the Fundraiser value in 2020 is where I want them to look.

Since the 2020 data point is a forecast, I’d want to visually distinguish it from actual data by varying the line style (making it dotted instead of solid) and the marker style (making it open rather than filled). I’ll also include the numeric data labels for specificity’s sake.

Here’s how we can apply those changes, as we continue to refine our graph:

5. To vary the line style, highlight the data point of interest the same way you did earlier in Step 1. Right-click and choose Format Data Point.

6. In the Fill & Line tab again, instead of clicking on Marker, click into the Line menu, where you can adjust the various settings of the line immediately preceding the data point. To achieve the effect in my example above, my selections are a solid blue line with a width of 2.5 pt and a Dash type of Square dot. You can follow these same steps for the other event types in the graph as well so that all of the lines for 2020 are dashed to indicate a forecast (without changing the color to blue).

7. To add the numeric labels, I’m opting to include them as text boxes so that I have greater control over the formatting. You can read more about the considerations for text vs embedded labels in this prior post.

These formatting changes require a bit of extra effort, but they make your charts far more effective. Don’t assume that your audience knows where to look in your visuals—make that task easier for them through your design choices.

But wait…there’s more! What about all the other data? Isn’t that interesting to the audience? It absolutely could be—I invite you to practice how you’d approach this task in the community exercise emphasize multiple takeaways. You can also check out how I’d walk my audience through a story in the video so what? using story to drive action (open to premium subscribers). 


More Excel how-to’s:

how do i avoid reworking my entire presentation if i have to share slides?

Through virtual and in-person workshops around the globe, we have taught tens of thousands of people how to communicate effectively with data. This series captures some of the noteworthy questions we hear during those sessions—and our answers.

Often a graph that makes sense during a live presentation loses meaning when distributed as a PowerPoint later. How can we retain context when transitioning between audiences without having to rework the entire presentation?

This is a relevant question in the time constraints we constantly are under in a real organization. We want to avoid using so-called slideuments in live presentations because it is difficult to both listen to a presenter and read text-heavy slides at the same time.  

But if you’re crunched for time, there are a few time-saving strategies that will help you avoid having to create a completely separate deliverable.

Let’s illustrate with a business example. Take a few minutes to watch Cole deliver a live data story (starting at 6:08): 

This presentation was notable because:

  •  The slides were well-designed, with effective graph choices, minimal clutter, and smart use of pre-attentive attributes to focus attention.

  • The visuals were paired with a strong narrative. 

If you flip through the same slides on your own—would the story still be as clear? Likely not.

Back to the original question: if we must send out the entire slide deck, then let’s look at two ways we could retain Cole’s narrative without having to rework the entire presentation.

  1. Write active slide titles. When you’re not presenting live, strong takeaway titles on your slides make it easy for the reader to understand the main point. The title bar is usually the first place your audience looks when consuming a slide deck on their own. Order your slides logically so that the reader can read just consecutive slide titles to get the overarching story you want to communicate. This is called horizontal logic.

  2. Add a fully annotated summary slide at the beginning of the deck. In this Craveberry video, Cole created this single summarized slide for the product that gets sent around after the meeting, as shown below. The audience gets the salient information without having to hear a live presenter or having to flip through the deck, because all of the information on the slide is self-reinforcing (an example of vertical logic). Adding it at the beginning means the audience could obtain the relevant details without having to flip through all the slides.

data storytelling and data visualization example

Here’s another example of adapting a live progression for written consumption.  

Employing either of these two approaches goes a long way in effective data storytelling because they allow us to tailor our mode of delivery to how our audiences are consuming it. 


Build your data storytelling skills in the community with these related exercises:

how to make a scatter plot in Excel

 
 

Scatter plots are excellent charts for showing a relationship between two numerical variables across a number of unique observations. We see them in business communications from time to time, although they’re much more commonly used in the “exploration” part of the process—when we’re still trying to understand our data and find the important insights. 

If you’re unfamiliar with scatter plots, their common use cases, or their benefits and drawbacks in a range of scenarios, check out the what is a scatter plot? article in our SWD Chart Guide. There, we explore some of the basics of scatter plots via an example, share tips for designing them more effectively, and discuss common variations (bubble charts, connected scatter plots, and more).

In this article we’ll walk through the steps of creating a scatter plot in Microsoft Excel. We’ll use a small dataset to:

  • create a simple scatter plot with a single data series;

  • modify that graph to show multiple data series in one scatter plot; 

  • learn how to add contextual elements to our view (like averages, quadrant lines, and trendlines);

  • add data labels to all, or just a few, points in our graph; and

  • create custom labels using other fields in our dataset.

The scenario

Four years ago, our organization wanted to find a way to make newly-hired junior analysts more successful and effective. We launched a small, competitive pilot program that would start new employees with a full year of dedicated and comprehensive training. All other junior hires would continue to receive the on-the-job, course-based and ad-hoc learning experience that we have traditionally provided. 

We currently have 20 individuals who have completed the program. Each month, they and all the analysts in the organization have their performance rated as below-average, average, or above-average in comparison to their peers across our three global locations.

The following table shows our pilot program graduates, the number of months they’ve been with the organization, and how many months their performance has been rated “above average.”

This data table is the source for our scatterplot.

This data table is the source for our scatter plot.

We’ll use this dataset to create and refine our scatter plots, reshaping it and adding to it as needed.

How to create a single-series scatter plot

The simplest way to create a scatter plot in Excel is to highlight the cells in the two columns that contain your two numeric variables—in this case, the “MONTHS OF EMPLOYMENT” and “MONTHS WITH ABOVE-AVERAGE PERFORMANCE” columns. 

XY columns highlighted in our data table.

Highlight the two columns you want to include in your scatter plot.

Then, go to the “Insert” tab of your Excel menu bar and click on the scatter plot icon in the “Recommended Charts” area of your ribbon.

Select scatter plot from the menu.

Select “Scatter” from the options in the “Recommended Charts” section of your ribbon.

Excel will automatically create a scatter plot for you in the same sheet as your data, using the first column of your dataset as the horizontal (X) axis, and the second column as your vertical (Y) axis. 

A quick note here: in creating scatter plots, a common practice is to make the horizontal axis your “independent variable” and the vertical axis your “dependent variable” (that is, the number that is likely to change based on the value of our independent variable).

For our scenario, the number of months a person has been employed is more likely to affect the number of “above average” ratings they receive, rather than vice versa. That’s why our independent variable—months of employment—is in our data table’s left-hand column, and our dependent variable is in the right hand column.

 
Excel creates a scatter plot with these default settings.

Excel creates a scatter plot with these default settings.

 

This automatically-generated graph could use some formatting and cleanup. Taking time to strengthen the skeleton of your graph—everything that isn’t the actual data points—will help make your insights and information stand out.

After removing unnecessary lines, and cleaning up our axes and titles, our graph looks like this:

 
Fixing the graph skeleton to create a cleaned-up scatter plot.

Fixing the graph skeleton to create a cleaned-up scatter plot.

 

Now we have a nicely-formatted, single-series scatter plot that uses an identical black circle as a marker for each of our unique data points. From here, we can continue to make modifications and refinements to our graph.

How to create a scatter plot with multiple series

In the scatter plot we’ve just created, there is only one data series, consisting of our entire cadre of pilot program participants. Each participant’s length of employment is plotted on the horizontal axis, and their total of above-average monthly ratings is on the vertical axis.

Let’s assume we wanted to subdivide this data series into multiple series. For instance, our participants are assigned to three different offices worldwide (United States, Canada, and Japan); what if we wanted to color our data markers to represent that person’s location?

In Excel, creating a scatter plot with multiple data series can be done several ways. The easiest is to have a single column in your data containing the X values for all of your data series, and then have a separate column for the Y values of each individual data series.

Let’s take a look at how we could modify our existing data table to do this.

Our original data table, which we’ll modify in order to make a multi-series scatter plot.

 

In this table, the “OFFICE” is its own column, and it contains three unique values: United States; Japan; and Canada. Instead, add three new columns to the right of the existing table, and make each OFFICE value the name of one of the columns:

Add a column to the right for each of the three different offices.

Add a column to the right for each of the three different offices.

In the cells of those columns, we’ll write a formula that says “If the value of [OFFICE] in this specific row matches the header of this column, then give this cell the same value as the [MONTHS WITH ABOVE AVERAGE PERFORMANCE] column; otherwise, give it a value of #N/A.”

In cell G5, add a formula to decide if the cell should be empty, or should contain the value from cell F5.

In cell G5, add a formula to decide if the cell should be empty, or should contain the value from cell F5.

When we propagate this formula across our new columns and down all of our rows, the table will look like this:

Copy the Y values from column F into the appropriate column G, H, or I, based on if the OFFICE value in column A matches the header.

Copy the Y values from column F into the appropriate column G, H, or I, based on if the OFFICE value in column A matches the header.

As you can see, our “United States” column only has numeric values if the “OFFICE” column value in that row is “United States.” 

When you click on the existing scatter plot, you’ll see purple and blue highlighting around the X and Y columns that Excel is currently displaying in that graph.

Click on the scatter plot to highlight the columns Excel is using for the X (purple) and Y (blue) values.

We’d like this graph to show the Y values in the three new columns we’ve just created. To do that, hold your cursor over the edge of the blue rectangle until it becomes a hand, and then drag that rectangle right by a single column, so that it’s highlighting the data underneath “United States.”

Click and hold the blue column, and drag it to the right by a single column.

You might notice that a lot of your data points are now missing! That’s because now, Excel is only using the “United States” column for our Y axis, and Excel won’t draw a data point if there’s an “#N/A” as a Y value.

Not to worry, though: we’ll get all our data points back now, by clicking on the bottom right corner of that blue rectangle and dragging it to the right, so that the rectangle covers all three new columns we’ve created.

Click on the bottom right corner of the blue rectangle, and drag that corner to the right so that all three new columns are highlighted.

All of our data has returned, hooray! And, as you can see, Excel is now using a different color for each of our data series. 

Let’s add a legend so that our viewers know what these different colors represent. First, we’ll name each of our data series: right-click on the chart, choose “Select Data,” and add the data series names manually in the pop-up window.

Type in the name of each series, or select a cell from the Excel sheet that contains the name.

 

Then, you can fine-tune the look of your graph—perhaps you add a legend as a subheader to your title, and pick specific colors for your series—and your multi-series scatter plot is ready to go.

 

This scatter plot contains the same data as our first one, but the data is now subdivided into three series—one for each office—and is color-coded to reflect that. We’ve also added a legend beneath our chart title to identify which color goes with which office.

 

How to add averages, reference lines, and trendlines to a scatter plot

Whenever we present data to an audience who might be unfamiliar with it, it’s a good idea to include contextual information to help make it easier to understand. In a scatter plot, we can  add context like:

  • What was the average X value? 

  • What was the average Y value?

  • Was there a goal for either variable?

  • Is there a trend worth emphasizing?

Depending on the insights or the data itself, you might use one (or several) of these techniques, so let’s go through them one at a time.

How to add an average point to a scatter plot

We’ll start by adding another row to the bottom of our data table, where we’ll calculate what the average X and Y values were. For simplicity’s sake, we’re only going to look at the average across ALL of our participants, rather than separate averages for each office (although the same techniques would apply). 

Since columns E and F contain our X and Y values, we’ll write formulas to average the values in each of those columns.

Row 24 will contain the averages of our X values (in column E) and our Y values (in column F). Use the AVERAGE formula, as shown in this screen capture, to generate the correct values.


Then, we’ll right-click on our chart, choose “Select Data…” from the menu that pops up, and add another data series just for our average values. Click the “+” button below the “Legend entries (Series):” window to add a new series, which you can then rename and set the range of X and Y values for in that pop-up.

The “average” series will consist of a single point, showing the average of all X values and the average of all Y values.

 

That will put a single point on our chart that marks our X and Y averages, as you can see in the updated graph below.

I’ve deleted the new “average” series from the legend (by clicking once to highlight the legend, then a second time to highlight the average, and hitting “delete”), and formatted the single point to be gray with a black outline.

 

The average of X and Y values across all three data series is now shown in a single gray marker, outlined in black.

 

How to add reference lines and create quadrants in a scatter plot

Now that we have an average point, we can visually break our scatter plot into four quadrants: 

  1. above average in both X and Y variables; 

  2. below average in both;

  3. high X but low Y; and 

  4. low X but high Y.


We’ll use our average point as the basis for drawing the lines that define those quadrants. Specifically, we’ll add a chart element to that point, and maybe not the one that you’d immediately expect: error bars.

Error bars are typically intended to show additional statistical context around a data point. Instead, we’ll be using them to draw both a vertical and a horizontal line, each of which will connect our axes to the edges of our graph, running directly through our average point.

First, we’ll need to know how long to make these error bars.

  • We know that each of our axes has a maximum value of 50, so we’ll make sure our error bars cover that total distance. 

  • Vertically, we’ll draw one bar from the baseline to our average point, and one bar from our average point to the top of our graph. 

  • Horizontally, we’ll draw one bar from our Y-axis to our average point, and then one bar from the average point to the right-hand edge of our graph.

To define the exact length of these error bar segments, we’ll add one more row of data to our table. Below the AVERAGE row, we’ll add a row called “Upper bound.” The value in those cells is calculated as “50-[average value]”.

Add an “Upper bound” row to the data set in preparation for building the quadrant lines.

Now, to add our error bars to the graph: click on the “average” data point in the chart, and then go to the Chart Design > Add Chart Element option in the ribbon, and select “Error Bars > More Error Bars Options...

Here’s where to find the “More Error Bars Options” item in the drop-down menus.

This will open up a Format Error Bars pane on the right side of the screen. (At this point, Excel defaults to having you format the vertical error bars; we’ll get to the horizontal bars in a minute.) You’ll see options for “Direction” (select Both), “End Style” (select No Cap), and “Error Amount” (select Custom). 

This screen shows the correct settings under “Format Error Bars.”

By selecting a Custom option under Error Amount, you’ll have to then click on the “Specify Value” button. In that popup window, tell Excel how far above and below your “average” point you want your vertical error bar to be drawn. That’s why we created that “upper bound” row of data earlier. For the “Positive Error Value” in the popup box, use the value from the “upper bound” row, and for the “Negative Error Value,” use the “AVERAGE” row value.

 

The vertical error bars are sized correctly, but the horizontal ones look like low-res wagon wheels. That’s what we’ll fix next.

 

Then, click directly on the horizontal error bars in the plot and follow the same steps to modify those bars, using the AVERAGE and Upper bound rows for the MONTHS OF EMPLOYMENT column. 

(Note: these bars might be really small by default; if you can’t click on them, then click anywhere in your graph, then go to the “Format” menu next to “Chart Design”, and in the drop down menu on the far left of the ribbon, select “Series ‘average’ X Error Bars”.)

 

Quadrant lines are drawn, but could still use some re-formatting.

 

Once that’s done, you can modify the format of the lines and the average point however you like. I prefer to push these reference lines visually toward the background, and to make the average point itself invisible by turning off its data marker. (Careful! Don’t delete the data marker entirely, because that will also make the error bars disappear.)

 

With some formatting changes, our error bars have become perfectly sized quadrant lines, faded nicely into the background.

 

By drawing quadrants, I can see right away that we have two employees from the Japan office in the top left who have been above average on performance frequently, even though they’re among the newest 50% of program participants. Conversely, the yellow mark in the bottom right shows me that one participant has been here for a long time, but is far below the 50th percentile in terms of above-average performance periods. I don’t know if this is an interesting story, or the most important insight, but simply drawing the quadrants on the graph makes it easier for me to analyze, and talk about, some of the data points within it.

How to add trendlines to a scatter plot

Unlike drawing quadrant lines, drawing trendlines in Excel is fairly straightforward. Perhaps it’s TOO straightforward, actually; it’s awfully easy to put a trendline in a graph that doesn’t have any particular basis in reality, or doesn’t describe the actual trend in a way that would be useful for setting future expectations. Trendlines can also be attention-grabbing, adding visual clutter to a graph type that is already challenging for some viewers to accurately interpret. Typically, if there’s a trend worth highlighting, it’s already visible in the data without drawing an additional line on the graph itself.

Those concerns aside, a trendline can provide helpful context in certain situations. The graph we’re looking at, for instance, may benefit from some visual guidance showing us how well we should expect our pilot program cadre to perform, based on their experience at the company. A linear trendline could use historical data to provide some of that insight.

To create a trendline that uses all of our data points, we’ll create a fifth data series in addition to “United States,” “Canada,”  “Japan,” and “average.” Right-click on the chart and choose “Select Data…”

In the popup window, add another data series that uses columns E and F (our original single-series data) as our X and Y values. I named it “for creating trendline.”

Add another data series to the view so that Excel can use it to calculate a trendline across all data points.

This data series will be plotted on top of all the existing series, using Excel’s default settings for size, color, and so on. In the view below, I’ve made the “for creating timeline” series markers larger and purple just so you can see them more clearly.

 

This purple series will be the basis for our trendline; eventually we’ll make the markers invisible.

 

Believe it or not, we’re almost done. All you have to do is right-click on any one of these markers and then select “Add Trendline…” Excel will automatically add a linear trendline to your chart, and show you the Format Trendline menu on the right hand side of the screen. 

The red line in the view is the trendline Excel draws as its default option.

If you know what you’re doing with trendlines, here’s where you’d make your preferred adjustments; for our purposes today, all I’m going to do is mute the color of the line, and then turn off the fill and border settings for my “for creating trendline” data markers.

 

Using color, size, and line style to de-emphasize the default linear trendline.

 

How to add data labels to a scatter plot

It’s great that we’ve put our data on the graph…but what does that data actually represent? If we care about more than just the overall distribution of data points, we should add data labels to some, if not all, of our points. 

To add data labels to a scatter plot, just right-click on any point in the data series you want to add labels to, and then select “Add Data Labels…” Excel will open up the “Format Data Labels” pane and apply its default settings, which are to show the current Y value as the label. (It will turn on “Show Leader Lines,” which I usually turn off.) 

 

By default, Excel will add the Y values as the data label, and will align it to the right of each data point.

 

Using only the checkboxes in the “Format Data Labels” pane, you can get your labels to include other values or combinations of values, including the Series name, the X value, and/or the Y value. In many cases, though, we’ll want to customize our data labels to show other information than just those fields. For that, we’ll use the “Value From Cells” option.

How to customize labels in a scatter plot

You can tell Excel to use any cell or series of cells as its source for Data Label information. For instance, let’s label each of the markers in this chart with the first initial and last name of the pilot program participants. We can add a “Data Label” column to the right of the data table and use the CONCATENATE formula to create a cell in each row with this information.

You can use formulas to fill in your new “Date Label” column, or you can just manually add the data. In this version, we used the CONCATENATE formula to combine the first initial from the FIRST NAME column, a period, and the full LAST NAME column value for every row.

Then, click on a data label in the existing scatter plot. That brings up the “Format Data Labels” pane, where you can change the settings in the “Label Contains” section to use “Value From Cells”. Clicking on the “Select Range…” button brings up a popup window asking what cells to use for the data label information, and you can highlight all the cells in that brand-new column.

Select any specific range of cells to use as the source for your custom labels.

When you click OK, every data marker will be labeled..

 

All marks are now labeled with our custom values.

 

To reduce the amount of text in the graph, let’s only keep labels for those participants who were far above or far below the trendline—in other words, overachievers and underachievers.

Double-click on one of the labels you want to remove, and then either delete it or set its Text Fill to “No Fill.” Do the same to several other unwanted labels, until you’re left with just high- and low-performing participants. (You can also click-and-drag labels slightly out of their default positions, to make labels with close-by neighbors easier to read.)

 

Only high and low performers and lableled in this view.

 


How to put it all together

With the techniques described above, you should be well prepared to create a scatter plot in Excel that can be designed and formatted to support the specific story you intend to tell. Use the power pairing of color and words to help your audience see what you want them to notice and learn from your scatter plot. Feel free to include annotations inside or near the chart area itself, as well as large headline or takeaway text above your visualization to emphasize the key messages.

The final version of our visualization includes a takeaway title, annotations, a reference line, and custom labels on selected elements in our mult-series scatter plot.

Now, you’re ready to create your own scatter plots in Excel! The data and the graphs you see in this post (along with some bonus content) can be downloaded here.

Check out our chart guide for more about scatter plots and other graph types; subscribe to our YouTube channel for Excel tutorials and lots of other videos; and follow our blog to get the latest in how-to instructions, graph makeovers, and other tips and tricks for communicating more effectively with data.