#SWDchallenge: upskill with UpSets
While our general advice in business communication is to keep your chart choices as simple and clear as possible, it’s always useful to stay aware of uncommon or specialized visualizations. You never know when you’ll have a particular need that is ideally addressed by an unusual chart type (or when you’ll have to explain how to read one to your colleagues).
One such graph innovation came up during a recent office hour (a benefit available to premium members of our SWD community)—the UpSet plot. This visual combines elements of bar charts, dot plots, and Venn diagrams to showcase relationships among items belonging to one or more of several related sets.
Here’s an example from Jake Conway and Nils Gehlenborg, the authors of an UpSet package built for use in R, that shows the proportion of movies in a dataset that can be classified into one or more of nine distinct genres.
Focusing first on the dots themselves: that part of the UpSet plot defines, for each column, what specific intersection of possible genres is being counted in the vertical bar above it.
In the first column, only Drama is lit up, so the vertical bar shows the 950 movies in the database that can ONLY be classified as dramas.
The fourth column, however, shows both Comedy and Drama lit, with those two marks connected by a solid black line. The bar above it, therefore, tells us that 171 movies in the dataset are classified as both Comedy and Drama. In set terminology, this is the intersection of Comedy and Drama, not the union.
If we keep moving rightward, eventually we find an intersection of three genres: Comedy, Drama, and Romance. Thirty-four movies fit that bill.
Imagine trying to visualize nine different genres and all the possible intersections of them in a Venn diagram, or an Euler diagram. It would be so geometrically complex that it may well be impossible. But here, we have a crisp, orderly, compact way of communicating that information.
The challenge
It’s a new year, so try a new graph! Using real-world or training data as you deem appropriate, create and share an UpSet plot of your own. In keeping with traditional SWD guidance, however, please also try to find some meaningful insight in your analysis and make that clear, with words and design choices, in your challenge submission.
As an esoteric chart type, the UpSet plot does not come standard in many business software applications, so learning how to create one is an additional aspect of this challenge.
The R package mentioned above can help you, as can the UpSetR Shiny app also from the same creator; other packages I’ve seen recommended include Complex UpSet and ggupset.
There are Python libraries for it as well (I can’t vouch for their relative plusses and minuses, but upsetplot probably does what it says on the label.)
With some effort, you can create one in Tableau as this video from Sean Miller demonstrates.
Excel will probably take even more handcrafting; I’d suggest creating three separate graphs (a connected dot plot for the switchboard-looking part, plus two bar charts), and carefully placing them together on one sheet, slide, or canvas.
Of course, you can always go the analog route and simply hand draw or illustrate part or all of your UpSet plot…it doesn’t matter how you make it, as long as the audience sees what you want them to see. (It’s true for professional magicians, and it’s true for data visualizers just the same.)
Submit your UpSet plot and any commentary here in the SWD community by 5pm PT on Wednesday, January 31, 2024.
Related resources
There’s nothing wrong with checking out the UpSet Plot entry on wikipedia to get a fuller understanding of its history and potential applications
If you prefer not to rely on anonymous sources, here’s a nice explainer from Kieran Healy on UpSet plots