Healthcare Analytics Challenge: Summary and Week 2

Greg Nelson Analytics training, Healthcare Analytics, Healthcare Analytics Challenge, Our Thots

Welcome to week 2 of the Healthcare Analytics Challenge—our ongoing series designed to engage you in critical thinking and problem solving with new activities and insight from our healthcare analytics experts.

The first challenge was to look at an infographic published in a popular “car magazine” and provide some “thot-ful” commentary on the piece and think of some ways in which the graphic could be improved. Today, we’re summarizing last week’s challenge and looking ahead to week 2. 

This challenge had the reader consider three primary areas when evaluating a graphical representation of data:

  • Purpose and objectives
  • Interpretability, chart selection and design choices
  • Improvements and recommendations

Understanding the Purpose of a Data Visualization

Data visualizations can play a role in self-guided, experiential learning

Data visualizations can play a role in self-guided, experiential learning

We started the challenge by asking about the intended use. Visualizations can have many goals (including being informational, entertaining, thot-provoking, support a decision, etc.)   In the case of Rapp’s graphic for Car and Driver Magazine, he provided a lot of “facts and figures.” While it is true that a reader of Car and Driver may want to use this to support a decision (e.g., buy a really fast car with few insurance claims), we can only assume the purpose of the piece was informational and to let people explore the data on their own by studying the graphics just as a child might explore the realities of money through experiences. See for example

Creating a Shared Understanding

We use data and the power of visualization as a communication tool that bridges understanding and action (or transformation.) For informational graphics, we want to transform our audience from ignorance (lack of knowledge or awareness) to a state where they remember the information, understand or comprehend and begin to use the information learned in new ways.

In the diagram below, we depict this as a cycle that shows we move from awareness to action.

The journey to analytic transformation

Data Visualization and how it supports audience transformation

So if data visualizations are used in support of creating a shared understanding, do you Week 1’s Challenge Infographic was effective?

As we evaluate Rapp’s infographic, we asked you what your favorite chart was and why. This process helps you as you think critically about whether you have achieved your goal of transformation – are the visuals helping us do that based on what we know about our audience and where they are and what action is desired. I add to my previous comment about this infographic being informative by suggesting that the author wanted to spark one’s self-discovery with the data – to make your own associations between the data that is presented.

Visual self-discovery as a learning modality can be stronger than just outlining the facts and figures intended to provide information in various ways to drive the self-discovery of information.

Evaluating the selection of the charts

While responses to the challenge varied, my favorite chart was the one that depicted recalls.

Automobile Recalls (Car and Driver, 2017)

Automobile Recalls (Car and Driver, 2017)

In addition to using a novel graphical representation (the Sankey Diagram), it made me think (and I like things that make me think!)  The graphic allowed me to go on a visual journey of recalls that included rankings, the dollar value of those recalls related those to the type of risks around air bag impact.  While it was an interesting chart, it also raised questions about the unit of analysis. What I mean by this is the ten biggest recalls were depicted with their values. However, they dissected those into the air bag risk type without clarifying whether or not they were describing only air-bag-related recalls on the left or it just so happened that all of the big recalls in 2016 were air-bag related.

In the narrative below the chart, they highlight the smallest recalls, all of which do not relate to air bag recalls.  It would be useful to ensure that there is complete interpretability so we don’t leave our audience asking those fundamental questions.  In practice, once executives or other leaders begin to question the data, we tend to lose our audience quickly!

The most straightforward graphic was the sidewise bar chart (I refer to this as a tornado chart). This chart is perhaps the easiest to understand as most people have been exposed to bar charts at some point in their careers.  One thing that I do appreciate with the construction of this chart is that the artist didn’t try to compensate for the varied scales (a common cheat!) That is, in the top chart we see the cars with the highest insurance claims dollars per vehicle which range from $214 to $397 (for the Audi A8 and Bentley respectively) and range from $38 to $51 for “normal people” cars.  I would love to see an adjustment for the cost of the car or normal people cars versus crazy expensive cars.

Effectiveness of the visualization in support of the narrative

The narrative underlying the analytics process is key to guiding the audience through the transformation that we discussed above.

Narrative is the essence of all storytelling. It is an essential part of stories whether they are fiction intended to entertain or nonfiction to communicate and share facts, events, ideas, beliefs, and opinions about business, society, science, technology, etc. The quality of a story is judged by the quality of the narrative—the connective tissue that binds together all the elements of a well told story

Source: TDWI “Ten Mistakes to Avoid In Data Storytelling” by David Wells.

In Rapp’s infographic, I am not certain that he was effective in relating each fact back to an overall narrative. In part, that was a reflection of my limited understanding of what he was trying to portray in this work. Also, the scope of data ranged from acceleration to noise levels to fuel economy to relative rankings of insurance claims to sales to recalls. There doesn’t seem to be an underlying narrative that ties all of these things together other than they represent the 2016 data about cars – other than this is data about cars.

As we craft our data story, it is critical to ensure that the narrative is clear and provides the binding for your audience and helps move the user along the transformation journey.

Interpretability and Design

While we discussed some of the chart selections above in reflecting on the value that they bring to the overall purpose and narrative of the data product, it is worth noting that design and “voice” can help to connect the data to your audience. For example, the use of color, iconography, fonts and choice of vernacular is critically important as we engage our readers. The choice of light blue and orange was interesting in that Rapp used orange to depict the “bad” throughout (e.g., worst, languishers, losers) and the light blue was the best. Given the demographics of the “average” reader, I would have expected the reader to be male, 40-60 years old (but that is only a guess.) Think about the color choices in and whether they were appropriate for the target audience. Did they consider the prevalence of color blindness in men (blue and yellow; red and green?)  What about color contrast (the light blue provide little contrast and can be difficult to read.)

Similarly, the use of novel charts such as 3D cubes (Road Holding) and squares rotated on their side (sales gainers and losers) may be visually cute, but definitely not universally accepted data visualization techniques that help promote understanding (and by definition a great example of what Edward Tufte calls “Chart Junk.”)

Further, it has been demonstrated that people don’t perform well when comparing the relative sizes of circles (computing the relative size or area or a shape.)  See for example these two articles:

Chart confusion

If one of our motivations is to engage users in the process of self-discovery data, then the easiest way to fail is to create a graphic that is confusing and can be interpreted in multiple ways. In the Rapp infographic, the first chart in the series (for us Westerns this is the upper left) was confusing to me (and evidently many of you!) I get that the top four worst performing cars for acceleration included the 2016 Chevy Colorado (crew cab diesel) and the Chevy Camaro. Wait, what? That can’t be true?  Are you saying that the Camaro was worse at acceleration than a diesel truck? Furthermore, the 2016 Nissan Sentra SL must have been appalling since it was listed twice ostensibly.  The footnote points to an asterisk that indicates that there must have been two tests – at 30 to 50 and again at 50-to-70 mph but the graphic had angles (0, 10, 20, 30) and I had no idea what those meant. Upon further reflection, I see that there were, in fact, four tests (1. 0 to 60, Rolling Start, Top Gear and ¼ mile.)

The only reason that I spent the 5 minutes on this chart was to describe it here – otherwise, I would have given up long ago!

As some of you replied in your challenge, there are a lot of choices for visually depicting the same information that would have been far more useful.  In this case, I think we can point to design as getting in the way of function.

Improvements and Recommendations

We have outlined several areas that deserve attention if we were to improve upon this infographic (or things to keep in mind as you do your work.)  The selection of chart, the colors we use, the visual “tone” that we set and the linkage to an underlying narrative are all critical components of great visual design.

Of course, it is easy to sit on the sidelines and talk about what we could have done better. I am in awe of those visual artists that bridge the world of data and visual design.

It was intentional that we started with a visualization because by starting with the end (or near end as you will soon understand) we begin to structure the story that we want to tell with data. The danger of starting with the presentation of data is that sometimes we let the visuals drive the story without understanding the thing that we are trying to explain. I view this process of crafting the story as a back and forth process where we view our output as our goal (in this case Nicolas Rapp wanted to create a print-based medium as the “deliverable”) and then go back to the data to truly understand the underlying phenomenon and see whether our theories hold true.

As an experimental social psychologist and an academic researcher by training, I was taught that we start with a theoretical framework, develop a testable hypothesis, conduct an empirical study and only then begin our data investigation.

But what do you if you are faced with data first?  The order of operations is very different in that we start with the data investigation to help begin to understand what’s hidden within.  As we start to understand the data, then we start the process of “sense making” by proposing hypotheses that may explain what we are seeing.  As we think through the potential hypothesis that explains what’s in the data, we frame this against what we know about the context of the data (people, processes, business, culture, etc.) and our empirical study becomes data-driven in that we begin to slice and dice to understand the conditions under which this is true.

These very difference processes for approaching data discovery are depicted below.  The Alternative View is currently in vogue and is often termed “visual analytics” whereas the Historical View is common to a traditional academic perspective.

This week’s challenge is intended to build on our first challenge by having you think through the implications of these two approaches to data discovery. Start the challenge below now!

Using visuals to tell a compelling story with your data can give your healthcare organization the competitive edge to succeed. Learn more by registering for the Healthcare Data Visualization Best Practices workshop today!