Why does mixing apples and oranges work for juice but not for a chart?

An illustration of orange and apple.
Created using Freepik graphic

If data cleaning is the most time-consuming part of the data analyst’s job, understanding the chart’s purpose is the most important one. Before we design the visuals, we must know what question the chart should answer and why it is relevant. We need to know ‘What’ and ‘Why’ to choose ‘How’ — which is selecting the most efficient chart for the job. Easy as it sounds, it’s way more complicated than one might think.

I’ll use the Eurostat chart related to EU trade in goods with Russia as an example. At first glance, the graph seems fine — the line chart shows change over time (which is correct), the design is neat (also desirable), and the layout supports the analysis. But somehow, after looking at the chart, we are left with — ‘wait, what?’. After closer examination, we notice that the charts don’t optimally present data. Showing data on two different aggregation levels is like comparing apples to oranges. Or I should say — apples to apple pies. Technically we can do this but with much additional and unnecessary work. Let’s see how the small formatting changes will make data coherent and comparison effortless.

The combination of two charts showing the EU trade in goods with Russia in 2021 and 2022. On the top, the line chart shows the percentage share. On the bottom, the bar chart shows the trade balance in billion euros.
Chart recreated by the author. Source of the original infographic: EU trade in goods with Russia, 2021–2022, Eurostat

Elements that work in this chart

Graph layout

The ultimate advantage of this graph is the layout. Placing one chart above the other enhances data analysis capabilities. We can independently compare the change over time for both dimensions or analyze them simultaneously.

Different perspectives

Another advantage is including both the percentage share and trade balance value. This enriches the analysis by providing a different angle. Thanks to the layout, we can analyze two dimensions at once, comparing percentage change with the value change.

Elements that don’t work in this chart

Merging two scales

This graph is an extreme variant of a dual axis where two axes are… merged despite having different units. The top part of the line is in percent, whereas the bottom is in billions of Euros. Such a solution is not only misleading but also unsustainable. In cases when export exceeds import, and the balance is positive, the bar would appear on the other side of the scale, which currently holds the different units. The alternative can be switching to the same unit and showing the import and export as the values rather than share. If implemented, it would look similar to the example published by The Federal Statistical Office of Germany (Destatis).

Comparing apples to oranges

There is also another issue, harder to notice but impactful. The top chart compares the actual import and export shares over time. So we know what the shares were at any given point, but we must estimate the difference between them visually. Because we compare the distance between two changing lines, there is no common baseline, which makes the task challenging. Even though the bottom part of the chart is already aggregated (shown as a difference in the export and import value) without complex visual assessment, we are comparing apples to… apple pies.

Step-by-step improvements

Material created by the author. Incremental Improvement #27: Step-by-step

Remove dual axis

The first step should be separating the two charts. The easiest way is by changing the location of the x-axis. Putting it between the charts splits them visually while emphasizing the common dimension. The new placement makes it easier to analyze two charts independently because the scale is proximate to both data points.

Adjust chart type

Even though using a line chart to show import and export is technically not a mistake, there is a more insightful way to present this data. U.S. Energy Information Administration provides an interesting alternative that was my inspiration. Instead of using two separate lines, we can switch to bars placed on both sides of the axis — export above and import below the zero line. This change makes the logic behind the balance calculation becomes self-explanatory — the negative balance means that more was imported than exported, which correlates with the chart above. Lastly, reducing the gap between the bars makes scanning easier. And using the same width simplifies the layout.

Calming down colors

Switching from a line chart to a bar chart and widening bars increase the data-ink and make it more prominent. We can calm the layout by reducing the color intensity — using the less saturated variant of pink and removing the color coding for import and export. The latter we already encoded with the position above or below the zero line. Therefore we can use neutral grey with slightly different brightness to make the zero line division more prominent. An additional benefit of such a solution is the possibility of adding information about the net trade. Choosing two light shades of grey will provide enough contrast with the added line.

Working on formatting

The cherry on the cake is removing horizontally oriented axes’ labels by incorporating them into the chart. This cleans up the layout and makes reading easier as we no longer have to twist our necks. And lastly, we can create a visual hierarchy by making the scale less prominent (using lighter shade), emphasizing the zero lines, and separating two charts even further by coloring the x-axis.

Comparing the before and after versions of the redesign graph show the EU trade in goods with Russia in 2021 and 2022. Switching from different aggregation levels to a unified layout based on bar charts.
Created by the author. Incremental Improvement #27: Before and After
Published
Categorized as UX Tagged

Leave a comment

Your email address will not be published.