Data visualization

Data visualization is the graphical representation of information and data. Using visualizations such as charts, graphs and maps makes data more accessible to non-technical audiences. This approach facilitates a deeper understanding of patterns and trends within the data.. Moreover, it also contributes to improved data quality by highlighting any outliers, missing values or unusual distributions in your data. However, it is important to note that visualizations also have the potential to confuse readers if the message is not clear or mislead them if the data are inaccurately represented. Therefore, a key consideration is whether the selected visualization lends itself to the message being conveyed. For example:

Bar charts are useful for comparing frequencies or percentages between different groups. This is the simplest type of chart to draw and read. It is easier to compare between groups when the bars are ordered by size.

Figure 1: Bar chart

(source: Eurostat, 2023)

Stacked bar charts are useful for comparing the segments of different groups. However, it may be difficult to compare if there are too many segments in each stack or if the segments are quite similar in size.

Figure 2: Stacked bar chart

(source: Haner and Lopez, 2023)

Line charts are useful for showing changes over time and can be adjusted to better communicate a specific message. However, it is recommended to indicate when the y axis (vertical axis) does not start at zero and to use similar scales when presenting multiple line charts, i.e., adjusting the intervals of the scales to be the same across all line charts, thereby facilitating accurate comparison between different datasets.

Figure 3: Line chart

(IOM, 2022)

Pie charts are useful for showing part-to-whole composition. Their use should be limited to situations where the primary objective is to highlight the relative importance of one or two categories within the dataset. To optimize clarity and comprehension, the number of slices should be limited to six and the slices should be ordered from largest to smallest and coloured in distinct, vivid colours.

Figure 4: Pie chart

(source: Buterin et al., 2022)

Stacked area charts are useful for showing changes in part-to-whole composition over time. However, these charts should be avoided when the objective is to show fluctuations within each category over time or to demonstrate how one category overtakes another (which would be better illustrated using a line chart).

Figure 5: Stacked area chart

(source: Bratsberg et al., 2017)

Scatter plots are useful for showing a relationship between two (or more) variables. However, a visual relationship does not imply statistical significance, which can be established through correlation analysis. If there is a significant linear correlation between the variables, consider including a trend line to enhance understanding. Furthermore, while a relationship may be observed, it does not imply causation, as there may be other confounding factors.

Figure 6: Scatterplot

(source: Rodriguez-Sanchez, 2022)

Proportional symbol maps use symbols that vary in size to represent differences in frequencies or percentages, usually between geographic units. Providing an intuitive visual representation of spatial patterns and disparities, proportional symbol maps can be a useful tool to compare migrant stocks between cities, states or countries.

Figure 7: Proportional symbol map

(source: IOM, Migration Data Portal)

Spider charts are useful for comparing how one or more groups score across multiple dimensions or variables. These can be employed, for example, to compare how countries score on different migration governance policies or how different migrant groups score on the sub-components of migrant integration.

Figure 8: Spider chart

(source: Yagmur and van de Vijver, 2022)

Alluvial diagrams are flow charts showing changes in the structure of a network over time and across locations. These diagrams are particularly useful to illustrate complex patterns of movement or transition between various entities. For example, they can effectively visualize migration corridors by showing movements from one set of locations to another set of locations over several time points.

Figure 9: Alluvial diagram

(source: Brooks, 2021)

Chord diagrams are also useful for visualizing migration corridors as they show flows between multiple locations, with the size of the arc corresponding to the importance of the flow. The group order around the circle is key to minimizing the number of arc crossings, which enhances readability.

Figure 10: Chord diagram

(source: IOM, 2020b)

Checklist for developing good data visualizations (UNECE, 2009)

Avoid visualizations if…
- Data are dispersed (too many outliers)
- There are too few or too many values
- There is not enough variation
Pick the appropriate visualization depending on the intended message
Do not rely on colour
Ensure consistency across visualizations, i.e., similar scales, colours and symbols
Reducethe possibility of misinterpretation by piloting the visual among colleagues
Include a legend identifying symbols, patterns or colours
Label the x and y axes and their measurement units
Add gridlines to bar and line charts to help readers compare values between groups
Include a short and concise chart title
Mention the source of the data (especially if collected by another entity)
Sort the data from smallest to largest values to facilitate comparison between groups
Avoid unnecessary graphic features, such as 3-dimensional charts

Chapter overview