Skip to content
ColorArchive
Data Visualization Guide
Search intent: color in data visualization design

Color in Data Visualization: Encoding Information Without Misleading Your Audience

Color in data visualization is not decoration — it is an encoding channel that carries quantitative and categorical information. Using color poorly in charts and graphs misleads readers, creates accessibility barriers, and undermines the credibility of the data being presented. Using color well produces visualizations that communicate at a glance, remain legible across display conditions, and serve the full range of users including those with color vision deficiencies.

Data VisualizationAccessibilityColor Theory
Key points
Data visualization uses three distinct types of color encoding, each with different design requirements. Categorical encoding uses color to distinguish unordered groups (product categories, countries, species) — the colors should be maximally distinguishable from each other while avoiding any implication of ordering. Sequential encoding uses color lightness or saturation to represent a continuous ordered variable (temperature, population density, probability) — the color progression should be perceptually uniform so that equal data differences produce equal perceived color differences. Diverging encoding represents variables with a meaningful midpoint (positive vs. negative deviation, comparison to average) — two hue sequences meet at a neutral midpoint color, showing direction as well as magnitude.
The most pervasive data visualization color mistake is applying categorical colors to ordered data, or sequential colors to categorical data. A bar chart showing five product categories should use five distinguishable, perceptually equal colors — not a five-step gradient from light to dark, which implies that the last category is 'more' than the first. A choropleth map showing population density should use a sequential color scale — not five arbitrary categorical colors, which obscures the underlying ordering. The choice of encoding type is a data structure decision, not an aesthetic one: it should be determined by whether the variable being represented is nominal (unordered), ordinal (ordered), or continuous (numeric).
Colorblind accessibility in data visualization requires designing for the 8% of men and 0.5% of women with some form of color vision deficiency. Deuteranopia and protanopia (red-green colorblindness) are the most common forms. The most common visualization mistake for these users: using red for 'bad' and green for 'good' without any other distinguishing encoding. The solution is not to eliminate color, but to add redundant encoding: shape, pattern, position, or text labels that convey the same information the color is encoding. A well-designed accessible visualization uses color as one of multiple encoding channels, not the sole encoding channel for critical information.

Choosing a categorical color palette for visualization

A good categorical color palette for data visualization has three properties: (1) Distinguishability — each color is visually distinct from every other color in the set. This is harder than it sounds for large sets (10+ categories) because the human eye can only reliably distinguish a limited number of colors simultaneously. Palettes of 8-12 colors are near the practical limit; beyond that, supplementary encoding (shape, pattern, texture) becomes necessary. (2) Perceptual equality — no color should appear more important or 'louder' than the others. Highly saturated colors appear more prominent than muted colors of the same hue; bright yellows appear lighter and less prominent than dark blues at the same saturation level. A good categorical palette is balanced across perceived lightness and saturation. (3) Colorblind safety — the palette should remain distinguishable for deuteranopes and protanopes. Tools like Viz Palette and the ColorBrewer palette library design specifically for colorblind-safe categorical sets.

Sequential and diverging scales: designing for perceptual uniformity

Sequential and diverging color scales should be perceptually uniform: equal steps in data value should produce equal steps in perceived color. Perceptually non-uniform scales mislead — they make some data ranges appear to vary more than others for reasons of color physics rather than data patterns. The standard color spaces used in legacy visualization tools (HSL, sRGB) are not perceptually uniform. CIELAB and OKLCH are significantly more uniform. Palette libraries like ColorBrewer were designed with perceptual uniformity testing; the viridis, inferno, and magma scales in matplotlib were specifically engineered for perceptual uniformity and colorblind safety. When designing custom sequential scales, test by converting to grayscale — the grayscale version should show a smooth, gradual transition with no abrupt jumps or flat regions. Abrupt jumps in grayscale indicate a perceptual uniformity problem.

The problem with rainbow color scales

Rainbow color scales (cycling through the full spectrum from red to violet) are the most commonly misused visualization palette. They are perceptually non-uniform — the transition between yellow and green appears sharper than the transition between blue and indigo, creating artificial visual features in data that do not represent real data discontinuities. They are not colorblind-safe. They imply no natural ordering (which direction is 'more'?). And they are challenging to print accurately and to read under different display conditions. Despite these problems, rainbow scales persist because they are visually striking and easily create an impression of data richness. For any visualization where the goal is accurate data communication rather than visual spectacle, replace rainbow scales with perceptually uniform sequential or diverging scales. For visualizations where the goal is to show maximum detail in a continuous field (satellite imagery, topography, certain scientific data), rainbow scales can be effective if their limitations are understood.

Contextual color: reference lines, annotations, and emphasis

Beyond data encoding, visualizations use color contextually: to highlight specific data points, mark reference lines, annotate outliers, or show uncertainty ranges. These contextual uses require a different color strategy than the encoding palette. Reference lines (averages, targets, thresholds) should use neutral colors (medium gray) that do not compete with the data encoding but remain legible. Highlighted emphasis colors (to draw attention to a specific data series or point) should contrast with the general data palette — typically achieved by making all non-highlighted elements gray and using the brand's primary action color for the highlighted element. Uncertainty representation (confidence intervals, error bars, probability ranges) benefits from lower saturation and transparency, visually signaling 'less certain' compared to the primary data marks. A visualization with a clean color hierarchy — where encoding, reference, emphasis, and uncertainty use distinct and non-competing color treatments — communicates significantly more efficiently than one where all color uses compete at the same visual weight.

Practical next step

Move from the guide into a concrete palette lane

Guides explain the use case. Collections prove the taste. Packs handle the export and implementation layer.

Related guides