Is your Data Highly Skewed? Try Visualizing with using a Log Scale

Visualizing data that is highly skewed is tricky because most points get overshadowed. To fix that, try using a log scale.

Here’s an example of a skewed data set. Total votes for each presidential candidate in the 2020 election:

As you can see, once you get past the two main presidential candidates, you can’t see the bars for anybody else.

To solve for that, change the scale to a log scale. A log scale will essentially shrink the range. The scale grows by a factor of 10 and that’s how you get a more even distribution. So the scale for a log chart will go from 1 to 100 to 10,000 to 1,000,000. Each window is given an equal weight which allows for a smoother chart.

The data above was visualized using Plotly and aggregated using Kaggle. You can check out the notebook by clicking on the link.

Thanks for reading!


Posted

in

by

Tags: