menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Data Science News

>

The Danger...
source image

Towards Data Science

4w

read

434

img
dot

The Dangers of Deceptive Data Part 2–Base Proportions and Bad Statistics

  • The article is a follow-up to a previous piece on deceptive data and explores statistical concepts leading to data misinterpretation.
  • It emphasizes the distinction between correlation and causation, highlighting the need for controlled trials to establish causation.
  • An example involving cigarette consumption and life expectancy illustrates the correlation vs. causation fallacy.
  • The base rate fallacy is discussed using an example from a medical school scenario to demonstrate the importance of base proportions in statistics.
  • The article mentions the Datasaurus Dozen dataset, showing how summary statistics alone may not reveal the full picture of the data.
  • It stresses the significance of understanding uncertainty and explains how error bars represent uncertainty, not errors in data.
  • The article cautions against mistrusting correct data due to misinterpretation of uncertainty and the societal implications of such misunderstandings.
  • Key takeaways include being cautious of deceptive data, considering base rates, exploring data beyond summary statistics, and interpreting uncertainty accurately.
  • Understanding these concepts equips individuals to approach data science problems effectively and make informed decisions.
  • The article concludes by encouraging readers to stay vigilant against deceptive data and misinformation.

Read Full Article

like

26 Likes

For uninterrupted reading, download the app