Correlational Research

Alex Gray
3 min readJun 30, 2020

When analyzing and doing data research there are a number of different methods you can use. Today I’m going to share one of the many ways to properly research data. I will explain what correlation is by using real-world examples with Covid-19 data. Hopefully, this can give you ideas on how you can use it in your own practice.

Correlational Research

As said by Adi Bhat in his article, Correlational Research: Definition with Example, “Correlational research is a type of non-experimental research method, in which a researcher measures two variables, understands and assesses the statistical relationship between them with no influence from any extraneous variable.”

Correlation is measured using the correlation coefficient or (“r”). It ranges from -1 to +1. A perfect positive correlation number is exactly 1. Meaning that if one variable moves up or down, the other will do the same. A perfect negative correlation number is negative 1. This means that when one variable goes either up or down the other variable will go the opposite. Lastly, if the number is 0 that means there’s no linear relationship at all.

Here, we have a heatmap of United States coronavirus data. We can see a significant positive correlation between confirmed cases and deaths in the United States. When pulling public records from all states and comparing deaths and confirmed cases, the correlation rate is high. From this correlation, we can see conclude that the more confirmed cases there are the more deaths will continue to rise. This is just one of many cases where correlation acts as a useful tool for data analysis.

When discussing correlation, the quote, “correlation does not imply causation”, is often introduced. In this case, provided above, we can make an educated guess that these correlations are valid. Yet, in other cases, we can compare variables with no verifiable connection that still seem to have a correlation that arouses curiosity.

One example of this is a graph made by Mark Wilson:

This graph, upon immediate inspection, may lead you to believe that the “Divorce rate in Maine” and the “Per capita consumption of margarine” have some correlation. Common sense and a more in-depth analysis prove there is no direct connection between data points. We see illustrated in this example, that it is important not to assume that if two things happen in tandem, they are bound to each other in causation. You have to know when to have some superstition around the data you are analyzing.

Safety Procedures/Conclusion

I hope my previous examples helped illustrate something of value around the topic of correlation. Perhaps this helps you decide when to take a second look. To finish this article, I’d like to leave you with some things you can do to protect yourself and others, as recommended by the CDC during the COVID19 pandemic. It is recommended that you are not in crowds of more than 10 people. Wash your hands for 20 seconds or use an alcohol-based hand sanitizer. Avoid touching your nose, eyes, mouth with unwashed hands.

Sources

  1. correlational research: by Adi Bhat https://www.questionpro.com/blog/correlational-research/
  2. Kaggle dataset: https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge

--

--