Data is a huge driving force behind many strategic changes within modern businesses. With the growing availability and surplus of data, it is becoming easier for companies to analyze and make changes based on insights derived from the information they have gathered. This concept, data-driven decision making, is not a new one. Companies have been using these practices in several ways over the years, whether it be through gathering qualitative data on a new product or perfecting their sales team’s performance. I am hoping to break down just a few common pitfalls companies can face using data for decision making.
The process of gathering data can be a time-consuming and often expensive process. The inexpensive side of the spectrum will be gathering internal data while the more expensive side would be leading qualitative focus groups revolving around a new or existing product. The medium I am going to focus on for data gathering falls roughly between these two, Surveys.
A number of issues can arise from the creation of surveys. One of the most prominent ones is writing questions that lead to an answer. This can be done intentionally or unintentionally, but doing so has enormous potential to skew the results of the survey.
Here is a good example of a leading question in action:
How do you feel about the negative impact deforestation has on the environment?
Of course, there are multiple studies backing up the claim that deforestation is negative, but implying that deforestation is negative in the question will predispose responders to that line of thought. This will lead to most responders answering the question already thinking that deforestation is negative.
For an unbiased response, these two questions would be better:
-What impact has deforestation had on the environment?
-On a scale of 1 through 7, with 1 being very negative and 7 being very positive, how would you describe the impact of deforestation on the environment?
Now, these two questions still aren’t perfect, but the second question removes the leading nature of the first question. It can help to break a question into two in order to get a full, unbiased look, at responses.
Data is not infallible
I cannot stress this point enough. Research studies all have an acceptable margin of error set by the company, generally between 4% and 8% with a 95% confidence level. This ultimately means that the sample and results derived from the research are not perfectly indicative of the total population. The cost to achieve such a result would be astronomical. Knowing this, it is important to realize that although using data to drive business decisions is an excellent way to reduce risk, there will still be a risk with every decision that is made. This leads me to another common mistake when taking part in research studies and analyzing data, bias.
Be careful of Bias
I could spend an entire blog and more on the concept of bias in research. It can exist in every aspect of the research process, from early design to analysis. I would like to talk primarily about the bias that can exist in data analysis. A practice, known as data dredging, can be extremely harmful to a company and a product and is oftentimes unintentionally done by the researcher and company. Data dredging, roughly put, is analyzing data by parsing out the data such that certain relationships can be shown as statistically significant. This can be extremely harmful as it has a very high risk of forming false positives.
A company, FiveThirtyEight, performed a study where they sent out a survey in which 54 people completed. They then performed roughly 27,716 regressions on that data. Some of the results were hilarious.
Here are just a few of the many correlations they found to be statistically significant:
-Drinking lemonade has a positive correlation between the belief that “Crash” deserved to win best picture
-Eating shellfish is positively correlated to being right-handed
-Eating table salt is positively correlated to having a positive relationship with your internet service provider
Now clearly, these are ridiculous correlations but what is even more important to take into account is that some of these might be real correlations. However, correlation does not equal causation. With these examples, it is easy to notice an issue exists with the analysis performed on the data. It can be more difficult to notice that issue when analyzing data related to your business or a new product your company is working on.
It is important to take a step back when reviewing or analyzing data and understand when a bias, intentional or unintentional, may exist and be wary of that.
Data is Beneficial
I know I have spent most of this blog talking about different ways data can be misused or misunderstood, but I want to add that data analysis, overall, is an excellent backbone for decision making. It can help greatly reduce risk and can keep employees better informed and more productive. I would highly recommend using data to drive business decisions, but it is also important that data is not the only deciding factor.
If you are interested in reading more about the study by FiveThirtyEight above please use the link below: