During which stage of the research process is primary and secondary data gathered?

Business analysts will conduct many analyses in their careers. Many of these analyses will be made possible by the hard work someone else did to collect the necessary data used in the analysis. In this post you'll learn the difference between primary and secondary data analysis.

What is the difference between primary and secondary data analyses?

Primary data is data collected by a researcher or group of researchers for a specific analysis.

Let's say for example that you wanted to run an analysis in your company to determine the health levels of your employees. You create a survey and hand it out to each employee. The data you collect would be classified as primary data.

Now let's say that after you've collected this data another analyst in your company uses it for a totally different analysis. In that case the second analyst would be using secondary data in a secondary data analysis.

Advantages of primary data analysis

There are a number of key advantages for using primary data for an analysis.

You can validate the reliability of the data

One of the biggest advantages of conducting a primary data analysis is the fact that you know the source of the data. You collected the data yourself so you have detailed knowledge of the collection methodology and if the data is reliable.

You have what you need for your analysis

Before you conduct any analysis you need to know what information is necessary to complete the analysis. In the case of primary data, you specifically gather the exact information that you need for your analysis.

The same can be said for volume. If you prefer a large sample size or a very specific mix of data then you can go out into the world and gather the exact volume and mix that you need.

Disadvantages of primary data analysis

Primary data analyses are expensive

Going out and manually collecting data, or paying a research company to do it for you is very expensive.

There will be specific analyses where you'll have no choice but to gather the data yourself. You'll need to plan well and budget accordingly so that you gather exactly what you need.

You will need to do your own data prep

Once you've gathered the primary data you'll need to go through a multi-stage process to clean, organize and validate the statistical validity of your data.

This is a difficult and timely process which will add to the costs of your analysis.

In the case of secondary data, often this work has already been done by the researchers that gathered the data in the first place.

Advantages of secondary data analysis

Secondary data analyses are quicker and (often) cheaper

Secondary data analyses are for the most part significantly cheaper and quicker to complete than primary data analyses because you're not collecting the data yourself. The data has often been prepped and validated statistically and can be used immediately. These two skipped steps will save you many hours of work.

Depending on the scope and subject matter of the analysis, it may be significantly cheaper to purchase a data source than collecting the data yourself. In these cases a secondary data analysis will not only be quicker (time is money) but the costs involved in getting your hands on the necessary data will be lower.

In the case of an online business, you may want to purchase a piece of software which will gather the data for you. Such a tool may allow your analysts to conduct dozens of ad-hock secondary data analyses.

Wide range of data sources available for secondary data analysis

More and more countries, organizations and companies are publishing large studies and useful data sources which can be used unlicensed for secondary data analyses.

My favorite site for finding useful data sources is Kaggle. Kaggle has the largest collection of resources and data sets for data scientists and analysts. Don't believe me, check out this up-to-date dataset of nearly 40,000 international football matches.

Disadvantages of secondary data analyses

Data validity and coverage

If you're not collecting the data yourself how will you know for sure that it's valid? There is always a risk when using a secondary dataset that the data is not reliable and has been faked or collected using an incorrect methodology (biased sample or manipulated for political reasons for example).

The other issue with secondary data is that it may not contact exactly what you need. You may be forced to come up with a proxy method for measuring a specific variable or combine a number of datasets to get around a missing variable. This may result in a higher than acceptable margin of error or the entire analysis being scrapped.

In the case of primary data, you know exactly what you're getting since you collected it yourself.

You don't control the structure of the data in a secondary data analysis

Since you're using a dataset that you didn't construct yourself, you won't necessarily have the data in the format that you'd like.

You may want to analyze the geo location of your data but all you have is the respective latitude and longitude of each point. Now you'll need to find a way to turn those data points into city.

Another example is grouping of data. For example you may have the range of salaries for employees instead of their exact salaries. This would prevent you from accurately counting average, min and max salaries among your sample.

What is better, primary or secondary data analysis?

Primary and secondary data analyses have different pros and cons.

If you have a tight budget and need to deliver the analysis quickly then secondary data would be the way to go.

If you have a very specific analysis in mind that needs to have a very high degree of accuracy then primary data would serve you best.

As an analyst, you'll need to determine which approach makes sense considering all the variables involved.

In which step of the market research process is secondary and primary data collected?

The fifth phase in a marketing research process is to find out the data sources. The researcher decides the sources of collecting data either secondary data or primary data. The researcher first tends to collects secondary data, because it is easily available and affordable to spend.

What is primary and secondary data in research?

Meaning. Primary data refers to the first hand data gathered by the researcher himself. Secondary data means data collected by someone else earlier. Data. Real time data.

What is primary and secondary research called?

The key difference between these two types of research is that primary research is collected first-hand whilst secondary research is gathered from pre-existing studies. Primary research is also referred to as field research. It involves original research, which is carried out first-hand, often for a specific purpose.

What phase of research is data gathering?

The Empirical Phase The next phase of the research process is the empirical phase. This involves the collection of data and the preparation of data for analysis.