Understanding Secondary Data in Research

Understanding Secondary Data in Research

Back in June 2016, I had a great opportunity to be part of an organized session at the Knowledge Without Boundaries (KWB) in Phoenix, Arizona.

Of course, I struggled to come up with a topic, but everyone has that problem. While at the Pennsylvania Economic Association (PEA) meetings about a week before, the topic just came to me while I gave my presentation.  I decided to focus on the use of secondary data in research because I use it all the time.  While giving my presentation at the Knowledge without Boundaries, one member of the audience stated that the use of secondary data is not research.  I almost dropped to the floor as did some other members of the audience.   Many business disciplines (e.g., finance, economics, marketing, operations research, etc.) often use secondary data in their research.  Pick up any journal in economics and finance and you will see most of the data is from secondary sources.  Some of these papers are, in fact, quite good and provides relevancy to scholars and policy-makers.  So let’s explore secondary data: what it is, why it matters, and when to use it!

What is Secondary Data?  

Let’s start from the ground up. Secondary data analysis can be literally defined as “second-hand” analysis.  It is the analysis of data or information that was either gathered by someone else (e.g., researchers, institutions, other government agencies etc.) or for some other purpose than the one currently being considered, or often a combination of the two.  In other words, someone has collected the data.  In economics at least, the common adage is, “Where is the data at?”  Telling an economist or a finance person that there is no secondary data available will send shivers down their spines. 

Learn more about the differences between primary and secondary data

The Rationale for Secondary Data

If secondary research and data analysis is undertaken, it can provide a cost-effective way of understanding research questions.  In fact, secondary data are also helpful in designing subsequent primary research and can provide a baseline with which to compare your primary data collection results if you did primary data collection. Therefore, it is wise to begin any research with a review of the secondary data that currently exists.   

As a researcher, you will need to keep the focus of the research in perspective; your first step should be to develop a statement of purpose and providing clear research questions.   Without clear research questions, the research may often meander and is not clearly defined and you may be quite sure what secondary data you will need.  It’s also important to make sure your research question is in alignment. Once the research questions and purpose has been defined, then the researcher will need to explore the availability of secondary data.  Learn more about this at https://research.phoenix.edu/blog/constructing-study-design-aligning-research-question-methodology-design-and-degree-program

The Availability of Secondary Data

Finding the appropriate secondary data can be difficult at times.  It can be quite time consuming and even painful at times.  But of course, it may depend on the source of the secondary data too.  Let’s keep the following points in mind:

The specific types of information and/or data needed to conduct a secondary analysis will depend, on the research questions.  Bad research questions mean confusion and poor use of the secondary data;

Secondary data analysis is usually conducted to gain a more in-depth understanding of the proposed research;

 Secondary data review and analysis involves the relevant data at various levels of aggregation and see how they can answer the research questions.

Sources of Secondary Data

There is quite a plethora of secondary data sources.  I shall present many of them below:

  • Official Statistics from Government Agencies: These include, but are not limited to: Bureau of Labor Statistics, the Census Bureau, the Federal Reserve Bank, Bureau of Economic Analysis, National Center for Health Statistics and a plethora of other government agencies. Learn more about Public Use Data Sets
  • Technical Reports: Technical reports are accounts of work done on research projects and sometimes may have data often in appendices.  You may need to request their data.
  • They are written to provide research results to colleagues, research institutions, governments, and other interested researchers.
  • Scholarly Journals: Articles in scholarly journals usually undergo a peer review where other experts in the same field review the content of the article for accuracy, originality, and relevance.  In general, the author(s) should make available their secondary data. 
  • Some journal require the posting of the secondary data used in a paper on its website or available by authors when asked by other researchers;  
  • Literature Review: Review articles discuss and list all the relevant publications from which the information is derived and often provide sources for their data. 

Obtaining Proprietary Secondary Data

Asking for the proprietary data that is not publically available:  this can be tricky but it can be done. 

I was in a session at the Midwest Economics Association (MEA) several years ago, and the presenter obtained proprietary data from a regional movie chain in St. Louis. No it was not AMC!!  To be honest, I forgot the name of this chain.  He needed their data to estimate a demand function for movies.  The chain was not even reluctant to provide him the data because he was just going to present his results from the demand functions.  The audience was wowed by his ability to get such data.

So yes you can use secondary data in your research!  If the secondary data provides relevancy and answers your research questions, that is great. 


James Rice's picture James Rice | September 16, 2016 11:40 pm MST

Dr. Sloboda,

Thank you for the excellent post of secondary data.  I appreciate your description of the importance of deriving new insights from data previously collected. In business, it is common to look for sources of data that are relevent and can enable analysis that might otherwise be costly or take a significant time to complete. Furthermore, just because data is secondary in nature does not invalidate the analysis or negate the researcher's obligation to understand the validity or quality of data collection instrument. Thus, in my opinion, research using secondary data is just as valuable as research enabled through primary data collection.

Thank you for sharing your insight and experience,

Dr. Jim Rice

Santosh Sambare's picture Santosh Sambare | November 2, 2016 8:06 am MST

Dear Dr. Sloboda,


You provide an interesting perspective on the use of secondary data in research.  Also, you have provided some very interesting thoughts on the use of secondary data to provide insights on research questions.  I have worked in various industries and involved in new product launches and secondary data is extremely important to come up with some “business hypotheses”.  Also, I have used it in forecasting new products. This is when you examine analogs or similar products to estimate the uptake of a new brand.  In the pharmaceutical and consumer packaged goods industries there is a wealth of secondary data that can lend itself various analyses and can help in formulating business and brand questions. In many instances secondary data has helped in formulating qualitative and quantitative research for new products as well as brands which are in the market.


Santosh Sambare, PhD

Cell 781-254-6633

Fiona Sussan's picture Fiona Sussan | November 21, 2016 2:58 am MST

Indeed, other than what you mentioned, marketers use secondary data frequently in order to understand past behavior and predict future consumer behavior. Supermarket panel data is an example.  The bonus card supermarkets provide you is to track your consumption (in return providing you with loyalty member prices). Based on supermarket panel data, researchers found the sales of diapers correlate to sales of beers on Saturdays!  Panel data analyses as a result guide supermarkets what to stock, what to promote, and what to pull off the shelves!

New product sale prediction also uses secondary data from past sales to predict future sales. The classic example is the Diffusion model by Frank Bass. Using a mere 10 point sales and a paramater to derive the diffusion curve. More details can be found here https://en.wikipedia.org/wiki/Bass_diffusion_model. BTW, I am a third generation Bass doctoral student. 

In marketing, increasingly consumer sentiments are being measured using secondary data also. While I was a postdoc at National Geospatial-Intelligence Agency a decade ago, I used PNNC's proprietory software INSPIRE and analyzed reams of publicly available data (e.g., newspapers, websites) on various topics (e.g., election, films, Muslims) to derive various dimensions of the emotional pulses of the people (e.g., a nation, global, specific groups).  Now, we have Nvivo that can perform similar analyses. 

Digitization has made it even more important to use secondary data! Big data of course is the buzz. If we take the trend of research interests or philosophies that is moving away from " causality" to "correlations", then putting a bunch of secondary data together will provide us holistic insights on 'what's going on' instead of 'what causes  what". In an increasingly complex world with terabytes of data generated by consumers everyday (remember everyone has a voice on the Net if you care to speak), it is not a bad idea to look into big data analyses - and yes secondary data we use. 

Having said all this, the selection of using secondary or primary data is laden with philosophical bias. Often researchers are unfortunately influenced by their mentors resulting in confirmation bias toward one method over another.  I have seen nasty fights  between the two camps among faculty members within business schools.  In my view, neither is superior. It just depends on what research questions one wants to answer.  I am fortunate to be both quantitative secondary data user, and at the same time I have used 'introspection' method  to publish an article in marketing!  Introspection method is perhaps one of the extereme qualitative and primary data collection method!



Visit Our Blog

Visit the Research Process Blog for insights and guidance from University researchers Go >>


Recent News