# Correlational Design

Design Introduction and Focus – Correlational research design can be relational (leading to correlation analysis) and predictive (leading to regression analysis).   Correlational (relational) research design is used in those cases when there is an interest to identify the existence, strength and direction of relationships between two variables.  Correlational predictive design is used in those cases when there is an interest to identify predictive relationship between the predictor and the outcome/criterion variable.  The synonym of correlation is “association”, and it is referred to the direction and magnitude of the relationship between two variables. This association cannot be used to draw conclusions with regard to cause-effect relationship between the variables. To illustrate, let us use a hypothetical example in which individuals’ salary level is associated with their test scores, and we found a correlation of -.50 which tells us that people with higher salaries tend to score lower on the test. Does this mean that making more money makes you less smart or that if you do well on tests you will make less money? The answer is no.

"Correlation does not tell us anything about causation, which is a mistake frequently made when interpreting [the results of the analysis] … Some other variables (time available to study, relevance of the material to their job, etc.) probably explain the relationship. And in order to interpret the results of the analysis we need to know the context. Correlation only tells us that a relationship exists, not whether it is a causal relationship (Holton & Burnett, 2005, p. 41).

When do we use the design? - This design is appropriate for exploring problems about the relationships between constructs, construct dimensions and items on a scale. For example, the age of a child may be related to the height and the adult occupation may be related to his/her income level (Cohen, Cohen, West, & Aiken, 2003)

Type of problem appropriate for this design – Problems that beg for the identification of relationships or predictive relationships are appropriate for correlational designs.

Theoretical framework/discipline background: Correlational research is supported by relational theories that attempt to test relationships between dimensions or characteristics of individuals, groups or situations or events. These theories explain how phenomena, or their parts are related to one another. The theory about the relationship between the constructs was first introduced by Karl Pearson an English statistician and then expanded by Charles Spearman that developed a method to compute correlation for ranked data (Salkind 2010).

In general terms this type of research seems an answer to a question such as: To what extent do two (or more) characteristics tend to occur together?

Specific Characteristics – Correlational design is the umbrella terms and there are multiple correlational designs under it.  For this reason, it is impossible to make one statement about their specific characteristics.  Hence, the most general statement that can be made about the specific characteristics of this design is that this design allows to identify relationships or predictive relationships between variables.  The characteristics of different correlational designs need to be carefully studied and discussed in relation to each of them individually.

Sample Size – The size of the sample for a correlation analysis was suggested to be 30+ samples. A number of suggestions have been made for sample size. The results of the regression analysis are impacted by sample size for instance. For correlation analysis a sample size has been suggested as 30+ samples Green (1991) recommended a sample size of 50+8k where k is the number of predictors.  Some of those suggestions did not take into consideration the effect size (medium effect size in social science) and statistical power (.80 in social science).  G*Power which is a program designed for calculating sample size does use these values and calculates sample size for different types of analysis.

Sampling Method – Both random and non-random sampling procedures can be used with correlational designs.

Data Collection – Data are usually collected by self-reported measures. The data collection instruments are usually multiple-choice Likert-scale questionnaires that allow to collect interval data.  However, secondary data such as student exam scores or some other data collected in the past can also be used in correlational design.

Data Analysis – In relational correlational design, although several variables can be entered into the analysis, a bivariate analysis between two variables is performed, and in the case of multiple variables the output will present bivariate relationships between any two variables entered into the analysis.  In predictive correlational design (which sometimes also is called regression design, and if more than one independent variable – multiple regression design), the possible predictive relationship between the outcome and the predictor(s) is identified.

The Pearson product-moment is used to determine the direction and strength of the correlation. This test generates a coefficient called the Pearson correlation coefficient denoted by the lower-case r that can range from -1 to +1 (a perfect negative correlation to a perfect positive correlation)

Judgement about the results in correlation and regression analysis are made based as below

Pearson r

Coefficient of Determination – R2

• Less than .20 – Slight, almost negligible relationship
• .20-.40 – Low correlation; definite but small relationship
• .40-.70 – Moderate correlation; substantial relationship
• .70 - .90 - High correlation; marked relationship
• .90 – 1.00 – Very high correlation; very dependable relationship
• below .05: too small an effect to be considered meaningful;
• .05 and above: a small but meaningful effect;
• above .10: a moderate effect, and
• above .25: a large effect.

Other types of correlation analysis that are used are: Kendall rank correlation, Spearman correlation, the point-biserial correlation.

More than 20 types of regression analysis exist ranging from simple regression that uses one predictor and one dependent variable to multivariate multiple regression that uses more than one predictor and more than one outcome variable.  While the list below might not be exhaustive, it will provide some idea about the multiplicity of the types of analysis:

• Simple Regression
• Bayesian Regression
• Curvilinear Regression
• Ecologic Regression
• ElasticNet Regression
• Jackknife regression
• Lasso Regression
• Linear Regression
• Logic regression
• Logistic Regression
• Multiple Multivariate Regression
• Polynomial Regression
• Quantile Regression
• Regression in Unusual Spaces
• Ridge Regression
• Stepwise Regression
• Regression Discontinuity
• Multiple Regression

• Simultaneous Multiple Regression
• Sequential Multiple Regression
• Stepwise Multiple Regression
• Multivariate Multiple Regression

Write up Results – The results are being reported in the following way.

E.g.

Correlation Analysis -  Hours spent studying and GPA were strongly positively correlated, r(123) = .61, p = .011

Regression Analysis -   From the table of Coefficients a number of values are reported selectively in a table, such as B, SEB (standard error of B) and β R2 p-value etc.  For more detailed information and suggestions please check Field (2009, p. 252).

Resources

Allen, J., & Le, H. (2008). An additional measure of overall effect size for logistic regression models. Journal  of Educational and Behavioral Statistics, 33(4), 416-441.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied Multiple Regression Correlation Analysis for the Behavioral Sciences

Field, A. (2009). Discovering statistics using SPSS. Sage publications.

Green, S. B. (1991). How many subjects does it take to do a regression analysis? Multivariate Behavioral Research, 26, 499–510.

Holton, E. F., & Burnett, M. F. (2005). The Basics of Quantitative Research. In R. A. Swanson & E. F. Holton (Eds.), Research in organizations: Foundations and methods of Inquiry (pp. 29-44). San Francisco: Berrett-Koehler.

Miles, J. N. V., & Shevlin, M. (2001). Applying regression and correlation: a guide for students and researchers. London: Sage. (This is an extremely readable text that covers regression in loads of detail but with minimum pain – highly recommended.)

Salkind, N. J. (2010). Encyclopedia of research design Thousand Oaks, CA: SAGE Publications Ltd doi: 10.4135/9781412961288

Stevens, J. (2002). Applied multivariate statistics for the social sciences (4th ed.). Hillsdale, NJ: Erlbaum. Chapter 3.

Video clips

How to work out required sample size for a correlation and a regression using G*Power