# Correlation Coefficient (r)

Match the following vocabulary words in the table below with the corresponding definitions.

Slope Histogram of the residuals

Don't use plagiarized sources. Get Your Custom Essay on
Correlation Coefficient (r)
Just from \$13/Page

Correlation Coefficient (r)

Contingency Table

Conditional Percentage (Conditional Proportion)

Explanatory Variable Scatterplot R-squared

Response Variable Sampling Variability Significance Level Type II Error

Standard Deviation of the Residual Errors

Quantitative Data y-intercept Categorical Data

Critical Value Regression Regression Line Census

Type I Error Residual Correlation Beta Level

Marginal Percentage (Marginal Proportion)

Residual Plot P-value Joint Percentage (Joint Proportion)

a. A number we compare our test statistic to in order to determine significance. In a sampling

distribution or a theoretical distribution approximating the sampling distribution, the critical

value shows us where the tail or tails are. The test statistic must fall in the tail to be significant.

b. Also called the Alpha Level. If the P-value is lower than this number, then the sample data

significantly disagrees with the null hypothesis and is unlikely to have happened by random

chance. This is also the probability of making a type 1 error.

c. A percentage or proportion involving two variables being true about the person or object, but

does not have a condition. There are generally two types (AND, OR).

d. The vertical distance between the regression line and a point in the scatterplot.

e. Statistical analysis that determines if there is a relationship between two different quantitative

variables.

f. When biased sample data leads you to support the alternative hypothesis when the alternative

hypothesis is actually wrong in the population.

g. A graph for visualizing the relationship between two quantitative ordered pair variables. The

ordered pairs (𝑥, 𝑦) are plotted on the rectangular coordinate system.

h. Data in the form of numbers that measure or count something. They usually have units and

taking an average makes sense.

i. Also called the line of best fit or the line of least squares. This line minimizes the vertical

distances between it and all the points in the scatterplot.

j. Collecting data from everyone in a population.

k. Statistical analysis that involves finding the line or model that best fits a quantitative

relationship, using the model to make predictions, and analyzing error in those predictions.

l. The probability of getting the sample data or more extreme because of sampling variability (by

random chance) if the null hypothesis is true.

m. The predicted y-value when the x-value is zero.

n. A statistic between −1 and +1 that measures the strength and direction of linear relationships

between two quantitative variables.

o. Data in the form of labels that tell us something about the people or objects in the data set.

p. Another name for the y-variable or dependent variable in a correlation study.

q. A single percentage or proportion without any conditions. In a contingency table, this can be

found with numbers in the margins.

r. Also called the coefficient of determination. This statistic measures the percent of variability in

the y-variable that can be explained by the linear relationship with the x-variable.

s. When biased sample data leads you fail to reject the null hypothesis when the null hypothesis is

actually wrong in the population.

t. Another name for the x-variable or independent variable in a correlation study.

u. Also called a two-way table. This table summarizes the counts when comparing two different

categorical data sets each with two or more variables.

v. The probability of making a type 2 error.

w. The amount of increase or decrease in the y-variable for every one-unit increase in the x-

variable.

x. Random samples values and sample statistics are usually different from each other and usually

different from the population parameter.

y. A statistic that measures how far points in a scatterplot are from the regression line on average

and measures the average amount of prediction error.

z. The percentage or proportion calculated from a particular group or if a particular condition was

true. These are the very important when studying categorical relationships.

aa. A graph that pairs the residuals with the x values. This graph should be evenly spread out and

not fan shaped.

bb. A graph showing the shape of the residuals. This graph should be nearly normal and centered

close to zero.