Dummy and Contrast Codings in Linear Regression

This tutorial explains the differences between dummy coding and contrast coding in linear regression using R code examples. It is worth pointing out that, this tutorial focuses on the categorical independent variable has 3 levels. Short Note Note that, in R, the default reference group in dummy coding uses the first item in an alphabetical … Read more

Changing Reference Level in Dummy Coding in R

You can change the reference level in dummy coding in R by using the following R code. contr.treatment(total_levels, base = Number_reference_level) Step 1: Prepare Data The following R code generates a sample data. X Y 1 1 -0.56047565 2 2 -0.23017749 3 3 1.55870831 4 1 0.07050839 5 2 0.12928774 6 3 1.71506499 7 1 … Read more

Dummy and Contrast Codings in R

 “Dummy” or “treatment” coding is to create dichotomous variables where each level of the categorical variable is contrasted to a specified reference level. Basic Syntax of Dummy and Contrast Coding 1. Dummy Coding The following is the syntax to do dummy coding in R. contr.treatment( number_of_level_of_X ) 2 3 1 0 0 2 1 0 3 … Read more

Quartile: Definition and Example

Definition of Quartile A quartile is a statistic describing how a set of data points are divided into 4 groups. Quartiles split a set of data by using 3 points: the lower quartile (Q1), the median (Q2), and the upper quartile (Q3). Together with the minimum and maximum values, 3 quartiles split the data set … Read more

Difference between Descriptive Statistics and Inferential Statistics

Descriptive statistics aim to summarize the characteristics of a given data set. In contrast, inferential statistics aim to use a sample of data to draw inferences about the whole population (i.e., hypothesis testing). Types of Descriptive Statistics 1. Measures of Central Tendency Central tendency is used to describe where the center of a dataset is located. Mean, … Read more

Difference between Sample and Population

A population is the entire group of individuals about whom you want to draw conclusions. In contrast, a sample is the subset of the same entire group. Example 1 of sample and population You would like to study if students like online courses at your university. Suppose your university has 10K students; thus, these 10K students … Read more

Calculate Population Variance in Excel

You can use the VARP, VAR.P, or VARPA functions in Excel to calculate population variance.  Data Example The following is the data example for population variance. Example of VARP() Type =VARP(B2:B12) in a cell in Excel to calculate population variance. The population variance is 46.23. Example of VAR.P() Type =VAR.P(B2:B12) in a cell in Excel to calculate population variance. The population … Read more

Population Variance Formula and Calculation by Hand

This tutorial shows the formula for population variance and the steps for calculating population variance by hand. Formula Population variance is the measure of the variability of a population. The following is the formula for population variance. where, Population vs. Sample Data The following is the population of a set of data. It has 11 … Read more

Calculate Sample Variance in Excel

You can use the VAR, VAR.S or VARA functions to calculate sample variance.  Data Example We are going to use the following sample of 6 students with math scores. The following is how the data looks in Excel. Example of Var() Thus, you can type =VAR(B2:B7) at E2 to calculate sample variance. You can type it anywhere other than E2 … Read more

How to Calculate Mean in Excel

You can use the syntax of =AVERAGE() to calculate the mean in Excel. Example 1: one column You type the syntax of =AVERAGE(B2:B12) in one cell. Basically you can choose any cells to type that, and I just chose E2 as an example. Example 2: Two columns For data in two different columns, you need … Read more