Dummy and Contrast Codings in R

 “Dummy” or “treatment” coding is to create dichotomous variables where each level of the categorical variable is contrasted to a specified reference level. Basic Syntax of Dummy and Contrast Coding 1. Dummy Coding The following is the syntax to do dummy coding in R. contr.treatment( number_of_level_of_X ) 2 3 1 0 0 2 1 0 3 … Read more

How to Plot Multiple t-distribution Bell-shaped Curves in R

This tutorial shows how you can plot multiple t-distribution bell-shaped curves in R. In short, it is a combination of dt(), plot(), and lines(). Example 1 of plotting multiple t-distribution bell-shaped curves The following is the R code to plot two t-distribution bell-shaped curves, with degree freedom of 1 and 5. Example 2 of plotting … Read more

Categories R

How to Simulate a Dataset for Logistic Regression in R

This tutorial shows the steps of simulating a dataset for logistic regression R. Logistic regression is based on the following link function. In particular, the following are the steps for simulating a dataset for logistic regression in R. Step 1: Generate Xs Suppose that we have 2 Xs in the logistic regression, and the following … Read more

Categories R

Poisson Regression in R

You can set family=poisson in the glm() function to do Poisson regression in R. glm(model_statement, family = poisson, data = data_file_name) Data Example This tutorial will use a dataset for Poisson regression. The following shows the key variables in this dataset. We are going to see if age can predict the number of people in … Read more

Categories R

Difference between Logit and Probit

This tutorial explains the difference between logit and probit in statistics with formulas and examples. Formula and Example for Logit We can start with the following formula. Thus, \( \beta_0+\beta_1x_1+…+\beta_nx_n \) can be from \( -\infty \) to \(+\infty \), and \( p(y=1) \) will be always within the range of \( (0,1) \). We … Read more

Remove Rows with NA in One Column in R

This tutorial shows how to remove rows with NA in one column in R. There are two R syntax to remove rows with NA in a single one column. # Use is.na() function dataframe_name[!is.na(dataframe_name$column_name),] # Use na.omit() function na.omit(dataframe_name, cols = “column_name”) Sample Data Frame Being Used The following R code to generate a sample … Read more

Categories R

How to Change Columns Names in R

There are two ways to change column names in R, one is to change all column names and a specific column name. The following is the R code syntax to change column names. Method 1: Change a specific column name: colnames(df_name)[which(names(df_name) == ‘old_col_name’)] <- ‘new_col_name’ Method 2: Change all column names: colnames(dataframe_name) <- c(“New_col_name_1″,”New_col_name_2”,…) Example … Read more

Categories R

How to Read SPSS Files in R

You can use the read_sav() function from the haven library to read SPSS files in R. The syntax of the function is as follows. read_sav(“file_name.sav”) Step 1: Install haven library The first step is to install the library of “haven” into your local computer. install.packages(‘haven’) Step 2: Library it in R Studio As for other libraries, you need … Read more

How to Add Density Line on Histogram in R

This tutorial shows how to add density line on histogram in R. The following is the key part of the syntax, which sets freq=FALSE and add line() on the top of the histogram. hist(data_name, freq =FALSE) lines(density(data_name)) Example 1 We are going to use the New Haven temperature data to plot the histogram and the … Read more

Categories R

How to Increase Bin Density in Histogram in R

This tutorial shows how to increase bin density in histogram in R. You can use breaks in hist() to do so. Below is the basic R syntax of doing so. hist(dataset_name,breaks = 50) Example 1 We are going to use the “Average Yearly Temperatures in New Haven” for the first example to show how to … Read more

Categories R