What is One-way ANOVA? Formula and Example

One-Way ANOVA is to compare the means of different groups, to see whether the mean difference is statistically significant. For instance, you would like to compare the average household size of three cities. You can collect 3 samples from these three cities and conduct a one-way ANOVA to check the difference. Formulas of One-way ANOVA … Read more

LaTex Formula Cheatsheet

This page includes statistics formulas in raw LaTex code. It is painful sometimes to write a complex formula and thus I hope this page is useful for those need to write them. In case you need to find symbols in LaTex, this linked pdf could be useful. Latex Code for Correlation Formula The following are … Read more

When to Use t-test versus Correlation in Data Analysis

Since both correlation and t-test are about relationships between X and Y, what is the difference between them and when do you use t-test (or correlation)? This tutorial aims to answer these two questions. The following figure presents the difference between t-test and correlation. In particular, t-test deals with situations where X is a binary … Read more

How to Do Scatter Plots in Python

This tutorial shows how to use Pandas, Matplotlib, and Seaborn for scatter plots in Python with examples, codes, and charts. There are two methods of doing scatter plots in Python. The following shows the core syntax. Pandas: df.plot (kind=”scatter”, x=”column_x”, y=”column_y”) Seaborn: sns.lmplot (x=”column_x”, y=”column_y”, data=df, fit_reg=True) Example 1: Use Pandas for scatter plots in … Read more

Plot Histogram in Python

Introduction We can use hist() in Matplotlib, pandas, and seaborn to plot histograms in Python. The following is the basic syntax of plotting histogram using these 3 different modules. Method 1: Using matplotlib plt.hist(data,bins=’auto’) Method 2: Using pandas pd.hist() Method 3: Using seaborn sns.histplot(data=dataset, x=’column_name’) Sample Data for Histogram We generate a randon sample of … Read more

How to Calculate Mean in Python (NumPy)

This short tutorial shows how you can calculate mean in Python using NumPy. First, we generate the random data with mean of 5 and standard deviation (SD) of 1. Then, you can use the numpy is mean() function. As you can see, the mean of the sample is close to 5. 4.943504497663466 Regarding of how … Read more

Generate Sample of Normal Distribution in Python NumPy

This tutorial shows how to generate a sample of normal distrubution using NumPy in Python. The following shows syntax of two methods. Method 1: It can change the default values (Default: mu=0 and sd=1). np.random.normal(mu=0, sigma=1, size) Method 2: It can only generate numbers of standard normal (mu=0 and sd=1). But, it can have different … Read more

How to Plot Bar Charts in Python

This tutorial will show how you can plot bar charts using Python with detailed examples. Similar to line charts, bar charts show the relationship between X (on x-asix) and Y (on Y-asix). I will first use the same data as in line charts to illustrate how to plot bar charts. Then, I will use another … Read more