Pandas: Concat Two Dataframes

This tutorial shows how to use function of concat() in Python with examples. pd.concat() can be used to concat two dataframes in Pandas. The syntax of combining two dataframes (df_1, and df_2) by adding columns: pd.concat ( [df_1, df_2], axis=1) The syntax of combining two dataframes by adding rows: pd.concat ( [df_1, df_2], axis=0) Example … Read more

How to Sum Rows in Dataframe in Python

This tutorial shows how to sum rows in a dataframe using Pandas in Python. Example 1: Use sum() for all rows Brand Location Number 0 BrandA CA 20 1 BrandB CA 30 2 BrandA NY 25 3 BrandC MA 20 4 BrandA CA 20 115 115 Example 2: Use sum() for selected rows The following … Read more

How to Reorder Columns in DataFrame in Pandas Python

This tutorial shows how to reorder columns in dataframe using Pandas in Python. This tutorial includes 3 examples using the methods of reindex(), double brackets [[]], and pop(). Sample Dataframe The following is the sample dataframe in Python, which will be later in all examples in this tutorial. The following is the final sample dataframe. … Read more

How to Use Pandas Melt() Function

This short tutorial shows you how you can use melt() funtion in Pandas. It is often used when we need to change the format of dataframe to fit into a certain statistical functions. Example 1 of Using melt() City1 City2 City3 0 6 2 4 1 2 1 1 2 3 3 2 3 4 … Read more

What is One-way ANOVA? Formula and Example

One-Way ANOVA is to compare the means of different groups, to see whether the mean difference is statistically significant. For instance, you would like to compare the average household size of three cities. You can collect 3 samples from these three cities and conduct a one-way ANOVA to check the difference. Formulas of One-way ANOVA … Read more

When to Use t-test versus Correlation in Data Analysis

Since both correlation and t-test are about relationships between X and Y, what is the difference between them and when do you use t-test (or correlation)? This tutorial aims to answer these two questions. The following figure presents the difference between t-test and correlation. In particular, t-test deals with situations where X is a binary … Read more

How to Calculate Mean in Python (NumPy)

This short tutorial shows how you can calculate mean in Python using NumPy. First, we generate the random data with mean of 5 and standard deviation (SD) of 1. Then, you can use the numpy is mean() function. As you can see, the mean of the sample is close to 5. 4.943504497663466 Regarding of how … Read more

Generate Sample of Normal Distribution in Python NumPy

This tutorial shows how to generate a sample of normal distrubution using NumPy in Python. The following shows syntax of two methods. Method 1: It can change the default values (Default: mu=0 and sd=1). np.random.normal(mu=0, sigma=1, size) Method 2: It can only generate numbers of standard normal (mu=0 and sd=1). But, it can have different … Read more