How to Subset Rows in Pandas Dataframes

There are at least 4 methods to subset row in Pandas dataframes. Method 1: loc[[Comma]] df.loc[[row_number1, row_number_2]] Method 2: loc[Colon] df.loc[row_number1: row_number_2] Method 3: iloc[[Comma]] df.iloc[[row_number1, row_number_2]] Method 4: iloc[Colon] df.iloc[[row_number1: row_number_2]] Example 1 for Method 1 The following uses loc[[Comma]] (i.e., loc[[0,2]])to subset rows in a Pandas dataframe. The following shows the original dataframe … Read more

Check if Any Value is NaN in a DataFrame

You can check if any value is NaN in a dataframe in Pandas in Python by using the following 2 methods. Method 1: check if any value is NaN by columns: df.isnull().any() Method 2: Check if any value is NaN in the whole dataframe: df.isnull().any().any() Example for Method 1 The following checks if any value … Read more

How to Replace NaN with Blank/Empty Cells in Pandas

You can replace NaN with Blank/Empty cells using either fillna() or replace() in Python. Single Column: Method 1: df[‘Column_name’].fillna(‘ ‘) Method 2: df[‘Column_name’].replace(np.nan,’ ‘, regex=True) Whole dataframe: Method 1: df.fillna(‘ ‘) Method 2: df.replace(np.nan, ‘ ‘, regex=True) Example 1: single column The following uses fillna() to replace NaN with empty cells in a single column. The updated dataframe has … Read more

Difference between NumPy Random and Python Random

NumPy Random is from NumPy, whereas Python Random is a module in Python. That is, Python random is NOT part of NumPy. This tutorial uses two examples to show the difference between NumPy Random and Python Random. Example 1 Python Random’s randint only has parameters of the range, whereas NumPy random’s randint has the additional … Read more

Examples of random.seed( ) in Python

random.seed() function can help save the state of random functions. Thus, by using seed(), these random functions can generate the same numbers on multiple code executions. Example 1 Example 1 shows how to use random.seed() and how it impacts the generated numbers. Note that, random.random() generates a floating point number in the range 0.0 <= X < 1.0. The following … Read more

When to Use ddof=1 in np.std()

The following is the rule of using ddof in np.std() in Numpy. Rule 1: If you are calculating standard deviation for a sample, set ddof = 1 in np.std(). np.std(sample_name, ddof=1) Rule 2: If you are calculating standard deviation for a population, set ddof = 0 in np.std(). np.std(population_name, ddof=0) Example of ddof = 1 … Read more

Generate Random Numbers in Python

This tutorial shows how you can use Numpy to generate random numbers in Python. The following is the basic syntax summarizing 3 functions. 1. Integers: np.random.randint() 2. Normal distribution: np.random.randn() 3. Uniform distribution: np.random.rand() Example 1: Integer np.random.randint(low, high=None, size=None, dtype=int) np.random.randint() will return integer numbers. Given that there are quite a few parameters in randint(), it is … Read more

How to Round Numbers in Pandas

You can use round() and apply() to round up and down numbers in Pandas. Round to specific decimal places: df.round(decimals = number of specific decimal places) Round up numbers: df[‘DataFrame column’].apply(np.ceil) Method 3: Round down values: df.apply(np.floor) Data being used The following is a column of numbers that we are going to use in this … Read more

How to Create an Empty Pandas Dataframe

You can use DataFrame() to create an empty Pandas dataframe. The following is the basic syntax as well as two examples. import pandas as pd df = pd.DataFrame() Example 1 The following creates an empty dataframe in Pandas and prints it out. The following is the output. As we can see, both columns and indexes … Read more

How to Replace NaN with Zero in Pandas

You can replace NaN with zero using either fillna(0) in Pandas or replace(np.nan,0) in Numpy. Single Column: Method 1: df[‘Column_name’].fillna(0) Method 2: df[‘Column_name’].replace(np.nan,0) Whole dataframe: Method 1: df.fillna(0) Method 2: df.replace(np.nan,0) Example 1: single column The following Python code first creates a dataframe with NaN in both columns and then replaces NaN in the first … Read more