You might encounter the following error when trying to convert Numpy arrays to a pandas dataframe.
Exception: Data must be 1-dimensional
1. Reproduce the Error
# import numpy and pandas
import numpy as np
import pandas as pd
# Create a numpy array of data:
X = np.array([5, 2, 3, 4, 10, 11, 14]).reshape(-1, 1)
Y = np.array([3, 1, 2, 5, 14, 15, 16]).reshape(-1, 1)
# convert to the dataframe
df_1 = pd.DataFrame ({'X':X,'Y':Y}, index=range(1,8))
print (df_1)
Output:
Exception: Data must be 1-dimensional
2. Why the Error Happens
It happens because pd.DataFrame is expecting to have 1-D numpy arrays or lists, since it is how columns within a dataframe should be. However, when you use reshape(-1,1)
, the 1-D array becomes a 2-D array.
We can print out with and without reshape(-1, 1) to see the difference.
X_without = np.array([5, 2, 3, 4, 10, 11, 14])
print('X without reshape(-1,1):\n', X_without )
X_with = np.array([5, 2, 3, 4, 10, 11, 14]).reshape(-1,1)
print('X with reshape(-1,1):\n', X_with)
Output:
X without reshape(-1,1): [ 5 2 3 4 10 11 14] X with reshape(-1,1): [[ 5] [ 2] [ 3] [ 4] [10] [11] [14]]
We can check the dimension and shape of X to illustrate that.
print('X_without dimension is:\n',np.ndim(X_without))
print('X_without shape is: \n',np.shape(X_without))
print('X_with dimension is:\n',np.ndim(X_with))
print('X_with shape is: \n',np.shape(X_with))
Output:
X_without dimension is: 1 X_without shape is: (7,) X_with dimension is: 2 X_with shape is: (7, 1)
3. How to Fix the Error
Method 1: Remove reshape(-1,1)
To fix the error of “Data must be 1-dimensional
“, You can remove the reshape(-1,1)
to make sure that X and Y are 1-D arrays. The following is the code.
# import numpy and pandas
import numpy as np
import pandas as pd
# Create a numpy array of data:
X = np.array([5, 2, 3, 4, 10, 11, 14])
Y = np.array([3, 1, 2, 5, 14, 15, 16])
# convert to the dataframe
df_1 = pd.DataFrame ({'X':X,'Y':Y}, index=range(1,8))
print (df_1)
Output:
X Y 1 5 3 2 2 1 3 3 2 4 4 5 5 10 14 6 11 15 7 14 16
Method 2: use np.ravel()
np.ravel() returns a contiguous flattened array. Thus, we can use that to change 2-D arrays to 1-D arrays.
# import numpy and pandas
import numpy as np
import pandas as pd
# Create a numpy array of data:
X = np.array([5, 2, 3, 4, 10, 11, 14]).reshape(-1, 1)
Y = np.array([3, 1, 2, 5, 14, 15, 16]).reshape(-1, 1)
# convert to the dataframe
df_1 = pd.DataFrame ({'X':X.ravel(),'Y':Y.ravel()}, index=range(1,8))
print (df_1)
Output:
X Y 1 5 3 2 2 1 3 3 2 4 4 5 5 10 14 6 11 15 7 14 16