About Lesson
Real-world data often contains missing values, and Pandas provides several functions to handle them.
1. Detecting Missing Data
You can check for missing values using isnull()
or notnull()
:
print(df.isnull()) # Shows a DataFrame with True for missing values
print(df.notnull()) # Shows a DataFrame with True for non-missing values
2. Dropping Missing Data
To drop rows or columns containing missing values:
df = df.dropna() # Drop rows with missing values
To drop columns with missing values:
df = df.dropna(axis=1) # Drop columns with missing values
3. Filling Missing Data
You can replace missing values with a specific value using fillna()
:
df['Age'] = df['Age'].fillna(30) # Fill missing 'Age' values with 30
You can also forward-fill or backward-fill:
df = df.fillna(method='ffill') # Forward-fill to propagate the previous value
df = df.fillna(method='bfill') # Backward-fill to propagate the next value