Pandas DataFrame Manipulation Issue: Calculating Monthly Average from Daily Data

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

elvish11

New Member
Aug 9, 2023
5
0
1
I'm working on a data analysis project using Python and Pandas, and I'm facing an issue with manipulating a DataFrame that contains daily data. I have a DataFrame with two columns: date and value. I want to calculate the monthly average of the value column based on the daily data.

Here's a simplified version of my code:

Python:
import pandas as pd

# Sample data
data = {'date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-02-01', '2023-02-02'],
        'value': [10, 15, 20, 5, 8]}

df = pd.DataFrame(data)
df['date'] = pd.to_datetime(df['date'])

# Calculate monthly average
monthly_avg = df.resample('M', on='date').mean()
When I run this code, the monthly_avg DataFrame seems to have NaN values for all the rows. I suspect this is because I don't have data for every day in a month. Is there a way to calculate the monthly average even if I have missing days within a month? Or do I need to preprocess the data differently before calculating the monthly average?
I have looked on other websites, including this one, but I was unable to find the answer. I would appreciate any advice on how to correctly compute the monthly average using Pandas from this daily data. I appreciate your help in advance.
 

Sean Ho

seanho.com
Nov 19, 2019
699
293
63
Vancouver, BC
seanho.com
Works for me on python 3.11. The average is calculated within each month bin, across however many observations there are in that bin. You do not need to have data for each day.
 

ItwasTheAi

New Member
Sep 10, 2023
1
0
1
do your clean your data (take out un needed rows)? also you might need to change date to binary hope this helps I just finished something similar and that was the problem I ran into. had to changed date format good luck.