Calculating monthly averages of daily temperature data including negative and positive values - group-by

I am trying to calculate monthly average temperatures for a dataset with daily temperature values that spans three years. With the data.frame appearing like this example"
Date Month Temperature
12-2-2016 December -10
12-3-2016 December -12
01-2-2017 January -15
01-3-2017 January -14
02-3-2017 February 3
02-4-2017 February 7
03-2-2017 March 8
03-3-2017 March 9
I tried running the following code in order to create a new dataframe with the Month and the average temperature:
group_by(df$month) %>%
summarise(mean_airtemp = mean(Temperature))
However, when running this code I get NA for certain months which I believe is attributed to negative values. I have tried to figure it out but have only found solutions that seem to separate the values based on whether they are negative or positive.

you can use groupby of month and temp together
df.groupby(['Month'])['Temperature'].mean().reset_index(name = 'avg')

Related

How can I show cumulative percentage or sum of an attribute in a line chart power bi

I have a business requirement to show the trend of cumulative percentage of an atribute over last 2 years. I used a measure to create running total of the attribute for this year and the results are correct.However When I tried to apply the same logic for the past 2 years it is giving me incorrect values.I have created a date table that spans from 2021 to 2023 dec and The X axis of the chart displays weeknumber.Could some one help me with this issue?
Thanks in Advance!
The measure for this year :
Cumulative2023 =
CALCULATE(
DIVIDE(
'Table'[CY CountNum2023]
,'Table'[CY CountDen2023]
,0),
FILTER(
ALLSELECTED('Table'),
'Table'[CreatedDate] <= MAX('Table'[CreatedDate])
)
)

How to calculate averages for a 3D matrix for the same day of data over multiple years in Matlab?

I am interested in calculating GPH anomalies in Matlab. I have a 3D matrix of lat, lon, and data. Where the data (3rd dimension) is a daily GPH value spaced in one-day increments for 32 years (from Jan. 1st 1979 to Jan. 1st 2011). The matrix is 95x38x11689. How do I compute a daily average across all years for each day of data, when the matrix is 3D?
In other words, how do I compute the average of Jan. 1st dates for all years to compute the climatological mean of all Jan. 1st's from 1979-2010 (where I don't have time information, but a GPH value for each day)? And so forth for each day after. The data also includes leap years. How do I handle that?
Example: Sort, and average all Jan. 1st GPH values for indices 1, 365, 730, etc. And for each day of all years after that in the same manner.
First let's take out all the Feb 29th, because these days are in the middle of the data and not appear in every year, and will bother the averaging:
Feb29=60+365*[1:4:32];
mean_Feb29=mean(GPH(:,:,Feb29),3); % A matrix of 95x38 with the mean of all February 29th
GPH(:,:,Feb29)=[]; % omit Feb 29th from the data
Last_Jan_1=GPH(:,:,end); % for Jan 1st you have additional data set, of year 2011
GPH(:,:,end)=[]; % omit Jan 1st 2011
re_GPH=reshape(GPH,95,38,365,[]);
av_GPH=mean(re_GPH,4);
Now re_GPH is a matrix of 95x38x365, where each slice in 3rd dimension is an average of a day in the year, starting Jan 1st, etc.
If you want the to include the last Jan 1st (Jen 1st 2011), run this line at after the previous code:
av_GPH(:,:,1)=mean(cat(3,av_GPH(:,:,1),Last_Jan_1),3);
For the ease of knowing which slice nubmer corresponds to each date, you can make an array of all the dates in the year:
t1 = datetime(2011,1,1,'format','MMMMd')
t2 = datetime(2011,12,31,'format','MMMMd')
t3=t1:t2;
Now, for example :
t3(156)=
datetime
June5
So av_GPH(:,:,156) is the average of June 5th.
For your comment, if you want to subtract each day from its average:
sub_GPH=GPH-repmat(av_GPH,1,1,32);
And for February 29th, you will need to do that BEFORE you erase them from the data (line 3 up there):
sub_GPH_Feb_29=GPH(:,:,Feb29)-repmat(mean_Feb29,1,1,8);

MATLAB: Find all values on one date, then filter down to an hour and find average [duplicate]

This question already has answers here:
Counting values by day/hour with timeseries in MATLAB
(3 answers)
Closed 6 years ago.
I have a year's worth of data, the data is recorded one minute intervals each day of the year.
The date and time was imported from excel (in form 243.981944, then by adding 42004 (so will be for 2015) and formatting to date it becomes 31.8.15 23:34:00).
Importing to MATLAB it becomes
'31/08/2015 23:34:00'
I require the data for each day of the year to be at hourly intervals, so I need to sum the data recorded in each hour and divide that by the number of data recorded for that hour, giving me the hourly average.
For some reason the data in August actually increments in 2 minute intervals, data for every other month increments in one minute intervals.
ie
...
31/07/2015 23:57:00
31/07/2015 23:58:00
31/07/2015 23:59:00
31/08/2015 00:00:00
31/08/2015 00:02:00
31/08/2015 00:04:00
...
I'm not sure how I can find all the values for a specific date and hour in order to work out the averages. I was thinking of using a for loop to find the values on each day, but when I got down to writing code realised this wouldn't work the way I was thinking.
I presume there must be some kind of functions available that would allow for data to be filtered by the date and time?
edit:
So I tried the following but I get these errors.
dates is a 520000x1 cell array containing the dates form = formatIn.
formatIn = 'DD/MM/YYYY HH:MM:SS';
[~,M,D,H] = datevec(dates, formatIn);
Error using cnv2icudf (line 131) Unrecognized minute format.
Format string: DD/MM/YYYY HH:MM:SS.
Error in datevec (line 112) icu_dtformat = cnv2icudf(varargin{isdateformat});`
Assuming your data is in a matrix or cell-array of strings called A, and your other data is in a vector X. Let's say all the data is in the same year (so we can ignore years)
[~,M,D,H] = datevec(A, 'dd/mm/yyyy HH:MM:SS');
mean_A = accumarray([M, D, H+1], X, [], #mean);
Then data from February will be in
mean_A(2,:,:)
To look at the data, you may find the squeeze() function useful, e.g.
squeeze(mean_A(2,1:10,13:24))
shows the average for the hours after midday (by column) for the first ten days (by row) of February.
See also:
Counting values by day/hour with timeseries in MATLAB

daily to monthly sum in matlab of multiple years

My data is excel column. In excel sheet one column contain last 50 years date (no missing date; dd/mm/yyyy format) and in other column everyday rainfall (last 50 years; no blank).
I want to calculate what is the sum of monthly rainfall for every month of last 50 years in Matlab. Remember, there four types of month ending date: 30, 31, 28 and 29. Upto now I am able read the dates and rainfall value from excel file like below
filename = 'rainfalldate.xlsx';
% Extracts the data from each column of textData
[~,DateString ]= xlsread(filename,'A:A')
formatIn = 'dd/mm/yyyy';
DateVector = datevec(DateString,formatIn)
rainfall = xlsread(filename,'C:C');
what is the next step so that I can see every months of last fifty years rainfall sum?
I mean suppose July/1986 rainfall sum... Any new solutions?Code or loop base in Matlab 2014a
As #Daniel suggested, you could use accumarray for this.
Since you want the monthly rainfall for multiple years, you could combine year and month into one single variable, ym.
It is possible then to identify if a value belongs to both a determined year and month at once, and "tag" it using histc function. Values with similar "tags" are summed by the accumarray function.
As a bonus, you do not have to worry about the number of days in a month. Take a look at this example:
% Variables initialization
y=[2014 2014 2015 2015 2015 2014 2014 2015 2015];
m=[1 1 1 2 2 3 3 3 3]; % jan, feb and march
values = 1:9;
% Identifyier creation and index generation
ym = y*100+m;
[~,subs]=histc(ym,unique(ym));
% Sum
total = accumarray(subs',values);
The total variable is the monthly rainfall already sorted by year and month (yyyymm).

How to create a Scatter Plot chart in SSRS 2008?

I am trying to create a scatter plot chart in SSRS 2008 (not 2008 R2). Is this possible? I want to map military time of day for each day that an event occurs. So my data looks like:
Hour of Day Day of the Week
8 3
14 5
2 1
5 1
10 7
But right now it just sums the HOur of Day values. I want each of them as discreet values however. (Day of the week is 1-7 = Sunday-Saturday).
If this is possible, then how do I set this up?