I'm having some issues creating a function with the following parameters:
Ndata = extperiod(data, year, month,time)
The data is a table with 3 columns, which from left to right are:
year/month/date, time, temperature
My goal is to create a function which can extract a time and a year/month, irrespective of the date and find it's corresponding temperature.
I need to avoid using for loops
I've been advised to use floor and find, where floor(YYYYMMDD/100) = YYYY*100 + MM, which I somehow want to integrate to my function.
I've previously found a way to extract all temperatures from the data for a given day, as follows:
k = find(data(:,1)==19750101);
data(k(1):k(end),3)
I'm trying to incorporate this method, but I think that the hint "floor(YYYYMMDD/100)" throws me a of a little.
I have tried with find(data(:,1)==floor(YYYYMMDD/100)), where I would think that I'd be given all dates with a specific year and month. For example:
find( data(:,1) == floor(19660101/100) )
I thought this would give me all points in the column vector where the value is 196601. But it doesn't.
What could I try differently?
From your explanation, your want to get all temperature for a given month, no matter time and day.
So you want to find dates that are comprised in the range [YYYYMM ; YYYY{MM+1}[ or [YYYYMM ; {YYYY+1}01[ in the case of selecting December.
Recall that you store the complete date in your table. So you need to apply your operator floor to both sides of your query, not only on the query value, because no date is floor(YYYYMMDD/100)!
As a result, try the following:
find( floor(data(:,1)/100) == floor(19660101/100) )
Related
I want to compare how different campaigns are progressing based on number of days into the campaign rather than by date (see day1, day2, etc... on the x-axis below).
Here is my DAX code, but I can't get it to work. Any help would be much appreciated...
**Normalised Campaign Metrics =
VAR DateReached = CALCULATE(MIN(Days[Day]),db[PAYMENT_DATE]<> BLANK(), KEEPFILTERS(db[PRODUCT_CODE SWITCH]))
VAR MaxDate = CALCULATE(MAX(db[PAYMENT_DATE]),KEEPFILTERS(db[PRODUCT_CODE SWITCH]))
VAR DayNo = SELECTEDVALUE(Days[Day])
RETURN CALCULATE(count(db[PAYMENT_DATE]),
FILTER(ALL(db[PAYMENT_DATE]),
DateReached+DayNo && DateReached+DayNo<=MaxDate))**
Many thanks!
enter image description here
I would recommend solving this through manipulating your actual data rather than a complex DAX measure. If you are familiar with star schema modelling, I would solve this problem by adding a new column to your fact table that calculates how many days from the start date the payment occurred and then connect this column to a new "Days Passed" dimension that is simply a list of numbers from 1 to however many days you need. Then, you can use this new dimension as the source data for your x axis and use a standard payment amount measure for your y axis.
I recommend to create a dimension table as the relative basis to comparison with inactive relationship. Here is a video about it:
https://youtu.be/knXFVf2ipro
My input file consists of column Actual Exp and Actual Min.
Now I want to create a measure for calculating the average for the current calendar year having the latest month of data for Actual Exp and Actual Min.
I want to calculate the average from Jan'21 to Mar'21, and later if data gets added for Apr or May I would like to calculate the average from Jan'21 to May'21 or Apr'21.
Similarly, I want to have from Jan '21 to Dec'21 in Dec'21 and an average from Jan'22 to Feb'22 in Feb'22. I also have a date filter I don't want my date filter to affect the average.
I tried using TOTALYTD and MAX(Date), but it's not working.
Thanks.
I've not fully understood your question, but hopefully this helps:
AVERAGEX ( ALL(TABLENAME), TABLENAME[COLUMNNAME])
This should give you an average of COLUMNNAME no matter what filters/slicing you have in place.
If you wanted to futher restrict this, you can try creating a 2nd measure such as
CALCULATE(AVERAGEX ( ALL(TABLENAME), TABLENAME[COLUMNNAME]), datetable[monthcolumn] IN {"Jan", "Feb", "Mar"})
I have two time series x and y which roughly cover the same period of time. The data is in daily form however there are some days that have data in one dataset but no data in the other. I wish to use matlab to create two data-sets of equal size with matching dates. Essentially I wish to remove the days that don't have data in both x and y. Is there a simple way to do this? Thanks.
You could use an inner join see help join if you are able to convert your timeseries into datasets. If not you could use the ismember function, but this time you should do it only on the dates.
Something like this will work:
a = {'2015-01-01', '2015-02-02', '2015-03-03'};
b = {'2015-01-01', '2015-03-03', '2015-04-04'};
newA = a(ismember(a,b));
newB = b(ismember(b,a));
I've been stuck on a MATLAB coding problem where I needed to create market weights for many stocks from a large data file with multiple days and portfolios.
I received help from an expert the other day using 'nested loops' it worked, but I don't understand what he has done in the final line. I was wondering if anyone could shed some light and provide an explanation of the last coding line.
xp = x (where x = market value)
dates=unique(x(:,1)); (finds the unique dates in the data set Dates are column 1)
for i=1:size(dates,1) (creates an empty matrix to fill the data in)
for j=5:size(xp,2)
xp(xp(:,1)==dates(i),j)=xp(xp(:,1)==dates(i),j)./sum(xp(xp(:,1)==dates(i),j)); (help???)
end
end
Any comment are much appreciated!
To understand the code, you have to understand the colon operator, logical indexing and the difference between / and ./. If any of these is unclear, please look it up in the documentation.
The following code does the same, but is easier to read because I separated each step into a single line:
dates=unique(x(:,1));
%iterates over all dates. size(dates,1) returns the number of dates
for i=1:size(dates,1)
%iterates over the fifth to last column, which contains the data that will be normalised.
for j=5:size(xp,2)
%mdate is a logical vector, which is used to select the rows with the currently processed date.
mdate=(xp(:,1)==dates(i))
%s is the sums up all values in column j same date
s=sum(xp(mdate,j))
%divide all values in column j with the same date by s, which normalises to 1
xp(mdate,j)=xp(mdate,j)./s;
end
end
With this code, I suggest to use the debugger and step through the code.
I'd like to know if there's a way to operate on the values of one matrix, based separatedly on the values of each row from another matrix, without using a for loop.
One specific example below.
data is a ~500k row matrix with three columns: the first is a list of hours/dates (I have them as serial numbers) covering 24 hours a day for several years, the second column is a location ID, and the third one is the electricity cost in that location.
hours has two columns: the first one is a list of dates/hours as a serial number and the second one is a certain upper limit specific for that hour.
What I need to do is find, for every date and hour in hours, the largest value of electrical cost that is below the upper limit set in hours, and save the ID of the corresponding location in a third column of hours if the id is unique, or zero if there's no location that satisfy the condition or if there's more than one.
A symplified version of the code I'm using:
#Add a third column for the hours matrix
ids=zeros(rows(hours),1)
hours=horzcat(hours,ids)
for i=1:rows(hour)
#Get the data for all locations in that hour
idx=(data(:,1)==hour(i,1) )
hourlydata=(data(idx,:))
#Get the ID for the maximum below the limit in that hour
idx=(hourlydata(:,3)<hour(i,2))
idlimit=(hourlydata(idx,3)==max(hourlydata(idx,3))
dataid=hourlydata(idlimit,2)
#Check if the data exists and is unique
if(rows(dataid)==1)
id=dataid(1,1)
else
id=0
endif
#Save the ID
hour(i,3)=id
endfor
Is there any way to do this without using the for loop?
(I already optimized the data matrix to make it as small as possible, but it's still pretty big, so I might encounter memory constraints when trying to implement a solution)
You can work with arrayfun. Suppose your data is
data = [datenum('2014-01-17'), 1, 13;datenum('2014-01-18'), 2, 7]
hours = [datenum('2014-01-17'), 17; datenum('2014-01-18'), 3]
Then define a selector function
function id = select( d, limit, data )
idx = (data(:,1)==d );
hourlydata = (data(idx,:));
idx = (hourlydata(:,3)<limit);
idlimit = (hourlydata(idx,3)==max(hourlydata(idx,3)));
dataid=hourlydata(idlimit,2);
if length(dataid)==1
id=dataid(1,1)
else
id=0
end
end
Now we find a vector the same size as hours containing the id
arrayfun(#(d, limit) select(d, limit, data), hours(:,1), hours(:,2))
ans =
1
0
You can easily merge this vector with hours.
Now i doubt that this is much faster, but no loop. Works with MATLAB, haven't checked with Octave.