Loop over fieldnames in a MatLab structure - matlab

I have a MatLab "struct", with different "level" and "sub-structures". When printed to a cell, the data contained in the "struct", look like that:
report.COUNTRY.SOURCE.SCENARIO.CATEGORY.ENTITY = YEAR YEAR ...;
As a minimal example:
report.HUN.CRF2014.BASEYEAR.CAT0.CO2 = 1991 1992 1993 1994
report.HUN.CRF2014.BASEYEAR.CAT0.CH4 = 1995 1996 1997
report.HUN.CRF2014.BASEYEAR.CAT0.H2S = 1990 1991 1992
report.HUN.CRF2014.BASEYEAR.CAT1.N2 = 1991 1992 1993
report.HUN.CRF2014.BASEYEAR.CAT1.FGASES = 1990 1991 1992
In order to produce tables listing the different variables combinations, I would like to loop over the fieldnames contained within the "struct".
I am currently trying to write a function able to do that:
fields=fieldnames(struct);
for categoryidx=1:length(fields)
categoryname=fields{categoryidx};
if isstruct(struct.(categoryname))
category=fieldnames(struct.(categoryname));
for entityidx = 1:length(category);
entityname = category{entityidx};
if isstruct(struct.(categoryname).(entityname))
gases=fieldnames(struct.(categoryname).(entityname));
end
end
end
end
Unfortunately, this is just producing anything! Does anyone has any idea how to loop over fieldnames in such a matlab structure? Thank you!

You might want to check out:
struct2tabler. This is a MATLAB function that recursively goes through a structure to convert into a table.
For example:
a.a = 5
a.b.c = 10
a.b.d = 15
Would return a table:
a_a a_b_c a_b_d
---------------------------
5 10 15
Disclaimer: I have written struct2tabler, so might be a little biased, however it was created out of a requirement, I think, very similar to yours.

Related

CDO - Resample netcdf files from monthly to daily timesteps

I have a netcdf file that has monthly global data from 1991 to 2000 (10 years).
Using CDO, how can I modify the netcdf from monthly to daily timesteps by repeating the monthly values each day of each month?
for eaxample,
convert from
Month 1, value = 0.25
to
Day 1, value = 0.25
Day 2, value = 0.25
Day 3, value = 0.25
....
Day 31, value = 0.25
convert from
Month 2, value = 0.87
to
Day 1, value = 0.87
Day 2, value = 0.87
Day 3, value = 0.87
....
Day 28, value = 0.87
Thanks
##############
Update
my monthly netcdf has the monthly values not on the first day of each month, but in sparse order. e.g. on the 15th, 7th, 9th, etc.. however one value for each month.
The question is perhaps ambiguously worded. Adrian Tompkins' answer is correct for interpolation. However, you are actually asking to set the value for each day of the month to that for the first day of the month. You could do this by adding a second CDO call as follows:
cdo -inttime,1991-01-01,00:00:00,1day in.nc temp.nc
cdo -monadd -gtc,100000000000000000 temp.nc in.nc out.nc
Just set the value after gtc to something much higher than anything in your data.
You can use inttime which interpolates in time at the interval required, but this is not exactly what you asked for as it doesn't repeat the monthly values and your series will be smoothed by the interpolation.
If we assume your dataset starts on the 1st January at time 00:00 (you don't state in the question) then the command would be
cdo inttime,1991-01-01,00:00:00,1day in.nc out.nc
This performs a simple linear interpolation between steps.
Note: This is fine for fields like temperature and seems to be want you ask for, but readers should note that one has to be more careful with flux fields such as rainfall, where one might want to scale and/or change the units appropriately.
I could not find a solution with CDO but I solved the issue with R, as follows:
library(dplyr)
library(ncdf4)
library(reshape2)
## Read ncfile
ncpath="~/my/path/"
ncname="my_monthly_ncfile"
ncfname=paste(ncpath, ncname, ".nc", sep="")
ncin=nc_open(ncfname)
var=ncvar_get(ncin, "nc_var")
## melt ncfile
var=melt(var)
var=var[complete.cases(var), ] ## remove any NA
## split ncfile by gridpoint (lat and lon) into a list
var=split(var, list(var$lat, var$lon))
var=var[lapply(var,nrow)>0] ## remove any empty list element
## create new list and replicate, for each gridpoint, each monthly value n=30 times
var_rep=list()
for (i in 1:length(var)) {
var_rep[[i]]=data.frame(value=rep(var[[i]]$value, each=30))
}

SAS Placeholder value

I am looking to have a flexible importing structure into my SAS code. The import table from excel looks like this:
data have;
input Fixed_or_Floating $ asset_or_liability $ Base_rate_new;
datalines;
FIX A 10
FIX L Average Maturity
FLT A 20
FLT L Average Maturity
;
run;
The original dataset I'm working with looks like this:
data have2;
input ID Fixed_or_Floating $ asset_or_liability $ Base_rate;
datalines;
1 FIX A 10
2 FIX L 20
3 FIX A 30
4 FLT A 40
5 FLT L 30
6 FLT A 20
7 FIX L 10
;
run;
The placeholder "Average Maturity" exists in the excel file only when the new interest rate is determined by the average maturity of the bond. I have a separate function for this which allows me to search for and then left join the new base rate depending on the closest interest rate. An example of this is such that if the maturity of the bond is in 10 years, i'll use a 10 year interest rate.
So my question is, how can I perform a simple merge, using similar code to this:
proc sort data = have;
by fixed_or_floating asset_or_liability;
run;
proc sort data = have2;
by fixed_or_floating asset_or_liability;
run;
data have3 (drop = base_rate);
merge have2 (in = a)
have1 (in = b);
by fixed_or_floating asset_or_liability;
run;
The problem at the moment is that my placeholder value doesn't read in and I need it to be a word as this is how the excel works in its lookup table - then I use an if statement such as
if base_rate_new = "Average Maturity" then do;
(Insert existing Function Here)
end;
so just the importing of the excel with a placeholder function please and thank you.
TIA.
I'm not 100% sure if this behaviour corresponds with how your data appears once you import it from excel but if I run your code to create have I get:
NOTE: Invalid data for Base_rate_new in line 145 7-13.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
145 FIX L Average Maturity
Fixed_or_Floating=FIX asset_or_liability=L Base_rate_new=. _ERROR_=1 _N_=2
NOTE: Invalid data for Base_rate_new in line 147 7-13.
147 FLT L Average Maturity
Fixed_or_Floating=FLT asset_or_liability=L Base_rate_new=. _ERROR_=1 _N_=4
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.HAVE has 4 observations and 3 variables.
Basically it's saying that when you tried to import the character strings as numeric it couldn't do it so it left them as null values. If we print the table we can see the null values:
proc print data=have;
run;
Result:
Fixed_or_ asset_or_ Base_
Floating liability rate_new
FIX A 10
FIX L .
FLT A 20
FLT L .
Assuming this truly is what your data looks like then we can use the coalesce function to achieve your goal.
data have3 (drop = base_rate);
merge have2 (in = a)
have (in = b);
by fixed_or_floating asset_or_liability;
base_rate_new = coalesce(base_rate_new,base_rate);
run;
The result of doing this gives us this table:
Fixed_or_ asset_or_ Base_
ID Floating liability rate_new
1 FIX A 10
3 FIX A 10
2 FIX L 20
7 FIX L 20
4 FLT A 20
6 FLT A 20
5 FLT L 30
The coalesce function basically returns the first non-null value it can find in the parameters you pass to it. So when base_rate_new already has a value it uses that, and if it doesn't it uses the base_rate field instead.

two different size of dataset filtering via considering timestamp using matlab?

I have two very large dataset of matlab. In both dataset we have different parameter. The only common parameter is timestamp means measuring value of all parameter with every 10 min of interval. Let us take an example,
In dataset 1 , I have Timestamp (YYYY-MM-DD , HH : MM :SS format) and power
In dataset 2, I have again timestamp(in above format) and speed
I want a new dataset which have power and speed with timestamp synchronization. For example :
TimeStamp P S
2014 - 01 - 01 , 00 :10 100 5
00 :20 7
00:30 150 10
00:40 200
00:50 145 12
01:00 50 7
01:10 6
etc............
So in short the output of the final dataset must be like :
TimeStamp P S
00 :10 100 5
00:30 150 10
00:50 145 12
So basically if i am getting both power and speed with same time then it should take otherwise filter rest.
And If we have different size of observation in both data set will it work ?? Even though they might have different observation size but I want only those data in my final database whose P and S matching with time Stamp and if it is not making then my final data base exclude those sets
anyone help me on this with the help of matlab ??? thanks in advance
You could try something like this:
%type "help ismember" in command window to see what the function does
%finds index of timestamp in dataset1 that exists in dataset 2
indexPinS = ismember(dataset1(:,1),dataset2(:,1));
%finds index of timestamp in dataset2 that exists in dataset 1
indexSinP = ismember(dataset2(:,1),dataset1(:,1));
%combines data in final database
finalDatabase = [dataset1(indexPinS,1), dataset1(indexPinS,2), dataset2(indexSinP,2)];

Take out date values between two dates from matrix variable, Matlab

I'm trying to take out two separate years from a date table.
% Date table
Datez = [2001 2;2001 5;2001 9;2001 11;2002 3;2002 5;2002 7;2002 9;2002 11;...
2003 2;2003 4;2003 6;2003 8;2003 10;2003 12;2004 3;2004 5;2004 7;...
2004 9;2004 11; 2005 10;2005 12]
I want to take out all values as 1 or 0. I want the dates from 2001-11 to 2002-11 plus all values from 2004-11 to 2005-11.
In total I should get a new vector, called test:
test = [0;0;0;1;1;1;1;1;1;0;0;0;0;0;0;0;0;0;0;1;1;0] % final result
I tried these combinations, but I don't know how to combine these four statements into a vector that looks like "test" or if there are any better solutions?
xjcr = 1:length(Datez)
(Datez(xjcr,1) >= 2001 & Datez(xjcr,2) >= 11) % greater than 2001-11
(Datez(xjcr,1) <= 2002 & Datez(xjcr,2) <= 11) % smaller than 2002-11
(Datez(xjcr,1) >= 2004 & Datez(xjcr,2) >= 11) % greater than 2004-11
(Datez(xjcr,1) <= 2005 & Datez(xjcr,2) <= 11) % smaller than 2005-11
Any ideas are much appreciated, thanks in advance!
Your issue is that you do not want to filter on two items independently, years greater than 2001 and months greater than November. This would give you December 2001 but not January 2002. The solution I believe is to treat your two composite numbers as a single number so that the comparison operator can operate on them as a pair. Here is an easy method:
Datez2 = Datez(:,1)*100 + Datez(:,2);
test = (Datez2>=200111 & Datez2<=200211) | (Datez2>=200411 & Datez2<=200511)
Maybe multiplying by 12 and adding (month - 1) would be best depending on if you are building something that needs to be very robust or if you are just hacking something together.

displaying dates of values - time series data

I am trying to maintain a table using some panel data. I have all the data outputting fine, but I am having difficulty getting the correct dates to display. The method I am using is the following:
gen ymdny = date(date,"MDY"); /*<- date var from panel dataset that i import*/
sort name ymdny;
summ ymdny;
local lastdate : disp %tdM-D r(max);
local lastdate2 : disp %tdM-D (r(max)-1);
local lastw : disp %tdM-D (r(max)-7);
This would work fine if the data were daily, but the dataset I have is actually business daily (ie. missing for the weekends and bank national holidays). It seems silly but I have not been able to figure out a workaround that does the job. Ideally - there is a function that i can use to print the corresponding date to a particular value.
For example:
gen resbal_1d = round(l1.resbal,0.1);
gen dateOf = dateOf(resbal_1d); /* <- pseudocode example of what I would like */
I'm not sure what you're asking for but my guess is that you want to see a human readable form date as the output, given a numerical input. (This is your last sentence.) So simply try something like:
display %td 10
The format is important as the following shows (see help format):
display %tq 10
Same numerical input, different format, different output.
Two other examples from the manual:
* string to integer
display date("5-12-1998", "MDY")
* string to date format
display %td date("5-12-1998", "MDY")
As for your example code, I don't get what you're aiming for. In effect, you can summarize the date variable because in Stata, dates are just integers. It's legal but couldn't say if it's good form. Below a simple example.
clear all
set more off
set obs 10
gen date = _n // create the data
format date %td // give date format
list
summarize date
local onedate = r(max)
display %td `onedate'
Some references:
[U] 24 Working with dates and time
help datetime
help datetime business calendars
http://www.stata.com/support/faqs/data-management/creating-date-variables/
http://www.ats.ucla.edu/stat/stata/modules/dates.htm
(Maybe you can explain with more detail and context what it is you want.)
Edit
Your comment
I do not see how this helps with the date output. For example,
displaying r(max) - 1 on a monday will still display the sunday date.
does not explain, at all, the problems you're having with Stata's business calendars.
I'm adding what is basically an example taken from the help file I already referenced. I do this with the hope of convincing you that (re)-reading the help files is worthwhile.
*clear all
set more off
* import string dates
infile str10 sdate float x using http://www.stata-press.com/data/r13/bcal_simple
list
*----- Regular dates -----
* create elapsed dates - Stata's way of managing dates
generate rdate = date(sdate, "MD20Y")
format rdate %td
drop sdate x
list
* compute previous and next dates
generate tomorrow1 = rdate + 1
format tomorrow1 %td
generate yesterday1 = rdate - 1
format yesterday1 %td
list
*----- Business dates -----
* convert regular date to business dates
generate bdate = bofd("simple", rdate)
format bdate %tbsimple
* compute previous and next dates
generate tomorrow2 = bdate + 1
format tomorrow2 %tbsimple
generate yesterday2 = bdate - 1
format yesterday2 %tbsimple
order yesterday1 rdate tomorrow1 yesterday2 bdate tomorrow2
list
/*
The stbcal-file for simple, the calendar shown below,
November 2011
Su Mo Tu We Th Fr Sa
---------------------------
1 2 3 4 X
X 7 8 9 10 11 X
X 14 15 16 17 18 X
X 21 22 23 X X X
X 28 29 30
---------------------------
*/
Notice that if you add or substract 1 from a regular date, then business days are not taken into account. If you do the same with a business calendar date, you get what you want. Business calendars are defined by .stbcal files; the example uses a built-in calendar called simple. You maybe need to make your own .stbcal file but it is not difficult. Again, the details are in the help files.