Computing the value range for a netcdf 3D variable - matlab

I have a large series of netcdf files representing daily snapshots of data. I am hoping to hook these up to a software which is asking me to add to the namelist the maximum and minimum values for a variable in the files. How can I enquire about the maximum and minimum values stored in a variable?
My variable is depth (here is an excerpt from an ncdump for an idea of the size of that variable)
...
dimensions:
z = 40 ;
lat = 224 ;
lon = 198 ;
time = 1 ;
variables:
float depth(z, lat, lon) ;
depth:long_name = "cell centre depth" ;
depth:units = "m" ;
...
I'm still a beginner at handling these files, and have been using NCO operators and/or matlab for netcdf handling to date - is there an easy way to perform this max min enquiry using either of these tools?
Before now I have had netcdfs where the value range was helpfully displayed in the attributes or it has been a sufficiently small amount of data to be displayed easily with a simple ncdump -v look at the values or storing the variable in matlab which auto displays the max min, but now I have too many values to use these quick and dirty methods.
Any help is greatfully received.
All the best,
Bex

One NCO method would be to use the ncrng command, which is simply a "filter" for a longer ncap2 command:
zender#roulee:~/nco/data$ ncrng three_dmn_rec_var in.nc
1.000000 to 80.000000
So, it's a three word command. Documentation on filters is here.

If you have a newer version of MATLAB, try using the ncread function.
% Update with your filename and variable name below.
% This reads in the full variable into MATLAB
variableData = ncread(filename,varname);
% Query max and min values
minValue = min(variableData(:))
maxValue = max(variableData(:))
% you could also write this information back to the file for future reference.
% see https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Attribute-Conventions.html
ncwriteatt(filename, varname, 'valid_range', [minValue, maxValue]);
% check result
ncdisp(filename, varname);
You could add two additional loops outside, one for looping through all your files and another for looping through all the variables in a file (look at ncinfo) to automate the whole thing.

The CDO method would be
cdo vertmax -fldmax in.nc max.nc
cdo vertmin -fldmin in.nc min.nc
The advantage is that you can calculate min/max just over x-y space (fldmax/fldmin), vertically (vertmax/min) or over time (timmax/min), or a combination of the three.
To dump the values from the netcdf to ascii you can use ncks
ncks -s '%13.9f\n' -C -H -v depth max.nc
To construct a namelist therefore you could for example write
echo min=`ncks -s '%13.9f\n' -C -H -v depth min.nc` >> namelist.txt
echo max=`ncks -s '%13.9f\n' -C -H -v depth max.nc` >> namelist.txt

Related

Matlab-reading large netcdf files

I have a 17G netcdf file that I am trying to use for analysis. Each variable in the netcdf file is set up like: variable(x,y,z,time). I would like to read in and analyze the variables one 'time' at a time for analysis in Matlab. In other words, I want to use all x, y, and z points at one time. In the past I have had smaller files so reading in a variable has been set up like
fid=netcdf.open('filename/location','NC_NOWRITE');
var_id=netcdf.inqVarID(fid,'varname');
var=netcdf.getVar(fid,var_id);
Is it possible to read in the variables using one time step when the variable is read in? (Incorrect syntax) It'd essentially look like
var=netcdf.getVar(fid,var_id,[:,:,:,time_index]);
Yes, the matlab netcdf command supports this, almost the way you wrote it:
data = netcdf.getVar(fid,var_id,var_index,var_length)
See the matlab documentation for more information. You can also use high-level matlab commands instead of the netCDF library functions.
For example, if varname is a 100x4 array, you could get row 7 by using:
% read 4 columns from 1 row of data starting at row 7, column 1
v = ncread('filename/location','varname',[7 1],[1 4]);
or a four-dimensional array, as in the question:
% read all data from dim. 1-3 at dim 4 = 27
v = ncread('filename/location','varname',[1 1 1 27],[Inf Inf Inf 1]);

Matlab: latex command to output with 4 significant digits or with certain amount of decimals?

I want to generate a table for the results in Matlab. I use the symbolic box's latex command. How can I specify the amount of significant digits?
Problem profile
>> results
results =
0.0025 0.0024 0.0024
>> latex(vpa(sym(results),4)) #THIS SHOULD PRINT with 4 decimals, how?
ans =
\left(\begin{array}{ccc} 0.0025401858540021748922299593687057 & 0.0023686521873358401535369921475649 & 0.0023649304185866526495374273508787 \end{array}\right)
>> vpa(sym(results),4)
ans =
[ 0.00254, 0.002369, 0.002365]
I think that syntax for vpa sets the minimum number of significant figures. Did you try setting the variable-precision accuracy?
d1 = digits(4); % records and sets accuracy
latex(vpa(sym(results)))
digits(d1); % restore previous accuracy
THANK YOU!
I just created the following nifty function using this (to output transfer functions directly in latex format):
function tf2latex(tf)
[num,den]=tfdata(tf);
syms s
d1 = digits(4); % records and sets accuracy
latex(vpa(poly2sym(cell2mat(num),s)/poly2sym(cell2mat(den),s)))
digits(d1);
end

Extracting netCDF time series for each lat/long in Matlab

I'm currently working with netCDF output from climate models and would like to obtain a text file of the time series for each latitude/longitude combination in the netCDF. For example, if the netCDF has 10 latitudes and 10 longitudes I would obtain 100 text files, each with a time series in a column format. I'm fairly familiar with the Matlab/netCDF language, but I can't seem to wrap my head around this. Naming the text files is not important; I will rename them "latitude_longitude_PCP.txt", where PCP is precipitation at the latitude and longitude location.
Any help would be appreciated. Thanks.
--Darren
There are several ways this problem could be solved.
Method 1. If you were able to put your netcdf file on a THREDDS Data Server, you could use the NetCDF Subset Service Grid as Point to specify a longitude/latitude point and get back the data in CSV or XML format. Here's an example from Unidata's THREDDS Data Server: http://thredds.ucar.edu/thredds/ncss/grid/grib/NCEP/GFS/Global_0p5deg/best/pointDataset.html
Method 2. If you wanted to use Matlab to extract a time series at a specific longitude/latitude location you could use the "nj_tseries" function from NCTOOLBOX, available at: http://nctoolbox.github.io/nctoolbox/
Method 3. If you really want to write an ASCII time series at every i,j location in your [time,lon,lat] grid using Matlab, you could do something like this (using NCTOOLBOX):
url='http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Global_2p5deg/best';
nc = ncgeodataset(url);
nc.variables
var='Downward_Short-Wave_Radiation_Flux_surface_12_Hour_Average';
lon = nc.data('lon');
lat = nc.data('lat');
jd = nj_time(nc,var);
ncvar = nc.variable(var);
for j=1:length(lat)
for i=1:length(lon)
v=ncvar.data(:,j,i);
outfile=sprintf('%6.2flon%6.2flat.csv',lon(i),lat(j))
fid=fopen(outfile,'wt')
data= [datevec(jd) v]
fprintf(fid,'%2.2d %2.2d %2.2d %2.2d %2.2d %2.2d %7.2f\n',data')
fclose(fid)
disp([outfile ' created.'])
end
end
If you had enough memory to read all the data into matlab, you could read outside the double loop, which would be a lot faster. But writing ASCII is slow anyway, so it might not matter that much.
%% Create demo data
data = reshape(1:20*30*40,[20 30 40]);
nccreate('t.nc','data','Dimensions',{'lat', 20, 'lon',30, 'time', inf});
ncwrite('t.nc', 'data',data);
ncdisp('t.nc');
%% Write timeseries to ASCII files
% Giving an idea of the size of your data can help people
% recommend different approaches tailored to the data size.
% For smaller data, it might be faster to read in the full
% 3D data into memory
varInfo = ncinfo('t.nc','data');
disp(varInfo);
for latInd =1:varInfo.Size(1)
for lonInd =1:varInfo.Size(2)
fileName = ['t_ascii_lat',num2str(latInd),'_lon',num2str(lonInd),'.txt'];
tSeries = ncread('t.nc','data',[latInd, lonInd, 1],[1,1,varInfo.Size(3)]);
dlmwrite(fileName,squeeze(tSeries));
end
end
%% spot check
act = dlmread('t_ascii_lat10_lon29.txt');
exp = squeeze(data(10,29,:));
assert(isequal(act,exp));

How to use data from .m file into another?

In the following code, I need 1 lakh samples in the array mydata. I don't know what I am getting out of mydata. Do I have to increase the range of t to get that. And how can I use the data in mydata into another .m file for further processing.
t = [ 1 : 1 : 500]; % Time Samples
f1=10000; % Input Signal Frequency
f2=20000;
f3=30000;
f4=f2+f3;
f5=f1+f3;
f6=f4+f2;
f7=f1+f6;
f8=45000;
f9=55000;
f10=35000;
mydata=[1:100000];
Fs = 100000; % Sampling Frequency
for i=1:100000
if(i<=10000)
mydata = sin(2*pi*f1/Fs*t);
elseif((i>10000)&&(i<=20000))
mydata=sin(2*pi*f2/Fs*t);
elseif((i>20000)&&(i<=30000))
mydata=sin(2*pi*f3/Fs*t);
elseif((i>30000)&&(i<=40000))
mydata=sin(2*pi*f4/Fs*t);
elseif((i>40000)&&(i<=50000))
mydata=sin(2*pi*f5/Fs*t);
elseif((i>50000)&&(i<=60000))
mydata=sin(2*pi*f6/Fs*t);
elseif((i>60000)&&(i<=70000))
mydata=sin(2*pi*f7/Fs*t);
elseif((i>70000)&&(i<=80000))
mydata=sin(2*pi*f8/Fs*t);
elseif((i>80000)&&(i<=90000))
mydata=sin(2*pi*f9/Fs*t);
elseif((i>90000)&&(i<=100000))
mydata=sin(2*pi*f10/Fs*t);
end
end
stem(mydata)
your code doesn't do very much; you know that, right?
if we don't know/understand what you want, we can't help..
and for anyone else: 1 lakh = 100 000 (http://en.wikipedia.org/wiki/Lakh)
edit: are you trying to produce an array of 100000 samples, consisting of a fixed number of points from different sine waves? aka:
[sin(1.0*pi*[0:10]) sin(2.0*pi*[0:10] sin(1.5*pi*[0:10] (etc) ]
edit2: you repeated your earlier question (which was already answered): How can I generate a sine wave with different frequencies using matlab?
I couldnt understand what you want to do with mydata, please be more specific, because your code is wrong and I cant figure what you want to create.
Whereas for the problem on having the data to be used on other script one simple way would just write mydata into disc,
by doing on your script:
save path_for_mydata/file_name.mat mydata
And on the other script:
load path_formydata/file_name.mat
One other way, would be to create a function, and pass it as a parameter.
Finally, you could just run the first script, and then the second script on command line or on one third script that would call this both scripts, the parameters from the first script will be saved on transient memory while the second script runs.

How can I save a row vector to HDF in MATLAB?

For some reason, the hdf5write method in MATLAB is automatically converting my row vectors to column vectors when I re-read them:
>> hdf5write('/tmp/data.h5','/data',rand(1,10));
>> size(hdf5read('/tmp/data.h5','/data'))
ans =
10 1
However, for a row vector in the third dimension, it comes back just fine:
>> hdf5write('/tmp/data.h5','/data',rand(1,1,10));
>> size(hdf5read('/tmp/data.h5','/data'))
ans =
1 1 10
How can I get hdf5write to do the right thing for row vectors? They should be coming back as 1 x 10, not 10 x 1.
edit the problem is slightly more complicated because I am using c-based mex to actually read the data later, instead of hdf5read. Moreover, the problem really is in hdf5write, and this is visible in the hdf5 files themselves:
>> hdf5write('/tmp/data.h5','/data',randn(1,10));
>> ! h5ls /tmp/data.h5
data Dataset {10}
that is, the data is saved as a 1-dimensional array in the hdf5 file. For comparison, I try the same thing with an actual 2-d matrix (to show what it looks like), a 1-d column vector, a 1-d vector along the third dimension, and, for kicks, try the V71Dimensions trick which is in the help for both hdf5read and hdf5write:
>> hdf5write('/tmp/data.h5','/data',randn(10,1)); %1-d col vector
>> ! h5ls /tmp/data.h5
data Dataset {10}
>> hdf5write('/tmp/data.h5','/data',randn(1,1,10)); %1-d vector along 3rd dim; annoying
>> ! h5ls /tmp/data.h5
data Dataset {10, 1, 1}
>> hdf5write('/tmp/data.h5','/data',randn(2,5)); %2-d matrix. notice the reversal in dim order
>> ! h5ls /tmp/data.h5
data Dataset {5, 2}
>> hdf5write('/tmp/data.h5','/data',randn(1,10),'V71Dimensions',true); %1-d row; option does not help
>> ! h5ls /tmp/data.h5
data Dataset {10}
So, the problem does seem to be in hdf5write. The 'V71Dimensions' flag does not help: the resultant hdf5 file is still a Dataset {10} instead of a Dataset {10,1}.
It's the reading that's an issue. From the help
[...] = hdf5read(..., 'V71Dimensions',
BOOL) specifies whether to change the
majority of data sets read from the
file. If BOOL is true, hdf5read
permutes the first two dimensions of
the data set, as it did in previous
releases (MATLAB 7.1 [R14SP3] and
earlier). This behavior was intended
to account for the difference in how
HDF5 and MATLAB express array
dimensions. HDF5 describes data set
dimensions in row-major order; MATLAB
stores data in column-major order.
However, permuting these dimensions
may not correctly reflect the intent
of the data and may invalidate
metadata. When BOOL is false (the
default), the data dimensions
correctly reflect the data ordering as
it is written in the file — each
dimension in the output variable
matches the same dimension in the
file.
Thus:
hdf5write('/tmp/data.h5','/data',rand(1,10));
size(hdf5read('/tmp/data.h5','/data','V71Dimensions',true))
ans =
1 10
I'm affraid for this you will have to use the low-level HDF5 API of Matlab.
In Matlab, the low-level API is available using for instance H5.open(...), H5D.write(...) and so on. The names correspond exactly to those of the C library (see the HDF5 doc). However there is a slight difference in the arguments they take, but the matlab help function will tell you everything you need to know...
The good news is that the Matlab version of the API is still less verbose than the C version. For instance, you don't have to close manually the datatypes, dataspaces, etc. since Matlab closes them for you when the variables go out of scope.