dump netcdf variable within a range - dump

I am using the following command to dump the content of the variable 'tas' within a netcdf file tas_EUR-44_historical.nc
ncdump -v tas tas_EUR-44_historical.nc
tas is a variable of three dimensions consisting of time, latitude and longitude tas(time, rlat, rlon)
Now I need to dump the first value of time ,0, for rlat ranging from 0 to 5 and rlon ranging from 0 to 5.
Does anyone know how this can be done?
Thanks!

You can use ncks
ncks -d time,0 -d rlat,0,5 -d rlon,0,5 in.nc out.nc

Strongly depends on what kind of tools you want to use. This is a very trivial task with most programming languages ("Python/R/..."), if you want a command line tool you might want to look at NCO and especially its ncks (NetCDF Kitchen Sink) command.
For example, if I have a NetCDF file (output ncdump -h)
netcdf u.xz {
dimensions:
xh = 256 ;
y = 1 ;
z = 160 ;
time = UNLIMITED ; // (481 currently)
variables:
float time(time) ;
string time:units = "Seconds since start of experiment" ;
float xh(xh) ;
float y(y) ;
float z(z) ;
float u(time, z, xh, y) ;
}
I can extract for example the first time record using:
ncks -d time,0,0 u.xz.nc test.nc
Or, something closer to your question, select the first time record and slice the spatial dimensions:
ncks -d time,0,0 -d xh,0,5 -d z,0,5 u.xz.nc test.nc
Each time the manipulated NetCDF file is written to a new file. You can leave out the last argument test.nc to dump the output to screen, or simply dump the output of test.nc with ncdump.

Related

How can I dump a file of float values to the command-line?

I have a file containing a raw sequence of 4-byte floating-point values - no headers, no formats, nothing else. I want to print some the float values in human-readable format, say from the n_1'th float to the n_2'th float, inclusive.
What's a simple way of doing this in a Unix-like command-line environment?
Use the octal dump utility, named od on Unix systems; it supports parsing floating-point binary representations as well.
Specifically, and letting N1 = n_1 * 4, N2 = n_2 * 4 and N3 = N2 - N1, invoke:
od -An -w4 -tf4 -j N1 -N N3 file_with_floats.bin
to get something like:
-123.456
inf
111.222
for a file with three 4-byte float values.
Explanation:
-w4 : Print the dumped values of 4 bytes per line (i.e. one float value)
-An : No prefixing each line with the starting offset into the file
-tf4 : Floating-point format, 4 bytes per value
-j N1 : Skip N1 bytes
-N N3 : Dump N3 bytes
If you want to print your file in C columns use 4*C as the argument to -w, e.g. -w20 will give you 5 columns per line.

Slow regexprep with a very long string

I have simulation data in an ascii file with a lot of data points. I'm trying to extract variable names and their values from it. The below is an example of what the file format looks like:
*ESA
*COM on Tue Sep 27 15:23:02 2016
*COM C:\Users\vi813c\Documents\My Matlab\
*COM The pathname to the ESB file was: C:\Users\vi813c\Documents\My Matlab
Case013
*RTITLE
Run Date/Time = 20-SEP-2016 13:29:00
MSC.EASY5 time-history plot with 20001 data points
*EOD
*FLOAT
TIME FDLB(1) FSLB(1) FVLB(1) MXLB(1) \
MYLB(1) MZLB(1) FDLB(2) FSLB(2) FVLB(2) \
MXLB(2) MYLB(2) MZLB(2) FDLB(3) FSLB(3) \
FVLB(3) MXLB(3) MYLB(3) MZLB(3)
0 884.439 -0 53645.8 -972.132
-311780 207.866 5403.68 1981.49 327781
258746 -1.74898E+006 84631.4 5384.25 -1308.47
326538 -97028.6 -1.74013E+006 -61858.1
0.002 882.616 0.008033 53661.1 -972.4
-311702 207.779 5400.42 1982.11 327784
258726 -1.74906E+006 84628.3 5381.01 -1308.44
326541 -97040.1 -1.74021E+006 -61858.8
0.004 876.819 0.031336 53705.6 -973.183
-311683 207.661 5391.19 1983.9 327795
258693 -1.74935E+006 84624 5371.85 -1309.63
326552 -97040.6 -1.74051E+006 -61858.8
0.006 869.491 0.061631 53763.3 -974.213
-311806 207.618 5377.45 1986.76 327813
258659 -1.74995E+006 84621.7 5358.2 -1312.04
326569 -97040.3 -1.7411E+006 -61861
0.008 861.718 0.095625 53828.1 -975.379
-312039 207.648 5360.82 1990.12 327834
A summary of data format characteristics is as follows:
Everything above "*FLOAT" is a header and I need to get rid of it
Stuff between "*FLOAT" and the first numeric value are the variable names
The variable names and the values are delimited by space(s) and '\'
The data are "lumped". Each lump has values for the variables at a given simulation time step. In the example above, there are 19 variables so that there are 19 numeric values in each lump
There can be multiple data sets; each preceded with "*FLOAT" and a variable name section
The following is how I am currently handling this data:
fileread the file --> one big string of characters
regexprep {'\s+,'\','\n'} with ',' --> comma delimited for strsplit
strfind "*FLOAT"
strsplit by ',' --> now becomes a cell
find the first numeric value by isnan(str2double(parse))
Then between the index from 2. and the index from 4 are the variable names and between the index from 4 and the next "*FLOAT" are the numeric data
This scheme is sort of working, but I can't stop thinking that there's gotta be a better way to do this. For one, the step 1. is extremely slow. I guess it's one big string for regexprep to work on with multiple things to replace.
How can I improve my script?
I gave this a shot with the string class which is new in 16b.
str = string(fileread('file.txt'));
fileNewline = [13 newline]; % This data has carriage returns
str = extractAfter(str, ['*FLOAT' fileNewline]);
str = erase(str, ['\' fileNewline]);
str = splitlines(str);
% Get the variable names
varNames = split(str(1))';
% Get the data
data = reshape(str(2:end), 4, [])';
data = strip(data);
data = join(data);
data = split(data);
data = double(data);
I'm not sure about how to load the file faster.
As mentioned in another comment, textscan could probably help. It might end up being the fastest solution. With the correct format specified and using the 'HeaderLines' option, I think you can make it work.

MATLAB reading CSV file with timestamp and values

I have the following sample from a CSV file. Structure is:
Date ,Time(Hr:Min:S:mS), Value
2015:08:20,08:20:19:123 , 0.05234
2015:08:20,08:20:19:456 , 0.06234
I then would like to read this into a matrix in MATLAB.
Attempt :
Matrix = csvread('file_name.csv');
Also tried an attempt formatting the string.
fmt = %u:%u:%u %u:%u:%u:%u %f
Matrix = csvread('file_name.csv',fmt);
The problem is when the file is read the format is wrong and displays it differently.
Any help or advice given would be greatly appreciated!
EDIT
When using #Adriaan answer the result is
2015 -11 -9
8 -17 -1
So it seems that MATLAB thinks the '-' is the delimiter(separator)
Matrix = csvread('file_name.csv',1,0);
csread does not support a format specifier. Just enter the number of header rows (I took it to be one, as per example), and number of header columns, 0.
You file, however, contains non-numeric data. Thus import it with importdata:
data = importdata('file_name.csv')
This will get you a structure, data with two fields: data.data contains the numeric data, i.e. a vector containing your value. data.textdata is a cell containing the rest of the data, you need the first two column and extract the numerics from it, i.e.
for ii = 2:size(data.textdata,1)
tmp1 = data.textdata{ii,1};
Date(ii,1) = datenum(tmp1,'YYYY:MM:DD');
tmp2 = data.textdata{ii,2};
Date(ii,2) = datenum(tmp2,'HH:MM:SS:FFF');
end
Thanks to #Excaza it turns out milliseconds are supported.

Different value when using fprintf or sprintf

I've written a function (my first, so don't be too quick to judge) in MATLAB, which is supposed to write a batch file based on 3 input parameters:
write_BatchFile(setup,engine,np)
Here setup consists of one or more strings, engine consists of one string only and np is a number, e.g.:
setup = {'test1.run';'test2.run';'test3.run'};
engine = 'Engine.exe';
np = 4; % number of processors/cores
I'll leave out the first part of my script, which is a bit more extensive, but in case necessary I can provide the entire script afterwards. Anyhow, once all 3 parameters have been determined, which it does successfully, I wrote the following, which is the last part of my script:
%==========================================================================
% Start writing the batch file
%==========================================================================
tmpstr = sprintf('\nWriting batch file batchRunMPI.bat...');
disp(tmpstr); clear tmpstr;
filename = 'batchRunMPI.bat';
fid = fopen(filename,'w');
fprintf(fid,'set OMP_NUM_THREADS=1\n');
for i = 1:length(setup);
fprintf(fid,'mpiexec -n %d -localonly "%s" "%s"\n',np,engine,setup{i});
fprintf(fid,'move %s.log %s.MPI_%d.log\n',setupname{i},setupname{i},np);
end
fclose all;
disp('Done!');
NOTE setupname follows using fileparts:
[~,setupname,setupext] = fileparts(setup);
However, when looking at the resulting batch file I end up getting the value 52 where I indicate my number of cores (= 4), e.g.:
mpiexec -n 52 -localonly "Engine.exe" "test1.run"
mpiexec -n 52 -localonly "Engine.exe" "test2.run"
mpiexec -n 52 -localonly "Engine.exe" "test3.run"
Instead, I'd want the result to be:
mpiexec -n 4 -localonly "Engine.exe" "test3.run", etc
When I check the value of np it returns 4, so I'm confused where this 52 comes from.
My feeling is that it's a very simple solution which I'm just unaware of, but I haven't been able to find anything on this so far, which is why I'm posting here. All help is appreciated!
-Daniel
It seems that at some stage np is being converted to a string. The character '4' has the integer value 52, which explains what you're getting. You've got a few options:
a) Figure out where np is being converted to a string and change it
b) the %d to a %s, so you get '4' instead of 52
c) change the np part of the printf statement to str2double(np).

Finding number of lines of a data file using command line

There is a conventional way to read each line one by one and check iostat hits nonzero or negative value at every reading. However, I would like to call system(command) routine and
use wc -l command to count the number of and then want to allocate the dimension of the array where I want to put the data. For the example, I am printing the number of lines in both ways:
Program Test_reading_lines
integer:: count,ios, whatever
character(LEN=100):: command
Print*,'Reading number of lines in a standard way'
count=0
open (10, file='DATA_FILE')
Do
read (10,*,iostat=ios) whatever
if (ios/=0) exit
count=count+1
End Do
close(10)
Print*,'Number of lines =', count
Print*,'Reading number of lines using shell command'
command='cat DATA_FILE | wc -l'
call system(command)
Print*,'Number of lines =','< ? >'
End Program Test_reading_lines
Unfortunately, in the latter case, can I assign a variable like count as in the standard case? That is, I want to print a variable instead of '< ? >' in the last print command.
This is not possible in a straightforward way. You could redirect the output of the command to a file, then open it and read it http://compgroups.net/comp.lang.fortran/how-to-get-the-output-of-call-system-in-a-v/216294
Or use some even more sophisticated features of the Unix functions and call its C API (see the first answer in that thread).
The EXECUTE_COMMAND_LINE() also does not have any feature to read the output of the command directly.
If you want to use the Unix command $ wc -l, you could call the Fortran subroutine execute_command_line which is common to many Fortran compilers, gfortran included.
Here is a working example which computes the number of lines, nlines, of a file called style.gnuplot and then uses nlines to append some rows to style.gnuplot by overwriting the last one.
PROGRAM numLines
IMPLICIT NONE
integer, parameter :: n = 100
integer :: i, nLines
real, parameter :: x0 = -3.14, xEnd = 3.14
real :: dx
real, dimension (:), allocatable :: x, fun
allocate(x(0:n)) ! Allocate the x array
allocate(fun(0:n)) ! Allocate the fun array
dx = abs(xEnd-x0)/n
x(0:n) = [(x0+i*dx, i = 0,n)] ! Create the x array
fun(0:n) = [(sin(x0+i*dx), i = 0,n)] ! Create the fun array
open(unit=1,file="plotFunction.dat")
DO i=0,size(x)-1
write(1,*) x(i), ' ', fun(i) ! Save the function to a file to plot
END DO
close(unit=1)
deallocate(x) ! Deallocate the x array
deallocate(fun) ! Deallocate the fun array
open(unit=7, file="style.gnuplot")
write(7,*) "set title 'y = sin(x)' font 'times, 24'"
write(7,*) "set tics font 'times, 20'"
write(7,*) "set key font 'times,20'"
write(7,*) "set grid"
write(7,*) "set key spacing 1.5"
write(7,*) "plot '<cat' u 1:2 w l lw 2 linecolor rgb 'orange' notitle "
close(unit=7)
CALL execute_command_line("wc -l style.gnuplot | cut -f1 -d' ' > nlines.file") ! COunt the lines
open(unit=1,file='nlines.file')
read(1,*) nlines ! Here the number of lines is saved to a variable
close(unit=1)
CALL execute_command_line("rm nlines.file") ! Remove nlines.file
CALL execute_command_line("cat plotFunction.dat | gnuplot -p style.gnuplot") ! Show the plot within the executable
open(unit=7,file="style.gnuplot")
DO i = 1,nLines-1
read(7,*) ! Read the file untile the penultimate row,
END DO ! then append the other rows
write(7,*) "set object rectangle at -3.14,0 size char 1, char 1", &
" fillcolor rgb 'blue' fillstyle solid border lt 2 lw 1.5"
write(7,*) "set object rectangle at 0,0 size char 1, char 1", &
" fillcolor rgb 'blue' fillstyle solid border lt 2 lw 1.5"
write(7,*) "set object rectangle at 3.14,0 size char 1, char 1", &
" fillcolor rgb 'blue' fillstyle solid border lt 2 lw 1.5"
write(7,*) "plot 'plotFunction.dat' u 1:2 w l lw 2 linecolor rgb 'orange' notitle"
close(unit=7)
CALL execute_command_line("gnuplot -p 'style.gnuplot'") ! Load again style.gnulot with the appended lines
END PROGRAM numLines
My code might not be elegant, but it seems to work!