Output from large .csv files generated by Matlab

Output from large .csv files generated by Matlab - matlab

I am downloading some data in static and time series format using Matlab codes (and toolboxes). The codes give out the results by default in a .csv file for static and time series separately. Although the static data turns out fine, the time series data is huge and the .csv file doesn't load all the data. I tried changing the output file extension to .dta and then also .mat in order to view the output in Stata or Matlab. Also tried writing a little loop where the data loaded into large .csv file can be split into two worksheets within the same file. But none of it has worked. Although I am used to some basic coding in Matlab, I am new to dealing with such large datasets. Any help on this would be very much appreciated.
Thank you- Veronica

How about using these two bits of VBScript to split your CSV file into 2 pieces. There will be 599,999 lines in the first output file (part1.csv) and the rest in the second file (part2.csv).
Save this as part1.vbs
Set fso = CreateObject ("Scripting.FileSystemObject")
Set stdout = fso.GetStandardStream (1)
LineNum=1
Do While Not WScript.StdIn.AtEndOfStream
REM Read in next line of input
Line = WScript.StdIn.ReadLine()
If LineNum<600000 Then
stdout.WriteLine(Line)
End If
LineNum=LineNum+1
Loop
Save this as part2.vbs
Set fso = CreateObject ("Scripting.FileSystemObject")
Set stdout = fso.GetStandardStream (1)
LineNum=1
Do While Not WScript.StdIn.AtEndOfStream
REM Read in next line of input
Line = WScript.StdIn.ReadLine()
If LineNum>=600000 Then
stdout.WriteLine(Line)
End If
LineNum=LineNum+1
Loop
Then you can do this at the Command Prompt to split your file in two:
cscript /nologo part1.vbs YourFile.CSV > part1.csv
cscript /nologo part2.vbs YourFile.CSV > part2.csv

Related

Edit an Abaqus input file and Run it from Matlab

I need perform 50 Abaqus simulations, each simulation analyses a certain material property and each differs by changing one parameter. So the idea is to write a Matlab script that:
opens the .inp file
edits the material parameter of interest
prints it into a new file which will be the new .inp file
runs it to perform the simulation
This is what I accomplished so far in a very simplified version:
f= fopen('PRD8_30s.inp');
c = textscan(f,'%s %s %s %s %s ','delimiter',',');
fclose(f) ;
S = [c{1}];
A = {'5e-08'} ;
S(12496) = A ;
fid = fopen('file.inp','w') ;
fprintf(fid,'%s \n',S{:} );
fclose(fid) ;
PRD_8_30s.inp
I manually found out the position of the parameter of interest (A at 12496 hence below the line *Viscoelastic). The code actually changes the parameter I need but there are major problems: it prints a new file with additional lines with respect to the original .inp (12552 vs 8737) and it doesn't print the entire .inp but only the first column.
How can I edit the .inp changing the parameter and obtaining a new .inp with the edited parameter that can be used to run the new simulation?
Thank you in advance for your help!

If your input file is not multiple Gb in size, The following might help.
create a template input and mark the parameter you want to change as, for example para_xxxx
Use the following script:
text=fileread('template.inp');
newtext=replace(text,'para_xxxx',newParameter);
fid=fopen('newcase.inp','w');
fprintf(fid,newtext);
fclose(fid);
The file name 'newcase.inp' should be updated each time in the loop.

Reading a file from a different directory in matlab

My matlab function is in a folder that contains the main project and the other functions of the code. However, the data is stored in a folder withing the main one named "data" and inside the specific dataset that i want, for example "ded4" in this example. I can't figure out how to open the text file that I want without changing the file to the main folder. The code I have so far is:
function[Classify] = Classify(logDir)
%%%%logDir='ded014a04';
Directory = ['data/' logDir '/']
Filename = [logDir '-fixationsOffSet']
File_name = fullfile(Directory,Filename)
File = fopen(File_name,'r')
end
The code is in the 'dev' folder, I think my path is correct because when I do
open(File_name)
it opens.
Thanks for the help

If you want to open the file in the editor, use
open(File_name)
If you want to read data from the file, you can use
dlmread(File_name) % Read ASCII delimited file.
or
C = textscan(File,'FORMAT') % Read formatted data from text file or string.
or more low-level using fscanf, e.g., if the file contains three columns of integers you do the following: Read the values in column order, and transpose to match the appearance of the file: (from the help of fprintf)
fid = fopen('count.dat');
A = fscanf(fid,'%d',[3,inf])';
fclose(fid);

How to batch rename files to 3-digit numbers?

I apologize in advance that this question is not specific. But my goal is to take a bunch of image files, which are currently named as: 0.tif, 1.tif, 2.tif, etc... and rename them just as numbers to 000.tif, 001.tif, 002.tif, ... , 010.tif, etc...
The reason I want to do this is because I am trying to load the images into matlab and for batch processing but matlab does not order them correctly. I use the dir command as dir(*.tif) to get all the images and load them into an array of files that I can iterate over and process, but in this array element 1 is 0.tif, element 2 is 1.tif, element 3 is 10.tif, element 4 is 100.tif, and so on.
I want to keep the ordering of the elements as I process them. However, I do not care if I have to change the order of the elements BEFORE processing them (i.e. I can make it work to rename, for example, 2.tif to 10.tif if I had to) but I am looking for a way to convert the file names the way I initially described.
If there is a better way to get matlab to properly order the files when it loads them into the array using dir please let me know because that would be much easier.
Thanks!!

You can do this without having to rename the files, if you want. When you grab the files using dir, you'll have a list of files like so:
files =
'0.tif'
'1.tif'
'10.tif'
...
You can grab just the numeric part using regexp:
nums = regexp(files,'\d+','match');
nums = str2double([nums{:}]);
nums =
0 1 10 11 12 ...
regexp returns its matches as a cell-array, the second line converts it back to actual numbers.
We can now get an actual numeric order by sorting the resulting array:
[~,order] = sort(nums);
and then put the files in the correct order:
files = files(order);
This should (I haven't tested it, I don't have a folder full of numerically labelled files handy) produce a list of files like so:
files=
'0.tif'
'1.tif'
'2.tif'
'3.tif'
...

this is partially dependent on the version of matlab you have. If you have a version with findstr this should work well
num_files_to_rename = numel(name_array);
for ii=1:num_files_to_rename
%in my test i used cells to store my strings you may need to
%change the bracket type for your application
curr_file = name_array{ii};
%locates the period in the file name (assume there is only one)
period_idx = findstr(curr_file ,'.');
%takes everything to the left of the period (excluding the period)
file_name = str2num(curr_file(1:period_idx-1));
%zeropads the file name to 3 spaces using a 0
new_file_name = sprintf('%03d.tiff',file_name)
%you can uncomment this after you are sure it works as you planned
%movefile(curr_file, new_file_name);
end
the actual rename operation movefile is commented out for now. make sure the output names are as you expect before uncommenting it and renaming all the files.
EDIT there is no real error checking in this code, it just assumes every file name has one and only one period, and an actual number as the name

The Batch file below do the rename of the files you want:
#echo off
setlocal EnableDelayedExpansion
for /F "delims=" %%f in ('dir /B *.tif') do (
set "name=00%%~Nf"
ren "%%f" "!name:~-3!.tif"
)
Note that this solution preserve the same order of your original files, even if there are missing numbers in the sequence..

Abaqus *.inp file created using Matlab

I was trying to do parametric studies in ABAQUS. I created an *.inp file (master) using GUI in abaqus, then wrote a matlab code to create a new *.inp file using the master. Master *.inp file can be found here and will be required to run the code.
In the new *.inp file everything was same as the master except a few specific lines which I am changing for parametric studies, code is given below. I am getting the files nicely but the problem is ABAQUS can't read the file and gives error messages. By visual inspection I don't find any faults. I guess matlab is writing the *.inp file in some other format which ABAQUS can't interpret.
clc;
%Number of lines to be copied
total_lines=4538; %total number of lines
lines_b4_RP1=4406; % lines before reference point 1
%creating new files
for A=0
for R=[20 30 40 50 100 200 300 400 500]
fileroot = sprintf('P_SHS_120X120X1_NLA_I15_A%dR%d.inp', A,R);
main_inp=fopen('P_SHS_120X120X1_NLA_I15_A0R10.inp','r'); %inputting the main inp file to be copied
wfile=fopen(fileroot,'w+'); %wfile= writing the new file
for i=1:total_lines
data=fgets(main_inp);
if i<lines_b4_RP1
fprintf(wfile,'%s\n', data);
elseif i==lines_b4_RP1
formatline1=('%s\n');
txtline='*Node';
fprintf(wfile, formatline1 ,txtline);
elseif i==(lines_b4_RP1+1)
formatline2=('%d%s%d%s%d%s%d\r\n');
comma=',';
refpt1=1;
xcoord1=R*cosd(A);
ycoord1=R*sind(A);
zcoord1=-20;
fprintf(wfile, formatline2, refpt1,comma,xcoord1,comma,ycoord1,comma,zcoord1);
elseif i==(lines_b4_RP1+2)
fprintf(wfile, formatline1 ,txtline);
elseif i==(lines_b4_RP1+3)
refpt2=2;
xcoord2=R*cosd(A);
ycoord2=R*sind(A);
zcoord2=420;
fprintf(wfile, formatline2 ,refpt2,comma,xcoord2,comma,ycoord2,comma,zcoord2);
elseif i>(lines_b4_RP1+3)
fprintf(wfile,'%s\n', data);
else break;
end
end
fclose(main_inp);
fclose(wfile);
end
end
Thanks in advance.
N.B. A sample *.dat file containing the error message is given here.

You are using fgets to get each line of the input file. From the matlab help:
fgets: Read line from file, keeping newline characters
You then print each line using
fprintf(wfile,'%s\n', data);
This creates two newlines at the end of each data line in the file. A second problem in your file is that you use \r\n in your format specifier. In matlab (unlike C) this will give you two newlines. e.g.
>> fprintf('Hello\rWorld\nFoo\r\nBar\n')
Hello
World
Foo
Bar
>>
I would suggest in future to test this approach with a much simpler format that you can use. Also there is a
*preprint
option that allows you to echo the contents of the input file back into the dat file. This creates big dat files, but it is useful for debugging.

Batch script to show multiple filenames in one call

Im trying to create a batch script to call a .exe to carry out analysis on multiple data files in a same folder. The syntax should be like:
"D:\Softwares\Analyzer.exe" [DataFile1].dat [DataFile2].dat ... [DataFileN].dat AnalysisFile.pdo"
Currently I have tried to use a FOR loop to scan each *.dat file in a specified folder. (I don't know how many data files in that folder, so I cannot type the filenames directly in the command line)
For example:
#ECHO OFF
FOR /r %%i in (*.dat) DO (
"D:\Softwares\Analyzer.exe" %%~ni.dat TestAnalysis.pdo
)
PAUSE
However, the analysis is carried out on seperate datafiles, and the .exe file will pop-up and open every time when a new .dat file is detected. Is there any way I could use *.dat or any other methods to represent [DataFile1].dat [DataFile2].dat ... [DataFileN].dat in one line seperated by a space (not a new line)?
I have also tried to use #tilte, which does not work as well. Since the .exe window keep pop-up whenever a new .dat file is detected and I have to close each of them in order to continue to next .dat file.
In general, I would like to do an automatic scan in a folder, get the names of the datafiles, and write a command line to call these .dat files in one line.
Any ideas/helps appreciated!!!

try this, remove the word echo if the output is OK:
#echo off &setlocal enabledelayedexpansion
set "line="
for %%i in (*.dat) do set "line=!line! "%%i""
echo "D:\Softwares\Analyzer.exe" %line% TestAnalysis.pdo
The code doesn't work with *dat files with exclams ! in the file name. This may be fixed if needed.