Matlab: str2double works on Windows PC but not on Apple Mac - matlab

Say I have the following example file names: file_0250.pdf, file_0251.pdf, file_0252.pdf. I would like to get the following cell array:
'250 251 252'.
On the Windows PC at work I can run the following code with no problems, but on my Macbook at home I can not get the 'str2double' values as it returns a NaN value. It's frustrating:
folder_name = '/User/....';
file_name = 'file_';
extension = '.pdf';
%//' files pattern with absolute paths
filePattern = fullfile(folder_name, [file_name '*' extension] );
old_filename = cellstr(ls(filePattern)) ;
%// Get numbers associated with each file
file_ID = strrep(strrep(old_filename, file_name ,''), extension,'');
file_ID_doublearr = str2double(file_ID);
I tried 'cell2mat', 'str2mat', but they do not go well with the rest of the code:
file_ID_doublearr = file_ID_doublearr - min(file_ID_doublearr)+ start_number;
file_ID = strtrim(cellstr(num2str(file_ID_doublearr)));
%// Get zeros string to be pre-appended to each new_name
str_zeros = arrayfun(#(t) repmat('0',1,t), 4-cellfun(#numel,file_ID),'uni',0) ;
%// Generate new filenames
new_name = get(handles.new_name, 'string');
new_extension = get(handles.new_extension, 'string');
new_filename = strcat(new_name,str_zeros,file_ID,new_extension) ;
%// Finally rename files with the absolute paths
cellfun(#(m1,m2) movefile(m1,m2),fullfile(folder_name,file_name),fullfile(folder_name,new_filename)) ;

Your issue has to do with the different way that *nix (Linux and Mac) and Windows treat ls as mentioned in the documentation. As you've found out, ls returns a 2D character array of filenames. On the PC, these filenames will be returned one per row.
file_001.pdf
file_002.pdf
file_003.pdf
file_004.pdf
When you call cellstr on the result, it will place each filename into it's own cell array element after which you can successfully extract the number portion and convert them to numbers.
On *nix-based systems though, ls will typically yield a multi-column output. For example:
file_001.pdf file_002.pdf file_003.pdf
file_004.pdf
When you call cellstr on this, you will get one cell array element per row, but as you can see the first row actually contains three filenames. Then once you extract the number portion you would get something like this:
'001 002 003'
'004'
When you try to convert to a number, you're trying to convert a string of numbers to a single number and you get a NaN.
str2double({'001 002 003'; '004'})
% NaN 4
The best way to fix this is to not use the OS-dependent ls and use dir instead which is guaranteed to have consistent behavior across operating systems.
files = dir(fullfile(folder_name, [filename, '*', extension]));
numbers = regexp({files.name}, '[0-9]*', 'match');
The other option is to make sure that file_ID does not contain any space-separated numbers.
file_IDs = {'001 002 003'; '004'};
% Break each element up into multiple elements if it contains spaces
file_IDs = cellfun(#(x)strsplit(x), file_IDs, 'UniformOutput', 0);
file_IDs = cat(2, file_IDs{:});
% Now convert to a number
str2double(file_IDs);
% 1 2 3 4

Related

joining arrays in Matlab and writing to file using dlmwrite( ) adds extra space

I am generating 2500 values in Matlab in format (time,heart_rate, resp_rate) by using below code
numberOfSeconds = 2500;
time = 1:numberOfSeconds;
newTime = transpose(time);
number0 = size(newTime, 1)
% generating heart rates
heart_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intHeartRate = int64(heart_rate);
number1 = size(intHeartRate, 1)
% hist(heart_rate)
% generating resp rates
resp_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intRespRate = int64(resp_rate);
number2 = size(intRespRate, 1)
% hist(heart_rate)
% joining time and sensor data
joinedStream = strcat(num2str(newTime),{','},num2str(intHeartRate),{','},num2str(intRespRate))
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', joinedStream,'delimiter','');
The data shown in the console is alright, but when I save this data to a .txt file, it contains extra spaces in beginning. Hence I am not able to parse the .txt file to generate input stream. Please help
Replace the last two lines of your code with the following. No need to use strcat if you want a CSV output file.
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', [newTime intHeartRate intRespRate]);
π‘‡β„Žπ‘’ π‘ π‘œπ‘™π‘’π‘‘π‘–π‘œπ‘› 𝑠𝑒𝑔𝑔𝑒𝑠𝑑𝑒𝑑 𝑏𝑦 π‘ƒπΎπ‘œ 𝑖𝑠 π‘‘β„Žπ‘’ π‘ π‘–π‘šπ‘π‘™π‘’π‘ π‘‘ π‘“π‘œπ‘Ÿ π‘¦π‘œπ‘’π‘Ÿ π‘π‘Žπ‘ π‘’. π‘‡β„Žπ‘–π‘  π‘Žπ‘›π‘ π‘€π‘’π‘Ÿ 𝑒π‘₯π‘π‘™π‘Žπ‘–π‘›π‘  π‘€β„Žπ‘¦ π‘¦π‘œπ‘’ 𝑔𝑒𝑑 π‘‘β„Žπ‘’ 𝑒𝑛𝑒π‘₯𝑝𝑒𝑐𝑑𝑒𝑑 π‘œπ‘’π‘‘π‘π‘’π‘‘.
The data written in the file is exactly what is shown in the console.
>> joinedStream(1) %The exact output will differ since 'rand' is used
ans =
cell
' 1,60,63'
num2str basically converts a matrix into a character array. Hence number of characters in its each row must be same. So for each column of the original matrix, the row with the maximum number of characters is set as a standard for all the rows with less characters and the deficiency is filled by spaces. Columns are separated by 2 spaces. Take a look at the following smaller example to understand:
>> num2str([44, 42314; 4, 1212421])
ans =
2Γ—11 char array
'44 42314'
' 4 1212421'

Matlab - How can I find the lowest common directory of an arbitrary group of files?

I have a cell array of full file names and I want to find the lowest common directory where it makes sense to store accumulated data and what not.
Here is an example hierarchy of test data:
C:\Test\Run1\data1
C:\Test\Run1\data2
C:\Test\Run1\data3
C:\Test\Run2\data1
C:\Test\Run2\data2
.
.
.
In Matlab, the paths are stored in a cell array as follows (each run shares a row):
filePaths = {...
'C:\Test\Run1\data1','C:\Test\Run1\data2','C:\Test\Run1\data3'; ...
'C:\Test\Run2\data1','C:\Test\Run2\data2','C:\Test\Run2\data3'};
I want to write a routine that outputs the common path C:\Test\Run1 so that I can store relevant plots in a new directory there.
C:\Test\Run1\Accumulation_Plots
C:\Test\Run2\Accumulation_Plots
.
.
.
Previously, I was only concerned with two files in an x-by-2 cell, so the regiment below worked; however, strcmp lost it's appeal since I can't (AFAIK) index the whole cell at once.
d = 1;
while strcmp(filePaths{1}(1:d),filePaths{2}(1:d))
d = d + 1;
end
common_directory = filePaths{1}(1:d-1);
mkdir(common_directory,'Accumulation_Plots');
As suggested by #nekomatic, I'm posting my comment as an answer.
filePaths = {...
'C:\Test\Run1\data1','C:\Test\Run1\data2','C:\Test\Run1\data3'; ...
'C:\Test\Run2\data1','C:\Test\Run2\data2','C:\Test\Run2\data3'};
% Sort the file paths
temp = sort(filePaths(:));
% Take the first and the last one, and split by '\'
first = strsplit(temp{1}, '\');
last = strsplit(temp{end}, '\');
% Compare them up to the depth of the smallest. Find the 'first N matching values'
sizeMin = min(numel(first), numel(last));
N = find(~[cellfun(#strcmp, first(1:sizeMin), last(1:sizeMin)) 0], 1, 'first') - 1;
% Get the smallest common path
commonPath = strjoin(first(1:N), '\');
You just need to compare the first d characters of any path in the array - e.g. path 1 - with the first d characters of the other paths. The longest common base path can't be longer than path 1 and it can't be shorter than the shortest common base path between path 1 and any other path.
There must be several ways you could do that, but a concise one is using strfind to match the strings and cellfun with isempty to check which ones didn't match:
% filePaths should contain at least two paths
filePaths = {...
'C:\Test\Run1\data1','C:\Test\Run1\data2','C:\Test\Run1\data3'; ...
'C:\Test\Run2\data1','C:\Test\Run2\data2','C:\Test\Run2\data3'};
path1 = filePaths{1};
filePaths = filePaths(2:end);
% find longest common left-anchored substring
d = 1;
while ~any(cellfun(#isempty, strfind(filePaths, path1(1:d))))
d = d + 1;
end
% find common base path from substring
[common_directory, ~, ~] = fileparts(path1(1:d));
Your code leaves d containing the length of the longest common left-anchored substring between the paths, but that might be longer than the common base path; fileparts extracts the actual base path from that substring.

Extract specific column information from table in MATLAB

I have several *.txt files with 3 columns information, here just an example of one file:
namecolumn1 namecolumn2 namecolumn3
#----------------------------------------
name1.jpg someinfo1 name
name2.jpg someinfo2 name
name3.jpg someinfo3 name
othername1.bmp info1 othername
othername2.bmp info2 othername
othername3.bmp info3 othername
I would like to extract from "namecolumn1" only the names starting with name but from column 1.
My code look like this:
file1 = fopen('test.txt','rb');
c = textscan(file1,'%s %s %s','Headerlines',2);
tf = strcmp(c{3}, 'name');
info = c{1}{tf};
the problem is that when I do disp(info) I got only the first entry from the table: name1.jpg and I would like to have all of them:
name1.jpg
name2.jpg
name3.jpg
You're pretty much there. What you're seeing is an example of MATLAB's Comma Separated List, so MATLAB is returning each value separately.
You can verify this by entering c{1}{tf} in the command line after running your script, which returns:
>> c{1}{tf}
ans =
name1.jpg
ans =
name2.jpg
ans =
name3.jpg
Though sometimes we'd want to concatenate them, I think in the case of character arrays it is more difficult to work with than retaining the cell arrays:
>> info = [c{1}{tf}]
info =
name1.jpgname2.jpgname3.jpg
versus
>> info = c{1}(tf)
info =
'name1.jpg'
'name2.jpg'
'name3.jpg'
The former would require you to reshape the result (and whitespace pad, if the strings are different lengths), whereas you can index the strings in a cell array directly without having to worry about any of that (e.g. info{1}).

Matlab | How to load/use files with consecutive names (abc1, abc2, abc3) and then pass on to the next series (cba1, cba2, cba3)?

I have a folder containing a series of data with file names like this:
abc1
abc2
abc3
bca1
bca2
bca3
bca4
bca5
cba1
... etc
My goal is to load all the relevant files for each file name, so all the "abc" files, and plot them in one graph. Then move on to the next file name, and do the same, and so forth. Is there a way to do this?
This is what I currently have to load and run through all the files, grab the data in them and get their name (without the .mat extension) to be able to save the graph with the same filename.
dirName = 'C:\DataDirectory';
files = dir( fullfile(dirName,'*.mat') );
files = {files.name}';
data = cell(numel(files),1);
for i=1:numel(files)
fname = fullfile(dirName,files{i});
disp(fname);
files{i} = files{i}(1:length(files{i})-4);
disp(files{i});
[Rest of script]
end
You already found out about the cool features of dir, and have a cell array files, which contains all file names, e.g.
files =
'37abc1.mat'
'37abc2.mat'
'50bca1.mat'
'50bca2.mat'
'1cba1.mat'
'1cba2.mat'
The main task now is to find all prefixes, 37abc, 50bca, 1cba, ... which are present in files. This can be done using a regular expression (regexp). The Regexp Pattern can look like this:
'([\d]*[\D]*)[\d]*.mat'
i.e. take any number of numbers ([\d]*), then any number of non-numeric characters ([\D]*) and keep those (by putting that in brackets). Next, there will be any number of numeric characters ([\d]*), followed by the text .mat.
We call the regexp function with that pattern:
pre = regexp(files,'([\d]*[\D]*)[\d]*.mat','tokens');
resulting in a cell array (one cell for each entry in files), where each cell contains another cell array with the prefix of that file. To convert this to a simple not-nested cell array, we call
pre = [pre{:}];
pre = [pre{:}];
resulting in
pre =
'37abc' '37abc' '50bca' '50bca' '1cba' '1cba'
To remove duplicate entries, we use the unique function:
pre = unique(pre);
pre =
'37abc' '50bca' '1cba'
which leaves us with all prefixes, that are present. Now you can loop through each of these prefixes and apply your stuff. Everything put together is:
% Find all files
dirName = 'C:\DataDirectory';
files = dir( fullfile(dirName,'*.mat') );
files = {files.name}';
% Find unique prefixes
pre = regexp(files,'([\d]*[\D]*)[\d]*.mat','tokens');
pre = [pre{:}]; pre = [pre{:}];
pre = unique(pre);
% Loop through prefixes
for ii=1:numel(pre)
% Get files with this prefix
curFiles = dir(fullfile(dirName,[pre{ii},'*.mat']));
curFiles = {curFiles.name}';
% Loop through all files with this prefix
for jj=1:numel(curFiles)
% Here the magic happens
end
end
Sorry, I misunderstood your question, I found this solution:
file = dir('*.mat')
matching = regexp({file.name}, '^[a-zA-Z_]+[0-9]+\.mat$', 'match', 'once'); %// Match once on file name, must be a series of A-Z a-z chars followed by numbers.
matching = matching(~cellfun('isempty', matching));
str = unique(regexp(matching, '^[a-zA-Z_]*', 'match', 'once'));
str = str(~cellfun('isempty', str));
group = cell(size(str));
for is = 1:length(str)
ismatch = strncmp(str{is}, matching, length(str{is}));
group{is} = matching(ismatch);
end
Answer came from this source: Matlab Central

Modify the value of a specific position in a text file in Matlab

AIR, ID
AIR.SIT
50 1 1 1 0 0 2 1
43.57 -116.24 1. 857.7
Hi, All,
I have a text file like above. Now in Matlab, I want to create 5000 text files, changing the number "2" (the specific number in the 3rd row) from 1 to 5000 in each file, while keeping other contents the same. In every loop, the changed number is the same with the loop number. And the output in every loop is saved into a new text file, with the name like AIR_LoopNumber.SIT.
I've spent some time writing on that. But it is kind of difficult for a newby. Here is what I have:
% - Read source file.
fid = fopen ('Air.SIT');
n = 1;
textline={};
while (~feof(fid))
textline(n,1)={fgetl(fid)};
end
FileName=Air;
% - Replace characters when relevant.
for i = 1 : 5000
filename = sprintf('%s_%d.SIT','filename',i);
end
Anybody can help on finishing the program?
Thanks,
James
If your file is so short you do not have to read it line by line. Just read the full thing in one variable, modify only the necessary part of it before each write, then write the full variable back in one go.
%% // read the full file as a long sequence of 'char'
fid = fopen ('Air.SIT');
fulltext = fread(fid,Inf,'char') ;
fclose(fid) ;
%% // add a few blank placeholder (3 exactly) to hold the 4 digits when we'll be counting 5000
fulltext = [fulltext(1:49) ; 32 ; 32 ; 32 ; fulltext(50:end) ] ;
idx2replace = 50:53 ; %// save the index of the characters which will be modified each loop
%% // Go for it
baseFileName = 'AIR_%d.SIT' ;
for iFile = 1:1000:5000
%// build filename
filename = sprintf(baseFileName,iFile);
%// modify the string to write
fulltext(idx2replace) = num2str(iFile,'%04d').' ; %//'
%// write the file
fidw = fopen( filename , 'w' ) ;
fwrite(fidw,fulltext) ;
fclose(fidw) ;
end
This example works with the text in your example, you may have to adjust slightly the indices of the characters to replace if your real case is different.
Also I set a step of 1000 for the loop to let you try and see if it works without writing 1000's of file. When you are satisfied with the result, remove the 1000 step in the for loop.
Edit:
The format specifier %04d I gave in the first solution insure the output will take 4 characters, and it will pad any smaller number with zero (ex: 23 => 0023). It is sometimes desirable to keep the length constant, and in your particular example it made things easier because the output string would be exactly the same length for all the files.
However it is not mandatory at all, if you do not want the loop number to be padded with zero, you can use the simple format %d. This will only use the required number of digits.
The side effect is that the output string will be of different length for different loop number, so we cannot use one string for all the iterations, we have to recreate a string at each iteration. So the simple modifications are as follow. Keep the first paragraph of the solution above as is, and replace the last 2 paragraphs with the following:
%% // prepare the block of text before and after the character to change
textBefore = fulltext(1:49) ;
textAfter = fulltext(51:end) ;
%% // Go for it
baseFileName = 'AIR_%d.SIT' ;
for iFile = 1:500:5000
%// build filename
filename = sprintf(baseFileName,iFile);
%// rebuild the string to write
fulltext = [textBefore ; num2str(iFile,'%d').' ; textAfter ]; %//'
%// write the file
fidw = fopen( filename , 'w' ) ;
fwrite(fidw,fulltext) ;
fclose(fidw) ;
end
Note:
The constant length of character for a number may not be important in the file, but it can be very useful for your file names to be named AIR_0001 ... AIR_0023 ... AIR_849 ... AIR_4357 etc ... because in a list they will appear properly ordered in any explorer windows.
If you want your files named with constant length numbers, the just use:
baseFileName = 'AIR_%04d.SIT' ;
instead of the current line.