I have a series of DICOM Images which I want to anonymize, I found few Matlab codes and some programs which do the job, but none of them export a .txt file of removed personal information. I was wondering if there is a function which can also save removed personal information of a DICOM images in .txt format for features uses. Also, I am trying to create a table which shows the corresponding new images ID to their real name.(subjects real name = personal-information-removed image ID)
Any thoughts?
Thanks for considering my request!
I'm guessing you only want to output to your text file the fields that are changed by anonymization (either modified, removed, or added). First, you may want to modify some dicomanon options to reduce the number of changes, in particular passing the arguments 'WritePrivate', true to ensure private extensions are kept.
First, you can perform the anonymization, saving structures of pre- and post-anonymization metadata using dicominfo:
preAnonData = dicominfo('input_file.dcm');
dicomanon('input_file.dcm', 'output_file.dcm', 'WritePrivate', true);
postAnonData = dicominfo('output_file.dcm');
Then you can use fieldnames and setdiff to find fields that are removed or added by anonymization, and add them to the post-anonymization or pre-anonymization data, respectively, with a nan value as a place holder:
preFields = fieldnames(preAnonData);
postFields = fieldnames(postAnonData);
removedFields = setdiff(preFields, postFields);
for iField = 1:numel(removedFields)
postAnonData.(removedFields{iField}) = nan;
end
addedFields = setdiff(postFields, preFields);
for iField = 1:numel(addedFields)
preAnonData.(addedFields{iField}) = nan;
end
It will also be helpful to use orderfields so that both data structures have the same ordering for their field names:
postAnonData = orderfields(postAnonData, preAnonData);
Finally, now that each structure has the same fields in the same order we can use struct2cell to convert their field data to a cell array and use cellfun and isequal to find any fields that have been modified by the anonymization:
allFields = fieldnames(preAnonData);
preAnonCell = struct2cell(preAnonData);
postAnonCell = struct2cell(postAnonData);
index = ~cellfun(#isequal, preAnonCell, postAnonCell);
modFields = allFields(index);
Now you can create a table of the changes like so:
T = table(modFields, preAnonCell(index), postAnonCell(index), ...
'VariableNames', {'Field', 'PreAnon', 'PostAnon'});
And you could use writetable to easily output the table data to a text file:
writetable(T, 'anonymized_data.txt');
Note, however, that if any of the fields in the table contain vectors or structures of data, the formatting of your output file may look a little funky (i.e. lots of columns, most of them empty, except for those few fields).
One way to do this is to store the tags before and after anonymisation and use these to write your text file. In Matlab, dicominfo() will read the tags into a structure:
% Get tags before anonymization
tags_before = dicominfo(file_in);
% Anoymize
dicomanon(file_in, file_out); % Need to set tags values where required
% Get tags after anonymization
tags_after = dicominfo(file_out);
% Do something with the two structures
disp(['Patient ID:', tags_before.PatientID ' -> ' tags_after.PatientID]);
disp(['Date of Birth:', tags_before.PatientBirthDate ' -> ' tags_after.PatientBirthDate]);
disp(['Family Name:', tags_before.PatientName.FamilyName ' -> ' tags_after.PatientName.FamilyName]);
You can then write out the before/after fields into a text file. You'd need to modify dicomanon() to choose your own values for the removed fields, since by default they are set to empty.
Related
%% File Names reading and label generation
dataFolder= 'allcontent';
fileNames = dir([dataFolder 'c*.*']);
lbl = sscanf(cat(1,'fileNames.name'),'co2%c%d.rd.%d');
status = lbl(1:3:end);
id = lbl(2:3:end);
ids = unique(id);
trial = lbl(3:3:end);
I want to concatenate the names of all the files in the folder titled all content , at the moment, matlab doesn't understand what allcontent is. Can someone help me get the contents of the folder ' all content' which are of the form 'c*.*' and then concatenate them?
You can use fullfile to concatenate paths in Matlab, i.e.
fileNames = dir(fullfile(dataFolder, 'c*.*'));
Also, I don't think fileNames.name should be in quotes. As #Wolfie mentioned, you can concatenate the filenames into a cell array using {fileNames.name}
filenames_array = {fileNames.name}
Then you can iterate over filenames_array using for or cellfun
this question about matlab:
i'm running a loop and each iteration a new set of data is produced, and I want it to be saved in a new file each time. I also overwrite old files by changing the name. Looks like this:
name_each_iter = strrep(some_source,'.string.mat','string_new.(j).mat')
and what I#m struggling here is the iteration so that I obtain files:
...string_new.1.mat
...string_new.2.mat
etc.
I was trying with various combination of () [] {} as well as 'string_new.'j'.mat' (which gave syntax error)
How can it be done?
Strings are just vectors of characters. So if you want to iteratively create filenames here's an example of how you would do it:
for j = 1:10,
filename = ['string_new.' num2str(j) '.mat'];
disp(filename)
end
The above code will create the following output:
string_new.1.mat
string_new.2.mat
string_new.3.mat
string_new.4.mat
string_new.5.mat
string_new.6.mat
string_new.7.mat
string_new.8.mat
string_new.9.mat
string_new.10.mat
You could also generate all file names in advance using NUM2STR:
>> filenames = cellstr(num2str((1:10)','string_new.%02d.mat'))
filenames =
'string_new.01.mat'
'string_new.02.mat'
'string_new.03.mat'
'string_new.04.mat'
'string_new.05.mat'
'string_new.06.mat'
'string_new.07.mat'
'string_new.08.mat'
'string_new.09.mat'
'string_new.10.mat'
Now access the cell array contents as filenames{i} in each iteration
sprintf is very useful for this:
for ii=5:12
filename = sprintf('data_%02d.mat',ii)
end
this assigns the following strings to filename:
data_05.mat
data_06.mat
data_07.mat
data_08.mat
data_09.mat
data_10.mat
data_11.mat
data_12.mat
notice the zero padding. sprintf in general is useful if you want parameterized formatted strings.
For creating a name based of an already existing file, you can use regexp to detect the '_new.(number).mat' and change the string depending on what regexp finds:
original_filename = 'data.string.mat';
im = regexp(original_filename,'_new.\d+.mat')
if isempty(im) % original file, no _new.(j) detected
newname = [original_filename(1:end-4) '_new.1.mat'];
else
num = str2double(original_filename(im(end)+5:end-4));
newname = sprintf('%s_new.%d.mat',original_filename(1:im(end)-1),num+1);
end
This does exactly that, and produces:
data.string_new.1.mat
data.string_new.2.mat
data.string_new.3.mat
...
data.string_new.9.mat
data.string_new.10.mat
data.string_new.11.mat
when iterating the above function, starting with 'data.string.mat'
I am new to matlab and image analysis. I would really appreciate some insight/help into the following problem. I am trying to rename images (jpg) in a folder that have a random name into specific (new) names. I made the an excel file with two columns the first column contains the old names and the second column the new names. I found the next code on stack overflow (Rename image file name in matlab):
dirData = dir('*.jpg'); %# Get the selected file data
fileNames = {dirData.name}; %# Create a cell array of file names
for iFile = 1:numel(fileNames) %# Loop over the file names
newName = sprintf('image%05d.jpg',iFile); %# Make the new name
movefile(fileNames{iFile},newName); %# Rename the file
end
The code gives all the photos a new name based on the old one but that is not what I want, the new names I use are not linked to the old ones. I tried the following code :
g= xlsread('names.xlsx') % names.xlsx the excel file with old and new names
for i=1:nrows(g)
image=open(g(i,1));
save(g(i,2),image);
end
It doesn't work. I get the error message :using open (line 68)
NAME must contain a single string. I don't think open is the right function to use.
Thank you !
How about this:
% Get the jpeg names
dir_data = dir('*.jpg');
jpg_names = {dir_data.name};
% Rename the files according to the .xlsx file
[~,g,~] = xlsread('names.xlsx'); % Thanks to the post of Sean
old_names = g(:,1);
new_names = g(:,2);
for k = 1:numel(old_names)
% check if the file exists
ix_file = strcmpi(old_names{k}, jpg_names);
if any(ix_file)
movefile(old_names{k}, new_names{k});
end;
end;
It seems like g = xlsread(file) reads only numeric data from file (see here). The third output argument of xlsread returns raw data including strings; does the following work? (i don't have excel and can't test it)
[~, ~, g] = xlsread('names.xlsx')
for i = 1:nrows(g)
movefile(g{i,1}, g{i,2})
end
Here is my code. I have 59 CSV files in my directory, I have 11 variables in each, with first one date in quarter format. I need to tell MATLAB that first column has date format, because the code below imports it as a string variable.
ext = '.csv';
countries = dir(['*', ext]);
countryFiles = {countries.name};
countriesNames = strrep(countryFiles, ext, '');
%%
a = cell(length(countriesNames), 2);
a(:,1) = countriesNames(:)';
a(:,2) = cellfun(#(file) readtable(file, 'TreatAsEmpty', '.','Format'?), countryFiles(:), 'uni', 0);
As far as I understand this is option 'Format' with datenum in readtable... However, I can`t find useful information on it in helpfiles and whatever I try I get errors... Below how my data looks for each country. Data is 1980Q1-2014Q4.
So I have 59*2 cell array with first column representing countryNames and second column contains 59 140*11 tables. First I do not know how to access variable names within cell array`s table. Second problem is that if you try to define x to a column in that table and then datenum(x,'YYYYQQ') I get "Input expected to be a cell array, was table instead."
So, I need to extract from cell a tables in a{1,2}, a{2,2}, a{3,2} ... then convert those tables to cellarray using table2cell option. How to do it using loop for each country and then write it back to the cell?
I have an array which I read into Matlab with importdata. It has 5 header lines
file = 'aoao.csv';
s = importdata(file,',', 5);
Matlab automatically treats the last line as the column header. I can then call up the column number that I want with
s.data(:,n); %n is desired column number
I want to be able to load up many similar files at once, and then call up the columns in the different files which have the same column header name (which are not necessarily the same column number). I want to be able to write and export all of these columns together into a new matrix, preferably with each column labelled with its file name,
what should I do?
samp = 'len-c.mp3'; %# define desired sample/column header name
file = dir('*.csv');
have the directory ready in main screen current folder. This creates a detailed description of file,
for i=1:length(file)
set(i) = importdata(file(i).name,',', 5);
end
this imports the data from each of the files (comma delimited, 5 header lines) and transports it to a cell array called 'set'
for k = 1:14;
for i=1:length(set(k).colheaders)
TF = strcmp(set(k).colheaders(i),samp); %compares strings for match
if TF == 1; %if match is true
group(:,k) = set(k).data(:,i); %save matching column# to 'group'
end
end
end
this retrieves the data from the named colheader within each file