I am trying to have a matrix which its elements doesn't have the same size .
Say the element_1 = 0.1234567 and the element_2 = 0.1 and I need the element_2 =0.1000000 so that both of them has the same size.
clc;clear all
a = rand(4,12);
COL_Names ={'This_is_Colu_No_1','This_is_Colu_No_2','This_is_Colu_No_3','This_is_Colu_No_4','This_is_Colu_No_5','This_is_Colu_No_6','This_is_Colu_No_7','This_is_Colu_No_8','This_is_Colu_No_9','This_is_Colu_No_10','This_is_Colu_No_11','This_is_Colu_No_12'};
rowNames = {'ROW1';'ROW2';'ROW3';'ROW4'};
T = array2table(a,'VariableNames',COL_Names,'RowNames',rowNames);
writetable(T,'Data.txt','Delimiter','\t','WriteRowNames',true);
type Data.txt ;
The OutPut is like this
Row This_is_Colu_No_1 This_is_Colu_No_2 This_is_Colu_No_3 This_is_Colu_No_4 This_is_Colu_No_5 This_is_Colu_No_6 This_is_Colu_No_7 This_is_Colu_No_8 This_is_Colu_No_9 This_is_Colu_No_10 This_is_Colu_No_11 This_is_Colu_No_12
ROW1 0.139740979774291 0.231035232035157 0.347778782863186 0.279682446566279 0.060995054119542 0.233212699943628 0.507599581908539 0.833087779293817 0.552819386888535 0.43251811668393 0.342580158122272 0.420574544492339
ROW2 0.00459708931895875 0.703626845695885 0.33064632159971 0.85782393462353 0 0.935097755896966 0.582441521353621 0.155241648807001 0.163717355897126 0.48985529896707 0.0134551766978835 0.810989133317225
ROW3 0.791254563282513 0.650747335567064 0.293769172888192 0.15110222627643 0.962791661993452 0.842147123142386 0.586462512126695 0.109349751268813 1 0.00525695361457879 0.700826048054212 0.989915984093474
ROW4 0.513993416249574 0.868158891144176 0.293769172888 0.552496163682282 0.301098948730568 0.779790450269442 0.420527994140777 0.523231514251179 0.0602548802340035 0.261436547849062 0.84923648156472 0.433189006269314
I think you need to do it manually using fprintf with formatSpec:
clc, clear, rng(3);
a = rand(4, 3);
colNames = {'This_is_Colu_No_9', 'This_is_Colu_No_10', 'This_is_Colu_No_11'};
rowNames = {'ROW98'; 'ROW99'; 'ROW100'; 'ROW101'};
formatSpecHead = '%-6s %-22s %-22s %-22s\n';
formatSpecRow = '%-6s %.20f %.20f %.20f\n';
fid = fopen('a.fwf', 'w');
fprintf(fid, formatSpecHead, 'Row', colNames{:}); % write header
for row = 1:size(a, 1)
fprintf(fid, formatSpecRow, rowNames{row}, a(row, :)); % write row
end
fclose(fid);
Then a.fwf looks like:
Row This_is_Colu_No_9 This_is_Colu_No_10 This_is_Colu_No_11
ROW98 0.55079790257457550418 0.89294695434765469777 0.05146720330082987793
ROW99 0.70814782261810482744 0.89629308893343806464 0.44080984365063646813
ROW100 0.29090473891294432729 0.12558531046383625274 0.02987621087856695556
ROW101 0.51082760519766301499 0.20724287813818675907 0.45683322439471107934
Related
I'm trying to take in a complex csv file in MATLAB and edit/rearrange it. I used tableread to pull out the numerical data that I then manipulated. Afterwards I aimed to take the first 52 rows of text metadata, add three more rows then combine with the numerical data and output as a csv file.
However, I can't seem to find a way to work with the text metadata rows that doesn't ultimately add new lines between them when output.
e.g. this is how my csv outputs:
#HEADER
,,,,,,
Instrument, Data Processed Analysis
,,,,,
Version,1.3.1,1.25
,,,,
Created,Monday, August 16, 2021 3:13:38 PM
,,,
Filename,D5-116.NGXDP
,,,,,
When it should look like:
#HEADER
NGX, Data Processed Analysis
Version,1.3.1,1.25
Created,Monday, August 16, 2021 3:13:38 PM
Filename,D5-116.NGXDP
Here is my code:
`
close
clear
fileName = 'D5-116.NGXDP';
fileOut = fileName(1:end-6);
fileOut = append(fileOut, '.csv');
eEnergy = 1.602176634e-19; %electron energy in coulombs
resisTor = 1e11; %Resistor value for EM
%% Reading and Edit Mass Data - this is too hardcoded right now
T = readtable(fileName, 'FileType', 'text');
T = T{:,:}; %convert from table to matrix
dataOrg(:,1:3) = T(1087:1236, 1:3); %cycle:time:Ar
dataOrg(:,4) = T(935:1084, 3); %39
dataOrg(:,5) = T(783:932, 3); %38
dataOrg(:,6) = T(631:780, 3); %37
dataOrg(:,7) = T(479:628, 3); %36
dataOrg(:,7) = dataOrg(:,7) * eEnergy * resisTor; %36Ar - convert CPS to V
dataOrg(:,3:7) = dataOrg(:,3:7) * 1000; %Convert all to mV
%% Reading the metadata from the file
T2 = fopen(fileName);
metaData = fread(T2, '*char')'; %read content
fclose(T2);
results = strsplit(metaData, '\n'); %split the char data by entry spaces
metaNeat = results(:,1:52)'; %Takes the metadata prior to #collectors
metaNeat(end+1, 1) = {'Blocks, 1'}; %added data
metaNeat(end+1, 1) = {'Cycles, 150'}; %added data
metaNeat(end+1, 1) = {'Cycle, Time, 40, 39, 38, 37, 36'};
for n = 1:length(metaNeat) %scan each row, if there is a comma split them
if contains(metaNeat(n,1), ',') == 1
tempMeta = split(metaNeat(n,1), ',')';
metaNeat(n,1:size(tempMeta, 2)) = tempMeta;
clear tempMeta
end
end
outPut = metaNeat; %metadata section
outPut(end+1:(end+150), 1:7) = num2cell(dataOrg); %mass value section
writecell(outPut,fileOut,'Delimiter',',', 'FileType', 'text')
I'm sure there is a way to do this in three lines of course. My guess is that the strsplit function I'm using to separate the char cell into multiple rows is preserving the \n value. The added lines (e.g. Blocks, 1) do not show gaps in the csv file. Any tips or advice would be appreciated.
Thanks for your time,
I have a two arrays that I plot: A a 1x101 vector and B the same
A = [0.140673450903833 0.143148937279028 0.145430952171596 0.147474938627147 0.149581060870114 0.151187105347571 0.152646348246015 0.153892222566265 0.154913060187075 0.155701930397674 0.156253328260122 0.156562551841967 0.156625533585493 0.156438787610539 0.155999394209637 0.155304997895017 0.154353810555534 0.153144616301392 0.151676776486360 0.149950234280632 0.147965519042205 0.145723755995511 0.143229676241241 0.140476287800831 0.137475884805212 0.134228713435530 0.130738812449387 0.127010778531129 0.123049766057659 0.118861478234099 0.114452155847321 0.109828564345449 0.104997979409803 0.0999681710919947 0.0947473865690700 0.0893443315667412 0.0837681505026921 0.0780284054055798 0.0721350536699391 0.0660984247128685 0.0599291956061627 0.0536383657701255 0.0472372308396036 0.0407373558685518 0.0341505481893574 0.0274888307202605 0.0207644183921863 0.0139897104812690 0.00717740846258673 0.000358034181698980 0.00651349557333709 0.0133637955715171 0.0202018404602034 0.0270147225489001 0.0337898204971252 0.0405146415132138 0.0471768260462406 0.0537641715916784 0.0602646603043279 0.0666664873507057 0.0729580891146200 0.0791281709097673 0.0851657340195109 0.0910601019446384 0.0968009457657087 0.102378308539557 0.107782628657363 0.113004762097380 0.118036003510261 0.122868106079509 0.127493300104313 0.131904310257409 0.136094371477732 0.140057243469708 0.143787223810476 0.147279159770258 0.150528459504324 0.153531108836772 0.156280444813554 0.158783035106175 0.161027296288627 0.163014562505352 0.164743731117677 0.166214276765471 0.167426257040343 0.168380310331524 0.169077651806683 0.169520068571722 0.169709914896378 0.169650109087113 0.169344135453180 0.168796059816963 0.168010582212876 0.166993205517562 0.165750858213848 0.164295206012858 0.162692813100379 0.160590402150861 0.158550181408264 0.156271984944015 0.153800366335689]
B = [-2 -1.96000000000000 -1.92000000000000 -1.88000000000000 -1.84000000000000 -1.80000000000000 -1.76000000000000 -1.72000000000000 -1.68000000000000 -1.64000000000000 -1.60000000000000 -1.56000000000000 -1.52000000000000 -1.48000000000000 -1.44000000000000 -1.40000000000000 -1.36000000000000 -1.32000000000000 -1.28000000000000 -1.24000000000000 -1.20000000000000 -1.16000000000000 -1.12000000000000 -1.08000000000000 -1.04000000000000 -1 -0.960000000000000 -0.920000000000000 -0.880000000000000 -0.840000000000000 -0.800000000000000 -0.760000000000000 -0.720000000000000 -0.680000000000000 -0.640000000000000 -0.600000000000000 -0.560000000000000 -0.520000000000000 -0.480000000000000 -0.440000000000000 -0.400000000000000 -0.360000000000000 -0.320000000000000 -0.280000000000000 -0.240000000000000 -0.200000000000000 -0.160000000000000 -0.120000000000000 -0.0800000000000001 -0.0400000000000000 0 0.0400000000000000 0.0800000000000001 0.120000000000000 0.160000000000000 0.200000000000000 0.240000000000000 0.280000000000000 0.320000000000000 0.360000000000000 0.400000000000000 0.440000000000000 0.480000000000000 0.520000000000000 0.560000000000000 0.600000000000000 0.640000000000000 0.680000000000000 0.720000000000000 0.760000000000000 0.800000000000000 0.840000000000000 0.880000000000000 0.920000000000000 0.960000000000000 1 1.04000000000000 1.08000000000000 1.12000000000000 1.16000000000000 1.20000000000000 1.24000000000000 1.28000000000000 1.32000000000000 1.36000000000000 1.40000000000000 1.44000000000000 1.48000000000000 1.52000000000000 1.56000000000000 1.60000000000000 1.64000000000000 1.68000000000000 1.72000000000000 1.76000000000000 1.80000000000000 1.84000000000000 1.88000000000000 1.92000000000000 1.96000000000000 2];
Plotting these two plot(B,A) I get this
with two maximum points at B = -1.52 and B = +1.52
I want to add automatically a point as marker in the two maximum values, a horizontal line above the highest point and a two way row pointing from the line to the second peak like this
I tried to sort A and find the position of the two maximum
[val ind] = sort(A,'descend');
max_values = val(1:2)
index = ind(1:2)
r_max = A(ind(1:2))
but the second peak is not the the second position of val because I get this sort:
Columns 1 through 13
0.1697 0.1697 0.1695 0.1693 0.1691 0.1688 0.1684 0.1680 0.1674 0.1670 0.1662 0.1658 0.1647
Columns 14 through 26
0.1643 0.1630 0.1627 0.1610 0.1606 0.1588 0.1586 0.1566 0.1566 0.1564 0.1563 0.1563 0.1563
The first value 0.1697 (in this case) is the correct one, but the second peak is not in the second position but at the 22nd position.
Looking at the plot, how can I get easily the two maximum points?
Once I know the two coordinates I can easily add all the objects that I need.
Using findpeaks (requires Signal Processing Toolbox), yline (introduced in R2018b) and annotation :
A = [0.140673450903833 0.143148937279028 0.145430952171596 0.147474938627147 0.149581060870114 0.151187105347571 0.152646348246015 0.153892222566265 0.154913060187075 0.155701930397674 0.156253328260122 0.156562551841967 0.156625533585493 0.156438787610539 0.155999394209637 0.155304997895017 0.154353810555534 0.153144616301392 0.151676776486360 0.149950234280632 0.147965519042205 0.145723755995511 0.143229676241241 0.140476287800831 0.137475884805212 0.134228713435530 0.130738812449387 0.127010778531129 0.123049766057659 0.118861478234099 0.114452155847321 0.109828564345449 0.104997979409803 0.0999681710919947 0.0947473865690700 0.0893443315667412 0.0837681505026921 0.0780284054055798 0.0721350536699391 0.0660984247128685 0.0599291956061627 0.0536383657701255 0.0472372308396036 0.0407373558685518 0.0341505481893574 0.0274888307202605 0.0207644183921863 0.0139897104812690 0.00717740846258673 0.000358034181698980 0.00651349557333709 0.0133637955715171 0.0202018404602034 0.0270147225489001 0.0337898204971252 0.0405146415132138 0.0471768260462406 0.0537641715916784 0.0602646603043279 0.0666664873507057 0.0729580891146200 0.0791281709097673 0.0851657340195109 0.0910601019446384 0.0968009457657087 0.102378308539557 0.107782628657363 0.113004762097380 0.118036003510261 0.122868106079509 0.127493300104313 0.131904310257409 0.136094371477732 0.140057243469708 0.143787223810476 0.147279159770258 0.150528459504324 0.153531108836772 0.156280444813554 0.158783035106175 0.161027296288627 0.163014562505352 0.164743731117677 0.166214276765471 0.167426257040343 0.168380310331524 0.169077651806683 0.169520068571722 0.169709914896378 0.169650109087113 0.169344135453180 0.168796059816963 0.168010582212876 0.166993205517562 0.165750858213848 0.164295206012858 0.162692813100379 0.160590402150861 0.158550181408264 0.156271984944015 0.153800366335689];
B = [-2 -1.96000000000000 -1.92000000000000 -1.88000000000000 -1.84000000000000 -1.80000000000000 -1.76000000000000 -1.72000000000000 -1.68000000000000 -1.64000000000000 -1.60000000000000 -1.56000000000000 -1.52000000000000 -1.48000000000000 -1.44000000000000 -1.40000000000000 -1.36000000000000 -1.32000000000000 -1.28000000000000 -1.24000000000000 -1.20000000000000 -1.16000000000000 -1.12000000000000 -1.08000000000000 -1.04000000000000 -1 -0.960000000000000 -0.920000000000000 -0.880000000000000 -0.840000000000000 -0.800000000000000 -0.760000000000000 -0.720000000000000 -0.680000000000000 -0.640000000000000 -0.600000000000000 -0.560000000000000 -0.520000000000000 -0.480000000000000 -0.440000000000000 -0.400000000000000 -0.360000000000000 -0.320000000000000 -0.280000000000000 -0.240000000000000 -0.200000000000000 -0.160000000000000 -0.120000000000000 -0.0800000000000001 -0.0400000000000000 0 0.0400000000000000 0.0800000000000001 0.120000000000000 0.160000000000000 0.200000000000000 0.240000000000000 0.280000000000000 0.320000000000000 0.360000000000000 0.400000000000000 0.440000000000000 0.480000000000000 0.520000000000000 0.560000000000000 0.600000000000000 0.640000000000000 0.680000000000000 0.720000000000000 0.760000000000000 0.800000000000000 0.840000000000000 0.880000000000000 0.920000000000000 0.960000000000000 1 1.04000000000000 1.08000000000000 1.12000000000000 1.16000000000000 1.20000000000000 1.24000000000000 1.28000000000000 1.32000000000000 1.36000000000000 1.40000000000000 1.44000000000000 1.48000000000000 1.52000000000000 1.56000000000000 1.60000000000000 1.64000000000000 1.68000000000000 1.72000000000000 1.76000000000000 1.80000000000000 1.84000000000000 1.88000000000000 1.92000000000000 1.96000000000000 2];
plot(B,A)
% Find peaks.
[maxValuesY,isMaxY]=findpeaks(A);
maxValuesX = B(isMaxY);
% Plot horizontal line.
yline(maxValuesY(2));
% Create arrow.
ar = annotation('arrow');
ar.Parent = gca;
ar.X = [maxValuesX(1), maxValuesX(1)];
ar.Y = [maxValuesY(2), maxValuesY(1)];
ar.Color = 'black';
ar.HeadLength = 3;
Thanks to marsei for tip on the position of annotation.
If you specifically have such plots, you can go with the following solution, which just excludes n neighbours around the first found maximum.
% Input (copy from above...)
A = [ .. ];
B = [ .. ];
% Index of max value.
[max_val, max_idx] = max(A);
% Find second max value by excluding n neighbourhood.
n = 10;
AA = A;
AA(max_idx - n : max_idx + n) = [];
sec_max_val = max(AA);
sec_max_idx = find(A == sec_max_val);
% Output.
figure(1);
hold on;
% Graph.
plot(B, A);
% Black line.
plot([B(1) B(end)], [max_val max_val], 'k');
% Black arrow.
p1 = [B(sec_max_idx) B(sec_max_idx)];
p2 = [max_val sec_max_val];
dp = p2 - p1;
quiver(p1(1), p2(1), p1(2) - p1(1), p2(2) - p2(1), 0, 'k');
hold off;
You'll get such an output:
I have a text file with data structure like this
30,311.263671875,158.188034058,20.6887207031,17.4877929688,0.000297248129755,aeroplane
30,350.668334961,177.547393799,19.1939697266,18.3677368164,0.00026999923648,aeroplane
30,367.98135376,192.697219849,16.7747192383,23.0987548828,0.000186387239864,aeroplane
30,173.569274902,151.629364014,38.0069885254,37.5704650879,0.000172595537151,aeroplane
30,553.904602051,309.903320312,660.893981934,393.194030762,5.19620243722e-05,aeroplane
30,294.739196777,156.249740601,16.3522338867,19.8487548828,1.7795707663e-05,aeroplane
30,34.1946258545,63.4127349854,475.104492188,318.754821777,6.71026540999e-06,aeroplane
30,748.506652832,0.350944519043,59.9415283203,28.3256549835,3.52978979379e-06,aeroplane
30,498.747009277,14.3766479492,717.006652832,324.668731689,1.61551643174e-06,aeroplane
30,81.6389465332,498.784301758,430.23046875,210.294677734,4.16855394647e-07,aeroplane
30,251.932098389,216.641052246,19.8385009766,20.7131652832,3.52147743106,bicycle
30,237.536972046,226.656692505,24.0902862549,15.7586669922,1.8601918593,bicycle
30,529.673400879,322.511322021,25.1921386719,21.6920166016,0.751171214506,bicycle
30,255.900146484,196.583847046,17.1589355469,27.4430847168,0.268321367912,bicycle
30,177.663650513,114.458488464,18.7516174316,16.6759414673,0.233057001606,bicycle
30,436.679382324,273.383331299,17.4342041016,19.6081542969,0.128449092153,bicycle
I want to index those file with a label file.and the result will be something like this.
60,509.277435303,284.482452393,26.1684875488,31.7470092773,0.00807665128377,15
60,187.909835815,170.448471069,40.0388793945,58.8763122559,0.00763951029512,15
60,254.447280884,175.946624756,18.7212677002,21.9440612793,0.00442053096776,15
However there might be some class that is not in label class and I need to filter those line out so I can use load() to load in.(you can't have char inside that text file and execute load().
here is my implement:
function test(vName,meta)
f_dt = fopen([vName '.txt'],'r');
f_indexed = fopen([vName '_indexed.txt'], 'w');
lbls = loadlbl()
count = 1;
while(true),
if(f_dt == -1),
break;
end
dt = fgets(f_dt);
if(dt == -1),
break
else
dt_cls = strsplit(dt,','){7};
dt_cls = regexprep(dt_cls, '\s+', '');
cls_idx = find(strcmp(lbls,dt_cls));
if(~isempty(cls_idx))
dt = strrep(dt,dt_cls,int2str(cls_idx));
fprintf(f_indexed,dt);
end
end
end
fclose(f_indexed);
if(f_dt ~= -1),
fclose(f_dt);
end
end
However it work very very slow because the text file contains 100 thousand of lines. Is it anyway that I could do this task smarter and faster?
You may use textscan, and get the indices/ line numbers of the labels you want. After knowing the line numbers, you can extract what you want.
fid = fopen('data.txt') ;
S = textscan(fid,'%s','delimiter','\n') ;
S = S{1} ;
fclose(fid) ;
%% get bicycle lines
idx = strfind(S, 'bicycle');
idx = find(not(cellfun('isempty', idx)));
S_bicycle = S(idx)
%% write this to text file
fid2 = fopen('text.txt','wt') ;
fprintf(fid2,'%s\n',S_bicycle{:});
fclose(fid2) ;
From S_bicycle, you can extract your numbers.
My code looks as so:
PosHotspot = dataset('file', 'PositiveHotspotpos.txt', 'Delimiter', '\t');
a = 2;
exon_end = PosHotspot.total_exon;
exonposition = PosHotspot.ExonPos;
Isoformnumber = PosHotspot.Isoform;
fileID = fopen('PosHotspot_results.txt', 'w')
for j = 1:660
exon = exonposition(j:j);
Isoform = Isoformnumber(j:j);
b = exon_end(j:j) - 1;
rng(0, 'twister');
r=randi([a b],1,1000);
less = sum(exon>r);
greater = sum(exon<r);
equal = sum(exon==r);
fprintf(fileID, '%s %4f %4f\n',Isoform,less,greater)
end
fclose(fileID)
However, I keep getting this error:
Error using fprintf
Function is not defined for 'cell' inputs.
Error in PositiveHotspotttest (line 24)
fprintf(fileID, '%s %4f %4f\n',Isofrom,less,greater)
I'm certain that it has to do with writing my information from Isoforms to the file.
Here's an example of what my file looks like:
chrom Gene Isoform exon_start ExonPos total_exon exonpos_exontotal
chr20 ADA NM_000022 43255096 4 13 0.307692307692
chr9 ALDOB NM_000035 104187734 7 10 0.7
chr5 ARSB NM_000046 78077674 7 9 0.777777777778
chr5 ARSB NM_000046 78135178 6 9 0.666666666667
chr5 ARSB NM_000046 78181406 5 9 0.555555555556
I want to output the Isoforms to my new file as well as the greater than and less than values. Is there a way to do this?
It's probably pretty simple, but again I'm new to matlab
Change:
Isoform = Isoformnumber(j:j);
to the more natural:
Isoform = Isoformnumber{j};
Like this you'll retrieve the content of the cell no. j, instead of the whole cell.
I want to make the results like this:
phraseblanks
phrasemat = Hello and how are you?
Hi there everyone!
How is it going?
WHazzup?
Phrase 1 had 4 blanks
Phrase 2 had 3 blanks
Phrase 3 had 2 blanks
Phrase 4 had 0 blanks
New phrasemat is :
Hello&and&how&are&you?
Hi&there&everyone!
How&is&it&going?
WHazzup?
so I made script "phraseblanks.m":
phrasemat = char('Hello and how are you?', ...
'Hi there everyone!', 'How is it going?', 'WHazzup?')
[r, c] = size(phrasemat);
for i = 1:r
phrasemat_new = cell(r, c);
howmany = countblanks(phrasemat(i, :));
fprintf('Phrase %d had %d blanks\n', i, howmany);
phrasemat(i,:) = strrep(phrasemat(i,:),' ','&')
phrasemat_new{i,:} = [phrasemat(i,:)];
end
fprintf('Changing one is %s\n', eval('phrasemat_new'));
script "countblanks.m":
function num = countblanks(phrase)
% countblanks returns the # of blanks in a trimmed string
% Format: countblanks(string)
num = length(strfind(strtrim(phrase), ' '));
end
and I keep having errors.
please help me..
I modified slightly your phraseblanks.m so that it works.
phrasemat = {'Hello and how are you?', ...
'Hi there everyone!', ...
'How is it going?', ...
'WHazzup?'};
r = numel(phrasemat);
phrasemat_new = cell(1, r);
for i = 1:r
howmany = countblanks(phrasemat{i});
fprintf('Phrase %d had %d blanks\n', i, howmany);
phrasemat{i} = strrep(phrasemat{i}, ' ', '&');
phrasemat_new(i) = phrasemat(i);
end
fprintf('Changing one is %s\n', phrasemat_new{:});
Obviously it could be written in a nicer, more "matlaby-way", but I didn't want to stride too far from your original version. Also you could consider using regular expressions, since if you have to spaces next to each other and you want to treat them as one blank space.