Perl matching array across file - perl

I have a file which looks like this.
In the Perl code, i am using an array #query = ('A+80', 'A+40', 'A+202', 'B+130', 'B+268', 'B+211', 'A+35');
What I want to do is: for each element of the array, scan the lines in the file shown below and print out something like this:
A+80 - HELIX
A+40 - SHEET
A+202 - HELIX
B+130 - HELIX
B+268 - SHEET
B+211 - SHEET
A+35 - LOOP
The logic behind this output is to extract for each entry in array, the first part i.e. A or B, and the second part, i.e. the number associated with the 1st part. Consider the first entry in the array: A+80. On the third line of the file the number 80 is lying between 78 (the 6th column) and 90 (the 9th column) and also first alphabet A is also matching in both cases. Hence the program prints HELIX for this query.
Consider 2nd element: A+40. The 2nd part i.e the number lies in the range as on this line
SHEET 2 B 3 ARG A 37 VAL A 43 1
i.e. between the numbers listed in columns 7 and 10, and the alphabet matches too. Hence for this entry the output is: SHEET
For other cases, like B+211. THe line given below matches the number and the alphabet associated with it.
SHEET 2 B 3 ARG A 37 VAL A 43 1
Hence the output for this entry is: SHEET
Also, for entries whose alphabet and number associated with it, do not match any of the lines in the file. the code outputs: A+35 - LOOP
What is an efficient way to do this in Perl?
Since I am a beginner with Perl, I am as of now splitting each entry in successive array elements, i.e. for #query, and matching/comparing the alphabet and number to each of the relevant columns in the lines. But somehow am unable to get the output desired.
Please help...
HELIX 1 1 GLY A 9 GLN A 30 1
HELIX 2 2 ASP A 47 ILE A 63 1
HELIX 3 3 GLU A 78 GLU A 90 1
HELIX 4 4 THR A 111 ALA A 117 1
HELIX 5 5 PRO A 120 LYS A 122 5
HELIX 6 6 SER A 129 ARG A 137 1
HELIX 7 7 CYS A 147 THR A 159 1
HELIX 8 8 GLY A 178 ASN A 188 1
HELIX 9 9 LEU A 202 LYS A 208 1
HELIX 10 10 GLY A 224 TRP A 226 5
HELIX 11 11 TYR A 258 GLU A 263 1
HELIX 12 12 VAL A 275 PHE A 294 1
HELIX 13 13 GLY B 9 GLN B 30 1
HELIX 14 14 ASP B 47 ILE B 63 1
HELIX 15 15 GLU B 78 GLU B 90 1
HELIX 16 16 THR B 111 ALA B 117 1
HELIX 17 17 PRO B 120 LYS B 122 5
HELIX 18 18 SER B 129 ARG B 137 1
HELIX 19 19 CYS B 147 THR B 159 1
HELIX 20 20 GLY B 178 TRP B 187 1
HELIX 21 21 LEU B 202 LYS B 208 1
HELIX 22 22 GLY B 224 TRP B 226 5
HELIX 23 23 TYR B 258 GLU B 263 1
HELIX 24 24 GLY B 276 PHE B 294 5
SHEET 1 A 2 GLU A 5 LEU A 7 0
SHEET 2 A 2 PHE A 267 THR A 269 1
SHEET 1 B 3 LYS A 66 LEU A 72 0
SHEET 2 B 3 ARG A 37 VAL A 43 1
SHEET 3 B 3 GLY A 96 VAL A 99 1
SHEET 1 C 4 THR A 191 CYS A 195 0
SHEET 2 C 4 HIS A 167 VAL A 171 1
SHEET 3 C 4 ILE A 211 VAL A 214 1
SHEET 4 C 4 ILE A 232 ASP A 235 1
SHEET 1 D 2 GLU B 5 LEU B 7 0
SHEET 2 D 2 PHE B 267 THR B 269 1
SHEET 1 E 3 LYS B 66 LEU B 72 0
SHEET 2 E 3 ARG B 37 VAL B 43 1
SHEET 3 E 3 GLY B 96 VAL B 99 1
SHEET 1 F 4 THR B 191 CYS B 195 0
SHEET 2 F 4 HIS B 167 VAL B 171 1
SHEET 3 F 4 ILE B 211 VAL B 214 1
SHEET 4 F 4 ILE B 232 ASP B 235 1
SHEET 1 G 2 ASN B 239 PRO B 242 0
SHEET 2 G 2 ARG B 250 VAL B 253 -1

The program below seems to do what you need. It reads data from within the source file using the DATA file handle for convenience: you must arrange to open and read the appropriate data file.
The entirety of the file is read into memory for straightforward access. If the file is enormous (say, hundreds of megabytes) then this approach may be inappropriate and you will have to come back for more help.
The records vary in length, so the algorithm locates the relevant fields relative to the first three-letter field found.
The hash %categories contains all of the required file data. It is indexed by the key letter - A or B here - and the value of each element is an array of anonymous hashes containing the label (column 1), the letter, and the start and end of the range covered by each record.
Building the output is straightforward, and uses map and grep to find the 'label' of all the relevant entries in the hash. If none are found the text "LOOP" is added as a default.
use strict;
use warnings;
my #query = qw/ A+80 A+40 A+202 B+130 B+268 B+211 A+35 /;
my %categories;
while (<DATA>) {
next unless /\S/;
my #data = split;
my #indices = grep $data[$_] =~ /^[A-Z]{3}$/, 0 .. $#data;
my %info;
#info{qw/ label letter start end /} = #data[ 0, $indices[0]+1, $indices[0]+2, $indices[1]+2 ];
push #{ $categories{$info{letter}} }, \%info;
}
for my $item (#query) {
my ($letter, $value) = split /\+/, $item;
my #matches = map $_->{label},
grep { $value >= $_->{start} and $value <= $_->{end} }
#{ $categories{$letter} };
#matches = ('LOOP') unless #matches;
warn qq(Multiple categories for query "$item") unless #matches == 1;
printf "%s - %s\n", $item, $_ for #matches
}
__DATA__
HELIX 1 1 GLY A 9 GLN A 30 1
HELIX 2 2 ASP A 47 ILE A 63 1
HELIX 3 3 GLU A 78 GLU A 90 1
HELIX 4 4 THR A 111 ALA A 117 1
HELIX 5 5 PRO A 120 LYS A 122 5
HELIX 6 6 SER A 129 ARG A 137 1
HELIX 7 7 CYS A 147 THR A 159 1
HELIX 8 8 GLY A 178 ASN A 188 1
HELIX 9 9 LEU A 202 LYS A 208 1
HELIX 10 10 GLY A 224 TRP A 226 5
HELIX 11 11 TYR A 258 GLU A 263 1
HELIX 12 12 VAL A 275 PHE A 294 1
HELIX 13 13 GLY B 9 GLN B 30 1
HELIX 14 14 ASP B 47 ILE B 63 1
HELIX 15 15 GLU B 78 GLU B 90 1
HELIX 16 16 THR B 111 ALA B 117 1
HELIX 17 17 PRO B 120 LYS B 122 5
HELIX 18 18 SER B 129 ARG B 137 1
HELIX 19 19 CYS B 147 THR B 159 1
HELIX 20 20 GLY B 178 TRP B 187 1
HELIX 21 21 LEU B 202 LYS B 208 1
HELIX 22 22 GLY B 224 TRP B 226 5
HELIX 23 23 TYR B 258 GLU B 263 1
HELIX 24 24 GLY B 276 PHE B 294 5
SHEET 1 A 2 GLU A 5 LEU A 7 0
SHEET 2 A 2 PHE A 267 THR A 269 1
SHEET 1 B 3 LYS A 66 LEU A 72 0
SHEET 2 B 3 ARG A 37 VAL A 43 1
SHEET 3 B 3 GLY A 96 VAL A 99 1
SHEET 1 C 4 THR A 191 CYS A 195 0
SHEET 2 C 4 HIS A 167 VAL A 171 1
SHEET 3 C 4 ILE A 211 VAL A 214 1
SHEET 4 C 4 ILE A 232 ASP A 235 1
SHEET 1 D 2 GLU B 5 LEU B 7 0
SHEET 2 D 2 PHE B 267 THR B 269 1
SHEET 1 E 3 LYS B 66 LEU B 72 0
SHEET 2 E 3 ARG B 37 VAL B 43 1
SHEET 3 E 3 GLY B 96 VAL B 99 1
SHEET 1 F 4 THR B 191 CYS B 195 0
SHEET 2 F 4 HIS B 167 VAL B 171 1
SHEET 3 F 4 ILE B 211 VAL B 214 1
SHEET 4 F 4 ILE B 232 ASP B 235 1
SHEET 1 G 2 ASN B 239 PRO B 242 0
SHEET 2 G 2 ARG B 250 VAL B 253 -1
output
A+80 - HELIX
A+40 - SHEET
A+202 - HELIX
B+130 - HELIX
B+268 - SHEET
B+211 - SHEET
A+35 - LOOP

Related

Calculate the mean of some rows of a column in a table

I have a .csv file with a table, which I imported as following:
mydata = readtable('datafile1.csv');
The table has 2549 rows, and 28 columns. Here is one part of the table, with all columns but some rows, to give an example:
ID subject A B C D E F G H I J K L M N O P Q R S T U V W X Y
'sbj05100' 'sbj05' 6.22316646575928 85 -2.31806182861328 339 14 100022 'tf' 48401 100 2 2 'no' 'h' 339 322.507000000000 339 'sbj05' 100 100021 286 1 419 1.95000000000000 2 1 662
'sbj05102' 'sbj05' 7.60787820816040 65 3.00547647476196 405 17 102012 'tf' 59201 102 1 2 'yes' 'h' 405 385.367000000000 405 'sbj05' 102 102011 283 1 283 1.89000000000000 1 1 364
'sbj05104' 'sbj05' -3.71897959709167 81 3.80262303352356 429 19 104012 'tf' 66401 104 1 2 'yes' 'h' 429 408.228000000000 429 'sbj05' 104 104011 266 1 266 2.19000000000000 2 1 244
'sbj09152' 'sbj09' 0.181026369333267 88 -0.0696721449494362 87 4 152042 'tf' 12401 152 4 2 'no' 'l' 87 82.8280000000000 87 'sbj09' 152 152041 297 1 297 1.25000000000000 1 1 354
'sbj09157' 'sbj09' 0.309507131576538 116 0.226024463772774 51 2 157042 'tf' 5201 157 4 2 'no' 'l' 51 48.4870000000000 51 'sbj09' 157 157041 273 1 273 1.45000000000000 1 1 279
'sbj10151' 'sbj10' 6.99367523193359 90 4.86872243881226 345 20 151022 'tf' 70001 151 2 2 'no' 'h' 345 328.224000000000 345 'sbj10' 151 151021 198 1 198 3 1 1 310
'sbj10167' 'sbj10' 2.25431561470032 152 -0.200379326939583 129 7 167012 'tf' 23201 167 1 2 'yes' 'h' 129 122.675000000000 129 'sbj10' 167 167011 110 1 110 2.32000000000000 2 1 276
'sbj10168' 'sbj10' 3.22731518745422 147 4.72183227539062 93 3 168042 'tf' 8801 168 4 2 'no' 'l' 93 88.3230000000000 93 'sbj10' 168 168041 179 1 179 2.38000000000000 2 1 132
I need to calculate the mean of column B, and separately of column C, for each subject (column subject) and each condition (column I).
What I would like to obtain is:
for sbj05 column B --> cond 1 = (65+81)/2
cond2 = 85
column C --> cond 1 = (3.005476475+3.802623034)/2
cond2 = -2.3180618
and so on...
I tried to follow this link in matlab, calculate mean in a part of one column where another column satisfies a condition.
[R, I, J] = unique(mydata(:,2));
% count the repeating entries: now we have integer indices!
counts = accumarray(J, 1, size(R));
% sum the 2nd column for all entries
sums = accumarray(J, mydata(:,4), size(R)); %for column B
% compute means
means = sums./counts;
but I get this error:
Undefined function 'accumarray' for input arguments of type 'table'.
Any suggestions?
Conveniently, Matlab has a function for calculalting statistics on tables. Instead of accumarray, you therefore may want to use grpstats:
meanPerSubjectAndCondition = grpstats(mydata,{'subject','I'},'mean','DataVars',{'B','C'})

reducing matrices under certain conditions - Part 2 -

This question is more complex than my previous question because here V is a cell
M is a matrix 4x2000000 composed of several submatrix Ai such that Ai(1:3,j) is the same vector for j = 1,...,size(Ai,2). and Ai(4,j) are values between 1 and 100.
V = {V1,V2,...,Vn} (V1 or V2 or ...Vn)
V1,V2,... and Vn have different sizes.
my goal is to eliminate all sub-matrix Ai of M, if Ai(4,:) does not contain all the values of V1 or V2 or ...Vn.
The only initial data for this problem are M and V
I wanted to use a for loop with the answer of the question here, but I noticed that the calculation time increases with the size of V.
Example:
M = [1022 3001 4451 1022 1022 3001 1022 3001 3001 1022 1055 1055 1055 1055 1055 1055;
112 45 10 112 112 45 11 45 99 112 11 11 11 11 11 11;
500 11 55 500 500 11 88 11 1 500 45 45 45 45 45 45;
2 6 3 5 71 2 2 71 5 88 8 15 21 94 10 33]
A1 = [1022 1022 1022 1022;
112 112 112 112;
500 500 500 500;
2 5 71 88]
A2 = [3001 3001 3001;
45 45 45;
11 11 11;
6 2 71]
A3 = [4451;
10;
55;
3]
A4 = [1055 1055 1055 1055 1055 1055;
11 11 11 11 11 11;
45 45 45 45 45 45;
8 15 21 94 10 33]
A5 =[3001;
99;
1;
5]
if V = {[2 71],[3],[15 94 33 10]}
The expected output (order of columns is not important):
[1022 1022 1022 1022 3001 3001 3001 4451 1055 1055 1055 1055 1055 1055;
112 112 112 112 45 45 45 10 11 11 11 11 11 11;
500 500 500 500 11 11 11 55 45 45 45 45 45 45;
2 5 71 88 6 2 71 3 8 15 21 94 10 33]
See if this works for you -
%// ID columns of M based on the uniquenes of the first thre rows
[~,~,idx] = unique(M(1:3,:).','rows') %//'
%// Lengths of each V cell
lens = cellfun('length',V)
%// Setup ID array for use with ACCUMARRAY later on
id = zeros(1,sum(lens))
id(cumsum(lens(1:end-1))+1) = 1
id = cumsum(id)+1
%// Collect all cells of V as a 1D numeric array
Vn = [V{:}]
%// Counts of number of elements for each cell/groups of V
counts_V = histc(id,1:numel(V))
%// Function handle to detect for if the input would satisfy the crietria
%// of all its values belong to either V1 or V2 or ...Vn
func1 = #(x) any(counts_V == histc(id(ismember(Vn,x)),1:numel(V)))
%// For each ID in "idx", see if it satisfies the above mentioned criteria
matches = accumarray(idx(:),M(4,:)',[], func1 ) %//'
%// Use the "detections" for selecting the valid columns from M
out = M(:,ismember(idx,find(matches)))

Functional addition of Columns in kdb+q

I have a q table in which no. of non keyed columns is variable. Also, these column names contain an integer in their names. I want to perform some function on these columns without actually using their actual names
How can I achieve this ?
For Example:
table:
a | col10 col20 col30
1 | 2 3 4
2 | 5 7 8
// Assume that I have numbers 10, 20 ,30 obtained from column names
I want something like **update NewCol:10*col10+20*col20+30*col30 from table**
except that no.of columns is not fixed so are their inlcluded numbers
We want to use a functional update (simple example shown here: http://www.timestored.com/kdb-guides/functional-queries-dynamic-sql#functional-update)
For this particular query we want to generate the computation tree of the select clause, i.e. the last part of the functional update statement. The easiest way to do that is to parse a similar statement then recreate that format:
q)/ create our table
q)t:([] c10:1 2 3; c20:10 20 30; c30:7 8 9; c40:0.1*4 5 6)
q)t
c10 c20 c30 c40
---------------
1 10 7 0.4
2 20 8 0.5
3 30 9 0.6
q)parse "update r:(10*c10)+(20*col20)+(30*col30) from t"
!
`t
()
0b
(,`r)!,(+;(*;10;`c10);(+;(*;20;`col20);(*;30;`col30)))
q)/ notice the last value, the parse tree
q)/ we want to recreate that using code
q){(*;x;`$"c",string x)} 10
*
10
`c10
q){(+;x;y)} over {(*;x;`$"c",string x)} each 10 20
+
(*;10;`c10)
(*;20;`c20)
q)makeTree:{{(+;x;y)} over {(*;x;`$"c",string x)} each x}
/ now write as functional update
q)![t;();0b; enlist[`res]!enlist makeTree 10 20 30]
c10 c20 c30 c40 res
-------------------
1 10 7 0.4 420
2 20 8 0.5 660
3 30 9 0.6 900
q)update r:(10*c10)+(20*c20)+(30*c30) from t
c10 c20 c30 c40 r
-------------------
1 10 7 0.4 420
2 20 8 0.5 660
3 30 9 0.6 900
I think functional select (as suggested by #Ryan) is the way to go if the table is quite generic, i.e. column names might varies and number of columns is unknown.
Yet I prefer the way #JPC uses vector to solve the multiplication and summation problem, i.e. update res:sum 10 20 30*(col10;col20;col30) from table
Let combine both approach together with some extreme cases:
q)show t:1!flip(`a,`$((10?2 3 4)?\:.Q.a),'string 10?10)!enlist[til 100],0N 100#1000?10
a | vltg4 pnwz8 mifz5 pesq7 fkcx4 bnkh7 qvdl5 tl5 lr2 lrtd8
--| -------------------------------------------------------
0 | 3 3 0 7 9 5 4 0 0 0
1 | 8 4 0 4 1 6 0 6 1 7
2 | 4 7 3 0 1 0 3 3 6 4
3 | 2 4 2 3 8 2 7 3 1 7
4 | 3 9 1 8 2 1 0 2 0 2
5 | 6 1 4 5 3 0 2 6 4 2
..
q)show n:"I"$string[cols get t]inter\:.Q.n
4 8 5 7 4 7 5 5 2 8i
q)show c:cols get t
`vltg4`pnwz8`mifz5`pesq7`fkcx4`bnkh7`qvdl5`tl5`lr2`lrtd8
q)![t;();0b;enlist[`res]!enlist({sum x*y};n;enlist,c)]
a | vltg4 pnwz8 mifz5 pesq7 fkcx4 bnkh7 qvdl5 tl5 lr2 lrtd8 res
--| -----------------------------------------------------------
0 | 3 3 0 7 9 5 4 0 0 0 176
1 | 8 4 0 4 1 6 0 6 1 7 226
2 | 4 7 3 0 1 0 3 3 6 4 165
3 | 2 4 2 3 8 2 7 3 1 7 225
4 | 3 9 1 8 2 1 0 2 0 2 186
5 | 6 1 4 5 3 0 2 6 4 2 163
..
You can create a functional form query as #Ryan Hamilton indicated, and overall that will be the best approach since it is very flexible. But if you're just looking to add these up, multiplied by some weight, I'm a fan of going through other avenues.
EDIT: missed that you said the number in the columns name could vary, in which case you can easily adjust this. If the column names are all prefaced by the same number of letters, just drop those and then parse the remaining into int or what have you. Otherwise if the numbers are embedded within text, check out this other question
//Create our table with a random number of columns (up to 9 value columns) and 1 key column
q)show t:1!flip (`$"c",/:string til n)!flip -1_(n:2+first 1?10) cut neg[100]?100
c0| c1 c2 c3 c4 c5 c6 c7 c8 c9
--| --------------------------
28| 3 18 66 31 25 76 9 44 97
60| 35 63 17 15 26 22 73 7 50
74| 64 51 62 54 1 11 69 32 61
8 | 49 75 68 83 40 80 81 89 67
5 | 4 92 45 39 57 87 16 85 56
48| 88 34 55 21 12 37 53 2 41
86| 52 91 79 33 42 10 98 20 82
30| 71 59 43 58 84 14 27 90 19
72| 0 99 47 38 65 96 29 78 13
q)update res:sum (1+til -1+count cols t)*flip value t from t
c0| c1 c2 c3 c4 c5 c6 c7 c8 c9 res
--| -------------------------------
28| 3 18 66 31 25 76 9 44 97 2230
60| 35 63 17 15 26 22 73 7 50 1551
74| 64 51 62 54 1 11 69 32 61 1927
8 | 49 75 68 83 40 80 81 89 67 3297
5 | 4 92 45 39 57 87 16 85 56 2582
48| 88 34 55 21 12 37 53 2 41 1443
86| 52 91 79 33 42 10 98 20 82 2457
30| 71 59 43 58 84 14 27 90 19 2134
72| 0 99 47 38 65 96 29 78 13 2336
q)![t;();0b; enlist[`res]!enlist makeTree 1+til -1+count cols t] ~ update res:sum (1+til -1+count cols t)*flip value t from t
1b
q)\ts do[`int$1e4;![t;();0b; enlist[`res]!enlist makeTree 1+til 9]]
232 3216j
q)\ts do[`int$1e4;update nc:sum (1+til -1+count cols t)*flip value t from t]
69 2832j
I haven't tested this on a large table, so caveat emptor
Here is another solution which is also faster.
t,'([]res:(+/)("I"$(string tcols) inter\: .Q.n) *' (value t) tcols:(cols t) except keys t)
By spending some time, we can decrease the word count as well. Logic goes like this:
a:"I"$(string tcols) inter\: .Q.n
Here I am first extracting out the integers from column names and storing them in a vector. Variable 'tcols' is declared at the end of query which is nothing but columns of table except key columns.
b:(value t) tcols:(cols t) except keys t
Here I am extracting out each column vector.
c:(+/) a *' b
Multiplying each column vector(var b) by its integer(var a) and adding corresponding
values from each resulting list.
t,'([]res:c)
Finally storing result in a temp table and joining it to t.

Shifting rows of matrix in matlab

I have to shift certain rows in matlab. Like let say I have a matrix of size 50x50. And I have to shift certain rows lets say 15,18,45.. to the top and the remaining rows at the bottom. How can I accomplish this in matlab?
Have you tried the circshift function? Something like this could help:
A = [1:8; 11:18; 21:28; 31:38; 41:48]
A =
1 2 3 4 5 6 7 8
11 12 13 14 15 16 17 18
21 22 23 24 25 26 27 28
31 32 33 34 35 36 37 38
41 42 43 44 45 46 47 48
B = circshift(A, [3, 0])
B =
21 22 23 24 25 26 27 28
31 32 33 34 35 36 37 38
41 42 43 44 45 46 47 48
1 2 3 4 5 6 7 8
11 12 13 14 15 16 17 18
This is a problem that can be quite easily solved with the help of some simple indexing:
Matrix = [ 1 101 201 301
2 102 202 302
3 103 203 303
4 104 204 304
5 105 205 305
6 106 206 306
7 107 207 307
8 108 208 308
9 109 209 309
10 110 210 310];
rowsOnTop = [1 8 4];
rowsBelow = true(size(Matrix,1),1);
rowsBelow(rowsOnTop) = false;
Modified = [Matrix(rowsOnTop,:); Matrix(rowsBelow,:)]
Modified =
1 101 201 301
8 108 208 308
4 104 204 304
2 102 202 302
3 103 203 303
5 105 205 305
6 106 206 306
7 107 207 307
9 109 209 309
10 110 210 310
I understood that you want to move certain rows of matrix to the top and keep the rest on its place. For that you can use this:
Example matrix:
Matrix = [ 1:10; 101:110; 201:210; 301:310 ]';
Matrix =
1 101 201 301
2 102 202 302
3 103 203 303
4 104 204 304
5 105 205 305
6 106 206 306
7 107 207 307
8 108 208 308
9 109 209 309
10 110 210 310
Here's the code:
RowsVector = [ 3, 5, 8 ];
Edit: new better solution (presented here first because it's better).
NewMatrix = Matrix(cell2mat(arrayfun(#(x) x:size(Matrix,1):prod(size(Matrix)), [ RowsVector, setdiff(1:size(Matrix, 1), RowsVector) ]', 'UniformOutput', false)));
NewMatrix =
3 103 203 303
5 105 205 305
8 108 208 308
1 101 201 301
2 102 202 302
4 104 204 304
6 106 206 306
7 107 207 307
9 109 209 309
10 110 210 310
Edit: the rest of the answer is related to a [limited] older solution.
% RowsVector must be sorted, otherwise the reordering will fail.
Edit: fixed a bug with unordered RowsVector input.
RowsVector = sort(RowsVector);
for RowIndex = 1:size(RowsVector, 2)
row = RowsVector(RowIndex);
Matrix = vertcat(Matrix(row,:), Matrix);
Matrix(row+1,:) = [];
end
This is the result:
Matrix =
8 108 208 308
5 105 205 305
3 103 203 303
1 101 201 301
2 102 202 302
4 104 204 304
6 106 206 306
7 107 207 307
9 109 209 309
10 110 210 310
I'd solve this by defining a row permutation matrix to produce the desired result. If Matlab has a built-in function for this it escapes me, so I wrote one:
function P = rowpermat(vec)
P = zeros(length(vec));
for i = 1:length(vec)
P(i,vec(i)) = 1;
end
If vec is a permutation of 1:n this function will return a matrix which permutes the rows of an nxn matrix 1->vec(1), 2->vec(2), ... Note the absence of error checking and the like so use this in production code at your own risk.
In this case, if A is the matrix to permute, you might write:
rowpermat([15, 18, 45, 1:14,16:17,19:44,46:50])*A

I am trying to extract the rows with the same x values from two different files in matlab, how can I do it?

To be more clear, what I want is to generate file3 from file1 but with the x values in file 2.
Example:
file 1:
x1=[1 2 3 4 5 6 7 8 9 10]'
y1=[11 22 33 44 55 66 77 88 99 00]'
file 2:
x2=[3 4 5 8 9]'
y2=[333 444 555 888 999]'
file 3:
x2=[3 4 5 8 9]'
y2=[33 44 55 88 99]'
Use ISMEMBER to find which values of x1 are in x2, and where they're located.
x1=[1 2 3 4 5 6 7 8 9 10]'
y1=[11 22 33 44 55 66 77 88 99 00]'
x2=[3 4 5 8 9]'
y2=[333 444 555 888 999]'
x3 = x2;
y3 = y1(ismember(x1,x2))
y3 =
33
44
55
88
99