Please help me.
I looked for the way to graph the following table and I could not.
Take the column "HR" for the X axis and "VALUE" for Y axis, the problem is that the graph is cut. Column "RNO" only for sort.
For example: the value graph X Axis 10,11, ..., 21,22,23 when reaching the value or 00 does not continue the graph in the X axis, what happens is that values less than 10 puts them first. I attach the image that appears to me:
HR RATE_TIMEWAITED_PER_CLASS RNO
11 1
12 135.083333 2
13 232.916667 3
14 130.611111 4
15 155.111111 5
16 186.472222 6
17 166.805556 7
18 110.916667 8
19 89.3055556 9
20 198.166667 10
21 56.0277778 11
22 32.0277778 12
23 29.4722222 13
00 501.111111 14
01 18.6944444 15
02 16.0555556 16
03 14.1666667 17
04 375.892811 18
05 16.0833333 19
06 29.8611111 20
07 79.6666667 21
08 131.25 22
09 332.666667 23
Related
I have a large matrix of 102730 rows in the form of text file (sample text file is attached) with some header files in it. The first column show year, the next the month, followed by the day, and value1, value2 and value 3. Some of the cells are missing/empty. I want to fill these empty cells with NaN, so that they don't interefere with the next value.
This is the input matrix:
1970 01 13 21.0 6.1 06 000.0
1970 01 14 22.4 8.1 03 000.0
1970 01 15 21.2 8.1 04 000.0
1970 01 16 22.6 9.1 04 000.0
1970 01 17 22.8 9.1 02 000.0
1970 01 18 22.9 8.9 07 000.0
1970 01 19 23.8 10.8 04 000.0
1970 01 20 21.8 12.1 10 010.5
1970 01 21 19.8 06 012.9
1970 01 22 15.3 8.5 07 000.0
1974 06 28 39.2 25.6 03 000.0
1974 06 29 41.2 30.5 05 000.0
1974 06 30 40.3 31.2 07 000.0
1974 07 01 41.3 31.5 12 000.0
1974 07 02 43.3 31.3 20 000.0
1974 07 03 41.2 16 041.6
1974 07 04 34.3 21.4 14 054.5
1974 07 05 33.1 23.8 05 000.0
1974 07 06 36.2 28.9 06 000.0
1975 04 18 36.6 20.8 12 000.0
1975 04 19 37.4 21.1 05 000.0
1975 04 20 39.9 27.0 07 000.0
1975 04 21 39.5 27.3 09 000.0
1975 04 22
1975 04 23 39.5 27.1 08 000.0
1975 04 24 37.7 26.0 10 000.0
1975 04 25 38.7 27.2 15 000.0
The desired output matrix:
1970 01 13 21.0 6.1 06 000.0
1970 01 14 22.4 8.1 03 000.0
1970 01 15 21.2 8.1 04 000.0
1970 01 16 22.6 9.1 04 000.0
1970 01 17 22.8 9.1 02 000.0
1970 01 18 22.9 8.9 07 000.0
1970 01 19 23.8 10.8 04 000.0
1970 01 20 21.8 12.1 10 010.5
1970 01 21 19.8 Nan 06 012.9
1970 01 22 15.3 8.5 07 000.0
1974 06 28 39.2 25.6 03 000.0
1974 06 29 41.2 30.5 05 000.0
1974 06 30 40.3 31.2 07 000.0
1974 07 01 41.3 31.5 12 000.0
1974 07 02 43.3 31.3 20 000.0
1974 07 03 41.2 Nan 16 041.6
1974 07 04 34.3 21.4 14 054.5
1974 07 05 33.1 23.8 05 000.0
1974 07 06 36.2 28.9 06 000.0
1975 04 18 36.6 20.8 12 000.0
1975 04 19 37.4 21.1 05 000.0
1975 04 20 39.9 27.0 07 000.0
1975 04 21 39.5 27.3 09 000.0
1975 04 22 Nan Nan Nan Nan
1975 04 23 39.5 27.1 08 000.0
1975 04 24 37.7 26.0 10 000.0
1975 04 25 38.7 27.2 15 000.0
As an attempt, first I tried with this:
T = readtable('sample.txt') ;
Above code didn't work since it meshed up and gave the wrong number of columns when there 2 digits before the decimal. Secondly, I found this link: Creating new matrix from cell with some empty cells disregarding empty cells
The foll. code snippet may be useful from this link, but I don't know how to read the data directly from the text pad inorder to apply this code & subsequent retrieval process:
inds = ~cellfun('isempty', elem); %elem to be replaced as sample
I also find out the method to detect empty cells here: How do I detect empty cells in a cell array?
but I couldn't figure out how to read the data from a text file considering these empty cells.
Could anyone please help?
Since R2019a, you can simply use readmatrix:
>> myMat = readmatrix('sample.txt')
From the docs:
For delimited text files, the importing function converts empty fields in the file to either NaN (for a numeric variable) or an empty character vector (for a text variable). All lines in the text file must have the same number of delimiters. The importing function ignores insignificant white space in the file.
For previous releases, you can use detectImportOptions object when calling readtable:
% Detect options.
>> opts = detectImportOptions('sample.txt');
% Read table.
>> myTable = readtable('sample.txt',opts);
% Visualise last rows of table.
>> tail(myTable)
ans =
8×7 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7
____ ____ ____ ____ ____ ____ ____
1975 4 18 36.6 20.8 12 0
1975 4 19 37.4 21.1 5 0
1975 4 20 39.9 27 7 0
1975 4 21 39.5 27.3 9 0
1975 4 22 NaN NaN NaN NaN
1975 4 23 39.5 27.1 8 0
1975 4 24 37.7 26 10 0
1975 4 25 38.7 27.2 15 0
For your text file, detectImportOptions is filling missing values with NaN :
>> opts.VariableOptions
If the desired output is a matrix, you can then use table2array:
>> myMat = table2array(myTable)
I can understand a bit of bash and R but not enough to write sth.
I have taken the qual file from my FastQ file using PRINSEQ and I have something like that:
>R8ABE:00036:00036
20 20 20 25 15 25 30 30 25 25 25 15 20 15 20 20 25 25 15 21 15 21 21 26 34 36 25 25 28 25 25 21 31 25 25 25 11 25 25 25 25 13 23 13 13 15 13 13 23 26 26 21 25 19 25 19 25 19 25 25
11 21 21 21 21 15 21 21 29 21 21 15 21 21 21 13 13 23
>R8ABE:00038:00039
20 20 15 20 25 15 20 23 14 13 14 14 8 13 23 23 8 13 13 13 13 13 7 13 13 13 8 21 34
>R8ABE:00038:00042
23 26 27 30 34 15 25 25 20 25 25 30 31 33 33 39 39 16 25 25 25 25 25 12 25 25 19 25
>R8ABE:00038:00047
25 25 25 25 19 13 14 14 8 13 13 13 8 13 13 8 13 13 20 20 30 30 34 34 16 25 19 25 25 19 21 15 21 15 21 31 21 25 25 25 15 25 30 30 19 27 29 36 37 36 36 32 35 33 33 33 19 25 25 25
25 25 25 25 25 34 28 28 24 15 15 13 9 13 13 8 13 23 23 17 23 23 34 15 20 15 21 21 21 21 15 21 25 25 25 25 25 28 22 25 27 28 28 10 15 15 16 16 15 15 15 25 25 30 30 25 25 19 25 25
I'd like to calculate the average value of each sequence. The names after the ">" are not necessary. It's important for me to have these "means" at the original order, like a list, similar to this:
21.62
22.16
30.88
.
.
It does not matter if there are decimal numbers.
Thank you!
You can use a python script like so:
from __future__ import division ## I'm assuming you're using python 2
with open('my_quality_file.fastq') as fh:
for line in fh:
if line.startswith('>'):
continue
scores = map(float, line.strip().split())
print sum(scores) / len(scores)
For each line that has quality scores on it, it will convert to a list of floats, then print the arithmetic mean.
example
My text file is like
string 1
12 13 14 16
11 41 25 26
32 25 26 27
string 2
23 15 26 28
12 15 19 17
35 65 84 12
string 3 and so on...
I want that if I ask for string 1 , it will give me the corresponging matrix under string 1 and the size of the matrix is also not known i.e.
12 13 14 16
11 41 25 26
32 25 26 27
will anyone tell me how to do this?
thanks
You should do the following:
1 - Convert string to integer/matrix
M1 = str2num(string1) - http://www.mathworks.com/help/matlab/ref/str2num.html
M2 = str2num(string2)
2 - If you want to find the corresponding values, just compare:
for i=1:length(M1)
[~,M3(i)]=ismember(M1(i),M2)
end
M3 will give you the indexes that are a match in M2.
I have a 21128x9 matrix in the following format:
x = ['Participant No.' 'yyyy' 'mm' 'dd' 'HH' 'MM' 'SS' 'question No.' 'response']
e.g.
x =
Columns 1 through 5
18 2011 10 26 15
18 2011 10 26 15
18 2011 10 26 15
18 2011 10 26 15
18 2011 10 26 15
19 2011 10 31 13
19 2011 10 31 13
19 2011 10 31 13
19 2011 10 31 13
19 2011 10 31 13
Columns 6 through 9
42 33 27 4
42 39 17 2
42 45 52 2
42 47 45 3
42 50 12 3
6 5 36 1
6 20 27 4
6 22 34 5
6 33 43 3
6 42 42 1
where columns 2-7 are date vectors.
The data are sorted by date/time.
I'd like to calculate the time taken to answer each question for each participant - i.e. the time elapsed between row 1 and 2, 2 and 3, 3 and 4, 4 and 5, and then 6 and 7, 7 and 8 etc. - to end up with a matrix, sorted by participant number, where I can then work out the mean time taken per question.
I've tried using the etime function, but to no avail.
EDIT: With regards to etime, just to see if it would work in practice, I tried to write:
etime(x(2,5:7),x(1,5:7))
to compare just columns 5-7 of rows 1 and 2, but i keep getting back:
??? Index exceeds matrix dimensions.
Error in ==> etime at 41
t = 86400*(datenummx(t1(:,1:3)) - datenummx(t0(:,1:3))) + ...
You were almost there! You needed to change the 5s to 2s, that's all:
etime(x(2,2:7),x(1,2:7))
Now to get them all lets make two matrices of the date vectors but one row out of synch with each other:
fisrt set up x:
x =[ 18 2011 10 26 15 42 33 27 4
18 2011 10 26 15 42 39 17 2
18 2011 10 26 15 42 45 52 2
18 2011 10 26 15 42 47 45 3
18 2011 10 26 15 42 50 12 3
19 2011 10 31 13 6 5 36 1
19 2011 10 31 13 6 20 27 4
19 2011 10 31 13 6 22 34 5
19 2011 10 31 13 6 33 43 3
19 2011 10 31 13 6 42 42 1]
now extract the times:
Tn = x(1:end-1, 2:7);
Tnplus1 = x(2:end, 2:7);
And no to get a vector of the difference in seconds between consecutive rows:
etime(Tnplus1, Tn)
Which results in:
ans =
6
6
2
3
422595
15
2
11
9
Also if you don't care about the year month day data just set them to zero i.e.
Tn(:, 1:3) = 0;
Tnplus1(:, 1:3) = 0;
etime(Tnplus1, Tn)
ans =
6
6
2
3
-9405
15
2
11
9
Here are some simple steps:
Calculate the difference between the two rows that you want to compare
Multiply with a vector that contains the number of seconds per unit
Small scale example:
% Hours Mins Secs:
difference = ([23 12 4] - [23 11 59]);
secvec = difference .* [3600 60 1];
secdiff = sum(secvec)
In the matrix shown below how can I select elements 01, 09, 17 and 25. From Egon's answer to my earlier question Select Diagonal Elements of a Matrix in MATLAB I am able to select the central value 25 using c = (size(A)+1)/2; but I am wondering how to select above mentioned elements in NW direction.
A = [01 02 03 04 05 06 07
08 09 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 32 33 34 35
36 37 38 39 40 41 42
43 44 45 46 47 48 49];
Use diag to get elements on the diagonal.
diagA = diag(A)
You can restrict this to the elements from the top left to the middle with
n = ceil(size(A, 1) / 2)
diagA(1:n)
Another way to do this is with linear indexing. If you have an N-by-N matrix, you can select the elements you want as follows:
values = A(1:N+1:ceil((N^2)/2));