I have text file which looks like as shown below:
0 chr23:54039 0 54039
0 chr23:103278 0 103278
0 chr22:174609 0 174609
0 chr22:54039 0 54039
0 chr25:103278 0 103278
0 chr25:174609 0 174609
26 chr26:174609 0 174609
If the first column is '0' i need to replace the 0 in the first column with the number after chr. So, the output should look like:
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Can anyone provide a simple sed or awk any linux solution?
If number in column #1 is always the same as chr number you can do this with awk
awk '{split($2,a,":|chr");$1=a[2]}1' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
With sed:
$ sed -r '/^0/s/0(\s*chr)([^:]*)/\2\1\2/g' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Without -r:
$ sed '/^0/s/0\(\s*chr\)\([^:]*\)/\2\1\2/g' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
The idea is to replace lines starting with 0. In those, the 0...chrNUM:... is caught and printed back with desired format.
With awk:
$ awk '/^0/ {split($2,a,":"); gsub("chr", "", a[1]); $1=a[1]}1' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Given lines starting with 0, the 2nd field is broken into pieces by : delimiter and then chr text is removes. Then it is ready to be stored as first field. 1 makes the condition true, so the full new line is printed.
sed "s/^0[[:blank:]]\{1,\}chr\([0-9]\{1,\}\):/\1 chr\1:/"
Related
I am building a parser in powershell for converting vmstat log dumps to CSV files as an input to a graphing framework (Rickshaw). I have repeating 'headers' in the file which I would like to remove. Data sample is as below:
Tue Sep 1 14:03:26 2015: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
Tue Sep 1 14:03:26 2015: r b swpd free buff cache si so bi bo in cs us sy id wa st
Tue Sep 1 14:03:26 2015: 0 1 224412 358316 248772 63286912 0 0 388 267 1 1 8 0 91 1 0
Tue Sep 1 14:03:36 2015: 0 0 224412 357572 248796 63286916 0 0 0 8 220 261 0 0 100 0 0
Tue Sep 1 14:03:46 2015: 0 0 224412 357696 248808 63286916 0 0 0 14 276 293 0 0 100 0 0
Tue Sep 1 14:03:56 2015: 0 0 224412 357688 248808 63286916 0 0 0 13 231 269 0 0 100 0 0
Tue Sep 1 14:04:06 2015: 0 0 224412 357300 248812 63286920 0 0 0 17 266 283 0 0 100 0 0
Tue Sep 1 14:06:56 2015: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
Tue Sep 1 14:06:56 2015: r b swpd free buff cache si so bi bo in cs us sy id wa st
Tue Sep 1 14:06:56 2015: 1 0 224412 357348 248976 63286928 0 0 0 1 182 231 0 0 100 0 0
Tue Sep 1 14:07:06 2015: 0 0 224412 357348 248980 63286928 0 0 0 9 211 251 0 0 100 0 0
Tue Sep 1 14:07:16 2015: 0 0 224412 357136 248988 63286928 0 0 0 19 287 279 0 0 100 0 0
Tue Sep 1 14:07:26 2015: 0 0 224412 357012 249004 63286928 0 0 0 9 199 244 0 0 100 0 0
Tue Sep 1 14:07:36 2015: 0 0 224412 357080 249012 63286928 0 0 0 7 235 258 0 0 100 0 0
Tue Sep 1 14:10:26 2015: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
Tue Sep 1 14:10:26 2015: r b swpd free buff cache si so bi bo in cs us sy id wa st
Tue Sep 1 14:10:26 2015: 12 0 224400 351832 265992 62560000 6 0 15 25262 8579 617 96 4 0 0 0
Tue Sep 1 14:10:36 2015: 12 0 224400 379200 266064 62444728 0 0 2 16727 8418 761 97 3 0 0 0
I use this bit of code to get that done.
Get-Content "C:\Projects\Play\Garage\Data_Processing\Sampler.log" | select-string -pattern 'procs|swpd' -notmatch | Out-File "C:\Projects\Play\Garage\Data_Processing\Refined.log"
The resulting file has the desired lines removed but instead have blank lines inserted at the beginning and towards the end. Because of this, I am unable to send this data/file to the next step of parsing. What could I be doing wrong?
Resultant File data:
> [BLANK LINE]
Tue Sep 1 14:03:26 2015: 0 1 224412 358316 248772 63286912 0 0 388 267 1 1 8 0 91 1 0
Tue Sep 1 14:03:36 2015: 0 0 224412 357572 248796 63286916 0 0 0 8 220 261 0 0 100 0 0
Tue Sep 1 14:03:46 2015: 0 0 224412 357696 248808 63286916 0 0 0 14 276 293 0 0 100 0 0
Tue Sep 1 14:03:56 2015: 0 0 224412 357688 248808 63286916 0 0 0 13 231 269 0 0 100 0 0
Tue Sep 1 14:04:06 2015: 0 0 224412 357300 248812 63286920 0 0 0 17 266 283 0 0 100 0 0
Tue Sep 1 14:06:56 2015: 1 0 224412 357348 248976 63286928 0 0 0 1 182 231 0 0 100 0 0
Tue Sep 1 14:07:06 2015: 0 0 224412 357348 248980 63286928 0 0 0 9 211 251 0 0 100 0 0
Tue Sep 1 14:07:16 2015: 0 0 224412 357136 248988 63286928 0 0 0 19 287 279 0 0 100 0 0
Tue Sep 1 14:07:26 2015: 0 0 224412 357012 249004 63286928 0 0 0 9 199 244 0 0 100 0 0
Tue Sep 1 14:07:36 2015: 0 0 224412 357080 249012 63286928 0 0 0 7 235 258 0 0 100 0 0
Tue Sep 1 14:10:26 2015: 12 0 224400 351832 265992 62560000 6 0 15 25262 8579 617 96 4 0 0 0
Tue Sep 1 14:10:36 2015: 12 0 224400 379200 266064 62444728 0 0 2 16727 8418 761 97 3 0 0 0
>[BLANK LINE]
>[BLANK LINE]
>[BLANK LINE]
Not sure why Select-String is making empty lines but you could replace Select-String with a simple Where-Object which would not return the empty lines
Here's how i would do it:
Get-Content "C:\Projects\Play\Garage\Data_Processing\Sampler.log" | Where-Object -FilterScript {$_ -notmatch 'procs|swpd'} | Out-File "C:\Projects\Play\Garage\Data_Processing\Refined.log"
I'm trying to change the following code so that the first matrix will become the second matrix:
function BellTri = matrix(n)
BellTri = zeros(n);
BellTri(1,1) = 1;
for i = 2:n
BellTri(i,1) = BellTri(i-1,i-1);
for j = 2:i
BellTri(i,j) = BellTri(i - 1,j-1) + BellTri(i,j-1);
end
end
BellTri
First matrix (when n = 7)
1 0 0 0 0 0 0
1 2 0 0 0 0 0
2 3 5 0 0 0 0
5 7 10 15 0 0 0
15 20 27 37 52 0 0
52 67 87 114 151 203 0
203 255 322 409 523 674 877
Second matrix
1 1 2 5 15 52 877
1 3 10 37 151 674 0
2 7 27 114 523 0 0
5 20 87 409 0 0 0
15 67 322 0 0 0 0
52 255 0 0 0 0 0
203 0 0 0 0 0 0
An option is to cyclically permute the columns using circshift.
function [BellTri, Second] = matrix(n)
BellTri = zeros(n);
BellTri(1,1) = 1;
for i = 2:n
BellTri(i,1) = BellTri(i-1,i-1);
for j = 2:i
BellTri(i,j) = BellTri(i - 1,j-1) + BellTri(i,j-1);
end
end
Second = BellTri;
for i = 1:n
Second(:, i) = circshift(Second(:,i), 1-i);
end
for i = n-1:-1:2
Second(1, i) = Second(1, i-1);
end
end
Input: [BellTri, Second] = matrix(7)
Output:
BellTri =
1 0 0 0 0 0 0
1 2 0 0 0 0 0
2 3 5 0 0 0 0
5 7 10 15 0 0 0
15 20 27 37 52 0 0
52 67 87 114 151 203 0
203 255 322 409 523 674 877
Second =
1 1 2 5 15 52 877
1 3 10 37 151 674 0
2 7 27 114 523 0 0
5 20 87 409 0 0 0
15 67 322 0 0 0 0
52 255 0 0 0 0 0
203 0 0 0 0 0 0
One approach:
out = zeros(size(A));
out(logical(fliplr(triu(ones(size(A,1)))))) = A(logical(tril(ones(size(A,1)))));
Note: As Divakar pointed out, there should be a typo in the first row. This method gives the corrected one.
Results:
A = [1 0 0 0 0 0 0;
1 2 0 0 0 0 0;
2 3 5 0 0 0 0;
5 7 10 15 0 0 0;
15 20 27 37 52 0 0;
52 67 87 114 151 203 0;
203 255 322 409 523 674 877];
>> out
out =
1 2 5 15 52 203 877
1 3 10 37 151 674 0
2 7 27 114 523 0 0
5 20 87 409 0 0 0
15 67 322 0 0 0 0
52 255 0 0 0 0 0
203 0 0 0 0 0 0
Say I have this matrix in memory and I want to calculate the 3D FFT
T =
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
16 17 18 19
20 21 22 23
24 25 26 27
28 29 30 31
32 33 34 35
36 37 38 39
40 41 42 43
44 45 46 47
44 45 46 47
52 53 54 55
56 57 58 59
60 61 62 63
real(fft2(T))
ans =
2000 -32 -32 -32
-128 0 0 0
-112 0 0 0
-128 0 0 0
-144 0 0 0
-128 0 0 0
-112 0 0 0
-128 0 0 0
-144 0 0 0
-128 0 0 0
-112 0 0 0
-128 0 0 0
-144 0 0 0
-128 0 0 0
-112 0 0 0
-128 0 0 0
real(fftn(T))
ans =
2000 -32 -32 -32
-128 0 0 0
-112 0 0 0
-128 0 0 0
-144 0 0 0
-128 0 0 0
-112 0 0 0
-128 0 0 0
-144 0 0 0
-128 0 0 0
-112 0 0 0
-128 0 0 0
-144 0 0 0
-128 0 0 0
-112 0 0 0
-128 0 0 0
Why am I getting the same result? How 3D FFTs can be done in Matlab/Octave?
A 3D-FFT should be applied to a 3D-array. If you apply the 3D-FFT to a 2D-array you get the same result as a 2D-FFT, because there is no third dimension in the array.
Think about it this way: an N-dimensional FFT is just N 1-dimensional FFT's, one along each dimension. If there is no third dimension in the array, the FFT along that dimension does nothing.
i.e. so that it appears like a diamond. (it's a square matrix) with each row having 1 more element than the row before up until the middle row which has the number of elements equal to the dimensions of the original matrix, and then back down again with each row back to 1?
A rotation is of course not possible as the "grid" a matrix is based on is regular.
But I remember what your initially idea was, so the following will help you:
%example data
A = magic(5);
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
d = length(A)-1;
diamond = zeros(2*d+1);
for jj = d:-2:-d
ii = (d-jj)/2+1;
kk = (d-abs(jj))/2;
D{ii} = { [zeros( 1,kk ) A(ii,:) zeros( 1,kk ) ] };
diamond = diamond + diag(D{ii}{1},jj);
end
will return the diamond:
diamond =
0 0 0 0 17 0 0 0 0
0 0 0 23 0 24 0 0 0
0 0 4 0 5 0 1 0 0
0 10 0 6 0 7 0 8 0
11 0 12 0 13 0 14 0 15
0 18 0 19 0 20 0 16 0
0 0 25 0 21 0 22 0 0
0 0 0 2 0 3 0 0 0
0 0 0 0 9 0 0 0 0
Now you can again search for words or patterns row by row or column by column, just remove the zeros then:
Imagine you extract a single row:
row = diamond(5,:)
you can extract the non-zero elements with find:
rowNoZeros = row( find(row) )
rowNoZeros =
11 12 13 14 15
Not a real diamond, but probably useful as well:
(Idea in the comments by #beaker. I will remove this part, if he is posting it by himself.)
B = spdiags(A)
B =
11 10 4 23 17 0 0 0 0
0 18 12 6 5 24 0 0 0
0 0 25 19 13 7 1 0 0
0 0 0 2 21 20 14 8 0
0 0 0 0 9 3 22 16 15
I need to insert datetime in every vmstat line that has value.
I can create a function like this:
function insert_datetime {
while read line
do
printf "$line"
date '+ %m-%d-%Y %H:%M:%S'
done
}
then call vmstat as below:
'vmstat 3 5 | insert_datetime'
but this line puts date time to every line, including dashes (--) and any rows that has text. How can I exclude rows that has dahses and text?
kthr memory page faults cpu 04-23-2013 10:19:49
----- ----------- ------------------------ ------------ ----------------------- 04-23-2013 10:19:49
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec 04-23-2013 10:19:49
0 0 45688088 4094129 0 0 0 0 0 0 45 12172 2840 1 1 99 0 0.35 2.2 04-23-2013 10:19:49
2 0 45694135 4088082 0 0 0 0 0 0 451 56350 21818 3 1 97 0 0.73 4.5 04-23-2013 10:19:52
1 0 45694137 4088061 0 0 0 0 0 0 303 24568 951 3 1 96 0 0.82 5.1 04-23-2013 10:19:55
1 0 45694138 4087739 0 0 0 0 0 0 445 9170 1504 2 0 98 0 0.64 4.0 04-23-2013 10:19:58
4 0 45703145 4078732 0 0 0 0 0 0 335 47175 1306 4 1 95 0 1.01 6.3 04-23-2013 10:20:01
I needed to look like this:
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
0 0 45688088 4094129 0 0 0 0 0 0 45 12172 2840 1 1 99 0 0.35 2.2 04-23-2013 10:19:49
2 0 45694135 4088082 0 0 0 0 0 0 451 56350 21818 3 1 97 0 0.73 4.5 04-23-2013 10:19:52
1 0 45694137 4088061 0 0 0 0 0 0 303 24568 951 3 1 96 0 0.82 5.1 04-23-2013 10:19:55
1 0 45694138 4087739 0 0 0 0 0 0 445 9170 1504 2 0 98 0 0.64 4.0 04-23-2013 10:19:58
4 0 45703145 4078732 0 0 0 0 0 0 335 47175 1306 4 1 95 0 1.01 6.3 04-23-2013 10:20:01
Why not just use vmstat -t? It seems to be exactly what you are looking for. Here is some sample output
[root#web5 vmstat]# vmstat -t 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ ---timestamp---
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 15704 193236 189628 595868 9 3 25 16 15 20 11 1 88 1 0 2013-05-22 13:32:36 JST
0 0 15704 193212 189628 595868 0 0 0 0 22 20 0 0 100 0 0 2013-05-22 13:32:37 JST
0 0 15704 193212 189628 595868 0 0 0 0 19 12 0 0 100 0 0 2013-05-22 13:32:38 JST
0 0 15704 193212 189628 595868 0 0 0 0 10 11 0 0 100 0 0 2013-05-22 13:32:39 JST
0 0 15704 193212 189628 595868 0 0 0 96 34 25 0 1 99 0 0 2013-05-22 13:32:40 JST
0 0 15704 193212 189628 595868 0 0 0 0 10 9 0 0 100 0 0 2013-05-22 13:32:41 JST
0 0 15704 193212 189628 595868 0 0 0 0 14 23 0 0 100 0 0 2013-05-22 13:32:42 JST
executed on CentOS6.3 with procps 3.2.8
[root#web5 uptime]# vmstat -V
procps version 3.2.8
Use awk:
vmstat 3 5 | awk '/^ *[0-9]/{$0=$0 " " strftime("%m-%d-%Y %T")};1'
Try:
function insert_datetime {
while read line
do
printf "$line"
if [[ "$line" =~ [0-9].* ]]; then
date '+ %m-%d-%Y %H:%M:%S'
else
echo
fi
done
}
sed can give you answer too... in much cleaner & portable (across shells) way:
vmstat 3 5 | sed '/^ *[0-9].*/s/.*/printf "&";date "+ %m-%d-%Y %H:%M:%S"/e'
All lines starting with a number are appended date in required format.