Print between patterns -awk - perl

I have the following input file
-------------------------------------
--- A-TKE spectrum (post-neutron)---
-------------------------------------
2-dim. array: (A = 73 To 162 Step 1) (E = 122 To 198 Step 1)
(The data are written according to the loop structure specified above.
The last loop is the inner-most one. Line breaks are not related to the data structure!
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 2 0 0 1 0 0 0 0
</A_TKE>
</Results>
This file is part of a bigger file, which contains more number block similar to the above one.
What I want to do is print in a seperate file the numbers that are within the file. As a first try, I thought "Let's print what is between two patters". The START pattern will be --- A-TKE spectrum (post-neutron)--- and the END pattern will </A_TKE>
I used
awk '/--- A-TKE spectrum (post-neutron)---/{flag=1;next}/</A_TKE>/{flag=0}flag{print}' input
To my surprise nothing is printed on the screen. Any idea on why is this happening?
I would also like to now if I can substitue next with a number that will correspond to the line that I want to print after the given pattern.

In perl (because whilst you start with awk, this is tagged as perl too):
perl -ne 'print if m/post-neutron/...m/A_TKE/' input
Which will do what your thingy was trying to.
It works by using the range operator - which is a perl construct which tests if the current file is between two markers.
And the fact that perl can run inline in the same way as awk or sed. -ne says "wrap this code in a while loop" that iterates a line at a time. (e.g. as sed/awk/grep do by default).
Otherwise ... it depends a little on what else is in your file. Because from the sample you gave - the only lines starting with a number are the ones you want. So:
perl -ne 'print if m/^\s*\d+\s*\d+/'
(Or combine the two tests:
perl -ne 'print if m/post-neutron/...m/A_TKE/ and m/^\s*\d+\s*\d+/'
Which will output anything between the two text markers that has number-space-number at the start of line.

Your awk script is almost there. I'd build it like this:
awk '
/<\/A_TKE>/ {exit}
flag && NF==50 {print}
/--- A-TKE spectrum .post-neutron.---/ {flag=1}
' input
You could optimize it of course, but just looking at the logic of it, the idea is that you need do no further processing once you hit your end marker .. and you only want to start printing AFTER your start marker.
Note that I've replaced the brackets in your start marker with dots, since brackets are interpreted as PART OF THE REGULAR EXPRESSION. I believe you may be considering these simple strings to search for, but they are regex.

Some working ideas
Awk can print between two matching lines with this simple command:
awk '/PATTERN1/,/PATTERN2/' file.txt
If you want more control over what happens when the lines are found, you can use the flag approach.
awk '/A-TKE/{f=1;for (i=0;i<7;i++){getline}}/A_TKE/{f=0}f' file.txt
Adjust i<7 to specify how many lines you would like to skip.
Also, in this particular case, you can exploit the fact that you have 50 fields of numbers, which is unlikely to occur anywhere else in the file. Thus, you could also get the numbers of interest out with:
awk 'NF>40' file.txt
NF is a special awk variable that stores the number of fields (columns) in the current record (line).
Why your approach does not work
Your approach does not work, because everything that is between // in awk is treated as a regular expression. In your case you have several - and () in the string, which are not escaped, and thus assume their special meaning and therefore the expression you typed never matches, thus the flag is never set to 1 and awk does not print a line.

Try this simple sed command
sed -n '/A-TKE/,/<\/A_TKE>/p' FileName

If you want to extract rows with just numbers
awk 'NF && !/[a-zA-Z\-]+/' input.file
will generate
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 2 0 0 1 0 0 0 0
with patterns and smart counters you can write it as
awk '/\/A_TKE/{exit}
f&&s&&s--{next}
f&&NF{print}
/A-TKE spectrum/{f=1;s=7}' input.file
fine tune patterns as you like.

Related

Read from a file into a vector in matlab using importdata()

My input file looks like this:
# FILE:app/src/f1.c
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 24 32 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
# FILE:src/f2.c
1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 24 31 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
...............
I want to read the headers i.e., the lines which start with #, and the vectors present below the headers into lists. I tried the using importdata() as below. But this gives me only the first header and first vector. I need to read it till the end of the file.
filename = 'output.txt';
A = importdata(filename);
disp(A.rowheaders);
disp(A.data);
EDIT: A = importdata(filename,'#'); This solved my problem
I believe the fgetl command should do the trick.
I would put it in a for loop with two fgetl commands, one for the metadata line and one for the numerical data line.
EDIT: Added example
fid = fopen('test.txt');
celHeaders = {};
celData = {};
while(~feof(fid))
celHeaders{end+1} = fgetl(fid);
celData{end+1} = fgetl(fid);
end
fclose(fid);
disp(celHeaders)
disp(celData)
This is assuming that your text file doesn't include additional lines above or below the data.

Numerical errors when solving Ax=b in MATLAB

I am attempting to solve the system Ax = b in MATLAB, where A is a 30x30 triangular matrix with (nonzero) values ranging from 1e-14 to 0.7, and b is a 30-element column vector with values ranging from 1e-3 to 1e3. When I enter x = A\b, I get an answer and no warning messages, but the answer is not reasonable (looks like just random numbers at the bottom of the vector). I presume this is due to numerical errors.
Message 5 on this page suggests decomposing/scaling the matrix in order to avoid numerical errors, but I haven't been able to figure out how to calculate the scaling matrices.
So the question is: Is this indeed an example of numerical instability, and if so, how can I rescale my matrix A, or change how MATLAB is performing the calculation, to avoid it?
Here is the matrix and vector that are producing the issue:
A =
Columns 1 through 15
0.69 0.4278 0.19893 0.082223 0.031861 0.011852 0.0042866 0.0015187 0.00052965 0.00018243 6.221e-05 2.1038e-05 7.0653e-06 2.3587e-06 7.8344e-07
0 0.4761 0.44277 0.27452 0.14183 0.065953 0.028624 0.011831 0.0047156 0.0018273 0.00069233 0.00025755 9.4356e-05 3.4126e-05 1.2206e-05
0 0 0.32851 0.40735 0.3157 0.19573 0.10618 0.052668 0.02449 0.010846 0.004623 0.0019108 0.00077007 0.00030383 0.00011773
0 0 0 0.22667 0.35134 0.32675 0.23635 0.14653 0.081766 0.042246 0.02058 0.0095696 0.0042851 0.0018597 0.00078615
0 0 0 0 0.1564 0.29091 0.31564 0.26093 0.182 0.11284 0.064129 0.03408 0.017168 0.0082788 0.0038496
0 0 0 0 0 0.10792 0.23418 0.29039 0.27006 0.2093 0.14274 0.088499 0.05095 0.02764 0.014281
0 0 0 0 0 0 0.074464 0.18467 0.25761 0.2662 0.22694 0.16884 0.1134 0.070311 0.040868
0 0 0 0 0 0 0 0.05138 0.14335 0.22219 0.25256 0.23488 0.18931 0.13694 0.090965
0 0 0 0 0 0 0 0 0.035452 0.1099 0.18738 0.23235 0.2341 0.2032 0.15748
0 0 0 0 0 0 0 0 0 0.024462 0.083415 0.15515 0.20842 0.22614 0.21031
0 0 0 0 0 0 0 0 0 0 0.016879 0.062789 0.12652 0.18303 0.21277
0 0 0 0 0 0 0 0 0 0 0 0.011646 0.046935 0.10185 0.15786
0 0 0 0 0 0 0 0 0 0 0 0 0.008036 0.034876 0.081087
0 0 0 0 0 0 0 0 0 0 0 0 0 0.0055448 0.025783
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0038259
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 16 through 30
2.5906e-07 8.5327e-08 2.8007e-08 9.1646e-09 2.9906e-09 9.7342e-10 3.1613e-10 1.0246e-10 3.3142e-11 1.0702e-11 3.4504e-12 1.1108e-12 3.5709e-13 1.1465e-13 3.6767e-14
4.3246e-06 1.5194e-06 5.2988e-07 1.8359e-07 6.3236e-08 2.1667e-08 7.3883e-09 2.5085e-09 8.4833e-10 2.8585e-10 9.5998e-11 3.214e-11 1.073e-11 3.5726e-12 1.1866e-12
4.492e-05 1.6909e-05 6.2902e-06 2.3156e-06 8.445e-07 3.0543e-07 1.0963e-07 3.9084e-08 1.3847e-08 4.8779e-09 1.7094e-09 5.9615e-10 2.0698e-10 7.1568e-11 2.4651e-11
0.00032494 0.00013173 5.2503e-05 2.0616e-05 7.9887e-06 3.0592e-06 1.1591e-06 4.3497e-07 1.6181e-07 5.9715e-08 2.1877e-08 7.9615e-09 2.8794e-09 1.0354e-09 3.7037e-10
0.0017358 0.00076232 0.00032721 0.00013766 5.69e-05 2.3151e-05 9.2878e-06 3.679e-06 1.4406e-06 5.5824e-07 2.1426e-07 8.1515e-08 3.0763e-08 1.1523e-08 4.2867e-09
0.0070833 0.0033935 0.001578 0.00071495 0.00031662 0.00013741 5.8573e-05 2.4566e-05 1.0154e-05 4.1418e-06 1.6691e-06 6.6527e-07 2.6248e-07 1.0259e-07 3.9755e-08
0.022523 0.01187 0.0060211 0.0029554 0.0014095 0.00065541 0.00029799 0.00013279 5.8116e-05 2.5022e-05 1.0615e-05 4.4423e-06 1.8361e-06 7.5031e-07 3.0339e-07
0.056398 0.033024 0.018428 0.0098671 0.005098 0.0025529 0.0012436 0.00059114 0.00027488 0.00012531 5.6113e-05 2.4719e-05 1.0728e-05 4.5926e-06 1.9414e-06
0.11158 0.073506 0.045573 0.026843 0.01513 0.0082078 0.0043059 0.0021929 0.0010877 0.00052686 0.00024979 0.00011615 5.3064e-05 2.3852e-05 1.0563e-05
0.17385 0.13089 0.091294 0.059747 0.037043 0.021923 0.012459 0.0068335 0.0036315 0.0018763 0.00094518 0.00046536 0.00022441 0.00010618 4.9374e-05
0.21107 0.18539 0.14778 0.10881 0.074955 0.048796 0.030253 0.017976 0.010288 0.0056949 0.0030601 0.0016008 0.00081734 0.00040822 0.00019981
0.19575 0.20632 0.19188 0.16145 0.12513 0.090508 0.061727 0.04001 0.024806 0.014788 0.0085139 0.0047507 0.0025773 0.0013629 0.00070418
0.13406 0.17663 0.19712 0.1935 0.17139 0.13947 0.10569 0.075354 0.050967 0.032916 0.020408 0.012201 0.0070603 0.003967 0.0021702
0.063943 0.11233 0.1567 0.18459 0.19074 0.17739 0.15122 0.1198 0.089133 0.062798 0.042179 0.027157 0.016837 0.010091 0.0058655
0.018977 0.050003 0.093006 0.13695 0.16982 0.18425 0.17952 0.15999 0.13226 0.1025 0.075107 0.052387 0.034978 0.022461 0.013926
0.0026399 0.013912 0.038815 0.076207 0.11812 0.15379 0.17481 0.17806 0.16559 0.14259 0.11493 0.087452 0.063257 0.043745 0.029059
0 0.0018215 0.010164 0.029933 0.061862 0.10068 0.13733 0.16319 0.17345 0.16803 0.15048 0.12595 0.099387 0.074457 0.053266
0 0 0.0012569 0.0074028 0.022949 0.049799 0.084907 0.12108 0.15014 0.16622 0.16747 0.15575 0.13519 0.11049 0.085626
0 0 0 0.00086723 0.0053768 0.017502 0.039787 0.07092 0.10553 0.13631 0.15695 0.16421 0.15837 0.14237 0.12037
0 0 0 0 0.00059839 0.0038955 0.013284 0.031571 0.058722 0.091019 0.12227 0.1462 0.15862 0.15845 0.14736
0 0 0 0 0 0.00041289 0.0028159 0.010039 0.024896 0.048236 0.077756 0.10847 0.1345 0.15115 0.15618
0 0 0 0 0 0 0.00028489 0.0020313 0.0075564 0.019521 0.039334 0.065845 0.095256 0.12234 0.14222
0 0 0 0 0 0 0 0.00019658 0.0014625 0.0056673 0.015226 0.031861 0.05531 0.082873 0.1101
0 0 0 0 0 0 0 0 0.00013564 0.0010512 0.0042363 0.011819 0.025648 0.046115 0.071478
0 0 0 0 0 0 0 0 0 9.359e-05 0.00075433 0.0031569 0.0091339 0.020528 0.038183
0 0 0 0 0 0 0 0 0 0 6.4577e-05 0.00054051 0.0023458 0.0070296 0.016344
0 0 0 0 0 0 0 0 0 0 0 4.4558e-05 0.00038676 0.0017385 0.0053894
0 0 0 0 0 0 0 0 0 0 0 0 3.0745e-05 0.0002764 0.0012852
0 0 0 0 0 0 0 0 0 0 0 0 0 2.1214e-05 0.00019729
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.4638e-05
b =
3712
246.89
43.304
22.55
14.897
10.066
6.8138
4.6131
3.1232
2.1146
1.4316
0.96927
0.65623
0.44429
0.3008
0.20365
0.13788
0.093351
0.063202
0.04279
0.02897
0.019614
0.013279
0.0089906
0.006087
0.0041211
0.0027902
0.001889
0.0012789
0.00086589
A .mat file with the full-precision variables may be found here.
Here are the results I'm getting on my machine (Matlab R2013a on OS X 10.10.5):
>> x=A\b
x =
5087.6
433.99
64.166
27.995
19.494
14.546
10.934
8.2265
6.1834
4.6933
3.2779
3.8272
-3.5375
23.953
-79.278
254.22
-702.1
1713.2
-3658.2
6822.7
-11046
15412
-18349
18393
-15244
10181
-5273.4
1992.3
-489.85
59.155
Although norm(A*x-b) returns a value on the order of 1e-13, the results are not physically reasonable given the problem I am trying to solve (values in x should be monotonically decreasing, and none should be negative). As an example, here is a similar dataset that returns a correct (looking) solution with the same matrix A:
>> c
c =
5142.1
339.52
22.417
1.4802
0.097731
0.0064529
0.00042607
2.8132e-05
1.8575e-06
1.2265e-07
8.0979e-09
5.3469e-10
3.5304e-11
2.331e-12
1.5391e-13
1.0162e-14
6.7099e-16
4.4304e-17
2.9253e-18
1.9315e-19
1.2753e-20
8.4205e-22
5.5598e-23
3.671e-24
2.4239e-25
1.6004e-26
1.0567e-27
6.9771e-29
4.6068e-30
3.0418e-31
>> x = A\c
x =
7029.1
653.25
60.709
5.642
0.52434
0.04873
0.0045287
0.00042087
3.9114e-05
3.635e-06
3.3782e-07
3.1395e-08
2.9177e-09
2.7116e-10
2.52e-11
2.342e-12
2.1765e-13
2.0227e-14
1.8798e-15
1.747e-16
1.6236e-17
1.5089e-18
1.4023e-19
1.3033e-20
1.21e-21
1.1339e-22
9.9766e-24
1.1858e-24
2.3902e-26
2.078e-26

SQL script works in 'posgres' schema, but not in other?

I am trying to dump an old database back into my system, but a weird error occurs.
When I login to 'postgres' database (psql posgtgres) the files would load, and the data would be also inserted,
CREATE TABLE
ALTER TABLE
but when I login to the database I created(psql EXAMPLE), it says
syntax error near "1"
Here is the part of the file that fails when I'm logged in to 'EXAMPLE' database :
...f92, f93, f94) FROM stdin;
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Please note that it is exactly the same file, and it was not edited before I connected to 'EXAMPLE' database.
Thank you in advance
EDIT:
Here is the copy command that gives the error.
COPY mmm_a1_0912_vorskla_02c (shot_num, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24, f25, f26, f27, f28, f29, f30, f31, f32, f33, f34, f35, f36, f37, f38, f39, f40, f41, f42, f43, f44, f45, f46, f47, f48, f49, f50, f51, f52, f53, f54, f55, f56, f57, f58, f59, f60, f61, f62, f63, f64, f65, f66, f67, f68, f69, f70, f71, f72, f73, f74, f75, f76, f77, f78, f79, f80, f81, f82, f83, f84, f85, f86, f87, f88, f89, f90, f91, f92, f93, f94) FROM stdin;
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
EDIT#2
FIXED:
Actually I initially assumed that every database you create will have a 'public' schema by default. My newly created database didn't. That was the issue.

Matlab: keeping non-zero matrix elements adjacent to each other and ignoring lone elements

Here is an example matrix (but the result shouldn't be constrained to only working on this):
a=zeros(7,7);
a(5,3:6)=1;
a(2,2)=1;
a(2,4)=1;
a(7,1:2)=1
a=
0 0 0 0 0 0 0
0 1 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 1 1 1 1 0
0 0 0 0 0 0 0
1 1 0 0 0 0 0
I want to get rid of all the 1's that are alone (the noise), such that I only have the line of 1's on the fifth row.
rules:
-the 1's are in 'connected lines' if there are adjacent 1's (including diagonally) e.g.:
0 0 0 1 0 0 1 0 1
1 1 1 0 1 0 0 1 0
0 0 0 0 0 1 0 0 0
(The connected lines are what I want to keep. I want to get rid of all the 1's that are not in connected lines, the connected lines can intersect each other)
the 'connected lines need to be at least 3 elements long. So in the 7x7 example, there would only be one line that matches this criteria. If a(7,3) was set to 1, then there would be a connected line at the bottom left also
I am currently looking at this through a column by column approach, and here is the first draft of my code so far:
for nnn=2:6
rowPoss=find(a(:,nnn)==1);
rowPoss2=find(a(:,nnn+1)==1);
for nn=1:length(rowPoss)
if myResult(rowPoss(nn)-1:rowPoss(nn)+1,n-1)==0 %
%then?
end
end
end
My difficulty is, during this column by column process, I'd have to enable a way to recognise the beginning of the connected line, the middle of the connected line, and when a connected line ends. The same rules for this, when applied to noise (the lone 1's), would just ignore the lone 1's.
The output I want is basically:
b=
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 1 1 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
If you have image processing toolbox, try bwareaopen
b = bwareaopen(a, 3);
Sample Run #1:
>> a
a =
0 0 0 0 0 0 0
0 1 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 1 1 1 1 0
0 0 0 0 0 0 0
1 1 0 0 0 0 0
>> b
b =
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 1 1 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Sample Run #2:
>> a
a =
0 0 0 0 0 0 0
0 1 0 1 0 0 0
0 0 1 0 0 0 0
0 0 0 0 0 0 0
0 0 1 1 1 1 0
0 0 0 0 0 0 0
1 1 0 0 0 0 0
>> b
b =
0 0 0 0 0 0 0
0 1 0 1 0 0 0
0 0 1 0 0 0 0
0 0 0 0 0 0 0
0 0 1 1 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0

Flipping Around 5 percent of 1's and 0's

For the following letter, I wish to add noise to it by changing 5 percent of the 1's into 0's. So far, I have the following code which turns them all into 0's. Can someone please point me in the right direction? Thank you!
letterA = [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ...
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 ...
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ...
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ...
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 ...
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ...
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0];
for i=1:numel(letterA)
if letterA(i)==1
letterA(i)=0;
end
end
disp(letterA)
try this:
letterA( letterA == 1 & rand(size(letterA)) <= 0.05 ) = 0;
In fact you could also do
letterA( rand(size(letterA)) <= 0.05 ) = 0;
which sets each element with probability of 5% to zero. The already zero elements are not affected. I think what causes confusion here is that you have to recognize that each element is independently handled from each other. It makes no difference if you do the first or the second version.
You can check it:
letterA = (rand(1e5,1) < 0.2); N1 = nnz(letterA);
letterA( rand(size(letterA)) <= 0.05 ) = 0;
(N1 - nnz(letterA))/N1
which gives values around 0.05, i.e. 5%. And it is not true what EitanT says, that it will flip at maximum 5%. It can be more than 5% or less, but on average it is 5%.
EitanTs version flippes exactly 5%, so which version to select depends on the application. For EitanT version the noise is correlated to the signal (because it is exact), which may or may not be what you want.
The basic approach is to find the indices of the 1's and count them, randomly pick a desired amount of indices out of them, and then operate on them:
one_flip_ratio = 0.05;
idx_ones = find(letterA == 1); %// Indices of 1's
flips = round(one_flip_ratio * numel(idx_ones)); %// Number of flips
idx_flips = idx_ones(randperm(numel(idx_ones), flips)); %// Indices of elements
letterA(idx_flips) = 0; %// Flip elements
This will flip 5% of the 1's to 0's.
Thanks for throwing out all these ideas, but eventually I came up with this. It will allow me to easily control both letter and background noise, which is what I intend to do. I'm just a novice, so this may not be the most efficient code, but it gets the job done! (I'm not looking for exactly 5%, the naked eye display value is what I'm more worried about.) PLEASE let me know how this can be improved! Thank you.
background_noise_intensity=0.05;
letter_noise_intensity=0.05;
for i=1:numel(letterA)
if letterA(i)==0
if rand < background_noise_intensity
letterA(i)=1;
end
elseif letterA(i)==1
if rand < letter_noise_intensity
letterA(i)=0;
end
end
end
noisy_letters=letterA;
reshaped_noisy_letters=reshape(noisy_letters,37,19)';
imshow(reshaped_noisy_letters);