Linux shell script, parsing each line [closed] - sed

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am facing a problem with my shell script (I'm using SH):
I have a file with multiple line including mail adressess, for example:
abcd
plm
name_aA.2isurnamec#Text.com -> this is a line that checks the correct condition
random efgh
aaaaaa
naaame_aB.3isurnamec#Text.ro ->same (this is not part of the file)
I have used grep to filter the correct mail adresses like this:
grep -E '^[a-z][a-zA-Z_]*.[0-9][a-zA-Z0-9]+#[A-Z][A-Z0-9]{,12}.(ro|com|eu)$' file.txt
I have to write a shell that cheks the file and prints the following (for the above example it would be like this ):
"Incorrect:" abcd
"Incorrect:" plm
"Correct:" name_aA.2isurnamec#Text.com
"Incorrect:" random efgh
"Incorrect:" aaaaaa
"Correct:" naaame_aB.3isurnamec#Text.ro
I want to solve this problem using grep or sed, while, if, or pipes etc i dont want to use lists or other things.
I have tried using something like this
grep condition abc.txt | while read -r line ; do
echo "Processing $line"
# your code goes here
done
but it only prints the correct lines, and i know that i can also print the lines that dont match the grep condition using -v on grep, but i want to print the lines in the order they appear in the text file.
I'm having trouble trying to parse each line of the file, or maybe i don't need to parse the lines 1
by 1, i really dont know how to solve it.
If you could help me i would appreciate it.
Thanks

#!/bin/bash
pattern='^[a-z][a-zA-Z_]*\.[0-9][a-zA-Z0-9]+#[A-Z][A-Za-z0-9]{,12}\.(ro|com|eu)$'
while read line; do
if [ "$line" ]; then
if echo "$line" | grep -E -q $pattern; then
echo "\"Correct:\" $line"
else
echo "\"Incorrect:\" $line"
fi
fi
done
Invoke like this, assuming the bash script is called filter and the text file, text.txt: ./filter < text.txt.
Note that the full stops in the regular expression are escaped and that the domain name can contain lowercase letters (although, I think that your regex is too restrictive). Other characters are not escaped because the string is in single quotes.
while reads the standard input line by line into $line; the first if skips the empty lines; the second one checks $line against $pattern (-q suppresses grep output).

Related

Improving sed program - conditions

I use this code according to this question.
$ names=(file1.txt file2.txt file3.txt) # Declare array
$ printf 's/%s/a-&/g\n' "${names[#]%.txt}" # Generate sed replacement script
s/file1/a-&/g
s/file2/a-&/g
s/file3/a-&/g
$ sed -f <(printf 's/%s/a-&/g\n' "${names[#]%.txt}") f.txt
TEXT
\connect{a-file1}
\begin{a-file2}
\connect{a-file3}
TEXT
75
How to make conditions that solve the following problem please?
names=(file1.txt file2.txt file3file2.txt)
I mean that there is a world in the names of files that is repeated as a part of another name of file. Then there is added a- more times.
I tried
sed -f <(printf 's/{%s}/{s-&}/g\n' "${files[#]%.tex}")
but the result is
\input{a-{file1}}
I need to find {%s} and a- place between { and %s
It's not clear from the question how to resolve conflicting input. In particular, the code will replace any instance of file1 with a-file1, even things like 'foofile1'.
On surface, the goal seems to be to change tokens (e.g., foofile1 should not be impacted by by file1 substitution. This could be achieved by adding word boundary assertion (\b) - before and after the filename. This will prevent the pattern from matching inside other longer file names.
printf 's/\\b%s\\b/a-&/g\n' "${names[#]%.txt}"
Since this explanation is too long for comment so adding an answer here. I am not sure if my previous answer was clear or not but my answer takes care of this case and will only replace exact file names only and NOT mix of file names.
Lets say following is array value and Input_file:
names=(file1.txt file2.txt file3file2.txt)
echo "${names[*]}"
file1.txt file2.txt file3file2.txt
cat file1
TEXT
\connect{file1}
\begin{file2}
\connect{file3}
TEXT
75
Now when we run following code:
awk -v arr="${names[*]}" '
BEGIN{
FS=OFS="{"
num=split(arr,array," ")
for(i=1;i<=num;i++){
sub(/\.txt/,"",array[i])
array1[array[i]"}"]
}
}
$2 in array1{
$2="a-"$2
}
1
' file1
Output will be as follows. You could see file3 is NOT replaced since it was NOT present in array value.
TEXT
\connect{a-file1}
\begin{a-file2}
\connect{file3}
TEXT
75

How to combine the output of regex and print in perl?

Hi I have search result as,
"abc"
from
perl -lne 'print for /"name":"(.+?)"/g' file > newfile
and
"def"
from
perl -lne 'print for /"title":"(.+?)"/g' file > newfile
I'm trying to get the O/p as
abc:"def",
by combining both one liners. I tried with:
perl -lne 'print for /"name":"(.+?)","title":"(.+?)"/g' *.json > newfile11
but it didn't work
I think I figured this out based on the input from your other question.
I'm assuming you have this as input:
{"card":{"cardName":"10AN10G","portSignalRates":["10AN10G-1-OTU2","10AN10G-1-OTU2E","10AN10G-1-TENGIGE","10AN10G-1-STM64"],"listOfPort":{"10AN10G-1-OTU2":{"portAid":"10AN10G-1-OTU2","signalType":"OTU2","tabNames":["PortDetails"],"requestType":{"PortDetails":"PTP"},"paramDetailsMap":{"PortDetails":[{"type":"dijit.form.TextBox","name":"signalType","title":"Signal Rate","id":"","options":[],"label":"","value":"OTU2","checked":"","enabled":"false","selected":""},{"type":"dijit.form.TextBox","name":"userLabel","title":"Description","id":"","options":[],"label":"","value":"","checked":"","enabled":"true","selected":""},{"type":"dijit.form.Select","name":"Frequency","title":"Transmit Frequency",}}}}}}
or at least a large text file containing those types of lines. You want to parse out the name and title from each line. You can do that with this one line.
Matt#MattPC ~/perl/testing/12
$ perl -ne 'if ( /"name":"([^"]+)","title":"([^"]+?)"/ ) { print $1 . ":\"" . $2 . "\",\n" }' input2.txt
which outputs:
signalType:"Signal Rate",
It works by capturing 2 group in the regex, one for the title and one for the name. The -ne flags go through each line the file and execute the code between the single quotes. $1 and $2 are the group we captured, and they are printed at the end.
Just as a tip, it is much easier to help you if you post your input, expected output, errors you ran into, and code you've tried when asking question.
edit: just wanted to put a disclaimer that it is better to parse JSON with a module, because what if you have escaped " with in a title or name? This regex wouldn't pick it up, but JSON parsers can handle those types of cases for you.

How to find specific number patterns in a data file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a data file that looks like this
15105021
15105043
15106013
15106024
15106035
15105024
15105042
15106015
15106021
15106034
and I need to grep lines that have sequence numbers like 1510603, 1510504
I tried this awk command
awk /[1510603,1510504]/ soursefile.txt
but it does not work.
Using egrep and word boundary on LHS since OP wants to match all matching numbers on RHS:
egrep '\b(1510603|1510504)' file
15105043
15106035
15105042
15106034
An shorter awk
awk '/1510603|1510504/' file
Based on the contents of your file the following should suffice
grep -E '^1510603|^1510504' file
If your grep version does not support the -E flag, try egrep instead of grep
If you insist on awk
awk '/^1510603/ || /^1510504/' file
Think this works:
egrep '1510603|1510504' source
Your question is very poorly stated, but if you want to print all numbers in the file that begin with either 1510603 or 1510504, then you can write this in Perl
perl -ne 'print if /^1510(?:603|504)/' sourcefile.txt

search some text in csv file [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Search a value in CSV
I have created a perl script.
Lets suppose I want to seach "abc" in a CSV file which contains äbcd. the script that I have written shows me abcd as a output which actually I dont want, could anyone help on this.
The issue is searching for an ANSI string in unicode file. I think you can best answer your question by reviewing this regex tutorial that points out an example similar to yours.
http://www.regular-expressions.info/unicode.html
I guess you need to match exact word in perl:
use the below regex in perl:
(/\b"your_word"\b/
tested:
without \b
> echo 'abcd'|perl -lne 'if(/abc/){print}'
abcd
with \b
> echo 'abcd' | perl -lne 'if(/\babc\b/){print}'
>
> echo 'abc' | perl -lne 'if(/\babc\b/){print}'
abc
by looking at your code you are doing this:
(grep /$curr/, #nodes)
so changing it to :
(grep /\b$curr\b/, #nodes) should work

I want to print a text file in columns

I have a text file which looks something like this:
jdkjf
kjsdh
jksfs
lksfj
gkfdj
gdfjg
lkjsd
hsfda
gadfl
dfgad
[very many lines, that is]
but would rather like it to look like
jdkjf kjsdh
jksfs lksfj
gkfdj gdfjg
lkjsd hsfda
gadfl dfgad
[and so on]
so I can print the text file on a smaller number of pages.
Of course, this is not a difficult problem, but I'm wondering if there is some excellent tool out there for solving problems like these.
EDIT: I'm not looking for a way to remove every other newline from a text file, but rather a tool which interprets text as "pictures" and then lays these out on the page nicely (by writing the appropriate whitespace symbols).
You can use this python code.
tables=input("Enter number of tables ")
matrix=[]
file=open("test.txt")
for line in file:
matrix.append(line.replace("\n",""))
if (len(matrix)==int(tables)):
print (matrix)
matrix=[]
file.close()
(Since you don't name your operating system, I'll simply assume Linux, Mac OS X or some other Unix...)
Your example looks like it can also be described by the expression "joining 2 lines together".
This can be achieved in a shell (with the help of xargs and awk) -- but only for an input file that is structured like your example (the result always puts 2 words on a line, irrespective of how many words each one contains):
cat file.txt | xargs -n 2 | awk '{ print $1" "$2 }'
This can also be achieved with awk alone (this time it really joins 2 full lines, irrespective of how many words each one contains):
awk '{printf $0 " "; getline; print $0}' file.txt
Or use sed --
sed 'N;s#\n# #' < file.txt
Also, xargs could do it:
xargs -L 2 < file.txt
I'm sure other people could come up with dozens of other, quite different methods and commandline combinations...
Caveats: You'll have to test for files with an odd number of lines explicitly. The last input line may not be processed correctly in case of odd number of lines.