Selecting records on base of first character - sed

I have a file which contains the following records
+aaaa
+bbbb
cccc-123
-dddd
eeee+789
-fff+456
ggg
Now I want to keep only the records if the first character is a "+" or "-" sign
so the (new) file should look like this
+aaaa
+bbbb
-dddd
-fff+456
Can this be done via a grep or sed command ?

Try this:
grep '^\[+-\]' myfile.txt
or
grep '^[+-]' myfile.txt
Depending on your flavour of grep

This might work for you (GNU sed):
sed '/^[+-]/!d' file
or:
sed -n '/^[+-]/p' file
In the first solution: if the first character of the line is not + or - delete the line.
In the second solution: if the first character of the line is + or - print it.

Related

sed: get a line number with regex and insert text at that line

I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.
This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.
Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)

How to truncate the first digit of a number?

For example, my file has the following data:
$ cat sample.txt
19999119999,string1,dddddd
18888135790,string2,dddddd
15555555500,string3,dddddd
This is a sample data. How can we remove ONLY first digit from each row? My output should be:
$ cat output.txt
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
Is there any way to parse each line character wise using grep or sed?
Or any other way to get the desired output?
You just need to print from the second character on:
$ cut -c2- file
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
Or, using sed, remove the first char:
$ sed 's/^.//' file
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
Try this:
$ sed -r 's/^[0-9](.*)/\1/' sample.txt
Output:
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
^[0-9] - The first digit of each line
(.*) - The content of each line except the first digit
\1 - Denote the content of (.*)
Sorry for my bad English.
Grep can solve this with a look behind. For that you need -P option :
grep -Po '(?<=^\d)(.+)' file
or in shorthand :
grep -Po '^\d\K.+' file
The (?<=^\d)/^\d\K part is the look behind that matches the first digit.

Remove all lines before a match with sed

I'm using sed to filter a list of files. I have a sorted list of folders and I want to get all lines after a specific one. To do this task I'm using the solution described here which works pretty well with any input I tried but it doesn't work when the match is on the first line. In that case sed will remove all lines of the input
Here it's an example:
$ ls -1 /
bin
boot
...
sys
tmp
usr
var
vmlinuz
$ ls -1 / | sed '1,/tmp/d'
usr
var
vmlinuz
$ ls -1 / | sed '1,/^bin$/d'
# sed will delete all lines from the input stream
How should I change the command to consider also the limit case when first line is matched by regexp?
BTW sed '1,1d' correctly works and remove the first line only.
try this (GNU sed only):
sed '0,/^bin$/d'
..output is:
$sed '0,/^bin$/d' file
boot
...
sys
tmp
usr
var
vmlinuz
This sed command will print all lines after and including the matching line:
sed -n '/^WHATEVER$/,$p'
The -n switch makes sed print only when told (the p command).
If you don't want to include the matching line you can tell sed to delete from the start of the file to the matching line:
sed '1,/^WHATEVER$/d'
(We use the d command which deletes lines.)
you can also try with :
awk '/searchname/{p=1;next}{if(p){print}}'
EDIT(considering the comment from Joe)
awk '/searchname/{p++;if(p==1){next}}p' Your_File
I would insert a tag before a match and delete in scope /start/,/####tag####/.

Delete first and last line or record from file using sed

I want to delete first and last line from the file
file1 code :
H|ACCT|XEC|1|TEMP|20130215035845|
849002|48|1208004|1
849007|28|1208004|1
T|2
After delete the output should be
849002|48|1208004|1
849007|28|1208004|1
I have tried below method but has to run it 2 times, I want one liner solution to remove both in one go!
sed '1,1d' file1.txt >> file1.out
sed '$d' file1.out >> file2
Please suggest one liner code....
You could use ;
sed '1d; $d' file
Use Command Separator
In sed, you can separate commands using a semicolon. For example:
sed '1d; $d' /path/to/file
How about:
sed '$d' < file1.txt | sed "1d"
Try sed -i '1d;$d' /path/to/file
awk 'NR>2{print v}{v=$0}'
Starting with line 3, print the previous line each time. This means the first and last lines will not be printed.

Unix - Split to N files using regexp to name destination file

How do I split a file to N files using as a filename the first 2 chars on the line.
Ex input file:
AA23409234TEXT
BA23201202Other Text
AA23509234YADA
BA23202202More Text.
C1000000000000000000
Should generate 3 files:
AA.txt
AA23409234TEXT
AA23509234YADA
BA.txt
BA23201202Other Text
BA23202202More Text.
C1.txt
C1000000000000000000
I'm thinking of using a sed script similar to this
/^(..)/w \1
But what that really does is create a file named '\1' instead of the capture group.
Any ideas?
$ awk '{fname=substr($0, 0, 2); print >>fname}' input.txt
Or
$ while read line; do echo "$line" >>"${line:0:2}"; done <input.txt
The first thing you need to do is determine all of your file names:
filenames=$(sed 's/\(..\).*/\1/' listOfStrings.txt | sort | uniq)
Then, loop through those filenames
for filename in $filenames
do
sed -n '/^$filename/ p' listOfStrings.txt > $filename.txt
done
I have not tested this, but I think it should work.
This might work for you:
sed 's/\(..\).*/echo "&" >>\1.txt/' file | sh
or if you have GNU sed:
sed 's/\(..\).*/echo "&" >>\1.txt/e' file