Airflow bash operator: Problem using sed in hdfs - sed

I have a airflow task that I am trying to use sed command for replacing LF with CRLF:
hdfs dfs -cat /test/file.txt | sed 's/$/\r/g' | hdfs dfs -put -f - /test/file.txt
I get following error:
error: sed: -e expression #1, char 4: unterminated `s' command
I think it is due to \r which it is conflicting with. How do I solve this problem?

I found the reason, the \ is a special character in Python.
To solved it I just added an extra \ is it becomes sed 's/$/\\r/g' , another option is to use prefixing.

Related

sed comand is not replacing value after first match

I am trying to add a line in a file using sed after first match. I looked my threads online, addressing the same issue and tried below versions of the command:
sudo sed -e '/cfg_file/cfg_file='$NEW_FILE -e '/cfg_file/cfg_file=/q' MAIN_CONF_FILE
sudo sed -i '0,/cfg_file/ a\cfg_file='$NEW_FILE $MAIN_CONF_FILE
sudo sed -i '1s/cfg_file/cfg_file='$NEW_FILE $MAIN_CONF_FILE
sudo sed -i '1/cfg_file/s/cfg_file='$NEW_FILE $MAIN_CONF_FILE
Unfortunately, nothing worked for me. Either they show error, in case of point 3, or show similar behavior of adding lines after each match.
SAMPLE FILE
cfg_file=some_line1
cfg_file=some_line2
Now I want to add a line after first match of cg_file, like below.
EXPECTED RESULT
cfg_file=some_line1
cfg_file=my_added_line_after_first_match_only.
cfg_file=some_line2
Help me in adding line after first match and correcting my command.
Since you're on Ubuntu, you are using GNU sed. GNU sed has some weird features and some useful ones; you should ignore the weird ones and use the useful ones.
In context, the useful one is ranges starting at line 0. Weird ones are the way it messes with a, i and c commands.
MAIN_CONF_FILE=/tmp/copy.of.main.config.file
NEWFILE="my_added_line_after_first_match_only."
sed -e '0,/^cfg_file=/ { /^cfg_file/ a\' \
-e "cfg_file=$NEWFILE" \
-e '}' \
"$MAIN_CONF_FILE"
In classic sed, the a command is followed by backslash-newline, and each subsequent line of the script is appended up to and including the first line without a backslash at the end (and the backslash is removed). Each -e argument functions as a line in the script. Distinguish between the shell lines escaped with backslash at the end and the sed script lines with backslash at the end.
Example at work
$ cat /tmp/copy.of.main.config.file | so
cfg_file=some_line1
cfg_file=some_line2
$ cat script.sh
MAIN_CONF_FILE=/tmp/copy.of.main.config.file
NEWFILE="my_added_line_after_first_match_only."
SED=/opt/gnu/bin/sed
${SED} \
-e '0,/^cfg_file=/ { /^cfg_file/ a\' \
-e "cfg_file=$NEWFILE" \
-e '}' \
"$MAIN_CONF_FILE"
$ bash script.sh
cfg_file=some_line1
cfg_file=my_added_line_after_first_match_only.
cfg_file=some_line2
$
This is based on your attempt 2, but avoids some of the weird stuff.
Basic sanity
As I noted, it is not sensible to experiment with sudo and the -i option to sed. You don't use those until you know that the script will do the job correctly. It is dangerous to do anything as root via sudo. It is doubly dangerous when you don't know whether what you're trying to use will work. Don't risk wrecking your system.

SED: unterminated `s' command at hyphen

I'm running the following in my provisioner
sed -i 's/DocumentRoot \/var\/www\/DocumentRoot \/var\/www\/app\/web-root\/\g' /etc/apache2/sites-available/000-default.conf
however I'm getting the error: sed: -e expression #1, char 69: unterminated 's' command - which is a hyphen (-) at that position. I've tried escaping it (\-) to no avail.
Any ideas?
your line:
sed -i 's/DocumentRoot \/var\/www\/DocumentRoot \/var\/www\/app\/web-root\/\g ...
^
sed needs s/.../.../g you have escaped the last / before g flag, more than that, you escaped g flag too. At least this mistake won't let your sed command go.
what better is, you pick another delimiter, if your pattern/replacement containing /(slash) too. It can save those dozens back slashes:
sed -i 's#foo/bar/blah#foo1/bar1/blah1#g` file

Sed remove matching lines script

I'm requesting help with a very simple script...
#!/usr/bin/sed -f
sed '/11,yahoo/d'
sed '/2506,stackover flow/d'
sed '/2536,reddit/d'
Just need it to remove three matches that account for 18408 in my file, data.csv
% sed -f remove.sed < data.csv
sed: 3: remove.sed: unterminated substitute pattern
Doing these same lines individually is no problem at all, so what am I doing wrong with this?
Using freeBSD 10.1 and its implementation of sed, if that matters.
This, being a sed script, should not have "sed" at each line.
Either change it to:
#!/usr/bin/sed -f
/11,yahoo/d
/2506,stackover flow/d
/2536,reddit/d
Or to
#!/bin/sh
sed -e /11,yahoo/d \
-e /2506,stackover flow/d \
-e /2536,reddit/d

Sed on CentOS and FreeBSD

i have the following command
sed '/.*href="\(backup[^"]*tbz\)".*/!d;s//\1/;q'
which is works on my CentOS install.
But when I try to run it on FreeBSD i got the following error:
sed: 1: "/.*href="\(backup[ ...": extra characters at the end of d command
(23) Failed writing body
What's wrong with this?
Thanks!
Try to run it like this:
sed '/.*href="\(backup[^"]*tbz\)".*/\!d;s//\1/;q'
Note the extra \ character escaping your !d command
Apparently FreeBSD sed doesn't like the semicolon as command separator. Try with multiple -e options instead:
sed -e '/.*href="\(backup[^"]*tbz\)".*/!d' -e 's//\1/' -e q
or perhaps newlines:
sed '/.*href="\(backup[^"]*tbz\)".*/!d
s//\1/
q'
(Yes, that's a long single-quoted string with two newlines in it.)

sed error - unterminated substitute pattern

I am in directory with files consisting of many lines of lines like this:
98.684807 :(float)
52.244898 :(float)
46.439909 :(float)
and then a line that terminates:
[chuck]: cleaning up...
I am trying to eliminate :(float) from every file (but leave the number) and also remove that cleaning up... line.
I can get:
sed -ie 's/ :(float)//g' *
to work, but that creates files that keeps the old files. Removing the -e flag results in an unterminated substitute pattern error.
Same deal with:
sed -ie 's/[chuck]: cleaning up...//g' *
Thoughts?
sed -i '' -e 's/:(float)//' -e '/^.chuck/d' *
This way you are telling sed not to save a copy (null length backup extention to -i) and separately specifying the sed commands.
sed -ie expression [files...]
is equivalent to:
sed -ie -e expression [files...]
and both mean apply expression to files, overwriting the files, but saving the old files with an "e" as the backup suffix.
I think you want:
sed -i -e expression [files...]
Now if you're getting an error from that there must be something wrong with your expression.
your numbers are separated with (float) by the : character. Therefore, you can use awk/cut to get your numbers. Its simpler than a regex
$ head -n -1 file | awk -F":" '{print $1}'
98.684807
52.244898
46.439909
$ head -n -1 file | cut -d":" -f1
98.684807
52.244898
46.439909
Solution :
sed -i '' 's/ :(float)//g' *
sed -i '' 's/[chuck]: cleaning up...//g' *
Explanation :
I can get:
sed -ie 's/ :(float)//g' *
to work, but that creates files that keeps the old files.
That's because sed's i flag is supposed to work that way
-i extension
Edit files in-place, saving backups with the specified extension. If a zero-length extension is given, no backup will be saved.
In this case e is being interpreted as the extension you want to save your backups with. So all your original files will be backed up with an e appended to their names.
In order to provide a zero-length extension, you need to use -i ''.
Note: Unlike -i<your extension>, -i'' won't work. You need to have a space character between -i and '' in order for it to work.
Removing the -e flag results in an unterminated substitute pattern error.
When you remove the e immediately following -i, i.e.
sed -i 's/ :(float)//g' *
s/ :(float)//g will now be interpreted as the extension argument to i flag. And the first file in the list of files produced by shell expansion of * is interpreted as a sed function (most probably s/regular expression/replacement/flags function) You can verify this by checking the output of
sedfn=$(echo * | cut -d' ' -f1); [[ ${sedfn:0:1} == "s" ]]; echo $?
If the output of the above chain of commands is 0, our assumption is validated.
Also in this case, if somehow the first filename qualifies as a valid s/regular expression/replacement/flags sed function, the other filenames will be interpreted as regular files for sed to operate on.
sed -i -e 's/ :(float)//g' *
Check to see if you have any odd filenames in the directory.
Here is one way to duplicate your error:
$ touch -- "-e s:x:"
$ ls
-e s:x:
$ sed -i "s/ :(float)//g' *
sed: -e expression #1, char 5: unterminated `s' command
One way to protect against this is to use a double dash to terminate the options to sed when you use a wild card:
$ sed -i "s/ :(float)//g' -- *
You can do the same thing to remove the file:
$ rm "-e s:x:"
rm: invalid option -- 'e'
$ rm -- "-e s:x:"