Using SED on MAC (zsh) to get first jpg after marker string - sed

Please Note: I found other gnu implementations of this, but they don't seem to work on a mac. This question is specifically for MacOS running zsh
I'm trying to pipe some output into SED and use it to find the first jpg after a marker string.
Here is my sample .sh file:
Phrase="where is \“frankenstien\" tonight.jpg with my hamburger tomorrow.jpg"
echo $Phrase | sed 's/.*\frankenstien" \(.*\)jpg/\1/'
The marker string is “frankenstien" (WITH quotes). I would like the output to be:
tonight.jpg
But instead its
tonight.jpg with my hamburger tomorrow.
So obviously the sequence passed to SED is wrong, how should I write it so that it stops after the first jpg AND includes the ".jpg" in it? I found many examples online of similar things but they did not work for MAC running zsh. Can the same code work on macs running bash? If you only get it to work on bash that might be good enough.
Thanks!

If the first jpg, is immediately following the frankenstien string (marker), then you can modify your regex to do below. The following should work on any POSIX compliant sed as it does not involve any constructs from the GNU version
sed 's/.*\"frankenstien\" \([^ ]*\).*/\1/'
The above regex will capture the string after the marker string and up to the subsequent space following the required string and ignore the rest.
P.S. Note that the shell versions don't play a role in how your regex string is interpreted by your sed installed. Remember sed is a binary on its own and comes shipped with your native distro (GNU on Linux and BSD on MacOS). There are few features supported in one and not in the other ( GNU vs *BSD ), but as such the native shell should not come into the picture here. E.g. In MacOS, with a default shell say zsh, you can have both BSD sed (shipped default) and GNU version (installable using homebrew).

how should I write it so that it stops after the first jpg AND includes the ".jpg" in it?
Match up until a space.
sed 's/.*frankenstien" \([^ ]*\) .*/\1/' <<<"$Phrase"
Handle tab also:
sed 's/.*frankenstien" \([^[:space:]]*\)[[:space:]].*/\1/' <<<"$Phrase"

Related

Run sha256sum (from Cygwin) on file with special character and blank (quoting does not work)

I have Cygwin installed in order to use Linux command line tools on Windows. I also added it to my PATH. In general, it works fine, but I observe this weird behavior:
I want to run sha256sum on the file C:\Users\s1504gl\Desktop\Täst .txt. Note the german Umlaut ä and the whitespace before the file extension. In order to avoid problems with paths, I always quote paths in command line calls, such as:
sha256sum "C:\Users\s1504gl\Desktop\Täst .txt"
However, PowerShell returns
/usr/bin/sha256sum: '"C:\Users\s1504gl\Desktop\T'$'\303\244''st .txt"': No such file or directory
When I rename the file to either Täst.txt or Test .txt, it works. So the combination of the special character ä and the whitespace seems to cause the problem. Exchanging double quotes by single quotes does not change anything in this case.
I am pretty sure it has to to with PowerShell since the example works without any problems on my Linux machine.
Is there some other way of escaping special characters and/or blanks that I do not know?
Run from Cygwin terminal
sha256sum "/cygdrive/C/Users/s1504gl/Desktop/Täst\ .txt"
In general Cygwin program do not accept Windows paths and works surely with POSIX path
I found the following workaround:
I create a temporary file from R, containing all the necessary commands and then run this tempfile using bash which is also included in Cygwin. This way, I escape from the problem occurring due to different encodings in Windows and the Linux tools from Cygwin.

sed: matching unicode blocks with

I am desperately trying to replace certain unicode characters (graphemes) from a file using sed. However I keep failing for some of them, namely the ones from unicode blocks:
\p{InHigh_Surrogates}: U+D800–U+DB7F
\p{InHigh_Private_Use_Surrogates}: U+DB80–U+DBFF
\p{InLow_Surrogates}: U+DC00–U+DFFF
I tried (in a sed config file loaded via the -f switch):
s/\p{InHigh_Surrogates}/###/ --> no effect at all
s/\\p\{InHigh_Surrogates\}/###_D-NON-UTF8_###/ -> error message 'Invalid content of \{\}'
Anybody got a suggestion? Also, I am not necessarily focused on using the blocks - but I also failed trying to define a character range of the form \xd800-\xdfff.
Thanks,
Thomas
Try using the -r flag for sed:
$ sed -r 's/\\p\{InHigh_Surrogates\}/###/g' file
###: U+D800–U+DB7F
\p{InHigh_Private_Use_Surrogates}: U+DB80–U+DBFF
\p{InLow_Surrogates}: U+DC00–U+DFFF
From man sed:
-r, --regexp-extended
use extended regular expressions in the script.

sed task not working on macbook pro

I have not found a solution for the following stupid task.
I have a file whose complete path I denote with
file_name
and two strings which are stored in variables var1 and var2.
I know that the string in var2 is inside the file file_name. I want to find and replace all the occurrences of var2 in file_name with the string in var1.
These strings contain path names. This means I have the character / inside.
Furthermore my machine is a macbook pro.
Combining many suggestions found on internet I finally tried in a terminal
sed -i "" -e "s:$var2:$var1:g" file_name
Result: file_name does not change. Any suggestion?
Is there a solution with awk?
Macs use BSD sed, not GNU sed. BSD sed does not have the -i or --in-place option. You will have to write out to a temporary file, and then move the new file in place after it is written.

How do I run the sed command with input and output as the same file?

I'm trying to do use the sed command in a shell script where I want to remove lines that read STARTremoveThisComment and lines that read removeThisCommentEND.
I'm able to do it when I copy it to a new file using
sed 's/STARTremoveThisComment//' > test
But how do I do this by using the same file as input and output?
sed -i (or the extended version, --in-place) will automate the process normally done with less advanced implementations, that of sending output to temporary file, then renaming that back to the original.
The -i is for in-place editing, and you can also provide a backup suffix for keeping a copy of the original:
sed -i.bak fileToChange
sed --in-place=.bak fileToChange
Both of those will keep the original file in fileToChange.bak.
Keep in mind that in-place editing may not be available in all sed implementations but it is in GNU sed which should be available on all variants of Linux, as per your tags.
If you're using a more primitive implementation, you can use something like:
cp oldfile oldfile.bak && sed 'whatever' oldfile >newfile && mv newfile oldfile
You can use the flag -i for in-place editing and the -e for specifying normal script expression:
sed -i -e 's/pattern_to_search/text_to_replace/' file.txt
To delete lines that match a certain pattern you can use the simpler syntax. Notice the d flag:
sed -i '/pattern_to_search/d' file.txt
You really should not use sed for that. This question seems to come up ridiculously often, and it seems very strange that it does since the general solution is so trivial. It seems bizarre that people want to know how to do it in sed, and in python, and in ruby, etc. If you want to have a filter operate on an input and overwrite it, use the following simple script:
#!/bin/sh -e
in=${1?No input file specified}
mv $in ${bak=.$in.bak}
shift
"$#" < $bak > $in
Put that in your path in an executable file name inline, and then the problem is solved in general. For example:
inline input-file sed -e s/foo/bar/g
Now, if you want to add logic to keep multiple backups, or if you have some options to change the backup naming scheme, or whatever, you fix it in one place. What's the command line option to get 1-up counters on the backup file when processing a file in-place with perl? What about with ruby? Is the option different for gnu-sed? How does awk handle it? The whole friggin' point of unix is that tools do one thing only. Handling logic for backup files is a second thing, and needs to be factored out. If you are implementing a tool, do not add logic to create backup files. Tell your users to use a 2nd tool for that. Integration is bad. Modularity is good. That is the unix way.
Notice that this script has several problems. The permissions/mode of the input file may be changed, for example. I'm sure there are innumerable other issues. However, by putting the backup logic in a wrapper script, you localize all of these issues and don't have to worry that sed overwrites the files and changes mode, while python keeps the file in place and does not change the inode (I made up those two cases, the point being that not all tools will use the same logic, while the wrapper script will.)
As far as I know it is not possible to use the same file for input and output. Though one solution is make a shell script which will save it to another file, delete the old input and rename the output to the input file name.
sed -e s/try/this/g input.file > output.file;mv output.file input.file
I suggest using sponge
sponge reads standard input and writes it out to the specified file.
Unlike a shell redirect, sponge soaks up all its input before writing
the output file. This allows constructing pipelines that read from and
write to the same file.
cat test | sed 's/STARTremoveThisComment//' | sponge test

Unable to use SED to edit files fast

The file is initially
$cat so/app.yaml
application: SO
...
I run the following command. I get an empty file.
$sed s/SO/so/ so/app.yaml > so/app.yaml
$cat so/app.yaml
$
How can you use SED to edit the file and not giving me an empty file?
$ sed -i -e's/SO/so/' so/app.yaml
The -i means in-place.
The > used in piping will open the output file when the pipes are all set up, i.e. before command execution. Thus, the input file is truncated prior to sed executing. This is a problem with all shell redirection, not just with sed.
Sheldon Young's answer shows how to use in-place editing.
You are using the wrong tool for the job. sed is a stream editor (that's why it's called sed), so it's for in-flight editing of streams in a pipe. ed OTOH is a file editor, which can do everything sed can do, except it works on files instead of streams. (Actually, it's the other way round: ed is the original utility and sed is a clone that avoids having to create temporary files for streams.)
ed works very much like sed (because sed is just a clone), but with one important difference: you can move around in files, but you can't move around in streams. So, all commands in ed take an address parameter that tells ed, where in the file to apply the command. In your case, you want to apply the command everywhere in the file, so the address parameter is just , because a,b means "from line a to line b" and the default for a is 1 (beginning-of-file) and the default for b is $ (end-of-file), so leaving them both out means "from beginning-of-file to end-of-file". Then comes the s (for substitute) and the rest looks much like sed.
So, your sed command s/SO/so/ turns into the ed command ,s/SO/so/.
And, again because ed is a file editor, and more precisely, an interactive file editor, we also need to write (w) the file and quit (q) the editor.
This is how it looks in its entirety:
ed -- so/app.yaml <<-HERE
,s/SO/so/
w
q
HERE
See also my answer to a similar question.
What happens in your case, is that executing a pipeline is a two-stage process: first construct the pipeline, then run it. > means "open the file, truncate it, and connect it to filedescriptor 1 (stdout)". Only then is the pipe actually run, i.e. sed is executed, but at this time, the file has already been truncated.
Some versions of sed also have a -i parameter for in-place editing of files, that makes sed behave a little more like ed, but using that is not advisable: first of all, it doesn't support all the features of ed, but more importantly, it is a non-standardized proprietary extension of GNU sed that doesn't work on many non-GNU systems. It's been a while since I used a non-GNU system, but last I used one, neither Solaris nor OpenBSD nor HP-UX nor IBM AIX sed supported the -i parameter.
I believe that redirecting output into the same file you are editing is causing your problem.
You need redirect standard output to some temporary file and when sed is done overwrite the original file by the temporary one.