I need to replace 400+ words with different hyperlinks in a rtf- or .docx-document.
I’ve made a script using keystrokes (cmd+f, esc etc), but the script takes forever and is not stable enough.
Using sed -i I’m able to do a replacement of the word, but not with hyperlink. Is this possible?
set theFile to choose file
set original to "foo"
set substitute to "VG"
set newlink to "https://www.vg.no"
do shell script "sed -i '' \"s|" & original & "|" & substitute & "|g\" " & quoted form of (POSIX path of theFile)
Here is something specific to try:
First, create a rich text document with the word 'foo' as its content (in TextEdit, as Word's output is an abomination). Save the file in the appropriate place and run this:
set theFile to ((path to desktop) as text) & "slink3.rtf"
set qptf to quoted form of POSIX path of theFile
set origStr to "foo"
set subStr to "VG"
set newlink to "https://www.vg.no"
do shell script "sed -i '' -e 's|" & origStr & "|" & subStr & "|' -e 's|VG|{{\\\\*\\\\fldinst{HYPERLINK \"https://www.vg.no/\"}}{\\\\fldrslt VG}}|' " & qptf
This expands to:
do shell script "sed -i '' -e 's|foo|VG|' -e 's|VG|{{\\\\*\\\\fldinst{HYPERLINK \"https://www.vg.no/\"}}{\\\\fldrslt VG}}|' '/Users/username/Desktop/slink3.rtf'"
The actual shell command which runs is:
% sed -i '' -e 's|foo|VG|' -e 's|VG|{{\\*\\fldinst{HYPERLINK "https://www.vg.no/"}}{\\fldrslt VG}}|' '/Users/username/Desktop/slink3.rtf'
What it does is first replace the string foo with the string VG, and then replace the string VG with VG as hyperlinked text.
NB TextEdit has a preference to display the raw rtf upon opening a file rather than the formatted text. If you do this with a document containing a single word, the structure is relatively clear. I recommend against even looking at a Word-generated document.
If you do this, you will see that the raw rtf uses a single backslash but both the shell and applescript require escaping which is why the script has \\\\.
Incidentally, I notice that you finish your sed search with 'g' but shouldn't this only run once per line? Consider removing it.
Obviously, I don't know your entire workflow but hopefully this matches the section you have posted.
Related
This is the command working for me:
sed "50,99999{/^\s*printf/d}" the_file
So this command delete all the lines between 50 and 99999 which have "printf" in it and there is only whitespace before printf at the line.
Now my questions are:
how to replace 99999 with some meta symbol to indicate the real line number
I tried sed "50,${/^\s*PUTS/d}" the_file, but it is not right.
how to replace "printf" with an environment variable? I tried
set pattern printf
sed "50,99999{/^\s*$pattern/d}" the_file
but it is not right.
Assuming a Bourne-like shell such as bash:
Simply define shell variables and splice them into your sed command string:
endLine=99999
pattern='printf'
sed '50,'"$endLine"'{ /^\s*'"$pattern"'/d; }' the_file
Note that the static parts of the sed command strings are single-quoted, as that protects them from interpretation by the shell (which means you needn't quote $ and `, for instance).
You can put everything into a single double-quoted string so as to be able to embed variable references directly, but distinguishing between what the shell will interpret up front and what sed will interpret can get confusing quickly.
That said, using a double-quoted string for the case at hand is simple:
sed "50,$endLine { /^\s*$pattern/d; }" the_file
sed "50,${/^\s*PUTS/d}" the_file
this line won't work, because you used double quotes, and you need escape the dollar: \$ or use single quote: '50,${/.../d}' file
sed "50,99999{/^\s*$pattern/d}" file
this line should work.
EDIT
wait, I just noticed that you set env var via set... this is not correct if you were with Bash. you should use export PAT="PUT" in your script.
check #Jonathan and #tripleee 's comments
I have a huge list of locations in this form in a text file:
ar,casa de piedra,Casa de Piedra,20,,-49.985133,-68.914673
gr,riziani,Ríziani,18,,39.5286111,20.35
mx,tenextepec,Tenextepec,30,,19.466667,-97.266667
Is there any way with command line to remove everything that isn't between the first and second commas? For example, I want my list to look like this:
casa de piedra
riziani
tenextepec
with Perl
perl -F/,/ -ane 'print $F[1]."\n"' file
Use cut(1):
cut -d, -f2 inputfile
With perl:
perl -pe 's/^.*?,(.*?),.*/$1/' filename
Breakdown of the above code
perl - the command to use the perl programming language.
-pe - flags.
e means "run this as perl code".
p means:
Set $_ to the first line of the file (given by filename)
Run the -e code
Print $_
Repeat from step 1 with the next line of the file
what -p actually does behind the scenes is best explained here.
s/.*?,(.*?),.*/$1/ is a regular expression:
s/pattern/replacement/ looks for pattern in $_ and replaces it with replacement
.*? basically means "anything" (it's more complicated than that but outside the scope of this answer)
, is a comma (nothing special)
() capture whatever is in them and save it in $1
.* is another (slightly different) "anything" (this time it's more like "everything")
$1 is what we captured with ()
so the whole thing basically says to search in $_ for:
anything
a comma
anything (save this bit)
another comma
everything
and replace it with the bit it saved. This effectively saves the stuff between the first and second commas, deletes everything, and then puts what it saved into $_.
filename is the name of your text file
To review, the code goes through your file line by line, applies the regular expression to extract your needed bit, and then prints it out.
If you want the result in a file, use this:
perl -pe 's/^.*?,(.*?),.*/$1/' filename > out.txt
and the result goes into a file named out.txt (that will be placed wherever your terminal is pointed to at the moment.) What this pretty much does is tell the terminal to print the command's result to a file instead of on the screen.
Also, if it isn't crucial to use the command line, you can just import into Excel (it's in CSV format) and work with it graphically.
With awk:
$ awk -F ',' '{ print $2 }' file
Suppose I have a text file with content like below:
'Jack', is a boy
'Jenny', is a girl
...
...
...
I'd like to use perl in Cli to only capture the names between pairs of single quotes
cat text| perl -ne 'print $1."\n" if/\'(\w+?)\'/'
Above command was what I ran but it didn't work. It seems like "'" messed up with Shell.
I know we have other options like writing a perl script. But given my circumstances, I'd like to find a way to fulfill this in Shell command line.
Please advise.
The shell has the interesting property of concatenating quoted strings. Or rather, '...' or "..." should not be considered strings, but modifiers for available escapes. The '...'-surrounded parts of a command have no escapes available. Outside of '...', a single quote can be passed as \'. Together with the concatenating property, we can embed a single quote like
$ perl -E'say "'\''";'
'
into the -e code. The first ' exits the no-escape zone, \' is our single quote, and ' re-enters the escapeless zone. What perl saw was
perl // argv[0]
-Esay "'"; // argv[1]
This would make your command
cat text| perl -ne 'print $1."\n" if/'\''(\w+?)'\''/'
(quotes don't need escaping in regexes), or
cat text| perl -ne "print \$1.qq(\n) if/'(\w+?)'/"
(using double quotes to surround the command, but using qq// for double quoted strings and escaping the $ sigil to avoid shell variable interpolation).
Here are some methods that do not require manually escaping the perl statement:
(Disclaimer: I'm not sure how robust these are – they haven't been tested extensively)
Cat-in-the-bag technique
perl -ne "$(cat)" text
You will be prompted for input. To terminate cat, press Ctrl-D.
One shortcoming of this: The perl statement is not reusable. This is addressed by the variation:
$pline=$(cat)
perl -ne "$pline" text
The bash builtin, read
Multiple lines:
read -rd'^[' pline
Single line:
read -r pline
Reads user input into the variable pline.
The meaning of the switches:
-r: stop read from interpreting backslashes (e.g. by default read interprets \w as w)
-d: determines what character ends the read command.
^[ is the character corresponding to Esc, you insert ^[ by pressing Ctrl-V then Esc.
Heredoc and script.
(You said no scripts, but this is quick and dirty, so might as well...)
cat << 'EOF' > scriptonite
print $1 . "\n" if /'(\w+)'/
EOF
then you simply
perl -n scriptonite text
I have an applescript to find and replace a number of strings. I ran in the problem of having a replacement string which contained & some time ago, but could get around it by putting \& in the replacement property list. However an apostrophe seems to be far more annoying.
Using a single apostrophe just gets ignored (replacement doesn't contain it), using \' gives a syntax error (Expected “"” but found unknown token.) and using \' gets ignored again. (You can keep doing that btw, even number gets ignored uneven gets syntax error)
I tried replacing the apostrophe in the actual sed command with double quotes (sed "s…" instead of sed 's…'), which works in the command line, but gives a syntax error in the script (Expected end of line, etc. but found identifier.)
The single quotes mess with the shell, the double quotes with applescript.
I also tried '\'' as was suggested here and '"'"' from here.
Basic script to get the type of errors:
set findList to "Thats.nice"
set replaceList to "That's nice"
set fileName to "Thats.nice.whatever"
set resultFile to do shell script "echo " & fileName & " | sed 's/" & findList & "/" & replaceList & " /'"
Try:
set findList to "Thats.nice"
set replaceList to "That's nice"
set fileName to "Thats.nice.whatever"
set resultFile to do shell script "echo " & quoted form of fileName & " | sed \"s/Thats.nice/That\\'s nice/\""
or to stick to your example:
set findList to "Thats.nice"
set replaceList to "That's nice"
set fileName to "Thats.nice.whatever"
set resultFile to do shell script "echo " & quoted form of fileName & " | sed \"s/" & findList & "/" & replaceList & "/\""
Explanation:
The sed statement is usually enclosed by single quotes like this:
set myText to "Hello"
set xxx to do shell script "echo " & quoted form of myText & " | sed 's/ello/i/'"
However, in this example you could have exluded the single quotes altogether.
set myText to "Hello"
set xxx to do shell script "echo " & quoted form of myText & " | sed s/ello/i/"
The unquoted sed statement will break down as soon a space is included.
set myText to "Hello"
set xxx to do shell script "echo " & quoted form of myText & " | sed s/ello/i there/"
--> error "sed: 1: \"s/ello/i\": unterminated substitute in regular expression" number 1
Since you can't include an apostrophe within a single quoted statement (even if you escape it), you can enclose the sed statement in double quotes like this:
set myText to "Johns script"
set xxx to do shell script "echo " & quoted form of myText & " | sed \"s/ns/n's/\""
EDIT
Lauri Ranta makes a good point that if your find or replace string contains escaped double quotes my answer won't work. Her solution is as follows:
set findList to "John's"
set replaceList to "\"Lauri's\""
set fileName to "John's script"
set resultFile to do shell script "echo " & quoted form of fileName & " | sed s/" & quoted form of findList & "/" & quoted form of replaceList & "/"
I'd also use text item delimiters. You don't have to include AppleScript's in the default scope or set the property back if it isn't used later.
set input to "aasearch"
set text item delimiters to "search"
set ti to text items of input
set text item delimiters to "replace"
ti as text
There's no easy way to escape the search or replace patterns if they can contain something that would be interpreted by sed.
set input to "a[a"
set search to "[a"
set replace to "b"
do shell script "sed s/" & quoted form of search & "/" & quoted form of replace & "/g <<< " & quoted form of input
If you have to use regular expressions, scripting languages like Ruby have methods for creating patterns from strings.
set input to "aac"
set search to "(a+)"
set replace to "\\1b"
do shell script "ruby -KUe 'print STDIN.read.chomp.gsub(Regexp.new(ARGV[0]), ARGV[1])' " & quoted form of search & " " & quoted form of replace & " <<< " & quoted form of input without altering line endings
How to force echo command to output a tab character in MS nmake makefile?
Verbatim tabs inserted right into a string after echo command are removed by nmake and don't show up in the output file.
all :
#echo I WANT TO OUTPUT THE <TAB> CHARACTER HERE! > output.txt
You can use a variable with TAB char. Put this lines in your .bat file:
set tab=[stroke TAB char from your keyboard]
echo a%tab%b>yourfile.txt
Doing so, yourfile.txt will have text a[TAB]b
As a workaround, you can create a file containing the tab character, named input.txt (not using nmake), and then say:
all :
#copy /b input.txt output.txt
I assume you already have tried to put the tab inside quotes?
all:
#echo "<TAB>" > output.txt
DOS and Windows have ugly text support in native batch files :).
Here is nice way to do your task:
install Python interpretator
write simple script which appends character with specified code to file
call script wherever you want :)
Simple script:
'''
append_char.py - appends character with specified code to end of file
Usage: append_char.py filename charcode
'''
import os
import sys
filename = sys.argv[1]
assert os.path.exists(filename)
charcode = int(sys.argv[2])
assert 0 <= charcode <= 255
fh = open(filename, 'ab')
fh.seek(0, os.SEEK_END)
fh.write(chr(charcode))
fh.close()
using this script from batch file you can create any possible file in universe :)
output.txt:
<<output.txt
I WANT TO OUTPUT THE <TAB> CHARACTER HERE!
<<KEEP
<TAB> represents a literal tab character here of course.
I had the same need. I used the answer using the quotes around the character and just took it one step further.
{tab} means pressing the keyboard tab key in the text editor.
SET tab="{tab}"
SET tab=%tab:~1,1%
The second line extracts the middle character from the quoted string.
Now you can use the %tab% variable with ECHO and, I suspect, anywhere else it's needed.
ECHO %tab%This text is indented, preceded by a tab.
I suspect this technique could be used with other problem characters as well.