How to generate caption from img alt atribute - emacs

Is there a way to convert an img tag containing an alt attribute (in a html file),
<img src="pics/01.png" alt="my very first pic"/>
to an image link plus caption (org file),
#+CAPTION: my very first pic
[[pics/01.png]]
using pandoc?
I'm calling pandoc like this:
$ pandoc -s -r html index.html -o index.org
where index.html contains the img tag from above, but it doesn't add the caption in the output org file:
[[pics/01.png]]

Currently the Org Writer unfortunately throws away the image alt and title strings. Feel free to submit an issue or patch if there's a way to do alt text in Org.
You can also always write a filter to modify the doc AST and add the alt text to an additional paragraph.

OP here. I didn't manage to make pandoc bend to my needs in this case. But a little bash scripting with some awk help does the trick.
The script replaces all img tags with org-mode equivalents plus captions. Pandoc leaves these alone when converting from html to org-mode.
The awk script,
# replace_img.awk
#
# Sample input:
# <img src="/pics/01.png" alt="my very first pic"/>
# Sample output:
# #+CAPTION: my very first pic
# [[/pics/01.png]]
BEGIN {
# Split the input at "
FS = "\""
}
# Replace all img tags with an org-mode equivalent.
/^<img src/{
print "#+CAPTION: " $4
print "[["$2"]]"
}
# Leave the rest of the file intact.
!/^<img src/
and the bash script,
# replace_img.sh
php_files=`find -name "*.php"`
for file in $php_files; do
awk -f replace_img.awk $file > tmp && mv tmp $file
done
Place these files at the root of the project, chomod +x replace_img.sh and then run the script: ./replace_img.sh. Change the extension of the files, if needed. I've had over 300 php files.

Related

how do I prepend a string to stdin in the command line?

This is probably pretty basic, but I'm struggling with the command line. Suppose I want to turn a markdown file myDoc.md to a pdf file. Markedjs provides a command line tool to convert markdown to html, and wkhtmltopdf can convert html to pdf, so I have the command
marked myDoc.md | wkhtmltopdf - myDoc.pdf
That works, it generates the pdf. But the pdf is pretty ugly, I want to prepend a style section to the html before passing it to wkhtmltopdf. Yes I could put the style section in the markdown document, but I don't want to pollute the markup with this. I want to use marked to generate html, then prepend a style section, then feed that to wkhtmltopdf, without any intermediate files to clean up. Something like this pseudo code
myStyle="<style>
*{
font-family: arial;
}
h1{
text-align:center;
}
</style>"
marked myDoc.md | concatenatestrings myStyle - | wkhtmltopdf - myDoc.pdf
but where I'm having trouble is I don't know how to handle the multiline string for myStyle and finding something that does what the hypothetical concatenatestrings command does, taking a string from stdin, prepend myStyle, and output to stdout.
I would use a template file, then use a subshell to output the template and output of marked myDoc.md to stdout, then pipe the results to the rest of your chain.
So, let us create the template file...
template.html
<style>
*{
font-family: arial;
}
h1{
text-align:center;
}
</style>
...and use it
$ (cat template.html && marked myDoc.md) | wkhtmltopdf - myDoc.pdf
I haven't tested this with your command (I don't want to install marked just to test it), but have tested it with the following...
$ (echo ree && echo cola) | cat
ree
cola

Change one word in lots of HTML pages with another word from a list of words

I have about 2000 HTML pages (all pages have the same content except for the city name). I have the list of city names, and i need each page to have 1 city name.
How can I change the City name in each page?
city name list: birmingham
montgomery
mobile
huntsville
tuscaloosa
hoover.. etc...
and I need to make each page like this:
title: birmingham,
next page;
title: montgomery,
and so on.
I need the change to happen in the title:Example (City Name)
and in 2 other h2 tags.
Thank you very much for your attention!
Update:
This script is for the existing files. It will hierarchically find all the index.html files in the current directory and will replace the "string_to_replace" string with that file's parent directory's name which is the city name in your case. It will also make that name capitalized before the replacement.
Feel free to update the tamplate_string variable value in the script so that it fits to the string which is used in your index.html files in place of the city name.
#!/bin/bash
template_string="string_to_replace"
current_dir=`pwd`
find $current_dir -name 'index.html' | while read file; do
dir=`basename $(dirname "$file")`
city="$(tr '[:lower:]' '[:upper:]' <<< ${dir:0:1})${dir:1}"
sed -i -e 's/'$template_string'/'$city'/g' $file
done
Initial answer:
My initial suggestion is to use a bash script (e.g. script.sh) similar to this:
#!/bin/bash
file="./cities.txt"
template="./template.html"
template_string="string_to_replace"
while IFS= read line
do
cp $template $line".html"
sed -i -e 's/'$template_string'/'$line'/g' $line".html"
echo "$line"
done <"$file"
and run it from bash terminal:
$ source script.sh
What you need to have:
cities.txt with cities names list, e.g.
London
Yerevan
Berlin
template.html with the html template you need to have in each file. Make sure the city name is set as "string_to_replace" in it, e.g. title: string_to_replace
Since you did not mention anything related to the files names, the files will be named like London.html, Yerevan.html,...
Let me know in case you don't need to create new files, and need to make replacement in the existing ones. In this case we'll need to update the script a bit after you tell me how you know which string is going to be used in the exact file.

AutoHotKey - How to take a screenshot and to paste it to a *.jpg file?

How to take a screenshot in Windows 8 and to paste it to a *.jpg file using AutoHotKey script? I want my custom key combination and folder for images
{CustomKey}:: ;image saved in Pictures/Screenshots folder by default
Send #{PrintScreen}
Return
The images are saved as png; can then be converted to jpg and moved to another directory.
Tested in Win 10.
Might work in Win 8 too.
Another question similar asked here:
https://autohotkey.com/board/topic/63742-how-to-save-a-screen-shot-with-ahk/
Problem was that saved file was blank. In my case, the file turned out to be all grey.
Reason was that FileAppend saved as text file.
To save screenshot image taken with PrintScrn button:
; Send {PrintScreen}
FileAppend %ClipboardAll%, FileName.raw, UTF-8
; This can be read back to memory with:
FileRead, Clipboard, *c FileName.raw
; image can also be converted/compressed to save space.
; I already had ffmpeg, so put this:
Run %ComSpec% /c "ffmpeg.exe -f rawvideo -pixel_format rgb32 -s 2256x1504 -i FileName.raw -vf hflip -vf vflip output.png"
Tested on Win 10.
You might try using the Gdip_All.ahk library. I found a maintained version of it at: https://github.com/mmikeww/AHKv2-Gdip/blob/master/Gdip_All.ahk
If you have Irfan View installed you can use this:
run('"C:\Program Files\IrfanView\i_view64.exe" /capture=0 "/convert=path_to_file.png" /jpgq=95')
You can also specify format or quality and include process name, time or other staff, for example:
dir := 'D:\screenshots'
name := winGetProcessName('A') ' ' a_YYYY '-' a_MM '-' a_DD ' ' a_hour '-' a_min '-' a_sec
format := 'png'
quality_jpeg := 95
run('"C:\Program Files\IrfanView\i_view64.exe" /capture=0 "/convert=' dir '\' name '.' format '" /jpgq=' quality_jpeg)
You can check Irfan View command line options in the program help or here.
You can also search some other programs that can silently make screenshots using command line options. If you find another good one let me know.

substitution within a text file, using Applescript and sed

The question is a sequel to plain text URL to HTML code (Automator/AppleScript).
Suppose I have a plain txt file /Users/myname/Desktop/URLlist.txt:
title 1
http://a.b/c
title 2
http://d.e/f
...
I'd like to (1) convert all the URL (http://...) to HTML code, and (2) add
<br />
to each empty line, so that the aforementioned content will become:
title 1
http://a.b/c
<br />
title 2
http://d.e/f
<br />
...
I come to the following Applescript:
set inFile to "/Users/myname/Desktop/URLlist.txt"
set middleFile to "/Users/myname/Desktop/URLlist2.txt"
set outFile to "/Users/myname/Desktop/URLlist3.txt"
do shell script "sed 's/\\(http[^ ]*\\)/<a href=\"\\1\">\\1<\\/a>/g' " & quoted form of inFile & " >" & quoted form of middleFile
do shell script "sed 's/^$/\\ <br \\/>/g' " & quoted form of middleFile & " >" & quoted form of outFile
It works, but it is redundant (and silly?). Could anyone make it more succinct? Can it be done involving only one text file instead of three (i.e. the original content in /Users/myname/Desktop/URLlist.txt is overwritten with the end result)?
Thank you very much in advance.
Try:
set inFile to "/Users/myname/Desktop/URLlist.txt"
set myData to (do shell script "sed '
/\\(http[^ ]*\\)/ a\\
<br />
' " & quoted form of inFile & " | sed 's/\\(http[^ ]*\\)/<a href=\"\\1\">\\1<\\/a>/g' ")
do shell script "echo " & quoted form of myData & " > " & quoted form of inFile
This will let you use the myData variable later in your script. If this is not part of a larger script and you are simply modifying your file, use the -i option as jackjr300 suggests. Also, this script looks for the original pattern and appends the new line to it rather than simply looking for empty lines.
EDIT:
set inFile to "/Users/myname/Desktop/URLlist.txt"
set myData to (do shell script "sed 's/\\(http[^ ]*\\)/<a href=\"\\1\">\\1<\\/a>/g; s/^$/\\ <br \\/>/g' " & quoted form of inFile)
do shell script "echo " & quoted form of myData & " > " & quoted form of inFile
Use the -i '' option to edit files in-place.
set inFile to "/Users/myname/Desktop/URLlist.txt"
do shell script "sed -i '' 's:^$:\\ <br />:; s:\\(http[^ ]*\\):\\1:g' " & quoted form of inFile
If you want a copy of the original file, use a specified extension like sed -i ' copy'
--
Updated:
A `DOCTYPE is a required preamble.
DOCTYPEs are required for legacy reasons. When omitted, browsers tend to use a different rendering mode that is incompatible with some specifications. Including the DOCTYPE in a document ensures that the browser makes a best-effort attempt at following the relevant specifications.
The HTML lang attribute can be used to declare the language of a Web page or a portion of a Web page. This is meant to assist search engines and browsers. According to the W3C recommendation you should declare the primary language for each Web page with the lang attribute inside the <html> tag
The <meta> tag provides metadata about the HTML document. <meta> tags always goes inside the <head> element.
The http-equiv attribute provides an HTTP header for the information/value of the content attribute.
content: the value associated with the http-equiv or name attribute.
charset: To display an HTML page correctly, the browser must know what character-set to use.
In this script: I put "utf-8" as encoding, change it by the encoding of your original file.
set inFile to "/Users/myname/Desktop/URLlist.html" -- text file with a ".html" extension
set nL to linefeed
set prepandHTML to "<!DOCTYPE html>\\" & nL & "<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en-US\" lang=\"en-US\">\\" & nL & tab & "<head><meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\" />\\" & nL & "</head>\\" & nL
do shell script "sed -i '' 's:^$:\\ <br />:; s:\\(http[^ ]*\\):\\1:g; 1s~^~" & prepandHTML & "~' " & quoted form of inFile
do shell script "echo '</html>' " & quoted form of inFile -- write last HTML tag
I can't understand sed commands very well (it makes my brain hurt) so here's the applescript way to do this task. Hope it helps.
set f to (path to desktop as text) & "URLlist.txt"
set emptyLine to " <br />"
set htmlLine1 to "<a href=\""
set htmlLine2 to "\">"
set htmlLine3 to "</a>"
-- read the file into a list
set fileList to paragraphs of (read file f)
-- modify the file as required into a new list
set newList to {}
repeat with i from 1 to count of fileList
set thisItem to item i of fileList
if thisItem is "" then
set end of newList to emptyLine
else if thisItem starts with "http" then
set end of newList to htmlLine1 & thisItem & htmlLine2 & thisItem & htmlLine3
else
set end of newList to thisItem
end if
end repeat
-- make the new list into a string
set text item delimiters to return
set newFile to newList as text
set text item delimiters to ""
-- write the new string back to the file overwriting its contents
set openFile to open for access file f with write permission
write newFile to openFile starting at 0 as text
close access openFile
EDIT: if you have trouble with the encoding these 2 handlers will handle the read/write properly. So just insert them in the code and adjust those lines to use the handlers. Good luck.
NOTE: when opening the file using TextEdit, use the File menu and open specifically as UTF-8.
on writeTo_UTF8(targetFile, theText, appendText)
try
set targetFile to targetFile as text
set openFile to open for access file targetFile with write permission
if appendText is false then
set eof of openFile to 0
write «data rdatEFBBBF» to openFile starting at eof -- UTF-8 BOM
else
tell application "Finder" to set fileExists to exists file targetFile
if fileExists is false then
set eof of openFile to 0
write «data rdatEFBBBF» to openFile starting at eof -- UTF-8 BOM
end if
end if
write theText as «class utf8» to openFile starting at eof
close access openFile
return true
on error theError
try
close access file targetFile
end try
return theError
end try
end writeTo_UTF8
on readFrom_UTF8(targetFile)
try
set targetFile to targetFile as text
targetFile as alias -- if file doesn't exist then you get an error
set openFile to open for access file targetFile
set theText to read openFile as «class utf8»
close access openFile
return theText
on error
try
close access file targetFile
end try
return false
end try
end readFrom_UTF8

How to extract strings from plist files for translation (localization)?

I need to prepare list of strings for translation of my iPhone application.
I have extracted strings from *.m files using genstring and from the XIB files using ibtool command.
But I have also lots of texts to translate in plist files (String field types enclosed in string tag).
Is there a nice bash script / command to extract those strings into a flat txt file?
I could review and filter it so my translators can work with nice list but not with alien looking XML file.
I made a custom shell script which tries to figure out the values needed. You can then use the localize.py script in a modified way (see below) to automatically create the translation files. (The line break where somehow very important) If there more entities to be translated, the shell script can be modified accordingly
#!/bin/bash
rm -f $2
sed -n 'N;/<key>Title<\/key>/{N;/<string>.*<\/string>/{s/.*<string>\(.*\)<\/string>.*/\/* \1 *\/\
"\1" = "\1";\
/p;};}' $1 >> $2
sed -n 'N;/<key>FooterText<\/key>/{N;/<string>.*<\/string>/{s/.*<string>\(.*\)<\/string>.*/\/* \1 *\/\
\"\1" = "\1";\
/p;}
;}' $1 >> $2
sed -n 'N;/<key>Titles<\/key>/{N;/<array>/{:a
N;/<\/array>/!{
/<string>.*<\/string>/{s/.*<string>\(.*\)<\/string>.*/\/* \1 *\/\
\"\1" = "\1";\
/p;}
ba
;};};}' $1 >> $2
the localize.py script needed some modification. Therefore I created a small package containing the localizer for the source code and for the plist Files. The new script even supports Duplikates (meaning it will kick them)
We recently made a small online application to do that, please take a look on: http://www.icapps.be/plist-translator/
I can't think of any command off the top of my head. However, plists are glorified xml files and there are various parsers available for them.
It shouldn't be too difficult to create a simple python script to get all the strings from the file.
Does this help?
http://www.icanlocalize.com/site/tutorials/how-to-translate-plist-files/
We much prefer paying clients who use our translation system with our translators, but you can translate yourself in our GUI at no charge.