Faster Emacs directory walker - emacs

Walking a directory tree in Emacs using the cookbook recipe (http://www.emacswiki.org/emacs/ElispCookbook#toc59), or the solution at Walk up the directory tree is quite slow.
Could one use Unix's find instead, via shell-command or call-process, and perform a funcall on the returned list?
Is there any cons to that idea (perhaps too much memory consumption for large trees?), and what would be the idiomatic way to do that in elisp, ie calling find with some given arguments and mapping a funcall on the returned value?
One possible benefit I can see is that the shell process could be launched asynchronously, without Emacs stopping at all when the process is started.

Yes, of course you can call find via call-process an then split the result line-by-line. Note that walk-path can be made significantly more efficient by the use of file-name-all-completions in place of directory-files and file-directory-p:
(defun my-walk-directory (dir action)
"walk DIR executing ACTION with (dir file).
DIR needs to end with /."
(dolist (file (let ((completion-ignored-extensions nil)
(completion-regexp-list nil))
(file-name-all-completions "" dir)))
(cond
((member file '("./" "../")) nil)
((eq ?/ (aref file (1- (length file)))) (my-walk-directory (concat dir file) action))
(t (funcall action dir file)))))
Of course, it's still not going to be as fast as find, but in my experience, this is about 10 times faster than using directory-files plus file-directory-p.

Related

Determine how library or feature was loaded

How can I determine where the load point is for an emacs library? For example, I'm trying to track down and remove any runtime requires of subr-x during initialization, so I'd like to know which library loaded it.
The load-history lists loaded files along with the requires they made when they were loaded, but doesn't seem to provide information about any requires that weren't evaluated initially, but may have been later.
As a simple example, if I M-xload-file "/path/to/the/following/test.el"
(defun my-f ()
(require 'misc))
(provide 'my-test)
I see the first entry in load-history is
("/path/to/test.el"
(defun . my-f)
(provide . my-test))
Then, evaluating (my-f), adds an entry for "misc.el", but there is no indication where it was loaded from (neither is the above entry updated).
How can I find that out?
How can I determine where the load point is for an emacs library?
You can't. There are many reasons an Emacs library will be loaded, for example,
autoload
C-x C-e some lisp code
M-: some lisp code
M-x load-library
For example, I'm trying to track down and remove any runtime requires of subr-x during initialization, so I'd like to know which library loaded it.
Use C-h v load-history, the order is meaningful, for example, your init file loads foo.el, and foo.el requires bar.el, then bar.el requires subr-x.el, load-history should looks like
(foo.el bar.el subr-x.el)
It's not an elegant solution, but worked for me.
As a starting point, that seems works fine for my purposes, I ended up "watching" for an initial call by load or require to a specific library. It's easy to get the name of the file where the require/load took place when an actual load is in progress and load-file-name is defined.
I was less interested in other cases, eg. interactive evaluation, but the following still works -- at least after very minimal testing, it just dumps a backtrace instead of the filename. From the backtrace, it's not hard to find the calling function, and if desired, the calling function's file could presumably be found with symbol-file.
Running the following locates loads/requires of subr-x, reporting in the message buffer the filenames of packages where it was loaded and dumping backtraces around deferred loading locations.
emacs -q -l /path/to/this.el -f find-initial-load
(require 'cl-lib)
(defvar path-to-init-file "~/.emacs.d/init.elc")
(defun find-load-point (lib &optional continue)
"During the first `require' or `load', print `load-file-name' when defined.
Otherwise, dump a backtrace around the loading call.
If CONTINUE is non-nil, don't stop after first load."
(let* ((lib-sym (intern lib))
(lib-path (or (locate-library lib) lib))
(load-syms (mapcar
(lambda (s)
(cons s (intern (format "%s#watch-%s" s lib-sym))))
'(require load)))
(cleanup (unless continue
(cl-loop for (ls . n) in load-syms
collect `(advice-remove ',ls ',n)))))
(pcase-dolist (`(,load-sym . ,name) load-syms)
(advice-add
load-sym :around
(defalias `,name
`(lambda (f sym &rest args)
(when (or (equal sym ',lib-sym)
(and (stringp sym)
(or (string= sym ,lib)
(file-equal-p sym ',lib-path))))
,#cleanup
(prin1 (or (and load-in-progress
(format "%s => %s" ',lib-sym load-file-name))
(backtrace))))
(apply f sym args)))))))
(defun find-initial-load ()
"Call with 'emacs -q -l /this/file.el -f find-initial-load'."
(find-load-point "subr-x" 'continue)
(load path-to-init-file))
;; test that deferred requires still get reported
(defun my-f () (require 'subr-x))
(add-hook 'emacs-startup-hook #'my-f)

Calling CCL + Quicklisp script as executable with command line arguments and achieving the desired output

After discovering a very simple way to watch YouTube videos from the command line using my new Raspberry Pi 2 (running Raspbian) using only easily obtainable packages, namely:
omxplayer -o local $(youtube-dl -g {videoURL})
I immediately wanted a way to watch entire YouTube playlists that way. So I saw this as a perfect excuse to hack together a solution in Common Lisp :)
My solution (imaginatively dubbed RpiTube) is a script that, when given the URL of a YouTube playlist, searches the page's HTML source and extracts the URLs for the videos contained within it. I can then
pass these URLs to a Bash script that ultimately calls the above command for each video individually, one after the other. The Common Lisp script itself is complete and works, however I'm having difficulty invoking it with the URL as a command-line argument. This is mainly because I'm still quite new to Quicklisp, Lisp packages and creating executables from Common Lisp code.
I'm running Clozure Common Lisp (CCL) with Quicklisp (installed as per Rainer Joswig's instructions). I've included the complete code below. It may be a little inefficient, but to my amazement it runs reasonably quickly even on the Raspberry Pi. (Suggested improvements are appreciated.)
;rpitube.lisp
;Given the URL of a YouTube playlist's overview page, return a list of the URLs of videos in said playlist.
(load "/home/pi/quicklisp/setup.lisp")
(ql:quickload :drakma)
(ql:quickload "cl-html-parse")
(ql:quickload "split-sequence")
(defun flatten (x)
"Paul Graham's utility function from On Lisp."
(labels ((rec (x acc)
(cond ((null x) acc)
((atom x) (cons x acc))
(t (rec (car x) (rec (cdr x) acc))))))
(rec x nil)))
(defun parse-page-source (url)
"Generate lisp list of a page's html source."
(cl-html-parse:parse-html (drakma:http-request url)))
(defun occurences (e l)
"Returns the number of occurences of an element in a list. Note: not fully tail recursive."
(cond
((null l) 0)
((equal e (car l)) (1+ (occurences e (cdr l))))
(t (occurences e (cdr l)))))
(defun extract-url-stubs (flatlist unique-atom url-retrieval-fn)
"In a playlist's overview page the title of each video is represented in HTML as a link,
whose href entry is part of the video's actual URL (referred to here as a stub).
Within the link's tag there is also an entry that doesn't occur anywhere else in the
page source. This is the unique-atom (a string) that we will use to locate the link's tag
within the flattened list of the page source, from which we can then extract the video's URL
stub using a simple url-retrieval-fn (see comments below this function). This function is iterative, not
recursive, because the latter approach was too confusing."
(let* ((tail (member unique-atom flatlist :test #'equal))
(n (occurences unique-atom tail))
(urls nil))
(loop for x in tail with i = 0
while (< (length urls) n) do
(if (string= x unique-atom)
(setf urls (cons (funcall url-retrieval-fn tail i) urls)))
(incf i))
(reverse urls)))
;Example HTML tag:
;<a class="pl-video-title-link yt-uix-tile-link yt-uix-sessionlink spf-link " data-sessionlink="verylongirrelevantinfo" href="/watch?v=uniquevideocode&index=numberofvideoinplaylist&list=uniqueplaylistcode" dir="ltr"></a>
;Example tag when parsed and flattened:
;(:A :CLASS "pl-video-title-link yt-uix-tile-link yt-uix-sessionlink spf-link " :DATA-SESSIONLINK "verylongirrelevantinfo" :HREF "/watch?v=uniquevideocode&index=numberofvideoinplaylist&list=uniqueplaylistcode" :DIR "ltr")
;The URL stub is the fourth list element after unique-atom ("pl-video-title..."), so the url-retreival-fn is:
;(lambda (l i) (elt l (+ i 4))), where i is the index of unique-atom.
(defun get-vid-urls (url)
"Extracts the URL stubs, turns them into full URLs, and returns them in a list."
(mapcar (lambda (s)
(concatenate 'string
"https://www.youtube.com"
(car (split-sequence:split-sequence #\& s))))
(extract-url-stubs (flatten (parse-page-source url))
"pl-video-title-link yt-uix-tile-link yt-uix-sessionlink spf-link "
(lambda (l i) (elt l (+ i 4))))))
(let ((args #+clozure *unprocessed-command-line-arguments*))
(if (and (= (length args) 1)
(stringp (car args)))
(loop for url in (get-vid-urls (car args)) do
(format t "~a " url))
(error "Usage: rpitube <URL of youtube playlist>
where URL is of the form:
'https://www.youtube.com/playlist?list=uniqueplaylistcode'")))
First I tried adding the following line to the script
#!/home/pi/ccl/armcl
and then running
$ chmod +x rpitube.lisp
$ ./rpitube.lisp {playlistURL}
which gives:
Unrecognized non-option arguments: (./rpitube.lisp {playlistURL})
when I would at least have expected that ./rpitube.lisp be absent from this list of unrecognized arguments. I know that in Clozure CL, in order to pass a command line argument to an REPL session untouched, I have to separate them from the other arguments with a double hyphen, like this:
~/ccl/armcl -l rpitube.lisp -- {playlistURL}
But invoking the script like this clearly lands me in a REPL after the script has run, which I don't want. Additionally the Quicklisp loading information and progress bars are printed to the terminal, which I also don't want. (Incidentally, as Rainer suggested, I haven't added Quicklisp to my CCL init file, since I generally don't want the additional overhead i.e. few second's loading time on the Raspberry Pi. I'm not sure if that's relevant).
I then decided to try creating a standalone executable by running (once the above code is loaded):
(ccl:save-application "rpitube" :prepend-kernel t)
And calling it from a shell like this:
$ ./rpitube {playlistURL}
which gives:
Unrecognized non-option arguments: ({playlistURL})
which seems to be an improvement, but I'm still doing something wrong. Do I need to replace the Quicklisp-related code by creating my own asdf package that requires drakma, cl-html-extract and split-sequence, and loading that with in-package, etc.? I have created my own package before in another project - specifically because I wanted to split up my code into multiple files - and it seems to work, but I still loaded my package via ql:quickload as opposed to in-package, since the latter never seemed to work (perhaps I should ask about that as a separate question). Here, the rpitube.lisp code is so short that it seems unecessary to create a whole quickproject and package for it, especially since I want it to be a standalone executable anyway.
So: how do I change the script (or its invocation) so that it can accept the URL as a command-line argument, can be run non-interactively (i.e. doesn't open a REPL), and ONLY prints the desired output to the terminal - a space-delimited list of URLs - without any Quicklisp loading information?
Ok, I've managed to adapt a solution from the suggestion linked by user #m-n above. RpiTube now seems to work for most playlists that I have tried except some music playlists, which are unreliable since I live in Germany and many music videos are blocked in this country for legal reasons. Huge playlists, very high quality (or very long) videos might be unreliable.
The BASH script:
#! /bin/bash
#Calls rpitube.lisp to retrieve the URLs of the videos in the provided
#playlist, and then plays them in order using omxplayer, optionally
#starting from the nth video instead of the first.
CCL_PATH='/home/pi/ccl/armcl'
RPITUBE_PATH='/home/pi/lisp/rpitube.lisp'
N=0
USAGE='
Usage: ./rpitube [-h help] [-n start at nth video] <playlist URL>
where URL is of the form: https://www.youtube.com/playlist?list=uniqueplaylistcode
******** Be sure to surround the URL with single quotes! *********'
play()
{
if `omxplayer -o local $(youtube-dl -g "$1") > /dev/null`; then
return 0
else
echo "An error occured while playing $1."
exit 1
fi
}
while getopts ":n:h" opt; do
case $opt in
n ) N=$((OPTARG - 1)) ;;
h ) echo "$USAGE"
exit 1 ;;
\? ) echo "Invalid option."
echo "$USAGE"
exit 1 ;;
esac
done
shift $(($OPTIND - 1))
if [[ "$#" -ne 1 ]]; then
echo "Invalid number of arguments."
echo "$USAGE"
exit 1
elif [[ "$1" != *'https://www.youtube.com/playlist?list='* ]]; then
echo "URL is of the wrong form."
echo "$USAGE"
exit 1
else
echo 'Welcome to RpiTube!'
echo 'Fetching video URLs... (may take a moment, especially for large playlists)'
urls="$(exec $CCL_PATH -b -e '(progn (load "'$RPITUBE_PATH'") (main "'$1'") (ccl::quit))')"
echo 'Starting video... press Q to skip to next video, left/right arrow keys to rewind/fast-forward, Ctrl-C to quit.'
count=0
for u in $urls; do #do NOT quote $urls here
[[ $count -lt $N ]] && count=$((count + 1)) && continue
play "$u"
echo 'Loading next video...'
done
echo 'Reached end of playlist. Hope you enjoyed it! :)'
fi
I made the following changes to the CL script: added the :silent option to the ql:quickload calls; replace my own ocurrences function with the built-in count (:test #'equal); and most importantly several things to the code at the end of the script that actually calls the URL-fetching functions. First I wrapped it in a main function that takes one argument, namely the playlist URL, and removed the references to *command-line-argument-list* etc. The important part: instead of invoking the entire rpitube.lisp script with the URL as a command line argument to CCL, I invoke it without arguments, and instead pass the URL to the main function directly (in the call to exec). See below:
(defun main (url)
(if (stringp url)
(loop for u in (get-vid-urls url) do
(format t "~a " u))
(error "Usage: rpitube <URL of youtube playlist>
where URL is of the form:
'https://www.youtube.com/playlist?list=uniqueplaylistcode'")))
This method could be applied widely and it works fine, but I'd be amazed if there isn't a better way to do it. If I can make any progress with the "toplevel" function + executable idea, I'll edit this answer.
An example working invocation, run on a small playlist of short videos, with playback beginning at the 3rd video:
$ ./rpitube -n 3 'https://www.youtube.com/playlist?list=PLVPJ1jbg0CaE9eZCTWS4KxOWi3NWv_oXL'
Many thanks.
I looked at this some and would like to share what I found. There are also several Lisp libraries which aim to facilitate scripting, executable building, or command-line argument handling.
For your executable building approach, save-application lets you specify a :toplevel-function, a function of zero arguments. In this case you will need to get the command line arguments through ccl:*command-line-argument-list*, and skip the first element (the name of the program). This is probably the minimal change to get your program running (I haven't run this; so it may have typos):
(defun toplevel ()
(let ((args #+clozure *command-line-argument-list*))
(if (and (= (length args) 2)
(stringp (second args)))
(loop for url in (get-vid-urls (second args)) do
(format t "~a " url))
(error "Usage: rpitube <URL of youtube playlist>
where URL is of the form:
'https://www.youtube.com/playlist?list=uniqueplaylistcode'"))))
(save-application "rpitube" :prepend-kernal t :toplevel-function #'toplevel)
Alternatively, some Lisp implementations have a --scpript command-line parameter which allows something similar to your #!/home/pi/ccl/armcl script to work. CCL doesn't seem to have an equivalent option, but a previous answer -- https://stackoverflow.com/a/3445196/2626993 -- suggests writing a short Bash script which would essentially behave like you hoped CCL would with this attempt.
quickload calls can be silenced with an argument:
(ql:quickload :drakma :silent t)

Allow dired-do-copy and dired-do-rename to create new dir on the fly

Does anyone have an emacs lisp hack that would allow the creation of a new directory on the fly during dired-do-copy or dired-do-rename? I understand that it can be created prior to running one of these two commands. Extra points for some type of "Are you sure..." prompt.
It look like a case of applying an advice. The question being: what to
advice. Looking at the dired code, it seem that the correct target is
dired-mark-read-file-name that is used to read the destination
file-name. This will work:
(defadvice dired-mark-read-file-name (after rv:dired-create-dir-when-needed (prompt dir op-symbol arg files &optional default) activate)
(when (member op-symbol '(copy move))
(let ((directory-name (if (< 1 (length files))
ad-return-value
(file-name-directory ad-return-value))))
(when (and (not (file-directory-p directory-name))
(y-or-n-p (format "directory %s doesn't exist, create it?" directory-name)))
(make-directory directory-name t)))))
Note that maybe the first when (when (member op-symbol '(copy move))) could be removed for this to apply to more case of file creation in dired. But I'm not sure of when dired-mark-read-file-name is called, So I let this test there to reduce potential unwanted side-effect

What is the canonical way to list numbered backup files emacs has created?

I know how to configure emacs to keep numbered backups. I don't know the most canonical way to find those numbered backups.
The emacs function "find-backup-file-name" seems like it is the closest. Its documentation states:
This function computes the file name for a new backup file for filename. It may also propose certain existing backup files for deletion. find-backup-file-name returns a list whose CAR is the name for the new backup file and whose CDR is a list of backup files whose deletion is proposed.
However, this is not what I am looking for. I'm looking for a list of ALL previously created backup files. Here's the code (paraphrased) I have written to accomplish this:
(defvar backup-directory "~/emacs.d/backups/")
(defun get-backup-pattern (file-name)
(concat "*" (replace-regexp-in-string "\/" "\\!" file-name t t) ".~*"))
(butlast
(split-string
(shell-command-to-string
(concat "find "
backup-directory
" -name \""
(get-backup-pattern (buffer-file-name))
"\""))
"\n"))
This method works fine. However, shelling out to "find" seems a like a hack to me; Especially since this method is platform specific.
Is there a built-in method I should use or at least something more idiomatic?
Personally, I don't save backup files in a central folder so I can't provide working code, but if you want to search the contents of a directory, use directory-files.
So here is the solution I've decided on. I went away from using the *nix find command and am using directory-files as suggested.
(defun get-filter-pattern (file-name)
(concat (replace-regexp-in-string "\/" "!" file-name t t)
".~[0-9]*~*$"))
(defun filter (condp lst)
(delq nil
(mapcar (lambda (x) (and (funcall condp x) x)) lst)))
(defun filter-files (backup-directory buffer-file-name)
(mapcar (lambda (backup-name) (concat backup-directory backup-name))
(filter (lambda (backup-name)
(string-match (get-filter-pattern buffer-file-name) backup-name))
(directory-files backup-directory))))
Perhaps this isn't quite as optimized as using find. However, it should be platform independent (ie can use on Windows).

Preferring certain file extensions with Emacs file name completion

I have lots of directories filled with a bunch of TeX documents. So, there's lots of files with the same base filename and different extensions. Only one of them, though, is editable. I'd like a way to convince Emacs that if I'm in a directory where I've got
document.tex
document.log
document.pdf
document.bbl
document.aux
...
and I'm in the minibuffer and do
~/Documents/.../doc<TAB>
it fills in 'document.tex', because that's the only really properly editable document in that directory. Anybody know of a good way to do that?
I've written some code that should do what you want. The basic idea is to set the variable 'completion-ignored-extensions to match the extensions you want to skip, but only when there are .tex files present. This code does that.
(defadvice find-file-read-args (around find-file-read-args-limit-choices activate)
"set some stuff up for controlling extensions when tab completing"
(let ((completion-ignored-extensions completion-ignored-extensions)
(find-file-limit-choices t))
ad-do-it))
(defadvice minibuffer-complete (around minibuffer-complete-limit-choices nil activate)
"When in find-file, check for files of extension .tex, and if they're found, ignore .log .pdf .bbl .aux"
(let ((add-or-remove
(if (and (boundp 'find-file-limit-choices) find-file-limit-choices
(save-excursion
(let ((b (progn (beginning-of-line) (point)))
(e (progn (end-of-line) (point))))
(directory-files (file-name-directory (buffer-substring-no-properties b e)) nil "\\.tex$"))))
'add-to-list
'remove)))
(mapc (lambda (e) (setq completion-ignored-extensions
(funcall add-or-remove 'completion-ignored-extensions e)))
'(".log" ".pdf" ".bbl" ".aux")))
ad-do-it)
Enjoy.
Probably the easiest way to do this in your case is just to customize the variable "completion-ignored-extensions".
However, this will mean that emacs always ignores things like ".log" and ".pdf" which may not be what you want. If you want it to be more selective, you may have to effectively re-implement the function file-name-completion.
If you are open to installing a large-ish library and reading some documentation, you could take a look at Icicles and define a sort function that meets your needs. An alternative is ido whose wiki page has an example of sorting by mtime, which should be easy to change to sort by a function of the filename extension.