I need to read a file contents into a two-dimensional list, split by newlines and spaces. For example,
a b
c d
needs to become
(list (list "a" "b") (list "c" "d"))
Currently I only know how to read the contents into a simple list determined by newlines. Whenever I need to use an element from that list, I have to split it by spaces everytime, but preferably this should be done only once beforehand.
While abo-abo's answer above is fine, it creates a temporary string with the full contents of the file, which is inefficient. If the file is very large, it is better to walk a buffer collecting data line-by-line:
(defun file-to-matrix (filename)
(with-temp-buffer
(insert-file-contents filename)
(let ((list '()))
(while (not (eobp))
(let ((beg (point)))
(move-end-of-line nil)
(push (split-string (buffer-substring beg (point)) " ") list)
(forward-char)))
(nreverse list))))
Note the use of with-temp-buffer, which avoids leaving a buffer lying around, and the use of insert-file-contents, which avoids interfering with any other buffer that might be visiting the same file.
Like this:
(with-current-buffer (find-file-noselect "~/foo")
(mapcar (lambda (x) (split-string x " " t))
(split-string
(buffer-substring-no-properties (point-min) (point-max))
"\n")))
With dash, s and f third-party libraries:
(--map (s-split " " it) (s-lines (s-chomp (f-read "FILE.TXT"))))
or:
(->> "FILE.TXT" f-read s-chomp s-lines (--map (s-split " " it)))
which are the same thing.
Related
I want to add a function (para2lines) to Emacs by which I can split the current paragraph into its sentences and print them line by line in a separate buffer. Following is code in Racket/Scheme:
(define (p2l paraString)
(define lst (string-split paraString ". "))
(for ((i lst))
(displayln i)))
Testing:
(p2l "This is a test. For checking only. Only three lines.")
Output:
This is a test.
For checking only.
Only three lines.
In Emacs Lisp, I could manage following code:
(defun pl (ss)
(interactive)
(let ((lst (split-string (ss))))
(while lst
(print (pop lst)))))
But I do not know how to get the text from the paragraph with current position. How can I correct this function?
Edit: basically, I want to read it as separate lines but want to save it as paragraph.
Here's an example that might help you on your way. It will do your conversion to the current paragraph (i.e. where the cursor is positioned), rather than to a new buffer. You could modify this to pass a string to your function if that's what you require.
(defun p2l ()
"Format current paragraph into single lines."
(interactive "*")
(save-excursion
(forward-paragraph)
(let ((foo (point)))
(backward-paragraph)
(replace-regexp "\n" " " nil (1+ (point)) foo)
(backward-paragraph)
(replace-regexp "\\. ?" ".\n" nil (point) foo))))
I would just run Emacs commands or write a macro to convert a paragraph to single-sentence lines, but maybe you are really just wanting to read wrapped paragraphs as lines, thus the need to have an Emacs command.
Here's something that will grab the current paragraph, insert a new buffer *Lines*, and then convert sentences to lines.
(defun para-lines ()
"Split sentences of paragraph to lines in new buffer."
(interactive)
;; Move the paragraph to a new buffer.
(let ((b (generate-new-buffer "*Lines*")))
(with-output-to-temp-buffer b
(let ((beg (save-excursion (forward-paragraph -1) (point)))
(end (save-excursion (forward-paragraph +1) (point))))
(princ (buffer-substring-no-properties beg end))))
;; Switch to new buffer
(with-current-buffer b
;; Since the name starts with "*", shut off Help Mode
(fundamental-mode)
;; Make sure buffer is writable
(setq buffer-read-only nil)
;; From the start of the buffer
(goto-char (point-min))
;; While not at the end of the buffer
(while (< (point) (point-max))
(forward-sentence 1)
;; Delete spaces between sentences before making new new line
(delete-horizontal-space)
;; Don't add a new line, if already at the end of the line
(unless (= (line-end-position) (point))
(newline))))))
To avoid using forward-sentence, and just use a regular expression, use re-search-forward. For instance, to match semi-colons as well as periods.
(defun para-lines ()
"Split sentences of paragraph to lines in new buffer."
(interactive)
;; Move the paragraph to a new buffer.
(let ((b (generate-new-buffer "*Lines*")))
(with-output-to-temp-buffer b
(let ((beg (save-excursion (forward-paragraph -1) (point)))
(end (save-excursion (forward-paragraph +1) (point))))
(princ (buffer-substring-no-properties beg end))))
;; Switch to new buffer
(with-current-buffer b
;; Since the name starts with "*", shut off Help Mode
(fundamental-mode)
;; Make sure buffer is writable
(setq buffer-read-only nil)
;; From the start of the buffer
(goto-char (point-min))
;; While not at the end of the buffer
(while (< (point) (point-max))
(re-search-forward "[.;]\\s-+" nil t)
;; Delete spaces between sentences before making new new line
(delete-horizontal-space)
;; Don't add a new line, if already at the end of the line
(unless (= (line-end-position) (point))
(newline))))))
In an automated Emacs Lisp --batch/--script script I need to process the command line arguments given to the script.
I've gotten as far as getting the arguments into a list of the the form:
("--aaa=bbb" "--ccc=ddd=eee" "--blah")
Now, I need to convert them to a list of the form:
(("aaa" "bbb") ("ccc" "ddd=eee") ("blah"))
In Python I'd write something like;
output = []
for v in input:
output.append(v[2:].split("=", 1))
But have been unable to convert that code to Emacs Lisp. I found Elisp split-string function to split a string by . character but wasn't able to figure out how to make it only split on the first equals.
I was heading down a route of using (substring "abcdefg" x x) with (search) from the cl package but it felt like there should be a better way? I think also want to use (mapc '<function> input) where function does the v[2:].split("=",1) part.
You can use split-string. See the following code example.
(setq cmd-line '("--aaa=bbb" "--ccc=ddd=eee" "--blah"))
(setq cmd-line (mapcar (lambda (argstr)
(when (string-match "^--" argstr)
(split-string (substring argstr 2) "=")))
cmd-line))
The output is (("aaa" "bbb") ("ccc" "ddd" "eee") ("blah")).
That is not exactly what you want because of "eee". Maybe you can use that and just neglect "eee".
If the "eee" is really a problem a small modification helps:
(setq cmd-line '("--aaa=bbb" "--ccc=ddd=eee" "--blah"))
(setq cmd-line (mapcar (lambda (arg)
(when (string-match "^--" arg)
(setq arg (split-string (substring arg 2) "="))
(if (cdr arg)
(setcdr (cdr arg) nil))
arg))
cmd-line))
The output is:
(("aaa" "bbb") ("ccc" "ddd") ("blah"))
Variant for the new requirement in the question:
(setq cmd-line '("--aaa=bbb" "--ccc=ddd=eee" "--blah"))
(setq cmd-line (mapcar (lambda (arg)
(when (string-match "^--\\([^=]*\\)\\(?:=\\(.*\\)\\)?" arg)
(let ((opt (match-string 1 arg))
(val (match-string 2 arg)))
(if val
(list opt val)
(list opt)))))
cmd-line))
The output is:
(("aaa" "bbb") ("ccc" "ddd=eee") ("blah"))
I'm trying to open a file and read through the sexps. If the form has setq in its first position then traverse the rest of the form adding the in the setq form to an alist.
;;; File passwords.el.gpg
(setq twitter-password "Secret"
github-password "Sauce")
My goal is to able to construct an alist from the pairs in the setq forms in teh file. How I even start?
First, I second the recommendation that you store the passwords in an actual alist and, if necessary, set whatever variables you need to based on that.
That aside, here's another solution that tries to break things out a bit. The -partition function is from the dash.el library, which I highly recommend.
You don't really need to "walk" the code, just read it in and check if its car is setq. The remainder of the form should then be alternating symbols and strings, so you simply partition them by 2 and you have your alist. (Note that the "pairs" will be proper lists as opposed to the dotted pairs in Sean's solution).
(defun setq-form-p (form)
(eq (car form) 'setq))
(defun read-file (filename)
(with-temp-buffer
(insert-file-literally filename)
(read (buffer-substring-no-properties 1 (point-max)))))
(defun credential-pairs (form)
(-partition 2 (cdr form)))
(defun read-credentials-alist (filename)
(let ((form (read-file filename)))
(credential-pairs form)))
;; usage:
(read-credentials-alist "passwords.el")
Alternatively, here's how it would work if you already had the passwords in an alist, like so
(defvar *passwords*
'((twitter-password "Secret")
(github-password "Sauce")))
And then wanted to set the variable twitter-password to "Sauce" and so on. You would just map over it:
(mapcar #'(lambda (pair)
(let ((name (car pair))
(value (cadr pair)))
(set name value)))
*passwords*)
You can use streams to read in the files (read-from-string) and then do the usual elisp hacking. The below isn't robust, but you get the idea. On a file, pwd.el that has your file, it returns the alist ((github-password . "Sauce") (twitter-password . "Secret"))
(defun readit (file)
"Read file. If it has the form (sexp [VAR VALUE]+), return
an alist of the form ((VAR . VALUE) ...)"
(let* (alist
(sexp-len
(with-temp-buffer
(insert-file-contents file)
(read-from-string (buffer-substring 1 (buffer-size)))))
(sexp (car sexp-len)))
(when (equal (car sexp) 'setq)
(setq sexp (cdr sexp))
(while sexp
(let* ((l (car sexp))
(r (cadr sexp)))
(setq alist (cons (cons l r) alist)
sexp (cddr sexp)))))
alist))
(readit "pwd.el")
In Python, you might do something like
fout = open('out','w')
fin = open('in')
for line in fin:
fout.write(process(line)+"\n")
fin.close()
fout.close()
(I think it would be similar in many other languages as well).
In Emacs Lisp, would you do something like
(find-file 'out')
(setq fout (current-buffer)
(find-file 'in')
(setq fin (current-buffer)
(while moreLines
(setq begin (point))
(move-end-of-line 1)
(setq line (buffer-substring-no-properties begin (point))
;; maybe
(print (process line) fout)
;; or
(save-excursion
(set-buffer fout)
(insert (process line)))
(setq moreLines (= 0 (forward-line 1))))
(kill-buffer fin)
(kill-buffer fout)
which I got inspiration (and code) from Emacs Lisp: Process a File line-by-line. Or should I try something entirely different? And how to remove the "" from the print statement?
If you actually want batch processing of stdin and sending the result to stdout, you can use the --script command line option to Emacs, which will enable you to write code that reads from stdin and writes to stdout and stderr.
Here is an example program which is like cat, except that it reverses each line:
#!/usr/local/bin/emacs --script
;;-*- mode: emacs-lisp;-*-
(defun process (string)
"just reverse the string"
(concat (nreverse (string-to-list string))))
(condition-case nil
(let (line)
;; commented out b/c not relevant for `cat`, but potentially useful
;; (princ "argv is ")
;; (princ argv)
;; (princ "\n")
;; (princ "command-line-args is" )
;; (princ command-line-args)
;; (princ "\n")
(while (setq line (read-from-minibuffer ""))
(princ (process line))
(princ "\n")))
(error nil))
Now, if you had a file named stuff.txt which contained
abcd
1234
xyz
And you invoked the shell script written above like so (assuming it is named rcat):
rcat < stuff.txt
you will see the following printed to stdout:
dcba
4321
zyx
So, contrary to popular belief, you can actually do batch file processing on stdin and not actually have to read the entire file in at once.
Here's what I came up with. Looks a lot more idiomatic to me:
(with-temp-buffer
(let ((dest-buffer (current-buffer)))
(with-temp-buffer
(insert-file-contents "/path/to/source/file")
(while (search-forward-regexp ".*\n\\|.+" nil t)
(let ((line (match-string 0)))
(with-current-buffer dest-buffer
(insert (process line)))))))
(write-file "/path/to/dest/file" nil))
Emacs Lisp is not suitable for processing file-streams. The whole file must be read at once:
(defun my-line-fun (line)
(concat "prefix: " line))
(let* ((in-file "in")
(out-file "out")
(lines (with-temp-buffer
(insert-file-contents in-file)
(split-string (buffer-string) "\n\r?"))))
(with-temp-file out-file
(mapconcat 'my-line-fun lines "\n")))
An mplayer tool (midentify) outputs "shell-ready" lines intended to be evaluated by a bash/sh/whatever interpreter.
How can I assign these var-names to their corresponding values as elisp var-names in emacs?
The data is in a string (via shell-command-to-string)
Here is the data
ID_AUDIO_ID=0
ID_FILENAME=/home/axiom/abc.wav
ID_DEMUXER=audio
ID_AUDIO_FORMAT=1
ID_AUDIO_BITRATE=512000
ID_AUDIO_RATE=0
ID_AUDIO_NCH=1
ID_LENGTH=3207.00
ID_SEEKABLE=1
ID_CHAPTERS=0
ID_AUDIO_BITRATE=512000
ID_AUDIO_RATE=32000
ID_AUDIO_NCH=1
ID_AUDIO_CODEC=pcm
ID_EXIT=EOF
Here's a routine that takes a string containing midentify output, and returns an association list of the key-value pairs (which is safer than setting Emacs variables willy-nilly). It also has the advantage that it parses numeric values into actual numbers:
(require 'cl) ; for "loop"
(defun midentify-output-to-alist (str)
(setq str (replace-regexp-in-string "\n+" "\n" str))
(setq str (replace-regexp-in-string "\n+\\'" "" str))
(loop for index = 0 then (match-end 0)
while (string-match "^\\(?:\\([A-Z_]+\\)=\\(?:\\([0-9]+\\(?:\\.[0-9]+\\)?\\)\\|\\(.*\\)\\)\\|\\(.*\\)\\)\n?" str index)
if (match-string 4 str)
do (error "Invalid line: %s" (match-string 4 str))
collect (cons (match-string 1 str)
(if (match-string 2 str)
(string-to-number (match-string 2 str))
(match-string 3 str)))))
You'd use this function like so:
(setq alist (midentify-output-to-alist my-output))
(if (assoc "ID_LENGTH" alist)
(setq id-length (cdr (assoc "ID_LENGTH" alist)))
(error "Didn't find an ID_LENGTH!"))
EDIT: Modified function to handle blank lines and trailing newlines correctly.
The regexp is indeed a beast; Emacs regexps are not known for their easiness on the eyes. To break it down a bit:
The outermost pattern is ^(?:valid-line)|(.*). It tries to match a valid line, or else matches the entire line (the .*) in match-group 4. If (match-group 4 str) is not nil, that indicates that an invalid line was encountered, and an error is raised.
valid-line is (word)=(?:(number)|(.*)). If this matches, then the name part of the name-value pair is in match-string 1, and if the rest of the line matches a number, then the number is in match-string 2, otherwise the entire rest of the line is in match-string 3.
There's probably a better way but this should do it:
(require 'cl)
(let ((s "ID_AUDIO_ID=0
ID_FILENAME=/home/axiom/abc.wav
ID_DEMUXER=audio
ID_AUDIO_FORMAT=1
ID_AUDIO_BITRATE=512000
ID_AUDIO_RATE=0
ID_AUDIO_NCH=1
ID_LENGTH=3207.00
ID_SEEKABLE=1
ID_CHAPTERS=0
ID_AUDIO_BITRATE=512000
ID_AUDIO_RATE=32000
ID_AUDIO_NCH=1
ID_AUDIO_CODEC=pcm
ID_EXIT=EOF"))
(loop for p in (split-string s "\n")
do
(let* ((elements (split-string p "="))
(key (elt elements 0))
(value (elt elements 1)))
(set (intern key) value))))
Here's a function you can run on the output buffer:
(defun set-variables-from-shell-assignments ()
(goto-char (point-min))
(while (< (point) (point-max))
(and (looking-at "\\([A-Z_]+\\)=\\(.*\\)$")
(set (intern (match-string 1)) (match-string 2)))
(forward-line 1)))
I don't think regexp is what really need. You need to split your string by \n and =, so you just say exactly the same to interpreter.
I think you can also use intern to get symbol from string(and set variables). I use it for the first time, so comment here if i am wrong. Anyways, if list is what you want, just remove top-level mapcar.
(defun set=(str)
(mapcar (lambda(arg)
(set
(intern (car arg))
(cadr arg)))
(mapcar (lambda(arg)
(split-string arg "=" t))
(split-string
str
"\n" t))))
(set=
"ID_AUDIO_ID=0
ID_FILENAME=/home/axiom/abc.wav
ID_DEMUXER=audio
ID_AUDIO_FORMAT=1
ID_AUDIO_BITRATE=512000
ID_AUDIO_RATE=0
ID_AUDIO_NCH=1
ID_LENGTH=3207.00
ID_SEEKABLE=1
ID_CHAPTERS=0
ID_AUDIO_BITRATE=512000
ID_AUDIO_RATE=32000
ID_AUDIO_NCH=1
ID_AUDIO_CODEC=pcm
ID_EXIT=EOF")