Generally, I can use the excellent rx macro to create readable regular expressions and be sure that I've escaped the correct metacharacters.
(rx (any "A-Z")) ;; "[A-Z]"
However, I can't work out how to create shy groups, e.g. \(?:AB\). rx sometimes produces them in its output:
(rx (or "ab" "bc")) ;; "\\(?:ab\\|bc\\)"
but I want to explicitly add them. I can do:
(rx (regexp "\\(?:AB\\)"))
but this defeats the point of rx.
In a perfect world, I'd like to be able to write:
(rx (shy-group "A"))
I'd settle for something like this (none of these work):
;; sadly, `regexp` only accepts literal strings
(rx (regexp (format "\\(?:%s\\)" (rx WHATEVER))))
;; also unfortunate, `eval` quotes the string it's given
(rx (eval (format "\\(?:%s\\)" (rx WHATEVER))))
How can I create regular expressions with shy groups using rx?
I think the structure of a rx form eliminates any need to explicitly create shy groups -- everything that a shy group could be needed for is accounted for by other syntax.
e.g. your own example:
(rx (or "ab" "bc")) ;; "\\(?:ab\\|bc\\)"
For other cases, it is also possible to extend the keywords used by rx.
Example (taken from EmacsWiki link above):
(defmacro rx-extra (&rest body-forms)
(let ((add-ins (list
`(file . ,(rx (+ (or alnum digit "." "/" "-" "_"))))
`(ws0 . ,(rx (0+ (any " " "\t"))))
`(ws+ . ,(rx (+ (any " " "\t"))))
`(int . ,(rx (+ digit))))))
`(let ((rx-constituents (append ',add-ins rx-constituents nil)))
,#body-forms)))
Related
I have a list of elements following
("(aviyon" "213" "flyingman" "no))") as list
What i want is that I want to split this list containing strings using parentheses as splitter but also want to include these parentheses in a new list without breaking the order
My desired output of new list(or same list modified)
("(" "aviyon" "213" "flyingman" "no" ")" ")")
I am coming from imperative languages and this would be 15 minute job in Java or C++. But here i'm stuck what to do. I know i have to
1- Get a element from list in a loop
I think this is done with (nth 1 '(listname) )
2- separate without removing delimiter put in to a new list
I found functions such as SPLIT-SEQUENCE but i can't do without removing it and without breaking original order.
Any help would be appreciated.
You can use cl-ppcre library to do the job.
For example:
CL-USER> (ql:quickload :cl-ppcre)
CL-USER> (cl-ppcre:split "([\\(\\)])" "(aviyon" :with-registers-p t)
("" "(" "aviyon")
CL-USER> (cl-ppcre:split "([\\(\\)])" "no))" :with-registers-p t)
("no" ")" "" ")")
CL-USER>
However, it makes empty-strings in a list. Use remove-if function to get rid of them:
CL-USER> (defun empty-string-p (s) (string= s ""))
EMPTY-STRING-P
CL-USER> (remove-if 'empty-string-p
(list "no" ")" "" ")"))
("no" ")" ")")
Finally, you can construct a function which does both, and run it in an imperative loop (yes, Common Lisp is not functional as many think):
CL-USER> (defun remove-empty-strings (l)
(remove-if 'empty-string-p l))
REMOVE-EMPTY-STRINGS
CL-USER> (defun split (s)
(cl-ppcre:split "([\\(\\)])"
s
:with-registers-p t))
SPLIT
CL-USER> (defparameter *the-list* '("(aviyon" "213" "flyingman" "no))"))
*THE-LIST*
CL-USER> (loop for item in *the-list*
for splitted = (split item)
for cleaned = (remove-empty-strings splitted)
append cleaned)
("(" "aviyon" "213" "flyingman" "no" ")" ")")
Let's have another answer, without external libraries.
Like you already did, we can split in the problem into smaller parts:
define a function which builds a list of tokens from a string, all-tokens
apply this function on all strings in your input list, and concatenate the result:
(mapcan #'all-tokens strings)
The first part, taking a state and building a list from it, looks like an unfold operation (anamorphism).
Fold (catamorphism), called reduce in Lisp, builds a value from a list of values and a function (and optionally an initial value).
The dual operation, unfold, takes a value (the state), a function, and generate a list of values.
In the case of unfold, the step function accepts a state and returns new state along with the resulting list.
Here, let's define a state as 3 values: a string, a starting position in the string, and a stack of tokens parsed so far.
Our step function next-token returns the next state.
;; definition follows below
(declare (ftype function next-token))
The main function which gets all tokens from a string just computes a fixpoint:
(defun all-tokens (string)
(do (;; initial start value is 0
(start 0)
;; initial token stack is nil
(tokens))
;; loop until start is nil, then return the reverse of tokens
((not start) (nreverse tokens))
;; advance state
(multiple-value-setq (string start tokens)
(next-token string start tokens))))
We need an auxiliary function:
(defun parenthesisp (c)
(find c "()"))
The step function is defined as follows:
(defun next-token (string start token-stack)
(let ((search (position-if #'parenthesisp string :start start)))
(typecase search
(number
;; token from start to parenthesis
(when (> search start)
(push (subseq string start search) token-stack))
;; parenthesis
(push (subseq string search (1+ search)) token-stack)
;; next state
(values string (1+ search) token-stack))
(null
;; token from start to end of string
(when (< start (1- (length string)))
(push (subseq string start) token-stack))
;; next-state
(values string nil token-stack)))))
You can try with a single string:
(next-token "(aviyon" 0 nil)
"(aviyon"
1
("(")
If you take the resulting state values and reuse them, you have:
(next-token "(aviyon" 1 '("("))
"(aviyon"
NIL
("aviyon" "(")
And here, the second return value is NIL, which ends the generation process.
Finally, you can do:
(mapcan #'all-tokens '("(aviyon" "213" "flyingman" "no))"))
Which gives:
("(" "aviyon" "213" "flyingman" "no" ")" ")")
The above code is not fully generic in the sense that all-tokens knows too much about next-token: you could rewrite it to take any kind of state.
You could also handle sequences of strings using the same mechanism, by keeping more information in your state variable.
Also, in a real lexer you would not want to reverse the whole list of tokens, you would use a queue to feed a parser.
solution
Since you didn't understood Alexander's solution and since I anyway wrote my solution:
;; load two essential libraries for any common lisper
(ql:quickload :cl-ppcre)
(ql:quickload :alexandria)
;; see below to see how to install quicklisp for `ql:quickload` command
;; it is kind of pythons `import` and if not install `pip install`
;; in one command for common-lisp
(defun remove-empty-string (string-list)
(remove-if #'(lambda (x) (string= x "")) string-list))
(defun split-parantheses-and-preserve-them (strings-list)
(remove-empty-string
(alexandria:flatten
(mapcar #'(lambda (el) (cl-ppcre:split "(\\(|\\))"
el
:with-registers-p t))
strings-list))))
;; so now your example
(defparameter *list* '("(aviyon" "213" "flyingman" "no))"))
(split-parantheses-and-preserve-them *list*)
;; returns:
;; ("(" "aviyon" "213" "flyingman" "no" ")" ")")
how this works
(cl-ppcre:split "(\\(|\\))" a-string)
splits the string by ( or ). Because in regex pattern ( or ) are used for capturing the match - like here too (the outer parantheses captures) - you have to escape them. \\( or \\).
So with cl-ppcre:split you can split any string in common lisp by regex-pattern. Super cool and super efficient package written by Edi Weitz. He wrote several super sophisticated packages for common lisp - they are also called ediware or edicls in the community.
By the way - cl-ppcre is even more efficient and faster than gold-standard for regex: the perl regex engine!
:with-regiesters-p t option then preserves the matched delimiter - which has to be captured by parentheses like this: (<pattern>) in the pattern.
mapcar this over the list to apply it on each string element in your string list.
However, what you got after that is a list of lists.
(Each inner list containing the splitted result for each string-element of the list).
Flatten the list by alexandria:flatten.
For many functions not in the standard of lisp, but which you think they are basic - like flatten a list - look always first in alexandria - mostly it has a function you desire - it is a huge library. That is why you need it anyway as a common lisper ;) .
But still, there will be empty strings around to be removed.
That is why I wrote remove-empty-string which uses remove-if - which together with remove-if-not is the standard filtering function for lists.
It takes a predicate function - here (lambda (x) (string= x "")) which gives T if string is an empty string and NIL if not.
It removes all elements in the resulting flattened list in our function, which are empty strings.
In other languages it will be named filter but yeah - sometimes function names in common-lisp are not very well chosen. Sometimes I think we should create alias names and move over to them and keep the old names for backward-compatibility. Clojure has nicer names for functions ... Maybe cl people should overtake clojure function names ...
quicklisp
#Alexander Artemenko wrote exactly my solution - he came first. I will add:
If you are so new to common lisp, maybe you don't know how to use quicklisp.
Do in terminal (linux or macos):
wget https://beta.quicklisp.org/quicklisp.lisp
Otherwise manually download in windows from the address.
I put it into ~/quicklisp folder.
Then in clisp or sbcl do:
(load "~/quicklisp/quicklisp.lisp") ;; just path to where downloaded
;; quicklisp.lisp file is!
;; then install quicklisp:
(quicklisp-quickstart:install)
;; then search for cl-ppcre
(ql:system-apropos "cl-ppcre")
;; then install cl-ppcre
(ql:quickload "cl-ppcre")
;; and to autoload everytime you start sbcl or clisp
;; in linux or mac - sorry I don't now windows that well
;; I have the opinion every programmer should us unix
;; as their OS
;; you have to let quicklisp be loaded when they start
;; by an entry into the init file
;; mostly located in ~/.sbclrc or ~/.clisprc.slip or such ...
;; respectively.
;; quicklisp does an entry automatically if you do:
(ql:add-to-init-file)
;; after installation do:
(quit)
;; If you then restart sbcl or clisp and try:
(ql:quickload :cl-ppcre)
;; it should work, - if not, you have to manually load
;; quicklisp first
(load "~/quicklisp/setup.lisp") ;; or wherever quicklisp's
;; setup.lisp file has been stored in your system!
;; and then you can do
(ql:quickload :cl-ppcre)
;; to install alexandria package then, do
(ql:quickload :alexandria) ;; or "alexandria"
;; ql:quickload installs the package from quicklisp repository,
;; if it cannot find package on your system.
;; learn more about quicklisp, since this is the package
;; manager of common lisp - like pip for python
Is there a way to push-back to a list in elisp?
The closest thing I found was
(add-to-list 'myList myValue t) ;; t tells it to put to back of the list
The problem, however, is that add-to-list enforces uniqueness. The other alternative is (push 'myList myVal) but that can only push to the front of a list. I tried using (cons myList myVal) but IIRC that returns something other than a list.
The only thing that has worked is
(setq myList (append myList (list myVal)))
but that syntax is hideous and feels like a lot of extra work to do something simple.
Is there a faster way to push to the back of a list. It's clearly possible as seen in (add-to-list), but is there a way to do it without enforcing uniqueness?
In other words, a good old push-back function like with C++ and the <List> class
Lisp lists vs "lists" in other languages
Lisp lists are chains of cons cells ("linked lists"), not specialized sequential containers like in C, and not a weird blend of lists and vectors like in Perl and Python.
This beautiful approach allows the same methodology to be applied to code and data, creating a programmable programming language.
The reasons Lisp does not have a "push-back" function are that
it does not need it :-)
it is not very well defined
No need
Adding to the end is linear in the length of the list, so, for
accumulation, the standard pattern is to combine
push while iterating and
nreverse when done.
Not well defined
There is a reason why add-to-list takes a symbol as the argument (which makes it useless for programming).
What happens when you are adding to an empty list?
You need to modify the place where the list is stored.
What happens when the list shares structure with another object?
If you use
(setq my-list (nconc my-list (list new-value)))
all the other objects are modified too.
If you write, like you suggested,
(setq my-list (append my-list (list new-value)))
you will be allocating (length my-list) cells on each addition.
Try this:
(defun prueba ()
(interactive)
(let ((mylist '(1 2 3)))
(message "mylist -> %s" mylist)
(add-to-list 'mylist 1 t)
(message "mylist -> %s" mylist)
(setq mylist '(1 2 3))
(add-to-list 'mylist 1 t '(lambda (a1 a2) nil))
(message "mylist -> %s" mylist)
))
adding a compare function that always returns nil as the fourth argument to add-to-list allows you
to add duplicates.
I was trying to port over the "pretty entities" behaviour from org-mode to latex-mode using the Emacs builtin prettify-symbols-mode. This mode uses font-lock-mode to display character sequences in a buffer as a single (unicode) character. By default for instance emacs-lisp code
(lambda () t)
becomes
(λ () t)
It does however seem to require the character sequences to be separated by some characters, e.g. white-spaces. For instance in my setup, the replacement
\alpha \beta -> α β`
will work, but it will fail when the strings are not separated, e.g.
\alpha\beta -> \alphaβ
This is an issue specifically, because I wanted to use this prettification to make quantum mechanical equations more readable, where I e.g. the replacement like
|\psi\rangle -> |ψ⟩
Is it possible to avoid this delimiter-issue using prettify-symbols-mode? And if it is not, is it possible by using font-lock-mode on a lower level?
Here's the code that should do what you want:
(defvar pretty-alist
(cl-pairlis '("alpha" "beta" "gamma" "delta" "epsilon" "zeta" "eta"
"theta" "iota" "kappa" "lambda" "mu" "nu" "xi"
"omicron" "pi" "rho" "sigma_final" "sigma" "tau"
"upsilon" "phi" "chi" "psi" "omega")
(mapcar
(lambda (x) (make-char 'greek-iso8859-7 x))
(number-sequence 97 121))))
(add-to-list 'pretty-alist '("rangle" . ?\⟩))
(defun pretty-things ()
(mapc
(lambda (x)
(let ((word (car x))
(char (cdr x)))
(font-lock-add-keywords
nil
`((,(concat "\\(^\\|[^a-zA-Z0-9]\\)\\(" word "\\)[a-zA-Z]")
(0 (progn
(decompose-region (match-beginning 2) (match-end 2))
nil)))))
(font-lock-add-keywords
nil
`((,(concat "\\(^\\|[^a-zA-Z0-9]\\)\\(" word "\\)[^a-zA-Z]")
(0 (progn
(compose-region (1- (match-beginning 2)) (match-end 2)
,char)
nil)))))))
pretty-alist))
As you can see above, pretty-alist starts out with greek chars. Then I add
\rangle just to demonstrate how to add new things.
To enable it automatically, add it to the hook:
(add-hook 'LaTeX-mode-hook 'pretty-things)
I used the code from here as a starting
point, you can look there for a reference.
The code of prettify-symbols-mode derives from code developped for languages like Haskell and a few others, which don't use something like TeX's \. So you may indeed be in trouble. I suggest you M-x report-emacs-bug requesting prettify-symbol-mode be improved to support TeX-style syntax.
In the mean time, you'll have to "do it by hand" along the lines of what abo-abo suggests.
One note, tho: back in the days of Emacs-21, I ported X-Symbol to work on Emacs, specifically because I wanted to see such pretty things in LaTeX. Yet, I discovered that it was mostly useless to me. And I think it's even more the case now. Here's why:
You can just use an actual ψ character in your LaTeX code instead of \psi nowadays. So you don't need display tricks for it to look "right".
Rather than repeating |\psi\rangle I much prefer defining macros and then use \Qr{\Vangle} (where \Vangle turns into \psi ("V" stands for "metaVariable"), and \Qr wraps it in a braket) so I can easily tweak those macros and know that the document will stay consistent. At that point, pretty-display of \psi and \rangle is of no importance.
The Lisp forum thread Define macro alias? has an example of creating function alias using a form such as
(setf (symbol-function 'zero?) #'zerop)
This works fine, making zero? a valid predicate. Is it possible to parametrize this form without resorting to macros? I'd like to be able to call the following and have it create function?:
(define-predicate-alias 'functionp)`
My take was approximately:
(defun defalias (old new)
(setf (symbol-function (make-symbol new))
(symbol-function old)))
(defun define-predicate-alias (predicate-function-name)
(let ((alias (format nil "~A?" (string-right-trim "-pP" predicate-function-name))))
(defalias predicate-function-name alias)))
(define-predicate-alias 'zerop)
(zero? '())
This fails when trying to call zero? saying
The function COMMON-LISP-USER::ZERO? is undefined.
make-symbol creates an uninterned symbol. That's why zero? is undefined.
Replace your (make-symbol new) with e.g. (intern new *package*). (Or you may want to think more carefully in which package to intern your new symbol.)
Your code makes a symbol, via MAKE-SYMBOL, but you don't put it into a package.
Use the function INTERN to add a symbol to a package.
To expand on Lars' answer, choose the right package. In this case the default might be to use the same package from the aliased function:
About style:
Anything that begins with DEF should actually be a macro. If you have a function, don't use a name beginning with "DEF". If you look at the Common Lisp language, all those are macro. For example: With those defining forms, one would typically expect that they have a side-effect during compilation of files: the compiler gets informed about them. A function can't.
If I put something like this in a file
(define-predicate-alias zerop)
(zero? '())
and then compile the file, I would expect to not see any warnings about an undefined ZERO?. Thus a macro needs to expand (define-predicate-alias 'zerop) into something which makes the new ZERO? known into the compile-time environment.
I would also make the new name the first argument.
Thus use something like MAKE-PREDICATE-ALIAS instead of DEFINE-PREDICATE-ALIAS, for the function.
There are already some answers that explain how you can do this, but I'd point out:
Naming conventions, P, and -P
Common Lisp has a naming convention that is mostly adhered to (there are exceptions, even in the standard library), that if a type name is multiple words (contains a -), then its predicate is named with -P suffix, whereas if it doesn't, the suffix is just P. So we'd have keyboardp and lcd-monitor-p. It's good then, that you're using (string-right-trim "-pP" predicate-function-name)), but since the …P and …-P names in the standard, and those generated by, e.g., defstruct, will be using P, not p, you might just use (string-right-trim "-P" predicate-function-name)). Of course, even this has the possible issues with some names (e.g., pop), but I guess that just comes with the territory.
Symbol names, format, and *print-case*
More importantly, using format to create symbol names for subsequent interning is dangerous, because format doesn't always print a symbol's name with the characters in the same case that they actually appear in its name. E.g.,
(let ((*print-case* :downcase))
(list (intern (symbol-name 'foo))
(intern (format nil "~A" 'foo))))
;=> (FOO |foo|) ; first symbol has name "FOO", second has name "foo"
You may be better off using string concatenation and extracting symbol names directly. This means you could write code like (this is slightly different use case, since the other questions already explain how you can do what you're trying to do):
(defmacro defpredicate (symbol)
(flet ((predicate-name (symbol)
(let* ((name (symbol-name symbol))
(suffix (if (find #\- name) "-P" "P")))
(intern (concatenate 'string name suffix)))))
`(defun ,(predicate-name symbol) (x)
(typep x ',symbol)))) ; however you're checking the type
(macroexpand-1 '(defpredicate zero))
;=> (DEFUN ZEROP (X) (TYPEP X 'ZERO))
(macroexpand-1 '(defpredicate lcd-monitor))
;=> (DEFUN LCD-MONITOR-P (X) (TYPEP X 'LCD-MONITOR))
I don't even know the proper terminology for this lisp syntax, so I don't know if the words I'm using to ask the question, make sense. But the question makes sense, I'm sure.
So let me just show you. cc-mode (cc-fonts.el) has things called "matchers" which are bits of code that run to decide how to fontify a region of code. That sounds simple enough, but the matcher code is in a form I don't completely understand, with backticks and comma-atsign and just comma and so on, and furthermore it is embedded in a c-lang-defcost, which itself is a macro. I don't know what to call all that, but I want to run edebug on that code.
Look:
(c-lang-defconst c-basic-matchers-after
"Font lock matchers for various things that should be fontified after
generic casts and declarations are fontified. Used on level 2 and
higher."
t `(;; Fontify the identifiers inside enum lists. (The enum type
;; name is handled by `c-simple-decl-matchers' or
;; `c-complex-decl-matchers' below.
,#(when (c-lang-const c-brace-id-list-kwds)
`((,(c-make-font-lock-search-function
(concat
"\\<\\("
(c-make-keywords-re nil (c-lang-const c-brace-id-list-kwds))
"\\)\\>"
;; Disallow various common punctuation chars that can't come
;; before the '{' of the enum list, to avoid searching too far.
"[^\]\[{}();,/#=]*"
"{")
'((c-font-lock-declarators limit t nil)
(save-match-data
(goto-char (match-end 0))
(c-put-char-property (1- (point)) 'c-type
'c-decl-id-start)
(c-forward-syntactic-ws))
(goto-char (match-end 0)))))))
I am reading up on lisp syntax to figure out what those things are and what to call them, but aside from that, how can I run edebug on the code that follows the comment that reads ;; Fontify the identifiers inside enum lists. ?
I know how to run edebug on a defun - just invoke edebug-defun within the function's definition, and off I go. Is there a corresponding thing I need to do to edebug the cc-mode matcher code forms?
What does def-edebug-spec do, and would I use it here? If so, how?
According to (elisp)Top > Debugging > Edebug > Edebug and Macros you have to tell Edebug how to debug a macro by defining it with debug statements or by using def-edebug-spec. This tells it what parameters should be evaluated and which shouldn't. So it can be done. In fact it looks as if c-lang-defconst already been fitted for edebug. Here is the definition in case you were interested:
(def-edebug-spec c-lang-defconst
(&define name [&optional stringp] [&rest sexp def-form]))
However, if you just want to see what the body evaluates to, then the way to do that is to use something like macro-expand-last-sexp below to see the result. Position your cursor after the sexp you want expanded (as you would for C-x C-e) and run M-x macro-expand-last-sexp RET. This will show you what it gets expanded to. You may run into troubles if you try to expand something like ,(....) so you may have to copy that sexp somewhere else and delete the , or ,#.
(defun macro-expand-last-sexp (p)
"Macro expand the previous sexp. With a prefix argument
insert the result into the current buffer and pretty print it."
(interactive "P")
(let*
((sexp (preceding-sexp))
(expanded (macroexpand sexp)))
(cond ((eq sexp expanded)
(message "No changes were found when macro expanding"))
(p
(insert (format "%S" expanded))
(save-excursion
(backward-sexp)
(indent-pp-sexp 1)
(indent-pp-sexp)))
(t
(message "%S" expanded)))))
I guess it depends on exactly what you are trying to do.
Use macroexpand or macroexpand-all to turn it into macro-free code and debug as usual?
Backticks &co may be best illustrated by an example:
(let ((a 1)
(b (list 2 3)))
`(a ,a ,b ,#b))
-> (a 1 (2 3) 2 3)
A backtick (or backquote`) is similar to a quote(') in that it prevents evaluation, except its effect can be selectively undone with a comma(,); and ,# is like ,, except that its argument, which must be a list, is spliced into the resulting list.