I would need to replace the strings contained within the curved brackets with the same strings but with an initial prefix and curly brackets \fill{(test_string)}. Is this possible?
Example:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam
nonummy nibh euismod tincidunt ut laoreet dolore.
(first_string)
Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy
nibh euismod tincidunt ut laoreet dolore.
(second_string)
Transform in:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam
nonummy nibh euismod tincidunt ut laoreet dolore.
\fill{(first_string)}
Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore.
\fill{(second_string)}
As suggested, you could use a regex search and replace. A simple example of this can be found here. In your case, this should work:
The regular expression \(([^\)]+)\) does the following (as taken from this site - you'll need to paste the regex into the site to see the explanation):
\( matches the character literally (case sensitive)
1st Capturing Group ([^\)]+)
Match a single character not present in the list below [^\)]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\) matches the character ) literally (case sensitive)
\) matches the character ) literally (case sensitive)
In Visual Studio Code, if you enable regex search by clicking the .* icon in the search bar, you can put this regular expression in. Then, in the replace section, you can put \fill{($1)} where the $1 is the 1st Capturing Group mentioned previously (the first_string, second_string, etc. part found by the regular expression).
There are a lot of Regex posts here on Stackoverflow you may want to read. One notable one is Greedy versus Lazy.
Related
description
I want to link some titles in Markdown. It works good except when I link a German word like "Systemüberwachung" or "ähnliches". I think this is not working because of the "ü" and "ä".
I already tried to link like this: #system-berwachung, #systemberwachung, #systemuberwachung and many others.
But how can I link word with the characters ä ö and ü?
I use VSCode 1.63.2 and Markdown Preview
test code snippet
- [Systemüberwachung](#systemüberwachung)
- [something](#something)
- [ähnliches](#ähnliches)
## Systemüberwachung
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet
## something
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet
## ähnliches
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet
.
It looks like Visual Studio Code takes the title string, lower-cases it, and URL-encodes it. That makes sense, as these are essentially URL fragments.
Heading
Link fragment
Systemüberwachung
system%C3%BCberwachung
ähnliches
%C3%A4hnliches
You can reproduce this in JavaScript, e.g. like so:
encodeURIComponent("Systemüberwachung".toLowerCase())
//=> "system%C3%BCberwachung"
(There's probably more to this logic, e.g. to replace punctuation, but for the purposes of this question I think this is the relevant part.)
Unfortunately, these links don't seem to actually work, even though they are linking to the correct elements. I'm not sure why that is.
Background
Since Visual Studio Code is built on Webview, I discovered this by opening up its dev tools with the "Developer: Open Webview Developer Tools" command via the command palette.
Then I used the "select an element" feature (also available via Ctrl+Shift+C when the dev tools are open and focused):
Clicking on the rendered title brings up the underlying HTML in the devtools panel:
Which then reveals the generated ID:
<h1 data-line="0" class="code-line" id="system%C3%BCberwachung">Systemüberwachung</h1>
We can verify that we are using the correct ID in the devtools console, e.g.
document.getElementById("system%C3%BCberwachung")
# German umlaut characters
## Table of Content
- [Systemüberwachung](#systemuberwachung)
- [something](#something)
- [ähnliches](#ahnliches)
<div id="systemuberwachung">
<h2>Systemüberwachung</h2>
</div>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, ...
## something
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, ...
<div id="ahnliches">
<h2>ähnliches<h2>
</div>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, ...
.
I'm looking to split a paragraph of text into individual sentences using Dart. The problem I am having is that sentences can end in a number of punctuation marks (e.g. '.', '!', '?') and in some cases (such as the Japanese language), sentences can end in unique symbols (e.g. '。').
Additionally, Dart's split method removes the split value from the string. For example, 'Hello World!" becomes "Hello World" when using the code text.split('! ');
I've looked around at Dart packages available but I'm unable to find anything that does what I'm looking for.
Ideally, I'm looking for something similar to BreakIterator in Java which allows the programmer to define which locale they wish to use when detecting punctuation and also maintains the punctuation mark when splitting the string into sentences. I'm happy to use a solution in Dart that doesn't automatically detect sentence endings based on Locale but if this isn't available I would like to have the ability to define all sentence endings to look for when splitting a string.
Any help is appreciated. Thank you in advance.
it can be done using regex, something like this:
String str1 = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. In vulputate odio eros, sit amet ultrices ipsum auctor sed. Mauris in faucibus elit. Nulla quam orci? ultrices a leo a, feugiat pharetra ex. Nunc et ipsum lorem. Integer quis congue nisi! In et sem eget leo ullamcorper consectetur dignissim vitae massa。Nam quis erat ac tellus laoreet posuere. Vivamus eget sapien eget neque euismod mollis.";
// regular expression:
RegExp re = new RegExp(r"(\w|\s|,|')+[。.?!]*\s*");
// get all the matches:
Iterable matches = re.allMatches(str1);
// Iterate all matches:
for (Match m in matches) {
String match = m.group(0);
print("match: $match");
}
output:
// match: Lorem ipsum dolor sit amet, consectetur adipiscing elit.
// match: In vulputate odio eros, sit amet ultrices ipsum auctor sed.
// match: Mauris in faucibus elit.
// match: Nulla quam orci?
// match: ultrices a leo a, feugiat pharetra ex.
// match: Nunc et ipsum lorem.
// match: Integer quis congue nisi!
// match: In et sem eget leo ullamcorper consectetur dignissim vitae massa。
// match: Nam quis erat ac tellus laoreet posuere.
// match: Vivamus eget sapien eget neque euismod mollis.
Let's assume we have a user given array:
q = ['dolor', 'sed']
And a item in my db is:
{//data,
'paragraphs' : [{ 'header' : 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam venenatis lectus risus, a interdum lectus rhoncus sed. Vestibulum sit amet massa eu metus iaculis laoreet et non est. ',
//mode data
},
{ .. }]
}
I want to find if the 'paragraphs.header' has the word dolor or sed.
I tried $in and search with no success. What should I use?
Infective way would be to use regular expressions:
db.foo.find({"paragraphs.header": {$in: [/dolor/, /sed/]}})
Because none of this is anchored the beginning of the string it means you cannot use indexes and you'll have to perform full scan to find matching documents.
If you want effective way you should look at TextSearch.
start mongod with --setParameter textSearchEnabled=true
create text index db.foo.ensureIndex({"paragraphs.header": "text"})
search db.foo.runCommand("text", {search: "dolor sed"})
You can specify language when you create text index or run text command but unfortunately Latin is not supported ;)
Is there a good way to configure vim to send format=flowed emails that include hanging indents?
My complete vimrc (for testing purposes) is:
set nocompatible
set fo+=awn
set tw=72
set ai
I'm typing something like:
1. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam
posuere dui lorem, et condimentum nulla. Sed pharetra justo nec ante
fringilla non mattis nisi blandit. Donec molestie ligula dolor.
Nulla facilisi. Aliquam vel nulla elit, mollis facilisis metus. Sed
id eros a ante blandit convallis id sit amet elit. Duis malesuada
lobortis leo a placerat. Sed ut ipsum nisl. Sed pretium mauris vitae
velit sollicitudin iaculis.
vim adds a trailing space to each line except the last, per set fo+=w. It also adds spaces for the hanging indent. It looks great!
My mail client sets the format=flowed header. The result when this email is viewed in either Mail.app or mutt is not pretty:
1. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam posuere dui lorem, et condimentum nulla. Sed pharetra justo nec ante fringilla non mattis nisi blandit. Donec molestie ligula dolor. Nulla facilisi. Aliquam vel nulla elit, mollis facilisis metus. Sed id eros a ante blandit convallis id sit amet elit. Duis malesuada lobortis leo a placerat. Sed ut ipsum nisl. Sed pretium mauris vitae velit sollicitudin iaculis.
The paragraph wraps correctly, in the sense that resizing the reader client reflows it (which is not what you'll see here on stackoverflow, but you get the idea). The problem is, there are 5 spaces between "Etiam" and "posuere" and all the other lines that have been joined back together.
Is there a fix for this in vim? Or is this a limitation of the format=flowed spec? How do other people handle this?
The paragraph wraps correctly, in the sense that resizing the reader client reflows it (which is not what you'll see here on stackoverflow, but you get the idea). The problem is, there are 5 spaces between "Etiam" and "posuere" and all the other lines that have been joined back together.
This is a limitation of the "format=flowed" MIME parameter as specified in RFC 3676. There is nothing in the specification that would allow a client to recognize the leading spaces as ornaments intended only for plaintext versions of the mail.
Section 4.1 of the RFC states:
If the first character of a line is a space, the line has been space-stuffed (see Section 4.4). Logically, this leading space is deleted before examining the line further (that is, before checking for flowed).
The referenced "space-stuffing" from Section 4.4:
Space-stuffing adds a single space to the start of any line which needs protection when the message is generated. On reception, if the first character of a line is a space, it is logically deleted. This occurs after the test for a quoted line (which logically counts and deletes any quote marks), and before the test for a flowed line.
So an RFC 3676-compliant mail client would remove a single leading space from each line beginning with such a character and then (optionally) remove any the linebreaks that following a single space character. This process would not touch the remaining leading whitespace
When editing documents I always stick to a certain line width of max 80 or 150 characters, depends what I am writing (code, text, etc.). If I change only a little the whole paragraph will shift and hence multiple lines are now in different order to optimal fit for the given line width. How do I diff this to see the actual real change an not the rewrapping artifacts?
Example, textwidth=30:
The actual changes are rather tiny:
line 9 insert: "Now I change a little"
line 15 insert: "Fill in here something and write totally new stuff with much more lines. "
line 18 change: s/Duis/TYPO/
The fact that I use (g)vimdiff here is of no matter, if other software can accomplish the desired diff.
Of course software is designed to wrap automatically when text reaches window borders, so I also tried to use just line breaks in the end of a paragraph. The reason why this is not good is, that automatically diffs are line based, and for small changes in paragraphs I get the whole line, meaning then the whole paragraph as diff update :(.
GNU wdiff does a word-by-word diff, not treating spaces and new lines any differently. One can even find vim syntax files for it (e.g. here).
$ cat file1
Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Aenean vel molestie
nulla. Pellentesque placerat lacus vel
eros malesuada tristique. Nulla vitae
volutpat justo. Donec est mauris,
$ cat file2
Lorem amet, consectetur adipiscing some
inserted text! elit. Aenean vel molestie
nulla. Pellentesque placerat lacus vel
eros malesuada replacement. Nulla vitae
volutpat justo. Donec est mauris,
$ wdiff file1 file2
Lorem [-ipsum dolor sit-] amet, consectetur
adipiscing {+some inserted text!+} elit. Aenean vel molestie
nulla. Pellentesque placerat lacus vel
eros malesuada [-tristique.-] {+replacement.+} Nulla vitae
volutpat justo. Donec est mauris
([- ... -] is deleted text, {+ ... +} is inserted text).
(There are other diff programs that do a similar thing: e.g. adiff, and maybe some of the ones listed in https://stackoverflow.com/questions/12625/best-diff-tool)
I like Beyond Compare for this kind of side-by-side file comparison. Also lets you do folder comparisons and bit-level comparisons, and you can right-click to select the left-hand file to compare, then another to select the right-hand one; or select two files and right-click Compare to bring them both up straight away.
I use DiffMerge which is free and available on many platforms.