consistent code formatting inline and in chunks with bookdown - knitr

For bookdown, I'm working on recreating inline code formatting that is similar to what is produced in chunks (see this old SO issue). My current approach is:
local({
hook_old <- knitr::knit_hooks$get("inline")
knitr::knit_hooks$set(inline = function(x, options) {
if (is.character(x) & knitr::is_html_output()) {
highr::hi_html(x)
} else {
hook_old(x, options)
}
})
})
but that produces an empty html file for the chapter that attempts to use
`r 'lm(y ~ x, data = dat)'`
There is a reproducible repo at topepo/bookdown-inline; see 01-intro.Rmd.
Generally, I suspect that the whole approach is different in bookdown.

Related

Purpose of minSupported and maxSupported parameters in getVersion API

I find getVersion API to be a bit hard to grasp. After some manual experiments with workflow changes, I found out that it's perfectly fine to have such a piece of code:
val version = Workflow.getVersion("change#1", 1, 1);
val anotherVersion = Workflow.getVersion("change#2", 2, 2);
Does it mean that the integer version is assigned to a changeId and not workflow instance? Does a single workflow instance/execution keep a set of integer-based versions?
What is the purpose of minSupported and maxSupported parameters? Why simply not to use an API like below?
val version = Workflow.getVersion("change#1")
if (version) {
// code after "change#1" changes
} else {
// code before "#change#1" changes
}
You are correct, the version is assigned to a changeId not a workflow instance. This allow versioning each piece of the workflow code independently. It allows fixing bugs while workflow is already running and didn't reach that part of the code.
The main reason is validation. The getVersion call records in the workflow history maxVersion when the code was executed for the first time. So on replay the correct version is used to guarantee correct replay even if the maxVersion has changed. When a branch is removed the minVersion is incremented. Imagine that such code is deployed by mistake when there is a workflow that needs the removed branch. The getVersion is going to detect that minVersion is larger than the one recorded in the history and is going to fail the decision task essentially blocking the workflow execution instead of breaking it. The same happens if the recorded version is higher than the maxVersion argument.
Update: Answer to the comment
In other words, I'm trying to come up with a situation where using
many different changeIds and not exceeding maxVersion=1 is not enough
They are enough if you don't perform removal of branches. But if you do then having validation of the minimal version is very convenient. For example look at the following code:
val version = Workflow.getVersion("change", 0, 2);
if (version == DEFAULT_VERSION) {
// before change
} else if (version == 1) {
// first change
} else {
// second hange
}
Let's remove the default version:
val version = Workflow.getVersion("change", 1, 2);
if (version == 1) {
// first change
} else {
// second hange
}
Now look at the version without min and max:
var version1 = Workflow.getVersion("change1");
var version2 = Workflow.getVersion("change2");
if (version1 == DEFAULT_VERSION) {
// before change
} else if (version2 == DEFAULT_VERSION) {
// first change
} else {
// second hange
}
Let's remove the default branch:
var version2 = Workflow.getVersion("change2");
if (version2 == DEFAULT_VERSION) {
// first change
} else {
// second hange
}
Note that a workflow that used the last sample code is going to break in unpredictable way if it is routed by mistake to a worker that doesn't know about version2, but only about the original default version. The first example with min max version is going to detect the issue gracefully.

file merge logic: scala

For scala experts this might be a silly question but me as a beginner facing hard time to identify the solution. Any pointers would help.
I've set of 3 files in HDFS location by the names:
fileFirst.dat
fileSecond.dat
fileThird.dat
Not necessarily they'll be stored in any order. fileFirst.dat could be created at very last so a ls every time would show different ordering of the files.
My task is to combine all files in a single file in the order:
fileFirst contents, then fileSecond contents & finally fileThird contents; with newline as the separator, no spaces.
I tried some ideas but couldn't come up with something working. Every time the order of combination messes up.
Below is my function to merge whatever is coming in:
def writeFile(): Unit = {
val in: InputStream = fs.open(files(i).getPath)
try {
IOUtils.copyBytes(in, out, conf, false)
if (addString != null) out.write(addString.getBytes("UTF-8"))
} finally in.close()
}
Files is defined like this:
val files: Array[FileStatus] = fs.listStatus(srcPath)
This is part of a bigger function where I'm passing all the arguments used in this method. After everything is done, I'll do the out.close() to close the output stream.
Any ideas welcome, even if it goes against the file write logic I'm trying to do; just understand that I'm not that good in scala; for now :)
If you can enumerate your Paths directly, you don't really need to use listStatus. You could try something like this (untested):
val relativePaths = Array("fileFirst.dat", "fileSecond.dat", "fileThird.dat")
val paths = relativePaths.map(new Path(srcDirectory, _))
try {
val output = fs.create(destinationFile)
for (path <- paths) {
try {
val input = fs.open(path)
IOUtils.copyBytes(input, output, conf, false)
} catch {
case ex => throw ex // Feel free to do some error handling here
} finally {
input.close()
}
}
} catch {
case ex => throw ex // Feel free to do some error handling here
} finally {
output.close()
}

How to select symbols onWorkspaceSymbol

I am developing an extension for visual studio code using language server protocol, and I am including the support for "Go to symbol in workspace". My problem is that I don't know how to select the matches...
Actually I use this function I wrote:
function IsInside(word1, word2)
{
var ret = "";
var i1 = 0;
var lenMatch =0, maxLenMatch = 0, minLenMatch = word1.length;
for(var i2=0;i2<word2.length;i2++)
{
if(word1[i1]==word2[i2])
{
lenMatch++;
if(lenMatch>maxLenMatch) maxLenMatch = lenMatch;
ret+=word1[i1];
i1++;
if(i1==word1.length)
{
if(lenMatch<minLenMatch) minLenMatch = lenMatch;
// Trying to filter like VSCode does.
return maxLenMatch>=word1.length/2 && minLenMatch>=2? ret : undefined;
}
} else
{
ret+="Z";
if(lenMatch>0 && lenMatch<minLenMatch)
minLenMatch = lenMatch;
lenMatch=0;
}
}
return undefined;
}
That return the sortText if the word1 is inside the word2, undefined otherwise. My problem are cases like this:
My algorithm see that 'aller' is inside CallServer, but the interface does not mark it like expected.
There is a library or something that I must use for this? the code of VSCode is big and complex and I don't know where start looking for this information...
VSCode's API docs for provideWorkspaceSymbols() provide the following guidance (which I don't think your example violates):
The query-parameter should be interpreted in a relaxed way as the editor will apply its own highlighting and scoring on the results. A good rule of thumb is to match case-insensitive and to simply check that the characters of query appear in their order in a candidate symbol. Don't use prefix, substring, or similar strict matching.
These docs were added in response to this discussion, where somebody had very much the same issue as you.
Having a brief look at VSCode sources, internally it seems to use filters.matchFuzzy2() for the highlighting (see here and here). I don't think it's exposed in the API, so you would probably have to copy it if you wanted the behavior to match exactly.

How to edit pasted content using the Open XML SDK

I have a custom template in which I'd like to control (as best I can) the types of content that can exist in a document. To that end, I disable controls, and I also intercept pastes to remove some of those content types, e.g. charts. I am aware that this content can also be drag-and-dropped, so I also check for it later, but I'd prefer to stop or warn the user as soon as possible.
I have tried a few strategies:
RTF manipulation
Open XML manipulation
RTF manipulation is so far working fairly well, but I'd really prefer to use Open XML as I expect it to be more useful in the future. I just can't get it working.
Open XML Manipulation
The wonderfully-undocumented (as far as I can tell) "Embed Source" appears to contain a compound document object, which I can use to modify the copied content using the Open XML SDK. But I have been unable to put the modified content back into an object that lets it be pasted correctly.
The modification part seems to work fine. I can see, if I save the modified content to a temporary .docx file, that the changes are being made correctly. It's the return to the clipboard that seems to be giving me trouble.
I have tried assigning just the Embed Source object back to the clipboard (so that the other types such as RTF get wiped out), and in this case nothing at all gets pasted. I've also tried re-assigning the Embed Source object back to the clipboard's data object, so that the remaining data types are still there (but with mismatched content, probably), which results in an empty embedded document getting pasted.
Here's a sample of what I'm doing with Open XML:
using OpenMcdf;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
...
object dataObj = Forms.Clipboard.GetDataObject();
object embedSrcObj = dateObj.GetData("Embed Source");
if (embedSrcObj is Stream)
{
// read it with OpenMCDF
Stream stream = embedSrcObj as Stream;
CompoundFile cf = new CompoundFile(stream);
CFStream cfs = cf.RootStorage.GetStream("package");
byte[] bytes = cfs.GetData();
string savedDoc = Path.GetTempFileName() + ".docx";
File.WriteAllBytes(savedDoc, bytes);
// And then use the OpenXML SDK to read/edit the document:
using (WordprocessingDocument openDoc = WordprocessingDocument.Open(savedDoc, true))
{
OpenXmlElement body = openDoc.MainDocumentPart.RootElement.ChildElements[0];
foreach (OpenXmlElement ele in body.ChildElements)
{
if (ele is Paragraph)
{
Paragraph para = (Paragraph)ele;
if (para.ParagraphProperties != null && para.ParagraphProperties.ParagraphStyleId != null)
{
string styleName = para.ParagraphProperties.ParagraphStyleId.Val;
Run run = para.LastChild as Run; // I know I'm assuming things here but it's sufficient for a test case
run.RunProperties = new RunProperties();
run.RunProperties.AppendChild(new DocumentFormat.OpenXml.Wordprocessing.Text("test"));
}
}
// etc.
}
openDoc.MainDocumentPart.Document.Save(); // I think this is redundant in later versions than what I'm using
}
// repackage the document
bytes = File.ReadAllBytes(savedDoc);
cf.RootStorage.Delete("Package");
cfs = cf.RootStorage.AddStream("Package");
cfs.Append(bytes);
MemoryStream ms = new MemoryStream();
cf.Save(ms);
ms.Position = 0;
dataObj.SetData("Embed Source", ms);
// or,
// Clipboard.SetData("Embed Source", ms);
}
Question
What am I doing wrong? Is this just a bad/unworkable approach?

Breaking Matlab code long lines in Latex listings environment

I need to include Matlab code in my Latex file. I'm using the listings package with matlab-prettifier and the settings are:
\usepackage{listings}
\usepackage[numbered, framed]{matlab-prettifier}
...
\lstset{
style = Matlab-editor,
basicstyle = \mlttfamily\scriptsize,
escapechar = ",
mlshowsectionrules = true,
}
\lstinputlisting[breaklines = true, breakatwhitespace=false]{test.m}
The long lines do not break, however. Any idea why this happens and how to solve it?
Edit: I discovered that if I change the class of the document from mwbk (current) to article, the lines break as desired. There should be a conflict with the class mwbk defined in
http://web.mit.edu/ghudson/dev/nokrb/third/tetex/texmf/tex/latex/mwcls/mwbk.cls, but I cannot figure out what is the problem.