RMarkdown syntax within apa_table() - papaja

I am not sure if I am overseeing something, maybe there is an easy solution for this already (sorry if this is the case) but so far I haven't found one:
When I am passing a manually created data.frame to apa_table() with row names / column names / values containing RMarkdown syntax, for example "$p$" or "$p > .001$", and try to knit it into a docx file, it will not work and just print it as it is. If I use label_variable(df, p="$p$) it does work ofc as expected, but this is my solution only for column names, not for the other locations within a table. The same also accounts for the note = "$p$" beneath an apa_table().
I am curious if it is possible or if a solution already exists, I'd be thankful for some help on this one!
Best regards and thank you in advance
Mischa

By default, apa_table() escapes characters that are special in LaTeX (e.g., $). You can turn this feature off by specifying escape = FALSE. Moreover, if you want to enable full markdown support for your table body, I recommend to specify format = "pipe", which tells apa_table() to return the table in pandoc's pipe format, which in turn supports markdown.
Consider this table with some markdown commands:
table_content <- data.frame(
"$\\mathit{df}$" = "$\\mathit{df} = 1$"
, b = c("**a**", "*b*")
, check.names = FALSE
)
A full call to apa_table() might then look like the following:
apa_table(
table_content
, escape = FALSE
, format = "pipe"
)
A current limitation to this approach seems to be table notes: pandoc's pipe tables do not seem to support table notes, so using markdown syntax for the table's body while also adding a table note does not seem to work at the same time.

Related

How can I use tsvector on a string with numbers?

I would like to use a postgres tsquery on a column that has strings that all contain numbers, like this:
FRUIT-239476234
If I try to make a tsquery out of this:
select to_tsquery('FRUIT-239476234');
What I get is:
'fruit' & '-239476234'
I want to be able to search by just the numeric portion of this value like so:
239476234
It seems that it is unable to match this because it is interpreting my hyphen as a "negative sign" and doesn't think 239476234 matches -239476234. How can I tell postgres to treat all of my characters as text and not try to be smart about numbers and hyphens?
An answer from the future. Once version 13 of PostgreSQL is released, you will be able to do use the dict_int module to do this.
create extension dict_int ;
ALTER TEXT SEARCH DICTIONARY intdict (MAXLEN = 100, ABSVAL=true);
ALTER TEXT SEARCH CONFIGURATION english ALTER MAPPING FOR int WITH intdict;
select to_tsquery('FRUIT-239476234');
to_tsquery
-----------------------
'fruit' & '239476234'
But you would probably be better off creating your own TEXT SEARCH DICTIONARY as well as copying the 'english' CONFIGURATION and modifying the copy, rather than modifying the default ones in place. Otherwise you have the risk that upgrading will silently lose your changes.
If you don't want to wait for v13, you could back-patch this change and compile into your own version of the extension for a prior server.
This is done by the text search parser, which is not configurable (short of writing your own parser in C, which is supported).
The simplest solution is to pre-process all search strings by replacing - with a space.

PostgreSQL Trimming Leading and Trailing Characters: = and "

I'm working to build an import tool that utilizes a quoted CSV file. However, several of the fields in the CSV file are reported as such:
"=""38000"""
Where 38000 is the data I need. The data integration software I use (Talend 6.11) already strips the leading and trailing double quotes for me (so, "38000" becomes 38000), but I can't find a way to get rid of those others.
So, essentially, I need "=""38000""" to become "38000" where the leading "=" is removed and the trailing "" is removed.
Is there a TRIM function that can accomplish this for me? Perhaps there is a method in Talend that can do this?
As the other answer stated, you could do that operation in SQL. Or, you could do it in Java, Groovy, etc, within Talend. However, if there is an existing Talend component which does the job, my preference is to use it. That leads to faster development, potentially less testing, and easier maintenance. Having said that, it is important to review all the components which are available, so you know what's available to you.
You can use the Talend component tReplace, to inspect each of the input columns you want to trim of quotes and equal signs. A single tReplace component can do search and replace operations on multiple input columns. If all the of the replaces are related to each other, I would keep them within a single tReplace. When it gets to the point of doing unrelated replacements, I might place those within a new tReplace so that logical operations are organized and grouped together.
tReplace
For a given Input Column
search for "=", replace with ""
search for "\"", replace with ""
Something like that:
SELECT format( '"%s"', trim( both '"=' from '"=""38000"""' ) );
-[ RECORD 1 ]---
format | "38000"
1st: trim() function removes all " and = chars. Result is simply 38000
2nd: with format can add double quote back to get wishful end result
Alternatively, can use regexp and other Postgres string functions.
See more:
https://www.postgresql.org/docs/current/static/functions-string.html

Way to preserve formatting for lists when copy / pasting from table cell?

My Word interop application needs to get content out of a cell of a table in a word document. The problem is, that the formatting for some items seems broken. For example the last item of a list does not have the list style applyied. Headings are only normal text etc.
The same happens if you create a table, create a list in the table and try to copy / paste the list to somewhere else.
Has anyone else had this problem and maybe found a solution? Is there any way to trick word into giving the correct formatting?
Thanks in advance
Example code
Range range = cell.range;
range.MoveEnd(WdUnits.wdCharacter, -1);
...
range.FormattedText.copy()
The range includes the end-of-cell marker which should not be exported. I just noticed, when not altering the range, list are correctly formatted but the whole cell is exported as a table, which is bad because i want to import the content into another document (where this would nest tables infinitly)
Word2010 v14.06.6112.5000

Apostrophe issue in RTF

I have a function within a custom CRM web application (old VB.Net circa 2003) that takes a set of fields from a database and merges them with palceholders in a set of RTF based template documents. These generate merged letters and documentation. The code essentially loops through each line of the RTF template file and replaces any instances of the placeholder values with text from a database record. The issue I'm having is that users have pasted a certain type of apostrophe into the web app (and therefore into the database) that is not rendering correctly in the resulting RTF file. It is rendering like this - ’.
I need a way to spot this invalid apostrophe in the code and replace it with a valid one. Unfortunately when I paste the invalid apostrophe into the Visual Studio editor it gets converted into the correct one. So I need another way to express this invalid apostrophe's value. Unfortunately I do not know a great deal about unicode and other encodings so I'm calling out for help with this.
Any ideas?
If you really just want to figure out what the character is you might want to try and paste it into a text editor like ultraedit. It has a hex mode that you can flip to to see the actual underlying bytes.
In order to do the replace once you've figured out the character you'd do something like this in Vb,
text.Replace(ChrW(2001), "'")
Note that you might not be able to figure it out easily using the text editor because it might also get mangled by paste from the clipboard. You might want to either print some debug of the ascii values from code. You can use the AscW function to do that.
I can't help but think that it may actually simply be a case of specifying the correct encoding to use when you write out the stream though. Assuming you're using a StreamWriter you can specify it on the constructor. I'm guessing you actually want ASCII given your requirement.
oWriter = New System.IO.StreamWriter(path, False, System.Text.Encoding.ASCII)
It looks like you probably want to encode characters out of the 8 bit range (>255).
You can do that using \uNNNN according to the wikipedia article.

Primitive MailMerge using just delimited field names

Obviously the correct way for our app to generate a Word document based on a template is to encourage the admin users (who will look after the templates) to embed the correct merge fields or bookmarks in the template document.
However in the past we have found our typical admin user who ordinarily doesn't use MailMerge or any of the other "advanced" features in Word is significantly put off by having to use merge fields. We have tried doing it for them, producing documentation, lots of screenshots etc. But it's all "fiddly" for them.
They have dozens of templates (all just different kinds of really simple letters), and will want to modify them reasonably frequently.
What they would really like is to be able to just mark fields with a simple delimiter like a curly brace, which effectively marks a homemade merge field to our app (though Word is oblivious to its significance) as in:
Dear {CustomerSurname}
Then we can just pick up the field(s) with several lines of code as in:
w = New Word.Application
d = w.Documents.Open(...)
Dim mergedContent As String = d.Content.Text
mergedContent = mergedContent.Replace("{CustomerSurname}", Customer.Surname)
mergedContent = mergedContent.Replace("{CustomerPostcode}", Customer.Postcode)
d.Content.Text = mergedContent
This feels crude, but beautifully simple (for the end user).
Has anyone else gone down this route? Anything wrong with it? We would advise them not to use the "{" and "}" character elsewhere in the normal text of the document, but that's not really a significant limitation.
Speed? Wrapping the merge field across two lines? Other problems?
This is a part of my code that I used to find and replace. I tried your code first but that didn't work. This is based upon the VBA code that Word generates when you record a macro.
range = document.Range()
With range.Find
.Text = "{" & name & "}"
.Replacement.Text = NewValueHere
.Forward = True
.Wrap = Word.WdFindWrap.wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
range.Find.Execute(Replace:=Word.WdReplace.wdReplaceAll)
I used this code because the documents needed to be compatible with the Office binary file formats. If you can use the Office 2007 format you don't need to have Word installed. Just unzip the document, find the word\document.xml file, do a String.Replace("[OldValue]", "New value"), save the file and zip it back to one package (docx file).
The code I displayed here is pretty slow because I'm automating Word. Opening Word, editing 2 documents with 6 fields and closing the app takes 4 seconds on my pc.
What if the user does want to use curly braces? I think you should provide a way to escape them for example /{/} or {{}} etc.
You need to make sure that your replace logic is case insensitive for example both {CustomerSurname} and {Customersurname} should be allowed to represent the same field. May be even optionally allow spaces between words like {Customer surname}.