Primitive MailMerge using just delimited field names - ms-word

Obviously the correct way for our app to generate a Word document based on a template is to encourage the admin users (who will look after the templates) to embed the correct merge fields or bookmarks in the template document.
However in the past we have found our typical admin user who ordinarily doesn't use MailMerge or any of the other "advanced" features in Word is significantly put off by having to use merge fields. We have tried doing it for them, producing documentation, lots of screenshots etc. But it's all "fiddly" for them.
They have dozens of templates (all just different kinds of really simple letters), and will want to modify them reasonably frequently.
What they would really like is to be able to just mark fields with a simple delimiter like a curly brace, which effectively marks a homemade merge field to our app (though Word is oblivious to its significance) as in:
Dear {CustomerSurname}
Then we can just pick up the field(s) with several lines of code as in:
w = New Word.Application
d = w.Documents.Open(...)
Dim mergedContent As String = d.Content.Text
mergedContent = mergedContent.Replace("{CustomerSurname}", Customer.Surname)
mergedContent = mergedContent.Replace("{CustomerPostcode}", Customer.Postcode)
d.Content.Text = mergedContent
This feels crude, but beautifully simple (for the end user).
Has anyone else gone down this route? Anything wrong with it? We would advise them not to use the "{" and "}" character elsewhere in the normal text of the document, but that's not really a significant limitation.
Speed? Wrapping the merge field across two lines? Other problems?

This is a part of my code that I used to find and replace. I tried your code first but that didn't work. This is based upon the VBA code that Word generates when you record a macro.
range = document.Range()
With range.Find
.Text = "{" & name & "}"
.Replacement.Text = NewValueHere
.Forward = True
.Wrap = Word.WdFindWrap.wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
range.Find.Execute(Replace:=Word.WdReplace.wdReplaceAll)
I used this code because the documents needed to be compatible with the Office binary file formats. If you can use the Office 2007 format you don't need to have Word installed. Just unzip the document, find the word\document.xml file, do a String.Replace("[OldValue]", "New value"), save the file and zip it back to one package (docx file).
The code I displayed here is pretty slow because I'm automating Word. Opening Word, editing 2 documents with 6 fields and closing the app takes 4 seconds on my pc.

What if the user does want to use curly braces? I think you should provide a way to escape them for example /{/} or {{}} etc.
You need to make sure that your replace logic is case insensitive for example both {CustomerSurname} and {Customersurname} should be allowed to represent the same field. May be even optionally allow spaces between words like {Customer surname}.

Related

Is it possible to ignore paragraph marks when using getTextRanges() in word add in?

I am currently developing a word addin using the office js library. I need to get all sentences in the word document as individual ranges. For this I used getTextRanges() on the body of the document with "." as the delimiter. However, it also separates on paragraph mark which is not ideal for my use case. All I want is for the document to be divvied up into ranges where the only delimiter is "." - regardless of whether the ranges will then expand across paragraphs.
Is there a way to ignore paragraph marks with getTextRanges(), or is there another method entirely that I seem to have overlooked?
Thanks.
I have been unable to resolve it.

Mailmerge single image into a Word Document based on a cell value

I'd like to include an image into a mail merged word document based on the presence of a single value in a column which contains several values.
e.g. if the cell contains the value BOB insert image, if it contains any other value then do nothing.
Most of the {INCLUDEPICTURE} functionality seems built around including a different image based on a filename matching a cell value.
{INCLUDEPICTURE} "MERGEFIELD Selection_identifier).png"\*
MERGEFORMAT \d }
Works provided I translate selection_identifer in the spreadsheet itself, but there has to be a better way. There seems to be little information about this particular usecase online.
If you are only using a single image and it does not vary between merges, you should probably just use
{ IF "{ MERGEFIELD Selection_identifier }" = "BOB" "<the_image>" }
where <the_image> is a copy of the actual image, sized how you want, pasted between those quotation marks. In that case, there would be no need for an INCLUDEPICTURE field or a reference to an external image file.
As usual, all the {} have to be the special field code brace pairs that you can insert on Windows Desktop Word using Carl-F9 or similar.

RMarkdown syntax within apa_table()

I am not sure if I am overseeing something, maybe there is an easy solution for this already (sorry if this is the case) but so far I haven't found one:
When I am passing a manually created data.frame to apa_table() with row names / column names / values containing RMarkdown syntax, for example "$p$" or "$p > .001$", and try to knit it into a docx file, it will not work and just print it as it is. If I use label_variable(df, p="$p$) it does work ofc as expected, but this is my solution only for column names, not for the other locations within a table. The same also accounts for the note = "$p$" beneath an apa_table().
I am curious if it is possible or if a solution already exists, I'd be thankful for some help on this one!
Best regards and thank you in advance
Mischa
By default, apa_table() escapes characters that are special in LaTeX (e.g., $). You can turn this feature off by specifying escape = FALSE. Moreover, if you want to enable full markdown support for your table body, I recommend to specify format = "pipe", which tells apa_table() to return the table in pandoc's pipe format, which in turn supports markdown.
Consider this table with some markdown commands:
table_content <- data.frame(
"$\\mathit{df}$" = "$\\mathit{df} = 1$"
, b = c("**a**", "*b*")
, check.names = FALSE
)
A full call to apa_table() might then look like the following:
apa_table(
table_content
, escape = FALSE
, format = "pipe"
)
A current limitation to this approach seems to be table notes: pandoc's pipe tables do not seem to support table notes, so using markdown syntax for the table's body while also adding a table note does not seem to work at the same time.

Removing spaces from a string using Powershell

I have an issue where extracting data from database it sometimes (quite often) adds spaces in between strings of texts that should not be there.
What I'm trying to do is create a small script that will look at these strings and remove the spaces.
The problem is that the spaces can be in any position in the string, and the string is a variable that changes.
Example:
"StaffID": "0000 25" <- The space in the number should not be there.
Is there a way to have the script look at this particular line, and if it finds spaces, to remove them.
Or:"DateOfBirth": "23-10-199 0" <-It would also need to look at these spaces and remove them.
The problem is that the same data also has lines such as:
"Address": " 91 Broad street" <- The spaces should be here obviously.
I've tried using TRIM, but that only removes spaces from start/end.
Worth mentioning that the data extracted is in json format and is then imported using API into the new system.
You should think about the logic of what you want to do, and whether or not it's programmatically possible to determine if you can teach your script where it is or is not appropriate to put spaces. As it is, this is one of the biggest problems facing AI research right now, so unfortunately you're probably going to have to do this by hand.
If it were me, I'd specify the kind of data format that I expect from each column, and try my best to attempt to parse those strings. For example, if you know that StaffID doesn't contain spaces, you can have a rule that just deletes them:
$staffid = $staffid.replace("\s+",'')
There are some more complicated things that you can do with forced formatting (.replace) that have already been covered in this answer, but again, that requires some expectation of exactly what data is going to come out of what column.
You might want to look more closely at where those spaces are coming from, rather than process the output like this. Is the retrieval script doing it? Maybe you can optimize the database that you're drawing from?

Removing some paragraph marks in a word document

I copy text from PDF files into word 2010 documents using Abbyy conversion software. I find the result will contain many line breaks which are incorrect. Is there any way I can remove any such marks if they are not preceded by either "." or "?" or "!"
I write macros in excel but have no experience of word coding
You could do a search and replace depending if you can find some sort of rules wich you can apply. Mayeby a little screenshot?