Is it necessary to create <span> elements to register event listeners - event-handling

I have a working web app that reads local .txt files and displays the content in a div element. I create a span element out of each word because I need to be able to select any word in the document and create an EEI (Essential Elements of Information) from the text. I then register a click handler on the containing div and let the event bubble up. The three functions below show reading the file, and parsing it, and populating the text div with spans:
function readInputFile(evt) {
reset();
var theFile = evt.target.files[0];
if(theFile) {
$("#theDoc").empty(); //Clean up any old docs loaded
var myReader = new FileReader();
var ta = document.getElementById("theDoc");
myReader.onload = function(e) {
parseTheDoc(e.target.result);
initialMarkup();
};
myReader.readAsText(theFile);
} else {
alert("Can not read input file: readInputFile()");
}
}
function parseTheDoc(docContents) {
var lines = docContents.split("\n");
var sentWords =[];
for(var i = 0; i < lines.length; i++) {
sentWords = lines[i].split(" ");
words = words.concat(sentWords);
words.push("<br>");
}
//examineWords(words);
createSpans(words);
}
function createSpans() {
for (var i = 0; i < words.length; i++) {
var currentWord = words[i];
if(currentWord !== "<br>") {
var $mySpan = $("<span />");
$mySpan.text(currentWord + " ");
$mySpan.attr("id", "word_" + i);
$("#theDoc").append($mySpan);
buildDocVector(currentWord, i, $mySpan);
}
else {
var $myBreak = $("<br>");
$myBreak.attr("id", "word_" + i);
$("#theDoc").append($myBreak);
buildDocVector("br", i, $myBreak);
}
}
//console.log("CreateSpans: Debug");
}
So basically a simple fileReader, split on \n, then tokenize on white space. I then create a span for each word, and a br element for each \n. It's not beautiful, but it satisfies the requirement, and works. My question is, is there a more efficient way of doing this? It just seems expensive to create all these spans, but my requirement is to annotate the doc and map any selected word to a data model/ontology. I can't think of a way to allow the user to select any word, or combination of words (control click) and then perform operations on them. This works, but with large docs (100 pages) I start having performance/memory issues. I understand this is more a design question and may not be appropriate, but I'd really like to know if there are more performant solutions.

Related

How to loop a search over a long string?

In order to get around the 255 character search limitation in the desktop Word api I'm breaking long strings into searchable chunks of 254 characters and pushing them into an object "oSearchTerms". Then I'm attempting to iterate over oSearchTerms, search for the text, highlight it, then search for the next chunk and do the same until all items in oSearchTerms have been highlighted. The problem is it's not looping. It goes through the first iteration successfully but stops.
I've tried copious context.sync() calls, return true, return context.sync(), etc, which you'll see commented out below, to no avail.
I should also point out that it's not showing any errors. The loop just isn't looping.
Do I have to convert this over to an async function? I'd like to stick with ES5 and not use fat arrow functions.
What am I missing?
var fullSearchTerm = "As discussed earlier, one of the primary objectives of these DYH rules is to ensure that operators have at least one source of XYZ-approved data and documents that they can use to comply with operational requirements The objective would be defeated if the required data and documents were not, in fact, approved and Only by retaining authority to approve these materials can we ensure that they comply with applicable requirements and can be relied upon by operators to comply with operational rules which We believe there are differences between EXSS ICA and other ICA that necessitate approval of EVIS ICA."
function findTextMatch() {
Word.run(function(context) {
OfficeExtension.config.extendedErrorLogging = true;
var oSearchTerms = [];
var maxChars = 254;
var lenFullSearchTerm = fullSearchTerm.length;
var nSearchCycles = Math.ceil(Number((lenFullSearchTerm / maxChars)));
console.log("lenFullSearchTerm: " + lenFullSearchTerm + " nSearchCycles: " + nSearchCycles);
// create oSearchTerms object containing search terms
// leaves short strings alone but breaks long strings into
// searchable 254 character chunks
for (var i = 0; i < nSearchCycles; i++) {
var posStart = i * maxChars;
var mySrch = fullSearchTerm.substr(posStart, maxChars);
console.log( i +" mySrch: "+ mySrch);
var oSrch = {"searchterm":mySrch};
oSearchTerms.push(oSrch);
}
console.log("oSearchTerms.length: " + oSearchTerms.length +" oSearchTerm: "+ JSON.stringify(oSearchTerms));
// Begin search loop
// iterate over oSearchTerms, find and highlight each searchterm
for (var i = 0; i < oSearchTerms.length; i++) {
console.log("oSearchTerms["+i+"].searchterm: " + JSON.stringify(oSearchTerms[i].searchterm));
var searchResults = context.document.body.search(oSearchTerms[i].searchterm, { matchCase: true });
console.log("do context.sync() ");
context.load(searchResults);
return context.sync()
.then(function(){
console.log("done context.sync() ");
console.log("searchResults: "+ JSON.stringify(searchResults));
if(typeof searchResults.items !== undefined){
console.log("i: "+i+ " searchResults: "+searchResults.items.length);
// highlight each result
for (var j = 0; j < searchResults.items.length; j++) {
console.log("highlight searchResults.items["+j +"]");
searchResults.items[j].font.highlightColor = "red";
}
}
else{
console.log("typeof searchResults.items == undefined");
}
// return true;
// return context.sync();
});
//.then(context.sync);
//return true;
} // end search loop
})
.catch( function (error) {
console.log('findTextMatch Error: ' + JSON.stringify(error));
if (error instanceof OfficeExtension.Error) {
console.log('findTextMatch Debug info: ' + JSON.stringify(error));
}
});
}
I recommend that you not have a context.sync inside a loop. That can be a performance hit and it makes the code hard to reason about. Please see my answer to: Document not in sync after replace text and this sample: Word Add-in Stylechecker for a design pattern that avoids this. The pattern can be used with ES5 syntax if you want.
If you implement this pattern, you may find that the problem has gone away, or at least you will be able to see clearer where the cause might be.

Word web addin load whole document from server header/footer

We are trying to load a word document from server using JavaScript. We send the document using a base64 encoding. With our current approach, only the body is loading using the function:
context.document.body.insertFileFromBase64(fileContent, "replace");
Unfortunately, the header and the footer are not loading. Is there another approach to load the whole document including body and footer?
the insertFile operation does not overwrite existing header/footers in the document.
According to my research, I saw this article for using insertFileFromBase64.The article says," if you use insertFileFromBase64 to insert the file it does have this blank page with header and footer." Did you have the same issue for this?
However, another article says it's a design issue. Userform will encode data and will create an appointment on Microsoft Outlook Calendar
The article provides approach:
function getFile(){
Office.context.document.getFileAsync(Office.FileType.Compressed, { sliceSize: 4194304 /*64 KB*/ },
function (result) {
if (result.status == "succeeded") {
// If the getFileAsync call succeeded, then
// result.value will return a valid File Object.
var myFile = result.value;
var sliceCount = myFile.sliceCount;
var slicesReceived = 0, gotAllSlices = true, docdataSlices = [];
console.log("File size:" + myFile.size + " #Slices: " + sliceCount);
// Get the file slices.
getSliceAsync(myFile, 0, sliceCount, gotAllSlices, docdataSlices, slicesReceived);
}
else {
app.showNotification("Error:", result.error.message);
}
});
}
function getSliceAsync(file, nextSlice, sliceCount, gotAllSlices, docdataSlices, slicesReceived) {
file.getSliceAsync(nextSlice, function (sliceResult) {
if (sliceResult.status == "succeeded") {
if (!gotAllSlices) { // Failed to get all slices, no need to continue.
return;
}
// Got one slice, store it in a temporary array.
// (Or you can do something else, such as
// send it to a third-party server.)
docdataSlices[sliceResult.value.index] = sliceResult.value.data;
if (++slicesReceived == sliceCount) {
// All slices have been received.
file.closeAsync();
onGotAllSlices(docdataSlices);
}
else {
getSliceAsync(file, ++nextSlice, sliceCount, gotAllSlices, docdataSlices, slicesReceived);
}
}
else {
gotAllSlices = false;
file.closeAsync();
console.log("getSliceAsync Error:", sliceResult.error.message);
}
});
}
function onGotAllSlices(docdataSlices) {
var docdata = [];
for (var i = 0; i < docdataSlices.length; i++) {
docdata = docdata.concat(docdataSlices[i]);
}
var fileContent = new String();
for (var j = 0; j < docdata.length; j++) {
fileContent += String.fromCharCode(docdata[j]);
}
var mybase64 = window.btoa(fileContent);
console.log("here is the base 64", mybase64);
// Now all the file content is stored in 'fileContent' variable,
// you can do something with it, such as print, fax...
}

Using Word.SearchOptions in an Office Add-In

I am developing an Office Add-In that processes the text of each paragraph of a Word document against data in JSON format and writes the result to a within the index.html file that is rendered in the Task Pane. This works fine. I am now trying to format the strings within the Word document that correspond to the hits for the keys in the JSON data.
I have a JS block in the head of the index.html file in which I call "Office.initialize" and define a variable based on the JSON data, and have utility functions related to the above functionality. After that comes a function in which I get the Word context and process the Word file paragraphs against the JSON data, and then try to search the Word paragraph itself in order to format the hits. In this last task I am trying to reproduce a snippet from Michael Mainer 1. But no formatting happens when I activate this function. Unfortunately I don't have access to a console since I am on a Mac, which makes it harder to debug.
I would be very appreciative of someone showing me where I'm going wrong.
`function tester() {
Word.run(function (context) {
// Create a proxy object for the document's paragraphs
var paragraphs = context.document.body.paragraphs;
// Load the paragraphs' text, which I run regexes on
context.load(paragraphs, 'text');
return context.sync().then(function () {
for (var i = 0; i < paragraphs.items.length; i++) {
var text = paragraphs.items[i].text;
// jquery to iterate over the "notes" objects from the JSON
$.each(notes, function(key, value) {
var regex = new RegExp("\\b" + key + "\\b", "g");
var res = regex.test(text);
// if the regex hits...
if (res == true) {
// This part works fine, using the JSON data to append to the <DIV> with ID = "notes"
document.getElementById('notes').innerHTML += "<button onclick=hide('" + value.seqNo + "')><b>" + key + "</b></button><p class='" + value.seqNo + "' id='" + i + "'>" + value.notes[0].note + "</p>";
// I now go on to searching for these hits within the current paragraph in the Word file
var thisPara = paragraphs.items[i];
// Set up the search options.
var options = Word.SearchOptions.newObject(context);
options.matchCase = false
// Queue the commmand to search the current paragraph for occurrences of the string "key" (coming from the JSON data)
var searchResults = paragraphs.items[i].search(key, options);
// Load 'text' and 'font' for searchResults.
context.load(searchResults, 'text, font');
// Synchronize the document state by executing the queued-up commands, and return a promise to indicate task completion.
return context.sync().then(function () {
// Queue a command to change the font for each found item.
for (var j = 0; j < searchResults.items.length; j++) {
searchResults.items[j].font.color = '#FF0000'
searchResults.items[j].font.highlightColor = '#FFFF00';
searchResults.items[j].font.bold = true;
}
// Synchronize the document state by executing the queued-up commands,
// and return a promise to indicate task completion.
return context.sync();
});
}
});
}
});
})
.catch(function (error) {
console.log('Error: ' + JSON.stringify(error));
if (error instanceof OfficeExtension.Error) {
console.log('Debug info: ' + JSON.stringify(error.debugInfo));
}
});
}
`
It looks like you just need access to the font property, have you tried just:
context.load(searchResults, 'font');
That was working for me?

Remove Content controls after adding text using open xml

By the help of some very kind community members here I managed to programatically create a function to replace text inside content controls in a Word document using open xml. After the document is generated it removes the formatting of the text after I replace the text.
Any ideas on how I can still keep the formatting in word and remove the content control tags ?
This is my code:
using (var wordDoc = WordprocessingDocument.Open(mem, true))
{
var mainPart = wordDoc.MainDocumentPart;
ReplaceTags(mainPart, "FirstName", _firstName);
ReplaceTags(mainPart, "LastName", _lastName);
ReplaceTags(mainPart, "WorkPhoe", _workPhone);
ReplaceTags(mainPart, "JobTitle", _jobTitle);
mainPart.Document.Save();
SaveFile(mem);
}
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
//grab all the tag fields
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
//remove all paragraphs from the content block
field.SdtContentBlock.RemoveAllChildren<Paragraph>();
//create a new paragraph containing a run and a text element
Paragraph newParagraph = new Paragraph();
Run newRun = new Run();
Text newText = new Text(tagValue);
newRun.Append(newText);
newParagraph.Append(newRun);
//add the new paragraph to the content block
field.SdtContentBlock.Append(newParagraph);
}
}
Keeping the style is a tricky problem as there could be more than one style applied to the text you are trying to replace. What should you do in that scenario?
Assuming a simple case of one style (but potentially over many Paragraphs, Runs and Texts) you could keep the first Text element you come across per SdtBlock and place your required value in that element then delete any further Text elements from the SdtBlock. The formatting from the first Text element will then be maintained. Obviously you can apply this theory to any of the Text blocks; you don't have to necessarily use the first. The following code should show what I mean:
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
IEnumerable<Text> texts = field.SdtContentBlock.Descendants<Text>();
for (int i = 0; i < texts.Count(); i++)
{
Text text = texts.ElementAt(i);
if (i == 0)
{
text.Text = tagValue;
}
else
{
text.Remove();
}
}
}
}

Export MS Word Document pages to Images

I want to export MS word(docx/doc) document pages to Image(jpeg/png).
I am doing same for presentation(pptx/ppt) using office interop export api for each slide, but didn't found corresponding API for word.
Need suggestion for API/alternate approach for achieving this.
Based on this similar question: "Saving a word document as an image" you could do something like this:
const string basePath = #"C:\Users\SomeUser\SomePath\";
var docPath = Path.Combine(basePath, "documentA.docx");
var app = new Application()
{
Visible = true
};
var doc = app.Documents.Open(docPath);
foreach (Window window in doc.Windows)
{
foreach (Pane pane in window.Panes)
{
for (var i = 1; i <= pane.Pages.Count; i++)
{
var page = pane.Pages[i];
var bits = page.EnhMetaFileBits;
var target = Path.Combine(basePath, string.Format("page-no-{0}", i));
using (var ms = new MemoryStream(bits))
{
var image = Image.FromStream(ms);
var pngTarget = Path.ChangeExtension(target, "png");
image.Save(pngTarget, ImageFormat.Png);
}
}
}
}
app.Quit();
Basically, I'm using the Page.EhmMetaFileBits property which, according to the documentation:
Returns a Object that represents a picture representation of how a
page of text appears.
... and based on that, I create an image and save it to the disk.