VS Code Extension: Appending text to the end of a document - visual-studio-code

How do you append text past then last line of a document in the editor using an extension?
I have an extension that either creates a new, untitled document, or it appends text to the bottom/end only of an existing document. It is the latter case that I am having trouble with. The extension does not depend on caret/cursor/selection position. I've tried both edit.insert() and edit.replace(), with various position/range values of getting past the last character, but my text addition is always placed above the last line:
Before operation (line 20 is the last line of the document):
What I get. Note the existing blank line is below the inserted text:
What I want. Note the existing blank line is above the inserted text.:
The code:
var lastLine = editor.document.lineAt(editor.document.lineCount - 1);
const replaceContent = 'Inserted Text';
editor.edit((editBuilder) => {
editBuilder.replace(lastLine.range.end, replaceContent);
});
I've found lots of SO articles for inserting/replacing text, just nothing specific to adding to the very end of an editor buffer.

// Get the text editor object
const editor = vscode.window.activeTextEditor;
// Get the current document object
const document = editor.document;
// Get the last line of the document
const lastLine = document.lineAt(document.lineCount - 1);
// Get the last line text range
const range = new vscode.Range(lastLine.range.start, lastLine.range.end);
// Append the text to the document
editor.edit((editBuilder) => {
editBuilder.insert(range.end, "\nAppended text");
});

Related

Word cannot open DOCX file with a table

I am trying to run mail merge against a DOCX file using Open XML API - just replacing a <w:t> element with a table (see below). Even the simplest table created using the following code results in Word erroring out when opening the file.
If I get rid of the row (so that I only have <w:tbl> / <w:tblGrid> / <w:GridCol>). there is no error, but then I cannot have any data of course.
Can anybody see what I am doing wrong?
Table table = new Table(new TableGrid(new GridColumn() { Width = "2000"}),
new TableRow(new TableCell(new Paragraph(new Run(new Text("test")))))
);
TextNode.Parent.ReplaceChild<Text>(table, TextNode);
You cannot replace <w:t> with <w:tbl>. The table is a block-level element so you can place it in the same places where you have the paragraph (<w:p>).
In other words, you can place it as a child element of one of the following: body, comment, customXml, docPartBody, endnote, footnote, ftr, hdr, sdtContent, tc, and txbxContent.
So, try something like this:
// TextNode (Text) -> Parent (Run) -> Parent (Paragraph)
var paragraph = TextNode.Parent.Parent as Paragraph;
paragraph.Parent.ReplaceChild(table, paragraph);
EDIT:
If the parent element is <w:tc>, you should add an empty paragraph to its end:
// TextNode (Text) -> Parent (Run) -> Parent (Paragraph)
var paragraph = TextNode.Parent.Parent as Paragraph;
var parent = paragraph.Parent;
parent.ReplaceChild(table, paragraph);
if (parent is TableCell)
parent.InsertAfter(new Paragraph(), table);

vscode API: get Position of last character of line

Following up on this still unanswered question regarding VS Code Extensions with the VS Code API. I didn't answer it because it specifically asked for a solution using the with method of the Position object. I couldn't make that work, nor was I able to loop through the object to get the last character. Trying to manipulate the selection with vscode.commands.executeCommand didn't work either, because vscode.window.activeTextEditor doesn't appear to reflect the actual selection in the window as soon as the Execution Development Host starts running. The only solution I could find was the hoop-jumping exercise below, which gets the first character of one line and the first character of the next line, sets a Range, gets the text of that Range, then reduces the length of that text string by 1 to get the last character of the previous line.
function getCursorPosition() {
const position = editor.selection.active;
curPos = selection.start;
return curPos;
}
curPos = getCursorPosition();
var curLineStart = new vscode.Position(curPos.line, 0);
var nextLineStart = new vscode.Position(curPos.line + 1, 0);
var rangeWithFirstCharOfNextLine = new vscode.Range( curLineStart, nextLineStart);
var contentWithFirstCharOfNextLine = editor.document.getText(rangeWithFirstCharOfNextLine);
var firstLineLength = contentWithFirstCharOfNextLine.length - 1;
var curLinePos = new vscode.Position(curPos.line, firstLineLength);
var curLineEndPos = curLinePos.character;
console.log('curLineEndPos :>> ' + curLineEndPos);
I'm obviously missing something - it can't be impossible to get the last character of a line using the VSCode API without mickey-mousing the thing like this. So the question is simply, what is the right way to do this?
Once you have the cursor Position the TextDocument.lineAt() function returns a TextLine. From which you can get its range and that range.end.character will give you the character number of the last character. - not including the linebreak which if you want to include that see TextLine.rangeIncludingLineBreak().
const editor = vscode.window.activeTextEditor;
const document = editor.document;
const cursorPos = editor.selection.active;
document.lineAt(cursorPos).range.end.character;
Note (from [TextDocument.lineAt documentation][1] ):
Returns a text line denoted by the position. Note that the returned
object is not live and changes to the document are not reflected.

I need to extract text from a pdf file using itext7 or itextsharp and put html tag for bold around all the words using bold font

I am using iText7 and I want to extract all the texts from a pdf and put html tag for bold ( <b>...</b> ) around all the words that uses bold fonts and save it in text file. Any pointers? I am able to independently extract text and also extract all the bold words but not able to co-relate the two.
Here is the code snippet I am using for extracting the text:
PdfDocument MyDocument = new PdfDocument(new PdfReader("C:\\MyTest.pdf"));
string MyText = PdfTextExtractor.GetTextFromPage(MyDocument.GetPage(1), new
SimpleTextExtractionStrategy());
Here is the code I am using for extracting all the words using the bold font:
MyRectangle = new Rectangle(0, 0, 50, 100);
CustomFontFilter fontFilter = new CustomFontFilter(MyRectangle);
FilteredEventListener listener = new FilteredEventListener();
LocationTextExtractionStrategy extractionStrategy =
listener.AttachEventListener(new LocationTextExtractionStrategy(), fontFilter);
PdfCanvasProcessor parser = new PdfCanvasProcessor(listener);
parser.ProcessPageContent(MyDocument.GetPage(1));
String MyBoldTextList = extractionStrategy.GetResultantText();
//------
class CustomFontFilter : TextRegionEventFilter
{
public CustomFontFilter(iText.Kernel.Geom.Rectangle filterRect) : base(filterRect){ }
override public bool Accept(IEventData data, EventType type)
{
if (type == EventType.RENDER_TEXT){
TextRenderInfo renderInfo = (TextRenderInfo)data;
PdfFont font = renderInfo.GetFont();
if (font!=null)
return font.GetFontProgram().GetFontNames().GetFontName().Contains("Bold");
}
return false;
}
}
The problem is that the pdf in question here is a multi-column document. SimpleTextExtractionStrategy brings the text in perfect order but if I use the LocationStrategy, it messes up texts by jumping from one column to next column in each line. I am not able to find any way to get the list of bold words using SimpleTextExtractionStrategy. In LocationStrategy, the list that I get is not in the right order so I am unable to co-relate it.
So to summarize:
You want to extract all the text from a pdf and put the html tag for bold (<b>...</b>) around all the text that uses bold fonts.
Your PDFs allow normal text extraction (without those <b> tags) using the SimpleTextExtractionStrategy. The LocationTextExtractionStrategy on the other hand cannot be used as it messes up the order of the multi-column text.
Bold text in your PDFs can properly be recognized by your CustomFontFilter, i.e. by the
font.GetFontProgram().GetFontNames().GetFontName().Contains("Bold")
condition.
Thus, one way to implement your task would be to extend the SimpleTextExtractionStrategy to check every chunk received using the CustomFontFilter condition and insert <b> tags where required.
For example like this:
public class BoldTaggingSimpleTextExtractionStrategy : SimpleTextExtractionStrategy
{
FieldInfo textField = typeof(TextRenderInfo).GetField("text", BindingFlags.NonPublic | BindingFlags.Instance);
bool currentlyBold = false;
public override void EventOccurred(IEventData data, EventType type)
{
if (type.Equals(EventType.RENDER_TEXT))
{
TextRenderInfo renderInfo = (TextRenderInfo)data;
string fontName = renderInfo.GetFont()?.GetFontProgram()?.GetFontNames()?.GetFontName();
if (fontName != null && fontName.Contains("Bold"))
{
if (!currentlyBold)
{
textField.SetValue(renderInfo, "<b>" + renderInfo.GetText());
currentlyBold = true;
}
}
else if (currentlyBold)
{
AppendTextChunk("</b>");
currentlyBold = false;
}
}
base.EventOccurred(data, type);
}
}
As you see I used reflection here. I did so because (A) TextRenderInfo does not allow public setting of the text and (B) AppendTextChunk must not be used before the first chunk is processed by base.EventOccurred - there the size of a StringBuilder containing the collected text chunks is used to check whether the chunk currently processed is the first one or not; if something is in that builder before at least one chunk has been processed, one gets a NullReferenceException. There are other work-arounds for that but reflection here means but one more line of code.

Inserting a cross-reference with Aspose Words

Does anybody know if it is possible to insert a cross reference (I want to reference a bookmark, but I could make anything else work as well), using Aspose Words, in C#?
If you want to insert a footnote or endnote, you can use the following code.
DocumentBuilder builder = new DocumentBuilder(doc);
builder.Write("Some text is added.");
Footnote endNote = new Footnote(doc, FootnoteType.Endnote);
builder.CurrentParagraph.AppendChild(endNote);
endNote.Paragraphs.Add(new Paragraph(doc));
endNote.FirstParagraph.Runs.Add(new Run(doc, "Endnote text."));
doc.Save(MyDir + #"FootNote.docx");
If you want to insert a bookmark, you can use the following code.
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.StartBookmark("MyBookmark");
builder.Writeln("Text inside a bookmark.");
builder.EndBookmark("MyBookmark")
If you want to update a bookmark, you can use the following code.
Document doc = new Document(MyDir + "Bookmark.doc");
// Use the indexer of the Bookmarks collection to obtain the desired bookmark.
Bookmark bookmark = doc.Range.Bookmarks["MyBookmark"];
// Get the name and text of the bookmark.
string name = bookmark.Name;
string text = bookmark.Text;
// Set the name and text of the bookmark.
bookmark.Name = "RenamedBookmark";
bookmark.Text = "This is a new bookmarked text.";
I work as developer evangelist at Aspose.
You can use the following code to get the page number of a bookmark.
Document doc = new Document("Bookmark.docx");
Aspose.Words.Layout.LayoutCollector layoutCollector = new Aspose.Words.Layout.LayoutCollector(doc);
// Use the indexer of the Bookmarks collection to obtain the desired bookmark.
Bookmark bookmark = doc.Range.Bookmarks["MyBookmark"];
// Get the name and text of the bookmark.
string name = bookmark.Name;
string text = bookmark.Text;
int pageNumber = layoutCollector.GetStartPageIndex(bookmark.BookmarkStart);
Console.Write("Bookmark name is {0}, it is placed on page number {1} and following is the text inside it: {2}", name, pageNumber, text);

OpenXML editing docx

I have a dynamically generated docx file.
Need write the text strictly to end of page.
With Microsoft.Interop i insert Paragraphs before text:
int kk = objDoc.ComputeStatistics(WdStatistic.wdStatisticPages, ref wMissing);
while (objDoc.ComputeStatistics(WdStatistic.wdStatisticPages, ref wMissing) != kk + 1)
{
objWord.Selection.TypeParagraph();
}
objWord.Selection.TypeBackspace();
But i can't use same code with Open XML, because pages.count calculated only by word.
Using interop impossible, because it so slowwwww.
There are 2 options of doing this in Open XML.
create Content Place holder from Microsoft Office Developer Tab at the end of your document and now you can access this Content Place Holder programatically and can place any text in it.
you can append text driectly to your word document where it will be inserted at the end of your text. In this approach you got to write all the stuff to your document first and once you are done than you can append your document the following way
//
public void WriteTextToWordDocument()
{
using(WordprocessingDocument doc = WordprocessingDocument.Open(documentPath, true))
{
MainDocumentPart mainPart = doc.MainDocumentPart;
Body body = mainPart.Document.Body;
Paragraph paragraph = new Paragraph();
Run run = new Run();
Text myText = new Text("Append this text at the end of the word document");
run.Append(myText);
paragraph.Append(run);
body.Append(paragraph);
// dont forget to save and close your document as in the following two lines
mainPart.Document.Save();
doc.Close();
}
}
I haven't tested the above code but hope it will give you an idea of dealing with word document in OpenXML.
Regards,