Copy from Ms Word and paste into Rich Textbox problem - ms-word

I have a problem when pasting ms word content into richtextbox.
when I copy content of word document and paste it into richtextbox which is in a windows application written in C#.
the links are shown like that ;
This is test.. Go to Google. <http://www.google.com>
Mail : Project <mailto:cbn#test.com>
The issue can also be created by loading in an saved RTF document from word.
How can I correct this, please help..
thanks in advance.

The issue here is that you're not actually copying RTF into the clipboard from Word. Well, kind of, but not the same RTF that would display just the formatted text and have a hyperlink behind it. You'd have to handle the paste event and do your own parsing and reformatting to achieve that.

Some richtextbox editors have a "Paste from Word" feature. You can paste it in a simple textbox and start re-formatting based on the rich text editor you're using.

This behavior has nothing to do with Word. In fact even though the RTF produced in the Clipboard by MS Word is slightly different from that of OpenOffice the results are identical. See screenshot below where the top two links are from MS Word and the rest from OOo 3.2.
.
It seems to be a peculiarity in the drawing of hyperlinks in the Winforms RichTextBox.
I don't see a quick workaround to change this behavior though.

Related

HTML pages with MathML opened in MS Word

I would like to open an HTML page with MS Word (Microsoft Office Standard 2019). The page contains MathML codes, e.g. this page.
If I save it and open with MS Word, the formulas appear as normal text, mathematical formatting is completely lost.
In this post I see a trick to import a MathML formula into MS Word, i.e. copy the code in Wordpad, then copy from wordpad and paste into MS Word. It is just an easter egg, not a real solution. When I try with a full page, indeed, it does not work.
This is very annoying, since MS Word is clearly able to correctly interpret MathML codes but it doesn't when needed. I guess that it is just a commercial decision, to promote the OOXML codes instead of MathML.
Is there a method to convince MS Word to work properly with MathML code? I would like to right-click on a .html file, select "open with Word", and see the correct MathML formulas.

Convert HTML to a Word-document which can be edited in Word Online

Our users write in a rich-text field, pretty much like this one, and we would like for them to be able to export this as a Word Document in their OneDrive, preserving the formatting, and being able to open the file in Word Online.
I have no trouble creating new files using the https://graph.microsoft.com/v1.0/ api. The problem is the conversion to docx. The google api provides this conversion automatically, but I did not find that for Microsoft. I tried using html-docx-js and it almost works perfectly.
The file is created:
However when opening the file the following dialog pops up:
Opening in read-only mode works, the file shows with the correct formatting.
Downloading the file and opening in desktop word works perfectly (i.e. editing as well).
The HTML-content i use is a simple div with a few p-tags, so the "Objects that Word Online doesn't support" probably comes from html-docx-js.
Here's an example word-file that is created. This file can be opened as normal in desktop Word but only opened in read-only in Word Online
https://1drv.ms/w/s!AqpUGtnMiyurgwE543OscH7PdLnY
Any ideas?

How do I automate converting PDF to HTML?

I work for a publisher and am trying to extract content from our fully laid out PDFs. I've tried pdftohtml, pdftotext, pdfminer, and other Python-based approaches to getting the content, as well as saving to Word, HTML, XML, etc. from the original Acrobat files.
I don't need just the text, I also need the text formatting. That's because, for example, I need all the blue text in the document.
When I save to HTML, Word, etc. from Acrobat, the resulting files contain screenshots of the pages, not the laid out text. When I extract text using different Python modules I get the text but lose the text formatting.
The only solution I've found is to manually copy and paste from the PDF into a word doc, then saving as HTML. I'm hoping to automate this.
Why does copying from Acrobat into Word achieve what I can't do by other means? Has anybody come across this problem before?
Maybe you can consider another method. The software (https://pdfapi.codeplex.com/) can convert pdf files to html directly via MVS. If you are able to use the MVS, i think the software i mentioned above is useful for you to convert the text in pdf files to html that can keep the format perfectly. Of course, it's just a referral, you can have a try.

How to enable copy/paste formatted text from Lotus Notes to TinyMCE?

This question was previously posted to the TinyMCE HowTo Forum with no responses. Here's hoping that someone out there has encountered (and solved) this issue.
The question: Is there some way to enable correct copy/paste of formatted text from a Lotus Notes email directly into TinyMCE?
The scenario: A rolling comments system on a web site, into which users occasionally need to paste rich text from an email viewed in Lotus Notes.
The details:
I have tried copying some formatted text from emails viewed in Lotus Notes (7.0.4, Windows XP) and pasting it into the "Full featured example" implementation of TinyMCE at http://tinymce.moxiecode.com/examples/full.php and found that it generally fails to maintain the formatting. In fact, of the browsers I tested, IE6 fared the best, and the more modern W3C standards compliant browsers were the worst.
Some text formatting I tested was:
larger text
underline
italics
bold
numbered list
bullet list
indented text
permanent pen
font family: arial
font family: times new roman
Results:
-Firefox (3.6.8), Vista or XP: all formatting lost
-Chrome (5.0.375.125), Vista or XP: all formatting lost, including line breaks
-IE6 (XP): some formatting is maintained (fails to copy numbers and bullets for lists, but indents lists properly)
-IETester (IE6) Vista: some formatting is maintained (fails to format lists at all, and the underline tag is not closed)
-IE7 (XP): some formatting is maintained (fails to format lists at all, and the underline tag is not closed)
-IE8 (Vista): some formatting is maintained (fails to format lists at all, and the underline tag is not closed)
If I first paste the clipboard from Lotus Notes into MS Word 2003 (11.5604.5606) it shows perfectly in Word, and if I then copy/paste it from there into TinyMCE it generally works better enough to be usable, although still loses some formatting, even when using the "Paste from Word" button in TinyMCE. Not surprisingly, if I open my Lotus Notes mail in a web mail client, the HTML mail copies and pastes perfectly into TinyMCE.
Since it shows perfectly in my Domino web client, and pastes perfectly into MS Word, it is obviously possible to copy/paste Lotus Notes formatting.
If anyone has had success with this please mention your Notes and browser versions, and any modifications you had to make to the TinyMCE config.
If you check what's pasted from Word, you'll find that it's pretty much what you'd get if you had done a File->Save As->Web Page in Word: a whole bunch of Word-specific HTML attributes and CSS. Essentially, it's Word's ability to be coerced into exporting HTML that does the trick; Word's rich text alone won't do the job. The Notes clipboard (which is different from the system clipboard) can export RTF to the system clipboard, which then pastes (with limitations) to Word (which can interpret RTF), but a JavaScript widget in the browser doesn't understand RTF.
You can use the w32 api to do your formatted copy (e.g. make a special copy btn in LotusScript and call it). I have actually done this, and it works fine.
however, will TinyMCE handle the paste operation well? - that I cannot tell you.
I have logged this as a bug against TinyMCE.
Ok, then eigther you will need to deactivate the paste plugin and write a plugin of your own or you will have to configure/change the paste plugin to your needs.
If I first paste the clipboard from Lotus Notes into MS Word 2003 (11.5604.5606) it shows >perfectly in Word, and if I then copy/paste it from there into TinyMCE it generally works >better enough to be usable,
Thing is, that your OS detects (at least sometimes) from which kind of context (plain text, html,...) copy-paste is done. That probably is the reason why copying it first into Word helps a bit.

Convert from Microsoft Word to Media Wiki Markup Style

How do I export a word document to media wiki markup style
I have been trying to do it by following the steps given in http://en.wikipedia.org/wiki/Help:WordToWiki
but all in vain, not getting it.
Any help please.
Best way is to use Open Office
Open the Word document in Open Office Writer.
Go to File / Export.
Under File format choose MediaWiki (.txt).
Click Save (or Export).
Open the new file in a text editor and copy the contents to the clipboard.
Paste the text to a Wikipedia article.
That is copy and pasted from the document you linked to.
For Open Office 4.15 you have to add the extension Sun Wiki Publisher 1.1 with the extension manager.
If you don't want to install OpenOffice, another option is the Word2MediaWikiPlus extension.
In Microsoft Word 2016 I use the plugin "Microsoft Office Word Add-in For MediaWiki" (already suggested by Jake). https://www.microsoft.com/en-us/download/details.aspx?id=12298
To make it work in Microsoft Word 2016 (version 16.0). I followed these instructions but replaced "15.0" in the instruction to "16.0",
https://answers.microsoft.com/en-us/msoffice/forum/msoffice_word-mso_other/using-microsoft-office-word-add-in-for-mediawiki/449726c2-6d08-45e1-919a-4b5082ab4b5b
Microsoft has released an add-in for Microsoft Word that lets you export a doc file to MediaWiki formatting (as a .txt file). It's fairly decent.
http://www.microsoft.com/en-us/download/details.aspx?id=12298
If you're going to be doing this a lot, consider installing the FCK Editor. This has a Paste From Word button.
The easiest way may be to install LibreOffice (http://libreoffice.org) and open the Word document in its Writer application, then from there do Export and save to Media Wiki txt file. The Copy-paste that text into the Media Wiki at edit mode
but there was no way for adding images automatically that won't work for libreoffice or the word plugin.
If you have only a few docs for converting to the mediawiki, it is ok.
But if there ar more the it is great deal of time and effort.
For autom. Imageupload the only working solution was the discontinued project Word2MediaWikiPlus.
If somethings has changed in the last years let it me now.
But if not there are some solutions with work without image upload
(if I found them i will add these entry here):
- on webserer projekt which generated very good wiki markup output there , i can' t remember the name.
- a commandline tool that do the conversion as input and output file