My default file format is RTF because I mostly use Word 2013 to write diary entries. I thought RTF would be a format that should be readable by most computers for years to come. Can the same be said about the native Word DOCX format? I was thinking of switching to DOCX as my native format because it gets rid of “[Compatibility Mode]” warnings in the title bar and taskbar and the files are smaller. Maybe there are other advantages too.
I imagine TXT files would be the most universal file format for a diary, but I use basic formatting and tables in my entries. I worry that 20 years in the future I'll be importing my diary in whatever word processor exists then and I won't be able to read them.
Your worry is quite well-founded, and actually has a name: digital rot. It applies to file formats, but just as much to storage media and other things. My Master's thesis is now inaccessible to me because I only have it on floppy disk, and also because the programming language I used is no longer available.
On the other hand, IF I really needed access to it, I probably could (buying a floppy drive off eBay, finding an emulator for the ancient computer it was created for, finding the programming language, and reading the data are likely all still doable today).
As for your specific question: RTF may not be your best choice, because it, too, is a proprietary format with many different, and sometimes incompatible, flavors - and on top of that, it is outdated and no longer supported.
My personal favorite format for this type of storage is either TXT if formatting does not matter, HTML if formatting does matter, and PDF if HTML's formatting is not enough.
TXT is likely going to be with us for a very long time - probably for at least your and my lifetime. There just is too much software source code written in it for it to go away.
HTML is also going to be with us for a very long time, although incompatibilities will develop as older features are being dropped.
PDF is a proprietary format and could disappear, although the installed base means that it probably will remain readable for decades. The main advantage of PDF is that it makes it easy to print to paper - or, with the appropriate printer driver, print to whatever file format will take its place.
DOCX is likely going to be Microsoft's last Word file format, and will be around at least as long as Microsoft offers word.
Bottom line: your concern is justified, but in the end, you do have a number of options.
Related
If I copy the above text from Chrome and paste it into Obsidian, I get
People who code: we want your input. [Take the Survey](https://stackoverflow.com/dev-survey/start?utm_source=so-owned&utm_medium=announcement-banner&utm_campaign=dev-survey-2021)
[](https://stackoverflow.com/questions/ask# "dismiss")
But when I paste into VS Code or any other editor, I get
People who code: we want your input. Take the Survey
Video Reference: https://dsc.cloud/J/67968285.mov
How does Obsidian do it? How can I achieve the same in VS Code?
— — — — — — — — — — — — — — — — — — — — — — — — —
The Short Answer:
" The VSCode editor does not support the feature you described, and VSCode will almost certainly never support any functionality that comes close to it. This is because of what VSCode is, and what VSCode does, which I will explain in detail below."
The Long Answer:
So VS-Code is a text-based editor, and its purpose is to be an environment that programmers can write their code in, therefore, people using VSCode are going to preform a long list of tasks which includes, but is not limited to:
interpret, compile, execute, debug, serve, share, save, write, read, document, and run code. Now here is the problem with adding the ability to paste arbitrary data types into VSCode. Having any type of data, other than standard text in a file that you plan on using to do one of the following (this is the very short list)...
compile,
execute,
debug,
interpret
parse,
serve,
...will cause a syntax error to be thrown.
When you can copy, and paste, text-formatting from an external source (like a webpage), there is an extremely high probability that some unwanted formatting data will get pasted into some program you've written, and the text's format-data will end up not rendering for what ever reason (theirs a million and 87 reasons why that would happen), and you end up getting syntax errors that you can't see, so you now have no idea where the error is, despite the error message saying line number 734 column 24. In a situation like this you will have an error message that makes no sense, and you will have to start deleting things to find the issue. all because it has text-formatting data intertwined with standard text-data, which you can't see. I hope I drew a clear enough picture for you.
Their are other tools called word processors that implement this feature. I constantly couple G-Docs with VSCode.
Crazy Enough, not all is lost
VSCode allows extensions to make custom editors. The Extension API used to create an editor (can be seen by clicking the link) lets developers build the UI using standard HTML, CSS, and JavaScript. This is enough for someone to write a Word Processor for VSCode, which surprisingly no one has done yet. When someone does create one, which I am almost certain that someone will eventually, in could support the feature your asking for.
For the record, the feature your describing is typically a word processor thing. VSCode allows you to install extensions such as PASTE, which copies and pastes other data types, however; when it is pasted into the editor, instead of rendering the data types like HTML, it just writes it in its text format. In other words, you might think your copying the page at first, but you will be disappointed once you paste to VSCode. I want to point out that Paste uses the GTK-3 Clipboards API, which means that if Paste were to be implemented in a word-processor, like word, or Google docs, those word processors would render the data the the Paste extension pasted into the word processor. In other words, it isn't the extension that is failing to render the Data (which as stated, can be done with the Paste Extension), but its VSCode that wonder render that data as HTML, but only accepts it as a standard text data type.
#W3Dojo
What you've copied can be considered "rich text", but VSCode treats everything in your clipboard as plain text.
So it's the same as pasting via "Ctrl+Shift+V" in other programs like Word or Google Docs. It will remove any formatting, links, color, font, bold/italic, etc.
Obsidian was built with formatting in mind - it closely follows the markdown specification, so it's natural that it will try to convert any rich text to markdown.
I'm not aware of a built-in VSCode setting that will allow you to paste rich text, but I've found 2 extensions that might do the trick:
Paste Special
Markdown Paste
I know this is too strange question, but we have multiple authors of one document and some contributors use OpenOffice to edit document, originating and edited by majority in MS word. Document is quite complex with differently structured paragraphs and fonts, bullets, numbering, embedded pictures, references to comments under the line, copied/pasted sections pasted with source formatting instead of pure text etc., so generally "fragile" and maybe little bit exceeding expectations of OpenOffice authors for MS compatibility. Bottom line is about various formatting issues, glue-ing of some words (occasionally space is missing), page footer/header modified or completely disappeared etc. We are unable to control behaviour of contributors and editors to the extent I would like to have, so I am trying to findout whether is there a way how to force users to use exclusively MS word for particular docx and to prevent using anything else? (I am not on MS payroll, I personally moved couple of people around me with "standard" document writing needs to OpenOffice, but incompatibility in this case creates useless redaction work for us.)
Thanks for any hint.
whether is there a way how to force users to use exclusively MS word for particular docx and to prevent using anything else
To me, it sounds like a terrible idea to try to enforce this with a macro or similar (and it probably wouldn't work even if you tried). Instead, come up with a better workflow and communicate with anyone who may be involved so they know what to do.
First question, is the document under configuration control? For example, if a bad change is made, do you have a way of going back to a previous version? There are many different configuration management tools available, both free and commercial.
Next, I would strongly recommend making final changes with only one Office suite. Pick either LibreOffice (or Apache OpenOffice - is that what you mean by OpenOffice? The OpenOffice.org suite was forked several years ago) or MS Word to be the official editing tool, but not both.
If you pick MS Word, then people can still make preliminary changes to the document using LibreOffice. However, someone with MS Word will then need to use a Diff tool to see the changes and then use MS Word to incorporate those changes into the document. Or ideally, Track Changes would be turned on to make it easier to see what changes were made and who made them. Comments can also be added to explain why changes were made.
What is even better is to get people to send marked-up PDF files that contain their proposed changes. PDF files cannot be edited, which is good because it avoids the kinds of problems that led you to write this question, and also the formatting changes they made will not appear differently on another computer. However, this requires a certain amount of education so that everyone agrees to do it this way, and in my experience, that's not easy with a diverse group.
If you ever see that someone has made changes to the main document using LibreOffice, you or someone else needs to go back to the latest version not edited by LibreOffice and then use MS Word to incorporate all of the new changes.
At this point, if both suites have been used to edit the document, then I would probably start off with a new blank document and copy all of the text unformatted into it. This would require redoing all tables and other formatting. Otherwise, it's likely to be nearly impossible to get a clean document, and the underlying formatting may have no end to the number of problems that keep popping up.
I got a very weird situation that highly needs your assistance. I appreciate your effort and time in advance.
I have a machine which produces a text file that records some information of the machine's working status such as, the coordinate of the drill head and the rotating speed used at that position. While we examine the text file, it appears to be unreadable because most of the contents are garbled. Please see the attached figure. http://ppt.cc/sA1I
If I open it with UltraEdit I see: http://ppt.cc/TrnV
As you can see some part of the file is readable; however many unrecognizable characters, which should be those numeric values we want.
Two reasons that I believe this problem should be solved by Matlab. First, I am sure this machine has many built-in matlab code inside for analysis purpose. Second, we have a .exe file, which is compiled by Matlab, can restore the garbled text file into arranged and readable format (the values of the coordinates are restored).
We desperately want to see the contents of this file by ourselves. Please kindly provide solution or idea or any direction for me to solve this issue.
Sincerely,
Old question without answer: For the record, a suggestion.
Sounds like a case of Mojibake, a problem with text encoding. Here's how I solved it.
Background: I had text files created on a Mac, others on a Windows, others still on Linux, each in different text encoding. So I got a text editor that would allow me to view the format and to change it. In my case, I used TextMate on MacOS, opened the files, picked the correct encoding upon opening, which sometimes was a Windows format, a Mac format, sometimes a Latin format -- had to use trial and error to figure it out based on a preview this particular piece of software gave me. Once I had the file opened in the correct encoding, I would save it in the utf-8format, which is not platform-specific and allows me to move my text files across various computers.
There may be more scalable methods, but I only had a hundred or so files to deal with, so I opted for the manual method, in order to personally visualize the rendering on screen, and because my files came in different encoding to begin with.
I've been using an application called "WinMerge" lately for document comparisons, but one of the requirements of my teams script files (for auditing purposes) is that when we release a revision of a script we highlight the changes in red (RTF format I believe, it's through Lotus Notes) To that end, is there any software that can automatically highlight changes for me or is the best I'm going to get a list of differences and be expected to manually highlight all changes?
Assuming an HTML+CSS solution meets your needs, this article from Linux Journal shows a shell script that reads diff output and writes an HTML document with colored text highlighting the differences.
On Windows, it would probably work as-is under bash as provided in the MSYS environment from the MinGW folk or in bash from Cygwin. The script itself isn't too large, I would imagine it could be ported to Perl with only a moderate effort.
Since converting HTML to RTF turns out not to have a trivial solution that I've found, you might have better luck porting the script to directly output RTF.
If an HTML report is acceptable, Beyond Compare can generate a comparison report that highlights differences. You can use the built-in stylesheets or a custom internal one to style the differences in red (the default is a light red color already).
It doesn't seem to be able to generate RTF, but perhaps there is a simple conversion between html/css and rtf.
I know this question has probably been asked already but I would really like to know of a program that will show line by line the differences between to word documents. Thus I need a word document format that supports this (.doc, .docx and .ods obviously don't).
Are HTML and XML the only formats that come close to supporting this feature or is there another format?
MS Word 2007 itself can compare docs for you! Check this link. Is this what you are looking for?
LaTeX, DocBook (which is actually XML).
I use MS Word 2003 (yes, it still works fine on Windows 10). For the most part I like it, and it does have a compare feature for comparing .doc or .docx files, but that feature is rather pathetic.
Fortunately, there are at least two good free alternatives. Kingsoft WPS Writer v.10.2.0.5978 and LibreOffice Writer 5.4.3 (both free, and both circa Nov. 2017) both do a pretty good job comparing MS Word documents. (You can also get LibreOffice very conveniently via Ninite.)
WPS Office is "free with advertising," or you can pay to make the ads go away. LIbreOffice is free and open-source.
I think I prefer LibreOffice by a narrow margin, but the compare feature is a bit hard to find. First open your "new" document in LibreOffice Writer. Then navigate to Edit -> Track Changes -> Compare Document... and open the "old" document. LibreOffice Writer shows you the "new" document and identifies what has changed in it by showing the insertions with underlines and the deletions as struck-through (deleted) text. It is quite nice, actually.
Of course, both WPS Writer and LibreOffice Writer can also edit your files. So, if you remember to save in .doc or .docx format, they can completely replace MS Office, for many users. However, compatibility isn't perfect in the more obscure "corners" of the programs, so I find myself still using the old MS Word 2003 for editing, most of the time.