How to compare two word documents? [closed]

How to compare two word documents? [closed] - ms-word

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Businesses Analyst from my team keeps sending us the updated Requirements documents often and I end up hunting the recent changes by comparing the old version. Is their a good way of comparing the Word documents?
Note: We have the track changes option ON, but now the documents looks like a blood bath, complicating it much more :(

Use this option in Word 2003:
Tools | Compare and Merge Documents
Or this in Word 2007:
Review | Compare
It prompts you for a file with which to compare the file you're editing.

I use TortoiseMerge with the xdocdiff plugin to compare Word, Excel, PowerPoint and PDF versioned files

If you have Beyond Compare, you can diff two word documents with the help of some rules that you have to download from the developer's site and plugin. It'll then give you a text-only (without formatting) view (with some word format-gibberish that you can ignore. The differences will be highlighted and easy to find.
I made a note on how to do it here. It talks about Excel but there is a rule for Word in the same place.
If you don't have Beyond Compare... buy it! Highly recommended.. I'd struggle without it.

Codejacked covers three different methods on how to compare word documents.

You're using the wrong tools. Through the course of my last major project, we managed to convince the entire team to move to a Wiki scheme. Not only did it make tracking changes faster and easier, but it helped organize the information better. Rather than having to keep track of arbitrary indexes in a large text document, hyperlinks were available between documents.
This meant that the documents could naturally flow from high-level to specifics. Implementation of such specs was incredibly easy in comparison to Word docs. Also, the fact that the docs were in a central location ensured that no one was still working from an out of date copy they saved to their hard drive.
I know there can be some internal resistance to moving in new directions. But if you can convince your colleagues that they should be forward thinking and always challenging themselves, they'll give it a shot and become true believers in no time flat. :-)

Near the "track changes" stuff there is also an option to compare documents, I believe.

Attorneys use programs such as Comparewrite and DeltaView as we are comparing documents daily. We call it "blacklining" a document because the differences show up in bold underline for additions and black strike-through for deletions.

Open any of the documents and use the Review>Compare tab.

I don't know how to compare the files individually, since they are binary, but how about making a program that talks to MS Word, copying the contents of the files to a pure-text file? Then you could compare the plain-text files.

If the formatting is basic, one option is to use a tool that dumps the doc to a plain text file, and then use diff as you would on any other.

Versionate might do the trick.

The document comparison features in Word 2003 are extremely poor, and often results in the user removing parts of documents they did not want too
The only rational choice is to use other software. There are a multitude of text comparing software in the marketplace, but to do this within Word, the simplest answer is to upgrade to Word 2007 or later versions
From Word version 2007 the ribbon command "Review" and "Compare" are easy to find, and operate reasonably obviously. And they have a nice clear layout of merged changes, and the before and after docs
The small cost of the upgrade will be well worth considering the time you will waste in 2003 compare, and the potential damage to your documents it could cause
Any suggestions by others that you can use the compare features in 2003 is mischievous, and has not well thought through given the long term consequences of parts of your documents being silently deleted

Related

Is there a way to prevent MS docx document editing in OpenOffice?

I know this is too strange question, but we have multiple authors of one document and some contributors use OpenOffice to edit document, originating and edited by majority in MS word. Document is quite complex with differently structured paragraphs and fonts, bullets, numbering, embedded pictures, references to comments under the line, copied/pasted sections pasted with source formatting instead of pure text etc., so generally "fragile" and maybe little bit exceeding expectations of OpenOffice authors for MS compatibility. Bottom line is about various formatting issues, glue-ing of some words (occasionally space is missing), page footer/header modified or completely disappeared etc. We are unable to control behaviour of contributors and editors to the extent I would like to have, so I am trying to findout whether is there a way how to force users to use exclusively MS word for particular docx and to prevent using anything else? (I am not on MS payroll, I personally moved couple of people around me with "standard" document writing needs to OpenOffice, but incompatibility in this case creates useless redaction work for us.)
Thanks for any hint.

whether is there a way how to force users to use exclusively MS word for particular docx and to prevent using anything else
To me, it sounds like a terrible idea to try to enforce this with a macro or similar (and it probably wouldn't work even if you tried). Instead, come up with a better workflow and communicate with anyone who may be involved so they know what to do.
First question, is the document under configuration control? For example, if a bad change is made, do you have a way of going back to a previous version? There are many different configuration management tools available, both free and commercial.
Next, I would strongly recommend making final changes with only one Office suite. Pick either LibreOffice (or Apache OpenOffice - is that what you mean by OpenOffice? The OpenOffice.org suite was forked several years ago) or MS Word to be the official editing tool, but not both.
If you pick MS Word, then people can still make preliminary changes to the document using LibreOffice. However, someone with MS Word will then need to use a Diff tool to see the changes and then use MS Word to incorporate those changes into the document. Or ideally, Track Changes would be turned on to make it easier to see what changes were made and who made them. Comments can also be added to explain why changes were made.
What is even better is to get people to send marked-up PDF files that contain their proposed changes. PDF files cannot be edited, which is good because it avoids the kinds of problems that led you to write this question, and also the formatting changes they made will not appear differently on another computer. However, this requires a certain amount of education so that everyone agrees to do it this way, and in my experience, that's not easy with a diverse group.
If you ever see that someone has made changes to the main document using LibreOffice, you or someone else needs to go back to the latest version not edited by LibreOffice and then use MS Word to incorporate all of the new changes.
At this point, if both suites have been used to edit the document, then I would probably start off with a new blank document and copy all of the text unformatted into it. This would require redoing all tables and other formatting. Otherwise, it's likely to be nearly impossible to get a clean document, and the underlying formatting may have no end to the number of problems that keep popping up.

How does being blind affect your coding style? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
The question of how blind people program has been answered over and over already, but I couldn't find anything on how being blind and using a screen reader or braille display affects your coding style.
Can you tell code created by blind people apart from other code?
Does being blind cause you to think differently about a problem and look for other solutions?

I'm a blind developer. I will try to answer to your question according to what I do and what I already saw in codes coming form other blind developers.
However, remember that my answer absolutely isn't a reference at all. There are probably as many different usages, habits, preferences as sighted common developers have.
When working in a company and/or for an open source project, we have anyway to format our code as defined by the rules of the given company and/or project. There is no question, it's required.
IN this case me and most of the blind programmers I know of first write unformatted code, compile, test, etc. and only format it when it's time to commit.
Auto-formatting tools as there are in IDE are extremely precious, otherwise it would often be a real pain. If not using an IDE, command-line tools are also common, e.g. astyle for Java and C/C++.
If a given format isn't required by a company and/or project, many of us:
don't indent code, as it usually is more pain to navigate and edit within it, especially if we want to take care of not breaking it. In contrary to sighted people, indentation generally don't help us to quickly identify blocks. Even with a braille display if we have one, we can only see one line at a time.
use other tricks to identify where blocks end, if necessary in case of doubt / when nesting is deep. Most often, this takes the form of a comment following the closing brackets, e.g. } // end for. When the need arrise to do this, it can be a good indicator to tell us that we should better organize the code / better split into different functions.
use a lot of small tricks to be able to jump quickly to a part of code of interest. This can be simple comments like //constructor, which can be immediately be found with Ctrl+F, but it can also be more subtle. For example, one of my personal tricks is to put a space between the name and the open parent when defining or declaring a function, but don't when calling the function. So I can quickly go to the definition (by searching for "name ("), or the places where it is called (by searching for "name(").
hate ASCII art because it's totally useless, ex: a long line of /**********
often use shortcuts to avoid long code that give no real information, e.g. import java.util.* instead of importing 50 classes one by one.
often prefer using simple text editors rather than complex IDE, or only use them for specific functions such as auto-formatting because it's absolutely needed. Two reasons for this: many IDE are unaccessible, only partially accessible, or are mostly accessible but it's not necessary easy or comfortable to use a given feature; or because responsiveness with speech and braille displays is quite poor, i.e. when pressing up/down arrow to read the next/previous line of code, there is a too long delay before it starts speaking (it becomes quickly very annoying, if you multiply 100ms a thousend times).

Well, I answered this question partially here. Basically, you rarely can tell that a piece of code is written by a blind person, unless he/she breaks rules in quite a rude fashion (for example, uses tabs and camelCase instead of spaces and snake_case in Python, like me).
but even those things might be seen only in individual pet projects or quick and dirty scripts. Most of the blind people acknowledge they live in a sighted world, and if you want your pull request to be merged or your code to be reviewed by a superior at work, you must obey the code styling of the project, whether you like it or not, whether you're blind or not. In this situation people at Go made a wise decision to include a formatting utility that every Go developer must run before committing his/her code. "Nobody likes the Gofmt style", says Rob Pike, and he's wrong: I like its style very much: camelCase and tabs, what a delicious thing! But even you don't like it, you must run the tool because it is the language rule to do so.
And to the last part of your question: yes, being blind sometimes makes me to choose a solution, namely a language. As I hate snake_case, I can't think about serious development in Rust, for example, because (again) it's a language rule to write code like this. I do write Python code, but it's... oh well... kind of other thing because Python is so quick and flexible in resolving everyday problems that here I decided to cope with its (annoying) multiple underscores and the absence of block ending markers. BTW, another possible sign of a blind coder is comments like this: } // end if (in something like Javascript or C), or #end if as a whole line in Python. I don't deny sighted people can use those, but if you see every if and for and while ending commented like this, a great chance is that the code was written by a blind person. I personally don't do this, but I know people that like it very much.

I know this question is quite old, but the answer might be relevant still:
I am blind developer and I always intent to follow a coding style of a company or some standard given by developers of a language.
I always indent my code instantly when I write it and screen reader reports the indentation level. Honestly I do not longer have a habit to read unformatted code, but I know blind people who do;
Do the regular docblocking;
Fold/unfold some parts of a code when I need to navigate through large chunks of it;
Regular snakecase / camelcase habit (depends on a language);
Sometimes write longer lines of code and then use IDE to fix the formatting, because it is not always that longer code tends to be more complex for me to read;
Try to enforce myself to restrict the length of a line to be no greater than 80 characters, but it's a bit of a pain to ensure that to happens due to a lack of a good tooling;
Sometimes add some useful comments to help me to debug code (I mean some calculations / formulas in a comments that are not necessary important to others, but it depends).
Personally I found the biggest challenge is to write a code in docblocks (annotations) like in Doctrine or APIPlatform for example, because screen reader reads an indentation to the first non-space / non-tab character in the line which is asterisk (*) in a case of a docblocks.

Import docx file with comments into emacs org (vision)

I collaborate with other researchers and frequently have the following work flow:
I write a draft in Emacs org, then export it to docx.
Other authors make edits using track changes and add comments.
I revise the draft in emacs org.
For step 3, I import back the docx file manually, which typically involves:
- Accepting all track changes.
- C&P'ing text back into the org file, making sure that I do not delete markups (pandoc can help here).
- Putting the comments in a list and making todos and further edits; often I write down a note about what I did to address the comment.
I've been looking for ways to make this process better. I found other discussions of this issue, and it boils down to: if you can, have your collaborators edit the manuscript as a text file (not realistic for me, at list not at this point); or do some manual import similar to the one I described above.
So this post is about your thoughts / ideas regarding a great solution to importing back edited docx files that might become reality in the future, and how it could be done.
I think there are two parts here:
How to automatically import back text without destroying markups such as footnotes, references etc.?
How to automatically extract all the notes and integrate them into the Emacs org file?
For the second question, my vision would be to have some sort of comment blocks above the paragraph of the comment, or a list of headlines, each of them representing a comment and a link to the paragraph. A properties drawer would be a great additional feature, it could have one entry for open/closed and one entry for response / notes.
P.S.: I think this is a real barrier to using text-based manuscript writing and it would be a huge step forward if there was a good way. Even more, with all the capabilities of Emacs org, I bet the end result would be much better then revising a paper within word, which is just painful.

Here's how you might be able to do it
assume all changes are properly marked.
assume you know the "base version" of your org file.
assume every marked change comes with a "before" and an "after".
Then, analyze the .docx (same for .odt) looking for marked changes. Ignore everything else. Take the "before" version of each change, turn it into plain text, and try to find the matching element in the org file, then replace that text with the "after" version.
For comments, you could probably try a similar approach.
Caveat: I have no idea how easy/hard it is to find the marked changes, extract the "before/after" info and turn it into plain text.
Oh, and this will probably only work acceptably for small localized changes, e.g. the kind of thing you might get from a reviewer. For things coming from another author who may end up making larger changes and reorganizations it'll probably break down miserably.

How to highlight the differences between two versions of a text in .NET web app?

I have been supporting a web application at work for our Call Center unit for about 2 years now. The app is written in ASP.NET 3.5 with SQL server 2005 database. I’ve been asked to expand the call detail section to allow agents to edit the current call note with the ability to revert back to its previous version. Now, that’s all cool but now the manager wants to be able to click on any particular note and see all edits with changes highlighted in yellow (and if something was deleted, he wants to SEE the deleted text crossed out). Actually, what I need is very similar to how Stackoverflow handles edits on their questions. I’ve been thinking about how to go about this and after doing research and Google-ing of course, I am still unsure which route to take. I am fairly new to .NET development. Any ideas on the best technique for highlighting the changes in UI? I am afraid I am going to have to store a copy of the entire note each time they make a change because the manager wants to be able to easily review notes and revert back to ANY version (not just the most recent one) before sending the monthly call report off to our VIP customers. Since this department OFTEN changes their mind on things, I want to make sure the new functionality is scalable and easy to maintain. Any ideas would be greatly appreciated. I am really just looking for someone to point me in the right direction; maybe there are some tools out there that can be useful, recommended keywords in Google lookup, etc.

This will be difficult do to.
You'll need a "text editor" control that can not only edit the text, but which can also tell you what changes were made.
You then need to store not only the final text string, but also the list of changes
You'll then need to be able to display the text plus changes, using strike-outs, and different colors for inserts vs. changes
You'll need to do this not only for the changes of a single user, but you'll need to store each users' changes in the database, and will need to be able to display all the changes, all at once.
Your manager should be really sure he needs this.

Some tools for doing the diff for you can be found at Any decent text diff/merge engine for .NET?.
This would entail storing every version like you say. This should allow you to implement it similarly to SO. I seem to recall reading or hearing Jeff mention it, but wasn't able to find it, likely in one of the SO podcasts.

Easiest would be to store the text for each revision, then when the user wants to see the diff use a diff tool to generate the highlighted text.
Here is some Javascript diff code:
http://ejohn.org/projects/javascript-diff-algorithm/
If all the computers have Word installed you may be able to use a Word control to accomplish this. TortoiseSVN has scripts in its program directory which can take two word documents and produce a document with changes highlighted. To see this create c:\aaa.doc and bbb.doc, then install TortoiseSVN and run:
wscript.exe "C:\program files\tortoisesvn\Diff-Scripts\diff-doc.js" c:\aaa.doc c:\bbb.doc //E:javascript

I think you should see http://en.wikipedia.org/wiki/Revision_control

What's the best way of diffing Crystal Reports? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
If you have two versions of the same report (.rpt) and you want to establish what the exact differences are, what is the best way to go about this? I've seen some commercial tools to do this, but I'm not too interested in forking out cash for something that should be relatively straight forward. Can I hook into the Crystal API and simply list all of the properties of every field or something? Please someone tell me that there's an Open Source project somewhere that does this... #:-)
#Kogus, wouldn't diffing the outputs as text hide any formatting differences?
#ladoucep, I don't seem to be able to export the report without data.

Can I hook into the Crystal API and
simply list all of the properties of
every field or something? Please
someone tell me that there's an Open
Source project somewhere that does
this... #:-)
There is in fact, such an API. I wrote a VB6 application to do just what you asked and more. I think I even migrated it to VB.Net. As it was for my own use, I didn't spend much time making it 'polished'. I've been intending to release it, but I haven't had the time...
Another approach that I've used in the past is to create an Access application to help manage large, report-development projects. One of it's many features includes the ability to extract the tables that are used by the report, and the SQL statements used by its Commands and SQL Expressions. It's intent is to give one a global perspective of which reports use which tables. I probably still have it somewhere...
** edit 1 **
BusinessObjects Enterprise XI (R?) has a feature named 'Meta Manager'. It will periodically examine the contents of the Repository and save the results to a database. It uses the Report-Application Service (RAS) to generate the meta data. It's an additional, 5-figure license, of course.
** edit 2 **
Consider using PowerShell to do the work: PsCrystal.

One helpful technique is to output both versions of the report to plain text, then diff those outputs.
You could write something using the crystal report component to describe every property of the report, like you described. Then you could output that to text, and diff those. I'm not aware of any open source tool that does it for you, but it would not be terribly hard to write it.
#question in the post:
Diffing the outputs would only show formatting changes if the relative positions had changed. For example, if i had this:
before:
First name, last name, addresss
after:
Last Name, First Name, Address
Then that would show up as a difference.
But if I had just bumped the address column over a few pixels, or changed it from plain text to bold, then you are right, that would not show up.

One technique I have used to great effect in the past is to print out reports from both versions based on the same data. I then take the first page from each version, lay one on top of the other (it is important not to mix them up) and hold them up to a window. It is generally quite easy to see any differences, and these differences can be manually annotated with a suitable writing instrument such as a pencil. Repeat for each page in the report.
Admittedly, for large reports this can be quite time consuming and error prone, but these limitataions can be overcome with patience and care.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse