How do I see the differences between 2 MySQL dumps? - diff

I have 2 MySQL dump files. I want to find the table data difference between 2 tables.

run mysqldump with "--skip-opt" to get the 2 dumps files i.e:
mysqldump --skip-opt -u $MY_USER -p$MY_PASS mydb1 > /tmp/dump1.sql
mysqldump --skip-opt -u $MY_USER -p$MY_PASS mydb2 > /tmp/dump2.sql
compare using these diff options:
diff -y --suppress-common-lines /tmp/dump1 /tmp/dump2

Use a DIFF tool - here are some graphical ones (both are free):
KDIFF
winmerge

This tool is not available anymore, as the website is no longer functional.
Maybe you can give a tool called mysqldiff a go, I haven't tried it myself yet but it's been on my list for a while.
http://www.mysqldiff.org/

This was very useful for me, so adding my two cents:
git diff --word-diff=color dump1.sql dump2.sql | less -R

I just had to add lines breaks at ),('s so that each record becomes a separate line. Then the result can be fed to a tool like diff. This command does the job:
FORMAT_="s/),(/),\n(/g"
diff <(sed $FORMAT_ old-dump.sql) <(sed $FORMAT_ new-dump.sql)

In order to compare 2 mysql diffs they need to be done in a certain manner, so that the order is in a defined way and non relevant data is omitted.
This was at one point not totally possible with mysqldump, I am not sure if this has changed in the meantime.
One good tool for the job is pydumpy https://code.google.com/p/pydumpy/ (mirror: https://github.com/miebach/pydumpy)
If you want to compare to an old dump, like in the question, you could first create a temporary database from the dump and then start there.

Here's what I use. It works.
#!/bin/bash
# Do a mysqldump of the a db, once a day or so and diff to the previous day. I want to catch when data has changed
# Use the --extended-insert=false so that each row of data is on a single line, that way the diff catches individual record changes
mv /tmp/dbdump0.sql /tmp/dbdump1.sql 2>/dev/null
mysqldump -h dbhostname.com -P 3306 -u username -p password --complete-insert --extended-insert=false dbname > /tmp/dbdump0.sql
# Ignore everything except data lines
grep "^INSERT" /tmp/dbdump0.sql > /tmp/dbdump0inserts
grep "^INSERT" /tmp/dbdump1.sql > /tmp/dbdump1inserts
diff /tmp/dbdump1.sql /tmp/dbdump0.sql > /tmp/dbdumpdiffs
r=$?
if [[ 0 != "$r" ]] ; then
# notifier code remove
fi

Related

p4 CLI: How to find new files not yet "added" to perforce control

I have looked at different ways of doing this using diff. The first option I tried is:
p4 diff -sa
Opened files that are different from the revision in the depot, or missing.
Initially I figured that this was a file with write permission bit set that did not exist in the depot. However, I have since learned p4 doesn't use mode bits to track opened/unopened states as I first thought.
Next I figured this option would work:
p4 diff -sl
Every unopened file, along with the status of 'same', 'diff' or 'missin' as compared to its revision in the depot.
This would be okay, except "unopened" is not inclusive of "untracked" files. Although, when I ran this, it produced something quite different that contradicts the documentation; it output pretty much everything that was tracked, but also output everything that wasn't tracked, but flagged them as 'same'. Maybe this means that it hasn't been added and doesn't exist in the depot, so the client is the same as the depot...? In my SVN biased opinion, a rather pointless option.
Then there is the 'opened' option. But this does exactly that. It lists all the files in the depot that have been opened on the client; so not the files modified on the client not yet added.
So is there an option I am missing somewhere, that will provide some valuable answer, like SVN and CVS are able to do with one simple command?
$ svn status
A added
M modified
R deleted
? untracked
L locked
C conflict
Or:
$ cvs -q up -Pd
Okay, looking around and playing with the 'add' command, it seems that a read-only add will output successful message if the file is not currently controlled:
$ p4 add -n -f somefile
//source/somefile#1 - opened for add
I applied this to the following command and pretty much get what I need:
$ find . -type f | while read f ; do p4 add -f -n "$f" | grep -e '- opened for add' >/dev/null && echo "A $f"; done
A ./somefile
Or if you're not bothered about local paths:
$ find . -type f | xargs -l1 p4 add -f -n | grep -e '- opened for add'
//source/somefile#1 - opened for add
Well, there exists "p4 status", which is very similar in both purpose and behavior to "svn status".
For more ideas, see: http://answers.perforce.com/articles/KB_Article/Working-Disconnected-From-The-Perforce-Server

How to use Rsync to copy only specific subdirectories (same names in several directories)

I have such directories structure on server 1:
data
company1
unique_folder1
other_folder
...
company2
unique_folder1
...
...
And I want duplicate this folder structure on server 2, but copy only directories/subdirectories of unique_folder1. I.e. as result must be:
data
company1
unique_folder1
company2
unique_folder1
...
I know that rsync is very good for this.
I've tried 'include/exclude' options without success.
E.g. I've tried:
rsync -avzn --list-only --include '*/unique_folder1/**' --exclude '*' -e ssh user#server.com:/path/to/old/data/ /path/to/new/data/
But, as result, I don't see any files/directories:
receiving file list ... done
sent 43 bytes received 21 bytes 42.67 bytes/sec
total size is 0 speedup is 0.00 (DRY RUN)
What's wrong? Ideas?
Additional information:
I have sudo access to both servers. One idea I have - is to use find command and cpio together to copy to new directory with content I need and after that use Rsync. But this is very slow, there are a lot of files, etc.
I've found the reason. As for me - it wasn't clear that Rsync works in this way.
So correct command (for company1 directory only) must be:
rsync -avzn --list-only --include 'company1/' --include 'company1/unique_folder1/***' --exclude '*' -e ssh user#server.com:/path/to/old/data/ /path/to/new/data
I.e. we need include each parent company directory. And of course we cannot write manually all these company directories in the command line, so we save the list into the file and use it.
Final things we need to do:
1.Generate include file on server 1, so its content will be (I've used ls and awk):
+ company1/
+ company1/unique_folder1/***
...
+ companyN/
+ companyN/unique_folder1/***
2.Copy include.txt to server 2 and use such command:
rsync -avzn \
--list-only \
--include-from '/path/to/new/include.txt' \
--exclude '*' \
-e ssh user#server.com:/path/to/old/data/ \
/path/to/new/data
If the first matching pattern excludes a directory, then all its descendants will never be traversed. When you want to include a deep directory e.g. company*/unique_folder1/** but exclude everything else *, you need to tell rsync to include all its ancestors too:
rsync -r -v --dry-run \
--include='/' \
--include='/company*/' \
--include='/company*/unique_folder1/' \
--include='/company*/unique_folder1/**' \
--exclude='*'
You can use bash’s brace expansion to save some typing. After brace expansion, the following command is exactly the same as the previous one:
rsync -r -v --dry-run --include=/{,'company*/'{,unique_folder1/{,'**'}}} --exclude='*'
An alternative to Andron's Answer which is simpler to both understand and implement in many cases is to use the --files-from=FILE option. For the current problem,
rsync -arv --files-from='list.txt' old_path/data new_path/data
Where list.txt is simply
company1/unique_folder1/
company2/unique_folder1/
...
Note the -r flag must be included explicitly since --files-from turns off this behaviour of the -a flag. It also seems to me that the path construction is different from other rsync commands, in that company1/unique_folder1/ matches but /data/company1/unique_folder1/ does not.
For example, if you only want to sync target/classes/ and target/lib/ to a remote system, do
rsync -vaH --delete --delete-excluded --include='classes/***' --include='lib/***' \
--exclude='*' target/ user#host:/deploy/path/
The important things to watch:
Don't forget the "/" from the end of the pathes, or you will get a copy into subdirectory.
The order of the --include, --exclude counts.
Contrary the other answers, starting with "/" an include/exclude parameter is unneeded, they will automatically appended to the source directory (target/ in the example).
To test, what exactly will happen, we can use a --dry-run flags, as the other answers say.
--delete-excluded will delete all content in the target directory, except the subdirectories we specifically included. It should be used wisely! On this reason, a --delete is not enough, it does not deletes the excluded files on the remote side by default (every other, yes), it should be given beside the ordinary --delete, again.

comparing two directories with separate diff output per file

I'd need to see what has been changed between two directories which contain different version of a software sourcecode. While I have found a way to get a unique .diff file, how can I obtain a different file for each changed file in the two directories? I'd need this, as the "main" is about 6 MB and wanted some more handy thing.
I came around this problem too, so I ended up with some lines of a shell script. It takes three arguments: Source and destination directory (as used for diff) and a target folder (should exist) for the output.
It's a bit hacky, but maybe it would be useful for someone. So use with care, especially if your paths have special characters.
#!/bin/sh
DIFFARGS="-wb"
LANG=C
TARGET=$3
SRC=`echo $1 | sed -e 's/\//\\\\\\//g'`
DST=`echo $2 | sed -e 's/\//\\\\\\//g'`
if [ ! -d "$TARGET" ]; then
echo "'$TARGET' is not a directory." >&2
exit 1
fi
diff -rqN $DIFFARGS "$1" "$2" | sed "s/Files $SRC\/\(.*\?\) and $DST\/\(.*\?\) differ/\1/" | \
while read file
do
if [ ! -d "$TARGET/`dirname \"$file\"`" ]; then
mkdir -p "$TARGET/`dirname \"$file\"`"
fi
diff $DIFFARGS -N "$1/$file" "$2/$file" > "$TARGET"/"$file.diff"
done
if you want to compare source code it is better to commit it to a source vesioning program as "svn".
after you have done so. do a diff of your uploaded code and pipe it to file.diff
svn diff --old svn:url1 --new svn:url2 > file.diff
A bash for loop will work for you. The following will diff two directories with C source code and produce a separate diff for each file.
for FILE in $(find <FIRST_DIR> -name '*.[ch]'); do DIFF=<DIFF_DIR>/$(echo $FILE | grep -o '[-_a-zA-Z0-9.]*$').diff; diff -u $FILE <SECOND_DIR>/$FILE > $DIFF; done
Use the correct patch level for the lines starting with +++

Is there an easy way to revert an entire P4 changelist?

Let's say I checked in a changelist (in Perforce) with lots of files and I'd like to revert the entire changelist. Is there an easy way to "revert" the entire changelist in one fell swoop?
Currently I do something like this for each file in the changelist:
p4 sync //path/to/file#n (where "n" is the previous version of the file)
cp file file#n
p4 sync //path/to/file
p4 edit //path/to/file
cp file#n file
rm file#n
As you can imagine, this is quite cumbersome for a large changelist.
The posted answers provide correct answers, but note also that there is an actual menu option in P4V to do this for you now. It's in the latest 2008.2 Beta, and so should be officially released the the next week or three.
This link gives details.
It should be a lot simpler to use than the earlier answers, but I've not had the opportunity to try it myself yet.
Update This has now been fully released. See Perforce downloads.
This looks interesting. I haven't tried it personally.
The official answer from Perforce is at http://kb.perforce.com/UserTasks/ManagingFile..Changelists/RevertingSub..Changelists but the procedure is not all that much easier than the one you suggest. The script suggested by #ya23 looks better.
For some reason, the awk step does not work for me. I'm running from a Windows environment with emulated Unix command line tools. However, the following does work:
p4 describe -s [changelist_number] | grep // | sed "s/\.\.\. //" | sed "s/#.*//" | p4 -ztag -x - where | grep "... path " | sed "s/\.\.\. path //"
Here are possible locations to get Unix command line tools in a Windows environment:
http://sourceforge.net/projects/getgnuwin32/?source=typ_redirect
http://unxutils.sourceforge.net/
I have the same problem when I want to delete an entire changelist. so I use the following script (notice that it also deletes the changelist's shelve and the changelist itself. if you only want to revert, copy the relevant lines).
Also, make sure the sed applies to your version of p4.
#!/bin/bash
set -e
if [[ $# -ne 1 ]]; then
echo "usage: $(basename $0) changelist"
exit 1
fi
CHANGELIST=$1
#make sure changelist exist.
p4 describe -s $CHANGELIST > /dev/null # set -e will exit automatically if fails
p4 shelve -d -c $CHANGELIST 2> /dev/null || true # changelist can be shelveless
files_to_revert=$(p4 opened 2> /dev/null | grep "change $CHANGELIST" | sed "s/#.*//g")
if [[ -n "$files_to_revert" ]]; then
p4 revert $files_to_revert
fi
p4 change -d $CHANGELIST
The problem starts when you want to revert an entire changelist ( as a bulk ) that you've just submitted, and you need to start reverting files of #n-1 one by one fast ( because it's production ) ...
Wanted to support ya23's answer- the link of a Python script - it's really really easy to use ( and really easy to miss his comment )
You give it the revision you want to rollback, and it prepares everything automatically ( each file's #n-1 & merging and everything ) ... you just submit.

How do I get a list of commit comments from CVS since last tagged version?

I have made a bunch of changes to a number of files in a project. Every commit (usually at the file level) was accompanied by a comment of what was changed.
Is there a way to get a list from CVS of these comments on changes since the last tagged version?
Bonus if I can do this via the eclipse CVS plugin.
UPDATE: I'd love to accept an answer here, but unfortunately none of the answers are what I am looking for. Frankly I don' think it is actually possible, which is a pity really as this could be a great way to create a change list between versions (Assuming all commits are made at a sensible granularity and contain meaningful comments).
I think
cvs -q log -SN -rtag1:::tag2
or
cvs -q log -SN -dfromdate<todate
will do what you want. This lists all the versions and comments for all changes made between the two tags or dates, only for files that have changed. In the tag case, the three colons exclude the comments for the first tag. See cvs -H log for more information.
The options for the cvs log command are available here. Specifically, to get all the commits since a specific tag (lets call it VERSION_1_0)
cvs log -rVERSION_1_0:
If your goal is to have a command that works without having to know the name of the last tag I believe you will need to write a script that grabs the log for the current branch, parses through to find the tag, then issues the log command against that tag, but I migrated everything off of CVS quite a while ago, so my memory might be a bit rusty.
If you want to get a quick result on a single file, the cvs log command is good. If you want something more comprehensive, the best tool I've found for this is a perl script called cvs2cl.pl. This can generate a change list in several different formats. It has many different options, but I've used the tag-to-tag options like this:
cvs2cl.pl --delta dev_release_1_2_3:dev_release_1_6_8
or
cvs2cl.pl --delta dev_release_1_2_3:HEAD
I have also done comparisons using dates with the same tool.
I know you have already "solved" your problem, but I had the same problem and here is how I quickly got all of the comments out of cvs from a given revision until the latest:
$ mkdir ~/repo
$ cd ~/repo
$ mkdir cvs
$ cd cvs
$ scp -pr geek#avoid.cvs.org:/cvs/CVSROOT .
$ mkdir -p my/favorite
$ cd my/favorite
$ scp -pr geek#avoid.cvs.org:/cvs/my/favorite/project .
$ cd ~/repo
$ mkdir -p ~/repo/svn/my/favorite/project
$ cvs2svn -s ~/repo/svn/my/favorite/project/src ~/repo/cvs/my/favorite/project/src
$ mkdir ~/work
$ cd ~/work
$ svn checkout file:///home/geek/repo/svn/my/favorite/project/src/trunk ./src
$ cd src
$ # get the comments made from revision 5 until today
$ svn log -r 5:HEAD
$ # get the comments made from 2010-07-03 until today
$ svn log -r {2010-07-03}:HEAD
The basic idea is to just use svn or git instead of cvs :-)
And that can be done by converting the cvs repo to svn or git using cvs2svn or cvs2git, which we should be doing anyway. It got my my answer within about three minutes because I had a small repository.
Hope that helps.
Something like this
cvs -q log -NS -rVERSION_3_0::HEAD
Where you probably want to pipe the output into egrep to filter out the stuff you don't want to see. I've used this:
cvs -q log -NS -rVERSION_3_0::HEAD | egrep -v "RCS file: |revision |date:|Working file:|head:|branch:|locks:|access list:|keyword substitution:|total revisions: |============|-------------"