Comparing two tar.gz

Comparing two tar.gz - diff

I have two tar.gz files I have downloaded from the same URL. One I downloaded using Safari (53.6 MB) and the other one through Chrome (56.9 MB). Extracting them results in two exactly alike folders with 56 MB. Same sizes, same contents. I compared with diff and found no difference. I also compared both tar.gz files with the following command, but also found no difference:
diff <(tar -tvf fileone.tar.gz | sort) <(tar -tvf filetwo.tar.gz | sort)
Now, is there any other way I could try to see what the difference between both is?

Related

Downloaded Eclipse Mars 1 checksums don't match

I downloaded Eclipse Mars 1 from a mirror chosen for me on the Eclipse website. But the md5sum on the downloaded file is 929f821dc83eaac83fc6320b291dcb7f while on the Eclipse website it's been given as 72a722a59a43e8ed6c47ae279fb3d355. This struck me as odd because it's the first time I'm experiencing something like this -- I've made like ~20 such downloads, different files, different servers, since I started learning how to code -- and all of them matched the checksums on their respective servers.
So I decided to get a fresh download from a different mirror. I chose https://spring.io/tools/eclipse. And the thing is, I can't find any checksums around there but the md5sum for their file is a562f87ddf353dd8519edfc072d4c67d. I'm confused. I'm under the impression that irregardless of the mirror, the file hash should match the hash given on the Eclipse website -- 72a722a59a43e8ed6c47ae279fb3d355

Either the md5sum was not updated correctly on the website or you had a hiccup in your Internet connection that made the file change, resulting in the difference you're having.
Another rare case might be that you made a mistake in choosing the checking
option: for example I once copied the sha1 code instead of md5.

Has anybody ever successfully downloaded the Cities project in netlogo

As shown in the following link, the developer says that the whole project codes are included in the zip archive, but after I downloaded and decompressed it, I can't find the file of Cities.nlogo which is the main procedure for the entire simulation project. I don't know if there are someone else encounter the same problem as mine, how do you solve it ?
http://ccl.northwestern.edu/cities/citiesmodel.shtml#download

I tried this just now:
% curl -OsS 'http://ccl.northwestern.edu/cities/cities.zip'
% unzip -l cities.zip | grep Cities.nlogo
error [cities.zip]: missing 2 bytes in zipfile
(attempting to process anyway)
36365 09-26-07 15:21 cities/Cities.nlogo
as you can see there is definitely a Cities.nlogo file in there, but is also appears that the zip file is slightly corrupted ("missing 2 bytes in zipfile"). You might try extracting the archive with a different program? It extracted successfully for me using Archive Utility on Mac OS X.

How to debug slow Ember CLI/Broccoli builds

My Ember CLI project is currently taking 8-9 seconds to build, and I'd like to understand why. The project is not that large (~180 files under app/ including hbs and scss).
Here's my brocfile: https://gist.github.com/samselikoff/874c90758bb2ce0bb210
However, even if I comment my entire Brocfile out and export just the app variable, the build still takes 5-6 seconds.
I'm not quite sure how debug. Here's my slowest tree logs:
Build successful - 8874ms.
Slowest Trees | Total
-------------------------------+----------------
TreeMerger (appAndDependencies)| 1286ms
TreeMerger (vendor) | 1275ms
CompassCompiler | 1204ms
StaticCompiler | 1185ms
TreeMerger (stylesAndVendor) | 1151ms
TreeMerger (allTrees) | 706ms
StaticCompiler | 625ms

UPDATE: If you are using ember-cli version 0.1.0 or newer, this hack is probably not necessary. ember-cli now symlinks files instead of copying. You may still get a performance improvement on windows or slow disks.
Broccoli (used by ember-cli) stores its temporary state in the file system, thus it's very file I/O dependent. Try to reduce the number of files in your public/, vendor/ and bower_components/ directories. All files inside these folders will be copied at least once per rebuild cycle. The size and number of files in the folders affects performance greatly.
Essentially, every time you change a file, broccoli is copying files between the many directories inside <ember app>/tmp/. In the case of your bower_components/ dir, it appears to be copying every single file more than once. It needs to do this because you might use app.import('some.js') in your Brocfile.js, you might also #import "some.scss" in SASS/LESS files. There is no way to know which files you actually need, so it copies all of them.
If you remove the files that you do not need from bower_components/ and vendor/, you will notice better build times.
A real world example
If you install the highcharts.com#3.0.5 bower dependency, you also get a special gift of 2829 files (198MB) in your bower_components/ dir. Imagine the unnecessary file system reads and copies that are happening there.
Here is a snippet of my cleaned dir structure:
$ find bower_components -type f | grep highcharts
bower_components/highcharts.com/js/highcharts-more.src.js
bower_components/highcharts.com/js/highcharts.src.js
Notice that only the .js files remain, I removed everything else. That's 2827 removed files. Highcharts is an extreme example, but most of your dependencies have 5 times as many files than you actually need.
Positive future ahead
The ember-cli team are hard at work improving performance of the underlying broccoli ecosystem. Work has already begun and some real world apps (with large trees) are seeing performance improvements reducing rebuild time from 4 seconds to 600ms. Using symlinks instead of copying is showing drastic improvements.
For those of us who have large scale apps, lots of bower deps and many crying team members, who need a solution now:
A temporary solution
One way of keeping your bower_components/ clean, is to check the dependencies into version control. This allows you to use git clean to prune your directory with ease:
bower install —-save d3
git add -—force bower_components/d3/d3.js # force, because bower_components/ is gitignored
git commit -m "Added d3.js"
// Brocfile.js
app.import('bower_components/d3/d3.js');
Every time you do a bower install you will likely get all the extra cruft back in your dir. git clean easily removes non version controlled files:
git clean -f -d -x bower_components/
ember serve
After doing this, it took a single rebuild (time to build after changing a file) from 20 seconds down to 3.5 seconds (we have a pretty large app).
If you do go this path, dont forget the bower deps needed by Ember:
bower_components/ember/ember.js
bower_components/ember/ember.prod.js
bower_components/ember-cli-shims/app-shims.js
bower_components/ember-cli-test-loader/test-loader.js
bower_components/ember-data/ember-data.js
bower_components/ember-data/ember-data.prod.js
bower_components/ember-load-initializers/ember-load-initializers.js
bower_components/ember-resolver/dist/modules/ember-resolver.js
bower_components/jquery/dist/jquery.js
bower_components/loader/loader.js
bower_components/handlebars/handlebars.js
bower_components/handlebars/handlebars.runtime.js
Here's the git command for you:
bower install
git add -f bower_components/ember/ember.js bower_components/ember/ember.prod.js bower_components/ember-cli-shims/app-shims.js bower_components/ember-cli-test-loader/test-loader.js bower_components/ember-data/ember-data.js bower_components/ember-data/ember-data.prod.js bower_components/ember-load-initializers/ember-load-initializers.js bower_components/ember-resolver/dist/modules/ember-resolver.js bower_components/jquery/dist/jquery.js bower_components/loader/loader.js bower_components/handlebars/handlebars.js bower_components/handlebars/handlebars.runtime.js
git commit -m "Added ember-cli dependencies"
git clean -f -d -x bower_components/

In addition to #tstirrat's answer:
Only running as administrator will allow You to use symlinks files
(Default SeCreateSymbolicLinkPrivilege flag) try running cmd (or PowerShell) as administrator and see if it helps.
You can allow it for specific user using Local Policies configuration.
Answer taken from here

wget download and rename files that originally have no file extension

Have a wget download I'm trying to perform.
It downloads several thousand files, unless I start to restrict the file type (junk files etc). In theory restricting the file type is fine.
However there are lots of files that wget downloads without a file extension, that when manually opened with Adobe for example, are actually PDF's. These are actually the files I want.
Restricting the wget to filetype PDF does not download these files.
So far my syntax is wget -r --no-parent A.pdf www.websitehere.com
Using wget -r --no-parent www.websitehere.com brings me every file type, so in theory I have everything. But this means I have 1000's of junk files to remove, and then several hundred of the useful files of unknown file type to rename.
Any ideas on how to wget and save the files with the appropriate file extension?
Alternatively, a way restrict the wget to only files without a file extension, and then a separate batch method to determine the file type and rename appropriately?
Manually testing every file to determine the appropriate application will take a lot of time.
Appreciate any help!

wget has an --adjust-extension option, which will add the correct extensions to HTML and CSS files. Other files (like PDFs) may not work, though. See the complete documentation here.

how to reduce GWT war file size

i am using ExtGWT. my application has 5 modules. in war folder all five modules will be compiled and placed. but in every module resources folder is common. my intention is keeping resources folder common. so that the generated war size can be decreased. plz suggest me.
Thanks,
David

Perhaps not exactly, what you are asking for, but I guess, you don't want to upload everytime everything since the amount of data is quite large.
I do it this way:
- DON't create a war-file.
- simply use rsync to incrementally deploy the contents of the war-directory of your GWT-project like this:
rsync -avc --compress --progress --delete --rsh='ssh' --cvs-exclude
./war
root#serverip:/usr/share/tomcat7/webapps/ROOT/
So, only newer files gets uploaded to the server and remaining old files which are not used anymore gets deleted from the server.
Hoped this helped you.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse