I am trying to download all files starting with traceroute from https://data-store.ripe.net/datasets/atlas-daily-dumps/ via wget.
I am running the following command:
wget -A traceroute* -m -np https://data-store.ripe.net/datasets/atlas-daily-dumps/ --no-check-certificate
It creates the directories, checks index.html's and then within 5 minutes it stops, without downloading any traceroute files.
When I try another type of file via
wget -A connection* -m -np https://data-store.ripe.net/datasets/atlas-daily-dumps/ --no-check-certificate
it donwloads the connection files no problem. What can be the issue?
You probably have a local file that matches the glob traceroute*; you need to put single quotes around it so the shell won't match anything:
wget -A 'traceroute*' -m -np https://data-store.ripe.net/datasets/atlas-daily-dumps/ --no-check-certificate
specifying traceroute*.bz2 seems to have fixed the problem
Related
I used
wget -mirror --convert-links http://example.com/ 2>&1 | tee -a wget.log
to download a website. It turns out that only some of the links were converted. How can I have all of the links converted, even after the download? I do not want to download all of the contents again.
Firstly, please be aware that --convert-links does it job after everything was downloaded so if you are inspecting certain downloaded file before wget finished working you might see unconverted list.
I do not want to download all of the contents again.
then you should use --no-clobber, but according to man page --mirror is equivalent to -r -N -l inf --no-remove-listing and --no-clobber and -N are mutually exclusive, therefore you must not use --mirror but parts of it excluding -N taking this is account your command should look following way
wget -r --no-clobber -l inf --no-remove-listing --convert-links http://example.com/
I'm trying to semi mirror a site. What I want is to download all of the MP3s and make sure I'm not redownloading those that I already have (hence the "mirror" part). I've typed in the following:
wget -m -nd -e robots=off --random-wait -A "*.mp3" -P FOLDER http://www.example.com/
And it downloads all the MP3s on the Current Page. It never follows the links to the "Next Page" or the likes. I've replaced the -m with -N -c -r without success. What other options can I use?
Try:
wget ‐‐execute robots=off ‐‐recursive ‐‐accept mp3,MP3 --random-wait ‐‐no-parent ‐‐continue ‐‐no-clobber //site.com/
Here is my wget command where i am trying to rename the file which i am downloading but it is not working. I am using -O option here but somehow it is not working.
access="http://mvn:8081/nexus/content/com/mvn/"
wget -r -np -nd -l1 -O "access.war" "$access" -A "com.infa.products.ldm.ingestion.access.web-"$n"-.-1-ldm-access-web.war"
Here i am renaming it to access.war. I can only use wget to do this job due to some restrictions.
Thanks for the help.
The option -A is "comma separated", but you are using dots to separate the extensions!
Instead of
-A "com.infa.products.ldm.ingestion.access.web-"$n"-.-1-ldm-access-web.war"
Try
-A "com,infa,products,ldm,ingestion,access,web-"$n"-,-1-ldm-access-web,war"
If this is not the solution to your problem, I suggest you simplify your wget-call down to something like this
wget -r -np -nd -l1 -O "access.war" "$access"
Just to verify that all else is working.
Or even better (to get fewer files)
wget -r -np -nd -l1 -O "access.war" "$access" -A "war"
I try to download this site, with this code:
wget -r -l1 -H -t1 -nd -N -np -A.mp3 -erobots=off tenshi.spb.ru/anime-ost/
But I only get the index and enter inside the first folder, not the subfolder, help me?
I use this command to download sites including their subfolders:
wget --mirror -p --convert-links -P . [site address]
A little explanation:
--mirror is a shortcut for -N -r -l inf --no-remove-listing.
--convert-links makes links in downloaded HTML or CSS point to local files
-p allows you to get all images, etc. needed to display HTML pages
-P specifies the next argument is the directory the files will be saved to
I found the command at:
http://www.thegeekstuff.com/2009/09/the-ultimate-wget-download-guide-with-15-awesome-examples/
You use -l 1 also known as --level=1 which limits recursion to one level. Set that to a higher level to download more pages. BTW, I like long options like --level because its easier to see what you are doing without going back to man pages.
What's the best way of updating data files from a website that has moved on to a new domain, with changes in their folder structure.
The old URL for example is http://folder.old-domain.com while the new URL is http://new-domain.com/directory1/directory2. My data is stored locally in ~/Data_Backup/folder.old-domain.com folder.
Data was originally downloaded using:
$ wget -S -t 0 -c --mirror –w 2 –k http://folder.old-domain.com
I was thinking of using mv to rename the old folder to follow the new URL pattern, but is there a better way of doing this?
Will this work? I'm not particular with the directory structure. What's important is to update the contents of the target folder (and its sub-folders.)
$ wget -S -t 0 -c -m –w 2 –k -N -np -P ~/Data_Backup/folder.old-domain.com http://new-domain.com/directory/directory
Thanks in advance.
Got it!
I need to add the following options:
-nH --cut-dirs=2
and now it works.