wget corrupted file .zip file error - wget

I am using wget to try and download two .zip files (SWVF_1_44.zip and SWVF_44_88.zip) from this site: http://www2.sos.state.oh.us/pls/voter/f?p=111:1:0::NO:RP:P1_TYPE:STATE
when I run:
wget -r -l1 -H -t1 -nd -N -np -A.zip -erobots=off "http://www2.sos.state.oh.us/pls/voter/f?p=111:1:0::NO:RP:P1_TYPE:STATE/SWVF_1_44.zip"
I get a downloaded zip file that has a screwed up name (f#p=111%3A1%3A0%3A%3ANO%3ARP%3AP1_TYPE%3ASTATE%2FSWVF_1_44) and it cannot be opened.
Any thoughts on where my code is wrong?

There's nothing "wrong" with your code. Wget is simply assuming you want to save the file in the same name that appears in the url. Use the -O option to specify an output file:
wget blahblahblah -O useablefilename.zip

Related

Retrieve file path downloaded via wget

I am downloading files with wget command
wget abc.com -nH -r -l1 --no-parent
This is storing files in different sub folders. I want path of each download file. So, how do I get it ?
Example:
wget is downloading file to:
c:/test/com/test/pacakage/filename1.text
c:/test/com1/test/package1/filename2.text
So, how to retrieve complete file path - ie. com/test/package/filename1.text ?
Thanks

wget download only the sub directory

i need to download only the sub directory named pyVim with all its content ,but i
am getting the parents as well , even that i tried the following options:
wget -r --no-parent http://server/pub/scripts/pyVim
getting : server directory with its subdirectories
tried:
wget -r -X pub,scripts --no-parent http://server/pub/scripts/pyVim
tried few more options ,none of those works
i just need to download pyVim directory with its content to the current directory.
You said pyVim is a directory, but then the URL you passed to wget indicates that pyVim is a file in the directory scripts.
To explicitly tell wget that pvVim is a directory pass a trailing /. So your final command is:
wget -r --no-parent http://server/pub/scripts/pyVim/

Using wget to download multiple urls.

I have a file urls.txt. Which has multiple urls.
I am using wget to download the web content like this:
wget -i urls.txt. The web content is getting saved in different different files for each link. I want it to save everything in a single txt file
This will store the request details to messages.txt and all of the downloaded content to html.txt
$ wget -a messages.txt -i urls.txt -O html.txt
From wget --help:
Logging and input file:
-a, --append-output=FILE append messages to FILE
Download:
-O, --output-document=FILE write documents to FILE
Tested on GNU Wget 1.19.1 built on darwin16.6.0.

Renaming a downloaded file using wget

Here is my wget command where i am trying to rename the file which i am downloading but it is not working. I am using -O option here but somehow it is not working.
access="http://mvn:8081/nexus/content/com/mvn/"
wget -r -np -nd -l1 -O "access.war" "$access" -A "com.infa.products.ldm.ingestion.access.web-"$n"-.-1-ldm-access-web.war"
Here i am renaming it to access.war. I can only use wget to do this job due to some restrictions.
Thanks for the help.
The option -A is "comma separated", but you are using dots to separate the extensions!
Instead of
-A "com.infa.products.ldm.ingestion.access.web-"$n"-.-1-ldm-access-web.war"
Try
-A "com,infa,products,ldm,ingestion,access,web-"$n"-,-1-ldm-access-web,war"
If this is not the solution to your problem, I suggest you simplify your wget-call down to something like this
wget -r -np -nd -l1 -O "access.war" "$access"
Just to verify that all else is working.
Or even better (to get fewer files)
wget -r -np -nd -l1 -O "access.war" "$access" -A "war"

updating data from different URL using wget

What's the best way of updating data files from a website that has moved on to a new domain, with changes in their folder structure.
The old URL for example is http://folder.old-domain.com while the new URL is http://new-domain.com/directory1/directory2. My data is stored locally in ~/Data_Backup/folder.old-domain.com folder.
Data was originally downloaded using:
$ wget -S -t 0 -c --mirror –w 2 –k http://folder.old-domain.com
I was thinking of using mv to rename the old folder to follow the new URL pattern, but is there a better way of doing this?
Will this work? I'm not particular with the directory structure. What's important is to update the contents of the target folder (and its sub-folders.)
$ wget -S -t 0 -c -m –w 2 –k -N -np -P ~/Data_Backup/folder.old-domain.com http://new-domain.com/directory/directory
Thanks in advance.
Got it!
I need to add the following options:
-nH --cut-dirs=2
and now it works.