wget not returning same result as clicking on link in chrome browser - wget

I am trying to use wget to download an audio file from a link (which has no file extension). The issue is that clicking this link automatically starts a .wav file download but wget on the same link returns a file without a file extension. Passing the -O file.wav extension does not work as the file itself is not compatible.
I have tried
wget -O test.wav "[DOWNLOAD LINK]"
The above downloads a file in my directory which is not audio.
My problem can be replicated by going to https://captcha.com/demos/features/captcha-demo.aspx and clicking on the href associated with the element of class class=BDC_SoundLink.
Questions:
Is there a way to get wget to return same result as clicking the link?
Is there a way to resolve the non audio file to audio file after wget does whatever it does?
Any help would be much appreciated!

The thing is that when you use WGET, you're actually downloading a text file because the MIME type is Text.
When you browse the website through your webbrowser it actually gets the right captcha code from the server and then you're able to download the file with the right captcha code. You can see below in the dev tools that the captcha code is here.
This sound file is linked to the captcha itself and each time you reload the captcha picture, the backend C# code of the asp.net page is giving a new captcha code.
That's why you can't download the captcha that way.

Related

Download message from Google group

I need to download an archived google group.
Following link is one of the messages of that group for example.
https://groups.google.com/forum/#!topic/sci.aeronautics/ViFtpXfVm7M
The problem is, what i see in the browser does not appear in the downloaded webpage.
With my very limited knowledge, It seems to me like the reason behind it is this content is dynamically created by java-script. Or else, these downloaded files are with so called 'mbox' extension which is encrypted ?
What I've tried so far
First trys
Simple download
wget https://groups.google.com/d/topic/sci.aeronautics/ViFtpXfVm7M
With mirror
wget --mirror https://groups.google.com/d/topic/sci.aeronautics/ViFtpXfVm7M
Assuming its encrypted
With cookies.
wget --load-cookies=cookies.txt https://groups.google.com/d/topic/sci.aeronautics/ViFtpXfVm7M
Got thunderbird to setup my gmail and opening. did not open correctly
Assuming the content was javascript generated
Downloaded using phantomJS
https://askubuntu.com/questions/411540/how-to-get-wget-to-download-exact-same-web-page-html-as-browser
Downloaded using phantomJS with a different script
https://gist.github.com/giocomai/247d54e097b5083e2451
Used scripts available from Github
https://github.com/henryk/gggd
https://github.com/icy/google-group-crawler
But none did not work so far.
Can anyone please shed some light on how to download this page with its message as a readable html or txt file ?
Cheers
AyyoSalli
You could use https://groups.google.com/forum/feed/sci.aeronautics/msgs/atom.xml?num=100 to get some of the posts - but it only gets roughly half the posts in this case.
And it has all the messages from all topics together.
View it in Firefox or Classic Opera to see directly in a more human-readable form.
But since you say you already got a file in standard mbox format, what exactly is wrong with it - did you attempt to import it into a locally installed email or newsclient ? (like Thunderbird)

Access downloaded pdf file path in HTML5 file system and display it in webview

In my chrome app, I am using HTML5 file system to save the pdf files to sand box.Downloading is working fine.But how do i access that downloaded file path? I want to give that path as webview source.
The best way, if it works, would be to use a filesystem URL. To get this use FileEntry.toURL
These don't work on external files (i.e. files that come from chrome.fileSystem.chooseEntry and are outside the app's sandbox) but should work for files in the app's sandbox.
Note, I am referring to filesystem:// urls not file://urls, which won't work as Marc Rochkind has pointed out in his answer.
Disclaimer: I haven't tested this, but I believe it should work.
You need to get the contents of the PDF into a data URL. See my answer to this question:
Download external pdf files to chrome packaged app's file system

Wget to download html

I have been trying to download an html from http://osu.ppy.sh/u/2330158 to get Historical data
but it doesnt download that part. Nor it downloads General, Top Ranks etc
Is there a way to make wget to download it?
That part of the page is loaded dynamically, so wget won't see it as it doesn't support Javascript. However, if you open the web developer tools in your browser of choice and then load the main page you can get the URL which you're really after. For this page, it's: http://osu.ppy.sh/pages/include/profile-history.php?u=2330158&m=0
Luckily, it's another simple, parameterised URL so you can feed that to wget:
wget "http://osu.ppy.sh/pages/include/profile-history.php?u=2330158&m=0"
That'll get you an html document containing just the historic data you're looking for.

Wordpress Download Link for Audio File

I am using the plugin mb.miniAudioPlayer to handle audio player on my wordpress site.
It allows for nice download button that does not require a "right click and save as" The only problem is that is does not save the file as expected.
I get mp3 file with no filename or anything. even when I rename the file to filename.mp3 it is not playable.
The plugin dev says it is something wrong with my server. I am totally lost. any help would be appreciated.
example page is here link
Response from dev here link
I didn't find any mp3 file link to download but if it's a server side problem then it could be a myme-type issue. Your server must be configured to handle the mp3 files. Add following code in your .htaccess file in the server root, it may solve the problem.
<Files *.mp3>
ForceType application/octet-stream
</Files>
Also remember that your mp3 files on the server must have mp3 extension. Also take a look at this.

How can I simply add a downloadable PDF file to my page?

I want to add a pdf and word format of my resume to my portfolio page and make it downloadable. Does anyone have some simple script?
Add a link to the file and let the browser handle the download.
You may be over-complicating the problem. It's possible to use a href pointing to the location of the .pdf or .doc file, when a user clicks on this in their browser, generally they will be asked if they would like to save or open the file, depending on their OS/configuration.
If this is still confusing, leave a comment and I'll explain anything you don't get.
Create the PDF. Upload it. Add a link.
Save yourself 30 minutes tossing around with PDFGEN code.
You will want to issue or employ the Content-Disposition HTTP header to force the download otherwise some browsers may recognize the common file extensions and try to automatically open the file contents. It will feel more professional if the link actually downloads the file instead of launching an app - important for a resume I think.
Content-Disposition must be generated within the page from the server side as far as I know.
Option:
Upload your resume to Google Docs.
Add a link to the file on your portfolio page just as I do in the menu of my blog:
Use Google Docs Viewer passing to it the URL of the PDF as you can see in this link.