wget behaves differently with different adresses

wget behaves differently with different adresses - wget

I have these two urls:
https://cdn.pixabay.com/photo/2017/06/24/09/13/dog-2437110_960_720.jpg
and
http://www.deutschland-machts-effizient.de/SiteGlobals/KAENEF/StyleBundles/Bilder/sublogo.png;jsessionid=DF603F2801D8F686FD4BCFAD770C3FC9?__blob=normal&v=3
Trying to access the pictures with wget works for the first one, but does not for the second one. Of course the first more closely resembles a picture (ending in .jpg), but any browser I tested displayed both as pictures I could download.
Instead of a picture I download a 2000 line html file, which contains several img tags. I guess I could try any of the urls, but I want to automate this for a general case, so this doesn't really help me.
What is the inherent difference between both pictures in the way they are stored on their respective server?
How can I download the second picture using wget?

Related

Providing EXIF-free images in a gallery or other webpage

First, thanks for any and all help regarding this topic.
Sites like Facebook and Twitter strip EXIF information from images as they are uploading. My goal is to allow users to upload images to our platform (working with Nextcloud and others) with full EXIF information, however, we need to display images that do not contain EXIF information or any metadata. Without stripping and creating a second, Exif-Free image for each, is it possible to simply hide that EXIF info so that, if a user downloads that image, the EXIF is not embedded?
We were told that the only way to do this is to have a second, exif-free copy (the order of when that's created is irrelevant pre/during/post upload). I'm hoping there's a way that we can simply display such a copy without doubling our physical space requirements.
Thanks again for your help.

Exif is metadata, along with IPTC, XMP, AFCP, ICC, FPXR, MPF, JPS and a comment, just for the JFIF/JPEG file format alone. Other picture file formats support even more/other metadata.
You wrote it yourself: a download - so it's a file in any case. Pictures are files, just like executables, movies, texts, music and archives are files, too. And metadata is part of its content, so whoever accesses the raw bytes of the file can grab everything in it. Which is not "please don't look" proof. If you
create that on the fly by stripping metadata everytime a download is requested,
or if you do it once to preserve performance and instead occupy space remains your decision.
If there would be something as simple as a "don't show" feature then it would still be in the file and could be extracted easily by software written to ignore that instruction. Seriously, there's no shortcut to that - do it properly and don't spare yourself from getting work done at the wrong end.

Copying images in from anti-scraper websites. Google Docs handles it easily - anyone know how?

I've been playing around with making a draftjs plugin that lets the user paste in mixed text&image content from websites and have images auto-uploaded to the server. I've quickly come to the realization that it's not easy, simply because of how many different sites use different kinds of counter-measures for copy/pasting images. Standard image tags in page content are no problem - easily grab the src and handle the file upload from the url. However, many sites use all kinds of trickery to make this a pain. For example, some will only serve small thumbnails, requiring a GET request on the image with a hash key in order to retrieve a larger version. Others somehow seem to corrupt the image so that it's unreadable by the time it's been retrieved. Others still play with weird embed tags to mess with draftjs' image blocks.
But then I open up a Google Docs file, and find that when I copy any images into that from a website, there's never any troubles whatsoever. All the problematic websites that I'm finding myself having to write specific methods for retrieving from seem to be handled by Google Docs with ease.
Am I using completely the wrong approach by trying to retrieve images from a url? Does Google use a far superior approach (yes, I presume) - in which case, does anyone have any idea what that approach might be?

Make wget convert links to other pages in input-file

I'm using wget to archive a discussion from a forum. The discussion is over several pages, navigated to with next and previous buttons.
I generated a list of the page urls and used that for the input-file, however the convert-links option is not converting the next and previous links, only the images.
Is there any way to make it do that?
I could use -r, but that would need a depth of 64 to get the whole discussion, and therefore it would get a whole load of extra unwanted stuff as well.

I figured out a workaround. It was easy enough to change the input file to html and upload it. Then with -r and -l1 it correctly converted the links.

Filepicker.io - image conversions preventing video uploads

We're currently working with Filepicker.io to allow users the ability to upload both images and videos. It appears that if we specify image conversions in the Javascript API options, video uploads don't process and instead get stuck at 99.30%. If I remove the 'conversions' option, video uploads process without issue. Is it not possible to specify image conversion options and accept both type of uploads? If so, this should really be specified in the docs.
I attached a JSFiddle with the code in question. http://jsfiddle.net/BYkD4/

It might be an issue on our end, taking a look now. For large files (+1Mb) we split the file into chunks, upload them in parallel, and then reassemble them on the server side. We use browser progress up to the 90% mark, after which we have to "best guess" what the server-side progress looks like, for now at least. That's the reason why it's hanging at 99.30% - it may actually be able to complete if you give it enough time.
In any case, looking into it
Edit: looks like this was an issue on our end. Fix deployed, everything should be working fine. Sorry about the issue

List Out all video from url

I am trying to list out all Video from a url. For this i m sending an request to "You Tube"
url as "http://www.youtube.com/" and want to list out all available video . But i didn't get anything from that request ? any idea or any documentation hint ?

There are utilities for downloading youtube videos (for example Linux has youtube-dl), but it's not uncommon for sites with large numbers of downloadable files to prevent attempts to simply download everything - and even though you said you wanted to list rather than download all the videos, that's unfortunately what it would suggest to a website administrator.
Besides, files on youtube are not accessed by simple urls like http://www.youtube.com/filename
Something more is required. I don't think you can treat the (what is it?) 11 character alphabet soup as a filename, it's a parameter passed to the software which streams back the video.
EDIT: youtube-dl is a command-line program in Linux and probably BSD. You need to know the URL of the Youtube video so you can type (for example)
youtube-dl http://www.youtube.com/watch?v=Z1JZ9O15280
If you had a list of these URLs you could put them in a file and make a bulk download script - but that takes us back to your original question.
In Firefox I would right-click on a link to a Youtube video and choose 'copy link location'. Then paste the URLs one at a time into a text file. But this question is drifting away from mere programming...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

wget behaves differently with different adresses - wget

Related

Providing EXIF-free images in a gallery or other webpage

Copying images in from anti-scraper websites. Google Docs handles it easily - anyone know how?

Make wget convert links to other pages in input-file

Filepicker.io - image conversions preventing video uploads

List Out all video from url

Categories

Resources