Requests from facebookexternalhit have truncated query string - facebook

My server is getting requsts with user-agent "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)". Some of these requests have truncated query strings.
I've been able to recreate the problem by posting the following url to a private facebook group. It seems to try to create a thumbnail, but is unable to.
https://example.com/blob/?blobId=547c67b0-1400-4a95-b49f-c8c5054f03ae&expiry=&signature=e65L4ieYRPK0k9uVvSYgkvCYiWAAOZHTIJ1jzh2u%2BGziioeBjSHuR6hYfGZ0xR4PHdNrJmai6frpRirJCbRrUOlCmMS1VqHMWBQs8oAChB0S57VqOdz7GDlYgckOZ%2B1oGk6N65A%2Fr1%2BrTwpw%2F0kYuhEwtKgjPOiMisJXbCs%2Bphd82WHHrIwmIgd%2BT6M8J0iomUSBSkHif3IpQ5fHBj%2F%2ByyjNQmRNMlmodbjsLmL9uY2xk9cnpZManv5JgKc0D30DzSW5rkpxLUF%2B%2F8FnPayk%2Bbur1pPEZPJ7bBxPocjbDXW%2BHm43P%2FDtqZNR5Bbg%2FrHemf6Y6tuVQlTLfp3YzPXSIw%3D%3D
The reason the request fails is that facebook truncates the query string like this:
https://example.com/blob/?blobId=547c67b0-1400-4a95-b49f-c8c5054f03ae&expiry&signature=e65L4ieYRPK0k9uVvSYgkvCYiWAAOZHTIJ1jzh2u%2BGziioeBjSHuR6hYfGZ0xR4PHdNrJmai6frpRirJCbRrUOlCmMS1VqHMWBQs8oAChB0S57VqOdz7GDlYgckOZ%2B1oGk6N65A%2Fr1%2BrTwpw%2F0kYuhEwtKgjPOiMisJXbCs%2Bphd82WHHrIwmIgd%2BT6M8J0iomUSBSkHif3IpQ5fHBj
Why does facebook do this? Is it about url length? What can I do to fix it?

Related

Facebook privacy policy URL: Bad Response Code: URL returned a bad HTTP response code

I want to publish my first Facebook application and a Privacy Policy URL is required.
I have the page privacypolicy.html published in my website but I get the next message when I configure it in "App Details":
You must submit a valid Privacy Policy URL in order to be compliant with Facebook Platform. Request failed with error:
Bad Response Code: URL returned a bad HTTP response code.
The http code returned when I request the page is 200
Any ideas?
The URL is cached by facebook.
Adding # at the end of my Url did the job ..
This is an old question I know, but I figured I'd post my solution and hope it helps anyone. For me I got this error because I had rewrite rules that didn't catch the URL that Facebook actually goes to in order to get the privacy policy. Facebook adds a query string to the URL that you give it for the privacy policy and since my privacy policy page doesn't do anything with the query string, I didn't check for it in my rewrite rule.
You can check out how Facebook scrapes the page you give it by going to Facebook's Sharing Debugger and putting your URL in the input bar. You can also see the last time that Facebook tried to scrape that URL and tell Facebook to try again once you've fixed any issues. This will get around the caching that was mentioned in user2390340's post.
Facebook externalhit appears to request via the ipv6 address published in DNS if there is one available and won't check on the ipv4 published in DNS.
If your website doesn't have ipv6 enabled it'll return a 404 or 500 and you'll get that error "Bad Response Code: URL returned a bad HTTP response code" for your Privacy Policy URL.
Edit:
Also noted that Facebook caches the URL, I was checking it and getting a "bad response code" error even though there was no hit from their UA in the access logs.
Adding ?stuff onto the end of the URL in the Privacy Policy field bypassed a cache and the access log hits showed up with 200 OK, allowing the URL to be saved.
Not sure if this is related to user2259887's comment about Facebook using IPV6 DNS. But after reading his comment, I was able to work around the validation issue by specifying an IP Address URL instead of using the host domain name URL.
This workaround will not work well if the site IP address is dynamic or change often.

Facebook server-side OAuth 2.0 on localhost:8080 can't get access token missing redirect_uri

There are many other question related to this, but they didn't help me fix my problem.
I'm using the Facebook server-side login for a website, which I want to test locally. The path that initiates the login action is [http://localhost:8080/fblogin] (this redirects to the Facebook login dialogue, and goes from there).
I can successfully get the code, but when I try to exchange that for an access token, I get the following error:
{"error":{"message":"Missing redirect_uri parameter.","type":"OAuthException","code":191}}
I am providing the redirect_uri, url encoded and it is the same as the one I use to get the first code. Here is the url I'm using to request the access token (with the all-caps query string parameters replaced with their actual values, of course):
https://graph.facebook.com/oauth/access_token?client_id=CLIENT_ID&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2Ffblogin&client_secret=CLIENT_SECRET&code=CODE_FROM_FB
I suspect this might have to do with how my app is set up on Facebook. Here are the values I have set:
Display Name: (an actual display name here)
App Domains: localhost
Contact email: (an actual email here)
Site URL: [http://localhost:8080/fblogin]
What do I need to tweak in the settings to get this to work? Or does this look correct?
By the way, if it makes any difference, I am using the Play! framework, version 2.0.1
After digging around a little more, I found that it was necessary for me to use POST when sending the request from my server to get the access token.
Interesting that using POST worked for you as this didn't for me.
In any case, did you add the query parameters using setQueryParameter()? (see How to make multiple http requests in play 2?)

Facebook server-side authentication flow: is this the right "code?"

I'm using FB.login on the JS client and want to verify the user's identity on the server. So, the client gets a signedRequest from facebook and sends it to the server. The server splits on the period, and decodes the second part of the signedRequest into a json object.
What should I be using for "code" when I send my server-side request to
https://graph.facebook.com/oauth/access_token?
client_id=YOUR_APP_ID
&redirect_uri=YOUR_REDIRECT_URI
&client_secret=YOUR_APP_SECRET
&code=CODE_GENERATED_BY_FACEBOOK
My decoded json looks something like:
{"algorithm":"HMAC-SHA256","code":"2.AQCPA_yfx4JHpufjP.3600.1335646800.1-5702286|l11asGeDQTMo3MrMx3SC0PksALj6g","issued_at":1335642445,"user_id":"5232286"}
Is that the code I need? Does it need to be B64 encoded? If this isn't the code, what code should I use?
_
What I've tried:
The request I'm trying to use is:
https://graph.facebook.com/oauth/access_token?client_id=295410083869479&redirect_uri=https://squaredme.appspot.com/facebookredirect&client_secret=44f1TOPSECRETbb8e&code=2.AQCPA_yfx4JHpufjP.3600.1335646800.1-5702286|l11asGeDQTMo3MrMx3SC0PksALj6g
but this returns the error:
{"error":{"message":"Error validating verification code.","type":"OAuthException","code":100}}
I can't tell if this is because I'm using a bad code, or what. Noteably, this is running on my local dev server, and squaredme.appspot.com definitely does NOT resolve to my IP. I don't know if facebook checks that or what - I'm assuming I'd get a better error message. Thanks for any direction!
You are trying to somehow combine the two flows together and that's why things don't work well.
When facebook POSTs into the iframe with your app url and a signed request there are two options, the easy one being that the user is already authenticated and then the signed request will have all the necessary data (including a signed request), then you just load the canvas page and use the JS SDK to get an access token there as well, but in this case there's no need to use the FB.login (since it opens a popup and will automatically close it), you can use the FB.getLoginStatus method which won't annoy the user.
If the user is not authenticated then the sign request will be missing the things you need to use the graph api.
You then redirect the user to the auth dialog, and since you are loaded in an iframe you'll need to return a html response which redirects the parent window using javascript, like:
top.location.href = "AUTH_DIALOG_URL";
When the use is done (accepted or rejected the app) he will be redirected to the "redirect_uri" you added as a parameter to the auth dialog.
If the user accepted your app then you'll be getting the "code" parameter in the query string.
You then take the code, exchange it with an access token as you posted in your question, and then redirect the user back to "apps.facebook.com/YOUR_APP".
When the page then loads the user is already authenticated and you'll be getting a full signed request.
I hope this clarifies things for you, recheck the Server-Side flow it pretty much covers it all.
I also had some trouble with that, then I found the solution here in StackOverflow.
There are two kinds of "code" provided by facebook. One comes inside the signedRequest in the cookie generated by the client-side flow. The Facebook's JS SDK handles this codes and get a access token without telling us anything.
The other type of code comes attached as a query to your redirect URI (http://www.yoururl.com/index.php?code=AAAgyiaus...), when you navigate to OAuth URL (server-side flow). With this code, you go to a Token URL and get your access token.
When you are using the server-side flow, you need to indicate a redirect URI both in the OAuth URL AND in the Token URL, and they have to be exactly the same, so a missing slash or a query string can be a lot of problem.
The codes are different from each other. When you use the both things together, appears to be impossible to get a access token using the code that was inside the cookie's signedRequest.
BUT, it is not. The magic is: the code from signedRequest is associated with NO URI, so as long as the redirect_uri is a mandatory field, all you have to do is to pass it blank when you navigate to the Token URL.
So the final solution is: grab the signedRequest from the cookie, parse it in your server to obtain the code, then read the Token URL:
https://graph.facebook.com/oauth/access_token?
client_id=YOUR_APP_ID
&redirect_uri=&client_secret=YOUR_APP_SECRET
&code=CODE_INSIDE_THE_SIGNED_REQUEST
It looks like a hack, so I don't know how long it's gonna work, but it's working right now.

"(#100) object URL is not properly formatted" while trying to setup facebook subscription

I am trying to setup facebook subscriptions for checkins, I have already asked for the user_checkins permission in my app, so I issue the following url
https://graph.facebook.com//subscriptions?access_token=&object=user&callback_url=&verity_token=&fields=checkin
To which I get
"(#100) object URL is not properly formatted"
The access token is valid, the url is properly encoded and points to a page according to FB guidelines about hub.xxx, everything seems normal. I am doing a GET though. Could this be the problem? Should it be a POST as the docs? Or is there another issue?
Thanks
Yes, this request should be POST (or GET with method=post argument passed). And this exactly the error message returned if you fail to do so...

How to disallow access to an url called without parameters with robots.txt

I would like to deny web robots to access a url like this:
http://www.example.com/export
allowing this kind of url instead:
http://www.example.com/export?foo=value1
A spider bot is calling /export without query string causing a lot of errors on my log.
Is there a way to manage this filter on robots.txt?
I am assuming you have problems with bots hitting the first URL in your example.
As said in the comment, this is probably not possible, because http://www.example.com/export is the resource's base URL. Even if it were possible as per the standard, I wouldn't trust bots to understand this properly.
I would also not send a 401 Access denied or similar header if the URL is called without a query string for the same reason: A bot could think that the resource is out of bounds entirely.
What I would do in your situation is, if somebody arrives at
http://www.example.com/export
send a 301 Moved permanently redirect to the same URL and a query string with some default values, like
http://www.example.com/export?foo=0
this should keep the search engine index clean. (It won't fix the logging problem you state in your comment, though.)