Access Facebook Graph API in Python, Access Token usage - facebook

I'm new to Python, and have been using different tutorials/guides to build a program for getting data from a Facebook page, and write it into an AzureSQL database. In the Facebook Developer page I've set up an app, and created the AppID and AppSecret.
Here's the relevant part of the code I think the issue is with:
#create authenticated post URL
def create_post_url(graph_url, APP_ID, APP_SECRET):
post_args = "/posts/?key=value&access_token=" + APP_ID + "|" + APP_SECRET
post_url = graph_url + post_args
return post_url
#render graph url call to JSON
def render_to_json(graph_url):
web_response = urllib2.urlopen(graph_url)
readable_page = web_response.read()
json_data = json.loads(readable_page)
return json_data
def main():
#define facebook app secret and app id
APP_SECRET = "xyz"
APP_ID = "abc"
#define page username
pagelist = ["porsche"]
graph_url = "https://graph.facebook.com/"
In the Facebook Graph API Explorer, running a GET request will work great and return data.
Facebook Graph API Explorer results
Howerver if I run my code in the CLI, the resulting error is as follows:
[vagrant#localhost SomeAnalysis]$ python src/collectors/facebook/facebookScaper.py
Traceback (most recent call last):
File "src/collectors/facebook/facebookScaper.py", line 92, in <module>
main()
File "src/collectors/facebook/facebookScaper.py", line 56, in main
json_fbpage = render_to_json(current_page)
File "src/collectors/facebook/facebookScaper.py", line 25, in render_to_json
web_response = urllib2.urlopen(graph_url)
File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request
I've been checking SO and have found similar posts but none explain how to use the Access Token. I've tried changing the part in the URL formation with APP_ID and APP_SECRET to just using the long ACCESS_TOKEN from the Graph API Explorer but that yields the same 400 error.
Any help is appreciated.
/K

Related

KeyError with hypixel api when profile json does exist

I'm using the Hypixel API while getting a response I try to find a certain key (last_save) in the json request.get(url) gets but while to print it, it keeps saying that the key doesn't exist this is my code
import requests
import json
api_key = 'API KEY HERE'
uuid = 'UUID HERE'
profileID = 'PROFILE ID HERE'
#Set the player name to the desired username
player = 'PLAYER NAME HERE'
#Make the request to the API
url = f'https://api.hypixel.net/skyblock/profile?key={api_key}&profile={profileID}'
response = requests.get(url)
#Parse the JSON response
data = response.json()
last_save = data["last_save"]
#Print the player's information
print(data['profile'])
this is the error it throws
'Traceback (most recent call last):
File "c:\Users\...\main.py", line 16, in <module>
last_save = data["last_save"]
KeyError: 'last_save''
how do i fix

Not able to trigger DAG from Airflow API but its working from Curl command

I am trying to trigger DAG from Airflow API through python script.
DAG is triggering from curl command but its not working from API.
import requests
url ='http://localhost:8080/api/experimental/dags/document_validation/dag_runs'
myobj = {''}
x =requests.post(url, data=myobj, headers={"Content-Type": "application/json"})
print(x.text)
Error I am getting
File "run_dag_api.py", line 6, in <module>
  `x = requests.post(url, data = myobj)`
File "/…/lib/python3.7/site-packages/requests/api.py", line 119, in post
  `return request('post', url, data=data, json=json, **kwargs)`
File "/…/lib/python3.7/site-packages/requests/api.py", line 61, in request
  `return session.request(method=method, url=url, **kwargs)`
File "/…/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
  `resp = self.send(prep, **send_kwargs)`
File "/…/lib/python3.7/site-packages/requests/sessions.py", line 643, in send
  `r = adapter.send(request, **kwargs)`
File "/…/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
  `raise ConnectionError(err, request=request)`
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Redirection using Scrapy Spider Middleware (Unhandled error in Deferred)

I've made a spider using Scrapy that first solves a CAPTCHA in a redirected address before accessing the main website I intend to scrape. It says that I have an HTTP error causing an infinite loop but I can't find which part of the script is causing this.
In the middleware:
from scrapy.downloadermiddlewares.redirect import RedirectMiddleware
class ProtectRedirectMiddleware(RedirectMiddleware):
def __init__(self, settings):
super().__init__(settings)
self.source = urllib.request.urlopen('http://sampleurlname.com/')
soup = BeautifulSoup(source, 'lxml')
def _redirect(self, redirected, request, spider, reason):
# act normally if this isn't a CAPTCHA redirect
if not self.is_protected(redirected.url):
return super()._redirect(redirected, request, spider, reason)
# if this is a CAPTCHA redirect
logger.debug(f'The protect URL is triggered for {request.url}')
request.cookies = self.bypass_protection(redirected.url)
request.dont_filter = True
return request
def is_protected(self, url):
return 'sampleurlname.com/protect' in url
def bypass_protection(self, url=None):
# only navigate if any explicit url is provided
if url:
url = url or self.source.geturl(url)
img = soup.find_all('img')[0]
imgurl = img['src']
urllib.request.urlretrieve(imgurl, "captcha.png")
return self.solve_captcha(imgurl)
# wait for the redirect and try again
self.wait_for_redirect()
return self.bypass_protection()
def wait_for_redirect(self, url = None, wait = 0.1, timeout=10):
url = self.url
for i in range(int(timeout//wait)):
time.sleep(wait)
if self.response.url() != url:
return self.response.url()
logger.error(f'Maybe {self.response.url()} isn\'t a redirect URL')
raise Exception('Timed out')
def solve_captcha(self, img, width=150, height=50):
# open image
self.img = 'captcha.png'
img = Image.open("captcha.png")
# image manipulation - simplified
# input the captcha text - simplified
# click the submit button - simplified
# save the URL
url = self.response.url()
# try again if wrong
if self.is_protected(self.wait_for_redirect(url)):
return self.bypass_protection()
# return the cookies as a dict
cookies = {}
for cookie_string in self.response.css.cookies():
if 'domain=sampleurlname.com' in cookie_string:
key, value = cookie_string.split(';')[0].split('=')
cookies[key] = value
return cookies
Then, this is the error I get when I run the scrapy crawl of my spider:
Unhandled error in Deferred:
2018-08-06 16:34:33 [twisted] CRITICAL: Unhandled error in Deferred:
2018-08-06 16:34:33 [twisted] CRITICAL:
Traceback (most recent call last):
File "/username/anaconda/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/username/anaconda/lib/python3.6/site-packages/scrapy/crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "/username/anaconda/lib/python3.6/site-packages/scrapy/crawler.py", line 105, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/username/anaconda/lib/python3.6/site-packages/scrapy/core/engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "/username/anaconda/lib/python3.6/site-packages/scrapy/core/downloader/__init__.py", line 88, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/username/anaconda/lib/python3.6/site-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/username/anaconda/lib/python3.6/site-packages/scrapy/middleware.py", line 36, in from_settings
mw = mwcls.from_crawler(crawler)
File "/username/anaconda/lib/python3.6/site-packages/scrapy/downloadermiddlewares/redirect.py", line 26, in from_crawler
return cls(crawler.settings)
File "/username/...../scraper/myscraper/myscraper/middlewares.py", line 27, in __init__
self.source = urllib.request.urlopen('http://sampleurlname.com/')
File "/username/anaconda/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/username/anaconda/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/username/anaconda/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/username/anaconda/lib/python3.6/urllib/request.py", line 564, in error
result = self._call_chain(*args)
File "/username/anaconda/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/username/anaconda/lib/python3.6/urllib/request.py", line 756, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/username/anaconda/lib/python3.6/urllib/request.py", line 532, in open
It basically repeats the bottom part of these over and over: open, http_response, error, _call_chain, and http_error_302, until these show at the end:
File "/username/anaconda/lib/python3.6/urllib/request.py", line 746, in http_error_302
self.inf_msg + msg, headers, fp)
urllib.error.HTTPError: HTTP Error 307: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Temporary Redirect
In setting.py is:
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware': None,
'myscrape.middlewares.ProtectRedirectMiddleware': 600}
Your issue has nothing to do with scrapy itself. You are using blocking requests in your middleware initiation.
This request seems to be stuck in a redirect loop. This usually happens when websites do not act appropriately and require cookies to allow you through:
First you connect and get a redirect response 30x and some setCokies headers
You redirect again but not with Cookies headers and the page lets you through
Python urllib doesn't handle cookies, so try this:
import urllib
from http.cookiejar import CookieJar
def __init__(self):
try:
req=urllib.request.Request(url)
cj = CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
response = opener.open(req)
source = response.read().decode('utf8', errors='ignore')
response.close()
except urllib.request.HTTPError as e:
logging.error(f"couldn't initiate middleware: {e}")
return
# you should use scrapy selectors instead of beautiful soup here
#soup = BeautifulSoup(source, 'lxml')
selector = Selector(text=source)
Alternatively you should use requests package that handles cookies by itself.

Getting http error 401: Unauthorized when processing oauth_token and oauth_verifier

Can someone please help with this. Thanks in advance.
I send request to twitter with nonce, callback_url, consumer_key, etc. Twitter then prompts the user to enter login and password. User successfully enters login credentials. After that Twitter sends oauth_token and oauth_verifier to the callback URl specified in the earlier call. But now I always get 'HTTP Error 401: Unauthorized' error. Can someone please help with this? Have spent days on it. Thank you.
"GET /account/twitter/oauthacc?oauth_token=ererererer&oauth_verifier=ereret43rwererererer"
HTTP Error 401: Unauthorized
Traceback (most recent call last):
File
"/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 714, in __call__handler.get(*groups)
File "/base/data/home/apps/s~test/2-7.3629731/oauth.py", line 74, in get
self.mOauthRequestToken = vOauthApi.getRequestToken(self.ACCESS_TOKEN_URL)
File "/base/data/home/apps/s~-test/2-7.369731/libs/oauthtwitter.py", line 206, in getRequestToken
resp = self._FetchUrl(url, no_cache=True)
File "/base/data/home/apps/s~test/2-7.3639731/libs/oauthtwitter.py", line 118, in _FetchUrl
url_data = opener.open(url).read()
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 387, in open
response = meth(req, response)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 498, in http_response
'http', request, response, code, msg, hdrs)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 425, in error
return self._call_chain(*args)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 506, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 401: Unauthorized

Error with OAuth in the Google Data Protocol Client Libraries

When I was testing the code in "OAuth in the Google Data Protocol Client Libraries (http://code.google.com/apis/gdata/docs/auth/oauth.html)", I always got the following error. Anyone can give me a hint?
Error Code:
File "D:\PROJ\GAE\proj2\proj2.py", line 262, in get
return self.redirect(auth_url)
File "C:\DEV\google_appengine\v1.4.2\google\appengine\ext\webapp\__init__.py", line 380, in redirect
absolute_url = urlparse.urljoin(self.request.uri, uri)
File "C:\DEV\Python\v2.5.4\lib\urlparse.py", line 253, in urljoin
urlparse(url, bscheme, allow_fragments)
File "C:\DEV\Python\v2.5.4\lib\urlparse.py", line 154, in urlparse
tuple = urlsplit(url, scheme, allow_fragments)
File "C:\DEV\Python\v2.5.4\lib\urlparse.py", line 193, in urlsplit
i = url.find(':')
AttributeError: 'Uri' object has no attribute 'find'
Here is the code I want to fetch Google contacts info:
class Test(webapp.RequestHandler):
def get(self):
client = gdata.contacts.client.ContactsClient(source = 'www.mydomainname.com')
callback_url = 'http://%s/test2' % self.request.host
request_token = client.GetOAuthToken(['http://www.google.com/m8/feeds/'],
callback_url,
GOOGLE_KEY,
GOOGLE_SECRET)
gdata.gauth.AeSave(request_token, 'request_token')
auth_url = request_token.generate_authorization_url(google_apps_domain = None)
return self.redirect(auth_url) #Error?!
Thanks in advance!
The auth_url generated is not a string.
Just do:
return self.redirect(str(auth_url))
and it will work.