I always used matlab to get FRED data, but now I'm not able to get it anymore.
A simple code like:
c = fred('http://research.stlouisfed.org/fred2/');
d = fetch(c,'DEXUSEU');
gets the error:
Index exceeds matrix dimensions.
Error in fred/fetch (line 93) d.Data =
[datenum(str2num(tmp(:,1:4)),str2num(tmp(:,6:7)),str2num(tmp(:,9:10)))
str2num(tmp(:,11:end))]; %#ok
Debugging the fetch function, the url it creates is ok, but in line 48 when it uses the urlread the result is:
301 Moved Permanently Moved
Permanently The document has moved here.
Any suggestion?
Thank you
It appears that FRED doesn't like non-HTTPS requests. I get the same error you report in Matlab 2015a, but if you change the url to https, it works ok.
c = fred('https://research.stlouisfed.org/fred2/');
d = fetch(c,'DEXUSEU');
If you take the url that Matlab is requesting from FRED and paste it in Chrome, you get a valid response (which I'm guessing Chrome is doing something to follow the link that the 301 error provides you while Matlab just gives up). They are still allowing non-HTTPS requests from their API service, but the base Matlab fetch function doesn't use the actual FRED API.
UPDATE: I just recieved the following email from FRED:
FRED API requires HTTPS.
Beginning on August 18, 2015, the FRED API will require HTTPS requests. This change will help provide secure communication with the FRED API. An automatic redirect will forward HTTP requests to HTTPS. We recommend that you update the URLs in your code. The API currently supports HTTPS to allow you to test your applications with this secure protocol.
Please contact us at STLS.RSRCHWebmaster#stls.frb.org or 314-444-FRED (3733) if you have questions or concerns. Thanks for using FRED and the FRED API.
Sincerely,
The FRED Team
Related
First off thanks for reading!
Second off YES I have tried to find the answer! :) Perhaps I haven't found it because I'm not using the right words to describe my problem, but it's been about 4 hours that I've been trying to figure it out now and I'm getting a little loopy trying to piece it together on my own.
I am very new to programming. Python is my first language. I am on my third Python course. I have an assignment to use the socket library (not urllib library - I know how to do that) to make a socket and use GET to receive information. The problem is that the program needs to take raw input for the URL in question.
I have everything else the way I want it, but I need to know the syntax that I'm supposed to be using INSIDE my "GET" request in order for the HTTP message to include the requested document path.
I have tried (obviously not all together lol):
mysock.send('GET (url) HTTP/1.0\n\n')
mysock.send( ('GET (url) HTTP:/1.0\n\n'))
mysock.send(('GET (url) HTTP:/1.0\n\n'))
mysock.send("GET (url) HTTP/1.0\n\n")
mysock.send( ("'GET' (url) HTTP:/1.0\n\n"))
mysock.send(("'GET' (url) 'HTTP:/1.0\n\n'"))
and:
basically every other configuration of the above (, ((, ( (, ', '' combinations listed above.
I have also tried:
-Creating a string using the 'url' variable first, and then including it inside mysock.send(string)
-Again with the "string-first" theory, but this time I used %r to refer to my user input (so 'GET %r HTTP/1.0\n\n' % url basically)
I've read questions here, other programming websites, the whole chapter in the book and the whole lectures/notes online, I've read articles on the socket library and the .send(), and of course articles on GET requests... but I'm clearly missing something. It seems most don't use socket library when they can use urllib and I don't blame them!!
Thank you again...
Someone from the university posted back to me that the url variable can concatenated with the GET syntax and assigned to a string variable which can then be called with .send(concatenatedvariable) - I had mentioned trying that but had missed that GET requires a space after the word 'GET' so of course concatenating didn't include a space and that blew it. In case anyone else wants to know :)
FYI: A fully quallified URL is only allowed in HTTP/1.1 requests. It is not the norm, though, as HTTP/1.1 requires setting the Host header. The relevant piece of reading would've been RFC 7230, sec. 3.1.1 and possibly RFC 3986. The syntax of the parameters is largely borrowed from the CGI format. It is in no way enforced, however. In a nutshell, everything put together would look like this on the wire:
GET /path?param1=value1¶m2=value2 HTTP/1.1
Host: example.com
As a final note: The line delimiter in HTTP is CRLF (\r\n). For robustness, a simple linefeed is acceptable as well but not recommended.
Is it possible to use Kdb+ http client to access pages protected by login? I am using https://github.com/KxSystems/cookbook/blob/master/yahoo.q as example of basic GET/POST. Does anyone have an example how to extract a cookie and use it in the following requests?
It is probably a bit crude, but the following will extract headers from an http, then cookies, parse and return as a dictionary:
x:"HTTP/1.0 200 OK\r\nContent-type: text/html\r\nSet-Cookie: theme=light\r\nSet-Cookie: sessionToken=abc123; Expires=Wed, 09 Jun 2021 10:18:14 GMT\r\n\r\n";
left:{(first y ss x)#y};
vs1:{{(y#x;(count[z]+y)_x)}[y;;x](first y ss x)};
headers:{{(`$x[0];x[1])} flip vs1[": "] each 1_"\r\n" vs left["\r\n\r\n"]x};
cookies:{(!). {(`$x[0];x[1])} flip vs1["="] each {x[1]#where x[0]=`$"Set-Cookie"} x};
cookies headers[x]
Whilst you might be able to various bits and bobs from an http response, the fact that you won't be able to manipulate http methods means that q can't be your tool to do this - well, not without some vigorous effort.
I would use something like Beautiful Soup in conjunction with q. Soup has some great tools for handling this kind of thing (e.g. cookies etc). There are various other similar projects too.
System call for Beautiful Soup that make relevant get/post/put calls and download required data
system"/path/to/code.py"
Where the code dumps the result somewhere or puts it into kdb directly. Then do whatever you like with it.
The following code should post a form to an endpoint (which returns 302) and, after following the redirect, parse the url of the page and return some information from there.
val start = System.currentTimeMillis()
val requestHolder = WS.url(conf("login.url"))
.withRequestTimeout(loginRequestTimeOut)
.withFollowRedirects(true) //This appears to have no effect...
requestHolder.post(getMap(username, password))
.map(resp =>{
Logger.debug(resp.status.toString)
val loginResponse = getResponse(resp)
val end = System.currentTimeMillis()
Logger.debug("Login for the user: "+username+", request took: " + (end - start) + " milliseconds.")
loginResponse
})
The problem is that .withFollowRedirects(true) appears to have no effect on the query. The status of the response is 302 and the request does not follow the redirect.
I've gone through the process manually using httpie and following the redirects does lead to the correct page.
Any help or insight would be much appreciated.
POST redirection isn't as well supported as GET redirection. W3 specification says:
If the 301 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.
Some browsers don't do that, and just ignore. Have a look also at the 307 status:
307 Temporary Redirect (since HTTP/1.1)
In this case, the request should be repeated with another URI; however, future requests should still use the original URI. In contrast to how 302 was historically implemented, the request method is not allowed to be changed when reissuing the original request. For instance, a POST request should be repeated using another POST request.
There is also a discussion about this on Programmer Stack Exchange.
I've had a lot of trouble with withFollowRedirects and POST.
At some point, while fighting to make things work, I had .withFollowRedirects(false) in my code, then removed it during cleanups & things broke. My current guess is that if this option is not explicitly made false, the default behavior is to follow redirects (302 in my case) with some faulty mechanism. Perhaps the default mechanism uses POST again with same arguments. But in my case, interacting with Google App Script (GAS), one needs to use GET to retrieve JSON output of a POST.
Whatever the mechanism was doing, I was getting 400 with no further diagnostics.
After wasting hours, I realized that .withFollowRedirects(false) was in fact truly needed: it disabled Play's messing with redirects, I was able to see the 302 response & handle the following GET manually with success.
I'm trying to copy a file located in user's personal folder (OneDrive Pro) using REST API. The resulting link seems to become too long (???) and server returns 400 BadRequest: The length of the URL for this request exceeds the configured maxUrlLength value.
Url looks like this (HTTP verb is POST):
https://<company>-my.sharepoint.com/personal/<user>_<company>_onmicrosoft_com/_api/Web/GetFileByServerRelativeUrl('/personal/<user>_<company>_onmicrosoft_com/Documents/<Folder>/<Folder with guid-like name>/<filename>.pdf')/copyto(strnewurl='/personal/<user>_<company>_onmicrosoft_com/Documents/<Same folder>/<another guid-like name>/<same filename>.pdf',boverwrite=true)
Any help or advice on how to overcome this is highly appreciated.
Just in case if anyone else will face this problem and you will have unique id in your SP.
If first part, which is
GetFileByServerRelativeUrl('/personal/<user>_<company>_onmicrosoft_com/Documents/<Folder>/<Folder with guid-like name>/<filename>.pdf')
will be replaced with GetFileById(uniqueId) the limitation would be satisfied and copy will succeed.
Just a practical question. I do need to retrieve the HTTP status code of a site as well as the IP address.
Given the fact I normally need to parse between 10k and 150k domains, I was wondering which is the most efficient method.
I've seen that using the urllib2.urlopen(site) attempts to download the entire file stream connected to the file. At the same time the urllibs2 doesn't offer a method to convert an hostname into an IP.
Given I'm interested only in the HEAD bit to collect information like the HTTP status code and the IP address of that specific server, what is the best way to operate?
SHould I try to use only the socket? Thanks
I think there is no one particular magic tool that will retrieve the HTTP status code of a site and the IP address.
For getting HTTP status code you should make a HEAD request using urllib2 or httplib or requests. Here's an example, taken from How do you send a HEAD HTTP request in Python 2?:
>>> import urllib2
>>> class HeadRequest(urllib2.Request):
... def get_method(self):
... return "HEAD"
...
>>> response = urllib2.urlopen(HeadRequest("http://google.com/index.html"))
An example, using requests:
>>> import requests
>>> requests.head('http://google.com').status_code
301
Also, you might want to take a look at grequests in order to speed things up with getting status codes from multiple pages.
GRequests allows you to use Requests with Gevent to make asyncronous
HTTP Requests easily.
For getting an IP address, you should use socket:
socket.gethostbyname_ex('google.com')
Also see these threads:
How do you send a HEAD HTTP request in Python 2?
How to resolve DNS in Python?
How do I get a website's IP address using Python 3.x?
Hope that helps.