Just a practical question. I do need to retrieve the HTTP status code of a site as well as the IP address.
Given the fact I normally need to parse between 10k and 150k domains, I was wondering which is the most efficient method.
I've seen that using the urllib2.urlopen(site) attempts to download the entire file stream connected to the file. At the same time the urllibs2 doesn't offer a method to convert an hostname into an IP.
Given I'm interested only in the HEAD bit to collect information like the HTTP status code and the IP address of that specific server, what is the best way to operate?
SHould I try to use only the socket? Thanks
I think there is no one particular magic tool that will retrieve the HTTP status code of a site and the IP address.
For getting HTTP status code you should make a HEAD request using urllib2 or httplib or requests. Here's an example, taken from How do you send a HEAD HTTP request in Python 2?:
>>> import urllib2
>>> class HeadRequest(urllib2.Request):
... def get_method(self):
... return "HEAD"
...
>>> response = urllib2.urlopen(HeadRequest("http://google.com/index.html"))
An example, using requests:
>>> import requests
>>> requests.head('http://google.com').status_code
301
Also, you might want to take a look at grequests in order to speed things up with getting status codes from multiple pages.
GRequests allows you to use Requests with Gevent to make asyncronous
HTTP Requests easily.
For getting an IP address, you should use socket:
socket.gethostbyname_ex('google.com')
Also see these threads:
How do you send a HEAD HTTP request in Python 2?
How to resolve DNS in Python?
How do I get a website's IP address using Python 3.x?
Hope that helps.
Related
I have 2 independent python 2 applications running in the same linux (ubuntu) computer.
I want to send messages from one to another (bidirectional) and receives these messages inside a callback function.
Is it possible? Do you have any example as reference?
Thanks
There are different options available for communicating between python apps.
A simple one would be to use an API based on HTTP. Each application will expose an specific port and communication takes place by exchanging HTTP requests.
There are several frameworks that allow you to build it in few steps. For example, using Bottle:
In app1:
from bottle import route, run, request
#route('/action_1', method='POST')
def action_1_handler():
data = request.json
print(str(data))
# Do something with data
return {'success': True, 'data': {'some_data': 1}}
run(host='localhost', port=8080)
In app2:
import requests
r = requests.post("http://localhost:8080/action_1", json={'v1': 123, 'v2': 'foo'})
print r.status_code
# 200
data = r.json()
# {u'data': {u'some_data': 1}, u'success': True}
Note that if the action executed at app1 after receiving the HTTP request takes lot of time, this could result in a timeout error. In such a case, consider to run the action in another thread or use an alternative communication protocol (e.g. sockets, ZeroMQ Messaging Library).
Some related reads:
Basic Python client socket example
Communication between two python scripts
https://www.digitalocean.com/community/tutorials/how-to-work-with-the-zeromq-messaging-library
Help me track the status of a specific port: "LISTENING", "CLOSE_WAIT", "ESTABLISHED".
I have an analog solution with the netstat command:
local command = 'netstat -anp tcp | find ":1926 " '
local h = io.popen(command,"rb")
local result = h:read("*a")
h:close()
print(result)
if result:find("ESTABLISHED") then
print("Ok")
end
But I need to do the same with the Lua socket library.
Is it possible?
Like #Peter said, netstat uses the proc file system to gather network information, particularly port bindings. LuaSockets has it's own library to retrieve connection information. For example,
Listening
you can use master:listen(backlog) which specifies the socket is willing to receive connections, transforming the object into a server object. Server objects support the accept, getsockname, setoption, settimeout, and close methods. The parameter backlog specifies the number of client connections that can be queued waiting for service. If the queue is full and another client attempts connection, the connection is refused. In case of success, the method returns 1. In case of error, the method returns nil followed by an error message.
The following methods will return a string with the local IP address and a number with the port. In case of error, the method returns nil.
master:getsockname()
client:getsockname()
server:getsockname()
There also exists this method:
client:getpeername() That will return a string with the IP address of the peer, followed by the port number that peer is using for the connection. In case of error, the method returns nil.
For "CLOSE_WAIT", "ESTABLISHED", or other connection information you want to retrieve, please read the Official Documentation. It has everything you need with concise explanations of methods.
You can't query the status of a socket owned by another process using the sockets API, which is what LuaSocket uses under the covers.
In order to access information about another process, you need to query the OS instead. Assuming you are on Linux, this usually means looking at the proc filesystem.
I'm not hugely familiar with Lua, but a quick Google gives me this project: https://github.com/Wiladams/lj2procfs. I think this is probably what you need, assuming they have written a decoder for the relevant /proc/net files you need.
As for which file? If it's just the status, I think you want the tcp file as covered in http://www.onlamp.com/pub/a/linux/2000/11/16/LinuxAdmin.html
Let's say I have a bunch of clients who all have their own numeric IDs. Each of them connect to my server through SockJS, with something like:
var sock = new SockJS("localhost:8080/sock/100");
In this case, 100 is that client's numeric ID, but it could be any number with any number of digits. How can I set up a SockJS router in my server-side code that allows for the client to set up a SockJS connection through a URL that varies based on what the user's ID is? Here's a simplified version of what I have on the server-side right now:
public void start() {
HttpServer server = vertx.createHttpServer();
SockJSHandler sockHandler = SockJSHandler.create(vertx);
router.route("/sock/*").handler(sockHandler);
server.requestHandler(router::accept).listen(8080);
}
This works fine if the client connects through localhost:8080/sock, but it doesn't seem to work if I add "/100" to the end of the URL. Instead of getting the default "Welcome to SockJS!" message, I just get "Not Found." I tried setting a path regex and I got an error saying that sub-routers can't use pattern URLs. So is there some way to allow for the client to connect through a variable URL, whether it's /sock/100, /sock/15, or /sock/1123123?
Ideally, I'd be able to capture the numeric ID that the client uses (like with routing REST API calls, when you could add "/:ID" to the routing path and then capture the value that the client uses), but I can't find anything that works for SockJS connections.
Since it seems that SockJS connections are considered to be the same as sub-routers, and sub-routers can't have pattern URLs, is there some work-around for this? Or is it not possible?
Edit
Just to add to what I said above, I've tried a couple different things which haven't seemed to work yet.
I tried setting up an initial, generic main router, which then re-directs to the SockJS handler. Here's the idea I had:
router.routeWithRegex("/sock/\\d+").handler(context -> {
context.reroute("/final");
});
router.route("/final").handler(SockJSHandler.create(vertx));
With this, if I access localhost:8080/sock/100 directly through the browser, it takes me to the "Welcome to SockJS!" page, and the Chrome network tab shows that a websocket connection has been created when I test it through my client.
However, I still get an error because the websocket shows a 200 status code rather than 101, and I'm not 100% sure as to why that is happening, but I would guess that it has to do with the response that the initial handler produces. If I try to set the initial handler's status code to 101, I still get an error, because then the initial handler fails.
If there's some way to work around these status codes (it seems like the websocket is expecting 101 but the initial handler is expecting 200, and I think I can only pick one), then that could potentially solve this. Any ideas?
I am looking into the Swift Vapor framework.
I am trying to create a controller class that maps data obtained on an SSL link to a third party system (an Asterisk PBX server..) into a response body that is sent over some time down to the client.
So I need to send received text lines (obtained separately on the SSL connection) as they get in, without waiting for a 'complete response' to be constructed.
Seeing this example:
return Response(status: .ok) { chunker in
for name in ["joe\n", "pam\n", "cheryl\n"] {
sleep(1)
try chunker.send(name)
}
try chunker.close()
}
I thought it might be the way to go.
But what I see connecting to the Vapor server is that the REST call waits for the loop to complete, before the three lines are received as result.
How can I obtain to have try chunker.send(name) send it's characters back the client without first waiting for the loop to complete?
In the real code the controller method can potentially keep an HTTP connection to the client open for a long time, sending Asterisk activity data to the client as soon as it is obtained. So each .send(name) should actually pass immediately data to the client, not waiting for the final .close() call.
Adding a try chunker.flush() did not produce any better result..
HTTP requests aren't really designed to work like that. Different browsers and clients will function differently depending on their implementations.
For instance, if you connect with telnet to the chunker example you pasted, you will see the data is sent every second. But Safari on the other hand will wait for the entire response before displaying.
If you want to send chunked data like this reliably, you should use a protocol like WebSockets that is designed for it.
If I open up a port on our server to listen out for incoming messages (ie - strings of text), how can i send those messages? are their any example codes out there that are useful?
I have downloaded a few sample projects but none seem to do i want.
what code is used to specify the ip address and port number?
Here you wil find exactly what you need.
It is a wrapper class called AsyncSocket; to connect you just send it the string with the ip and an int with the port.
It will do all the work on ints own.
To send info you just turn your string into NSData if I remember correctly.
It has a very easy to follow sample so you should be up and running in about 20 min.
Look up any tutorial text on networking.
For instance, there are a bunch of echo server implmentations at rosettacode.