wget "awaiting response..." forever. Works fine for other sites or from another server - server

I have spent hours and hours on this. I have contacted my server provider who assures me they cannot block me. I have contacted multiple people at NOAA and have been assured I am not blocked. But about two weeks ago a script that has been working for 5 years quit working. The script works from home but not from 1and1. I see differences in what wget does but cannot see anything wrong. The main difference is that one server translates the encoding to UTF-8 while the one that works does not. I do not think that is the problem as I am not getting a file not found error and I see the same thing with sites that work fine. I am pasting the same exact text in both servers so no difference in my command. I get no error message, just waiting for response.
This diagram is a summary of my tests. It does not point to an obvious source of the problem so I am asking the experts here. Local and 1and1 are the servers, ndbc and noaa are the data sources. As you can see, the data only fails with 1and1 but 1and1 only fails with one data source. By fail I mean connected but no response.
Here is what I see on my home server.
Everything is correct and the file is saved.
If I paste the exact same text into my 1and1 server I get this
No matter how long I wait, no response.
The interesting thing is that if I ask for an invalid file, I correctly get an ERROR 404 so I know they are responding to me. ndbc says they can see my traffic and that they are not blocking me.
And finally, to show that wget works on other domains here is a fetch of the noaa home page:
Does anyone know what is going wrong either with something I am doing or with what ndbc.noaa might be doing or what might be wrong at 1and1 so I can get a handle on where the problem is?
I have implemented a kludge workaround by downloading the data to my myth server and using sftp to upload it to the 1and1 server but this is not a long term fix.
By the way, I see the same thing if I try using lynx to access www.ndbc.noaa.gov. I can download their home page from my myth server but not from the 1and1 server. I can use lynx to get home pages from www.noaa.gov from either server.

Related

Twython 401 on stream, but REST works fine

NO KIDDING. My code was working yesterday
So I'm writing a script to get streaming data using twython. But today it's giving me 401 errors. Even with the example code
I tried new keys, and a new app, but it still gives me 401.
However, normal REST api interface in Twython (see next code block) works just fine.
twitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
print twitter.get_home_timeline()
Some Googlefu led me to this SO thread which suggested using ntpd -q, but this doesn't solve my issue!
This affects only some of the servers behind stream.twitter.com. Perhaps their clocks are off.
Some kind of geo-aware load balancing appears to be used and can the domain name resolves to a number of different IP addresses, depending on where you are doing the resolving.
Connecting to 199.59.148.229 or 199.59.148.138 gives the 401 error for me (these I get resolving from AWS Oregon), but 199.16.156.20 works (this I get resolving from my home network).
Therefore, my temporary workaround for this was to add this to /etc/hosts:
199.16.156.20 stream.twitter.com
I expect the original problem to go away in a few hours, but this works for me, for now.

ColdFusion and REST

According to this blog entry, REST is available in ColdFusion.
However, I have multiple sites, so when I navigate to
localhost/rest/Example/hello
I get "Hello World", but if I go to
http://mysite.com/rest/Example/hello
I get HTTP Error 500.0 - Internal Server Error
Requested URL http://mysite.com:80/jakarta/isapi_redirect.dll
If I go to the IP address, I get "Hello World".
Aaron posted a comment referring to bug 3348765, but I'm not sure that helps me get this first Proof-Of-Concept working.
Q: How do I get REST to work in ColdFusion if I have multiple sites defined in IIS?
Have a look at this: http://blogs.coldfusion.com/post.cfm/rest-support-in-coldfusion-part-i tutorial.
Apparently, mysite.com is very unlucky domain name since lots of people trying to use it for learning or testing purposes. That's just a website and most likely they don't have a ColdFusion 10 REST webservice.
Your problem with localhost seems to be the web-server issue. If you open http://localhost, do you see the same website as http://127.0.0.1? These could easily be two different virtual hosts (websites, in terms of IIS).
Solution would be to check how your ColdFusion is installed with your IIS. Even better solution would be to set up some virtual host for playing with code, for example I've got virtual local hosts like localhost.coldfusion, localhost.railo, etc.
Hope this helps.

Recreate a site from a tcpdump?

It's a long story, but I am trying to save an internal website from the pointy hair bosses who see no value from it anymore and will be flicking the switch at some point in the future. I feel the information contained is important and future generations will want to use it. No, it's not some adult site, but since it's some big corp, I can't say any more.
The problem is, the site is a mess of ASP and Flash that only works under IE7 and is buggy under IE8 and 32bit only even. All the urls are session style and are gibberish. The flash objects itself pull extra information with GET request to ASP objects. It's really poorly designed for scraping. :)
So my idea is to do a tcpdump as I navigate the entire site. Then somehow dump the result of every GET into a sql database. Then with a little messing with the host file, redirect every request to some cgi script that will look for a matching get request in the database and return the data. So the entire site will be located in an SQL database in URL/Data keypairs. Flat file may also work.
In theory, I think this is the only way to go about this. The only problem I see is if they do some client side ActiveX/Flash stuff that generates session URLs that will be different each time.
Anyway, I know Perl, and the idea seems simple with the right modules, so I think I can do most of the work in that, but I am open to any other ideas before I get started. Maybe this exist already?
Thanks for any input.
To capture I wouldn't use tcpdump, but either the crawler itself or a webproxy that can be tweaked to save everything, e.g. Fiddler, Squid, or mod_proxy.

Perl application move causing my head to explode...please help

I'm attempting to move a web app we have (written in Perl) from an IIS6 server to an IIS7.5 server.
Everything seems to be parsing correctly, I'm just having some issues getting the app to actually work.
The app is basically a couple forms. You fill the first one out, click submit, it presents you with another form based on what checkboxes you selected (using includes and such).
I can get past the first form once... but then after that it stops working and pops up the generated error message. After looking into the code and such, it basically states that there aren't any checkboxes selected.
I know the app writes data into .dat files... (at what point, I'm not sure yet), but I don't see those being created. I've looked at file/directory permissions and seemingly I have MORE permissions on the new server than I did on the last. The user/group for the files/dirs are different though...
Would that have anything to do with it? Why would it pass me on to the next form, displaying the correct "modules" I checked the first time and then not any other time after that? (it seems to reset itself after a while)
I know this is complicated so if you have any questions for me, please ask and I'll answer to the best of my ability :).
Btw, total idiot when it comes to Perl.
EDIT AGAIN
I've removed the source as to not reveal any security vulnerabilities... Thanks for pointing that out.
I'm not sure what else to do to show exactly what's going on with this though :(.
I'd recommend verifying, step by step, that what you think is happening is really happening. Start by watching the HTTP request from your browser to the web server - are the arguments your second perl script expects actually being passed to the server? If not, you'll need to fix the first script.
(start edit)
There's lots of tools to watch the network traffic.
Wireshark will read the traffic as it passes over the network (you can run it on the sending or receiving system, or any system on the collision domain).
You can use a proxy server, like WebScarab (free), Burp, Paros, etc. You'll have to configure your browser to send traffic to the proxy server, which will then forward the requests to the server. These particular servers are intended to aid testing, in that you'll be able to mess with the requests as they go by (and much more)
As Sinan indicates, you can use browser addons like Fx LiveHttpHeaders, or Tamper Data, or Internet Explorer's developer kit (IIRC)
(end edit)
Next, you should print out all CGI arguments that the second perl script receives. That way, you'll know what the script really thinks it gets.
Then, you can enable verbose logging in IIS, so that it logs the full HTTP request.
This will get you closer to the source of the problem - you'll know if it's (a) the first script not creating correct HTML, resulting in an incomplete HTTP request from the browser, (b) the IIS server not receiving the CGI arguments for some odd reason, or (c) the arguments aren't getting from the IIS server and into the perl script (or, possibly, that the perl script is not correctly accessing the arguments).
Good luck!
What you need to do is clear.
There is a lot of weird excess baggage in the script. There seemed to be no subroutines. Just one long series of commands with global variables.
It is time to start refactoring.
Get one thing running at a time.
I saw HTML::Template there but you still had raw HTML mixed in with code. Separate code from presentation.

Why am I getting this WSDL SOAP error with authorize.net?

I have my script email me when there is a problem creating a recurring transaction with authorize.net. I received the following at 5:23AM Pacific time:
SOAP-ERROR: Parsing
WSDL: Couldn't load from 'https://api.authorize.net/soap/v1/service.asmx?wsdl' :
failed to load external entity "https://api.authorize.net/soap/v1/service.asmx?wsdl"
And of course, when I did exactly the same thing that the user did, it worked fine for me.
Does this mean authorize.net's API is down? Their knowledge base simply sucks and provides no information whatsoever about this problem. I've contacted the company, but I'm not holding my breath for a response. Google reveals nothing. Looking through their code, nothing stands out. Maybe an authentication error?
Has anyone seen an error like this before? What causes this?
Maybe as a back up you can cache the WSDL file locally and in case of network issues use the local copy. I doubt it changes often so if you refresh it weekly that should be satisfactory as the file will probably not be stale by then.