GWT-RPC and the infamous sporadic "StatusCodeException: 0" exception revisited - gwt

My problem is the infamous "StatusCodeException: 0" problem happening when using GWT 2.6.1 when accessing page via subdomain https://sub.site.com/.
Now, this happens quite sporadically for one customer using IE11 and I can't reproduce this from several distinct computers using IE11, IE10, IE9 or IE8 (not to talk about Chrome or Firefox).
Accessing exactly the same webapp from https://site.com/ seems to work fine for that customer.
This obviously lead me to conclusion that I'm having problem with Same Origin Policy.
What is strange though is that my webapp is designed in the way that no cross-domain or cross-subdomain requests are made. Same goes for no cross-protocol as well no cross-port requests. In other words, Same Origin Policy is not violated in this situation. As a confirmation of that, I can provide following proof:
While being at customer site I've seen how this is reproduced: customer starts using application and everything works fine - all requests are returning response normally. Then, after several minutes of working, exactly the same requests on the same page (without reloads) starts to fail with StatusCodeException: 0.
Basically, both https://sub.site.com and https://site.com points to the same IP, and there is only one Tomcat webapp serving exactly the same resources both for https://sub.site.com and https://site.com.
Another proof would be the codebase of the single GWT module itself: there I use only one instance of one service called DashboardService:
public class DashboardModule extends EntryPoint implements IDashboardModule {
private final DashboardServiceAsync dashboardService = createDashboardService();
#Override
public void onModuleLoad() {
// loading of module elements
// dashboardService is passed as a parameter so only one instance is used
}
/**
* PLEASE SEE QUESTION #1 BELOW CODE SNIPPET
*/
private static final String DASHBOARD_REQUEST_URL = "request";
private static DashboardServiceAsync createDashboardService() {
final DashboardServiceAsync service = GWT.create(DashboardService.class);
((ServiceDefTarget) service).setServiceEntryPoint(DASHBOARD_REQUEST_URL);
return service;
}
}
=================================== EDIT ====================================
After looking in the console at customer location, the error was always the following:
SCRIPT7002: XmlHttpRequest: network error 0x2ee4, ...
so it seems that this has nothing to do with Same-Origin Policy, because as per this article it is described as ERROR_INTERNET_INTERNAL_ERROR An internal error has occurred.
It's a pity, but I've found only 2 mentions of this error which were not resolved:
Error under IE10 and Error under IE11.
I have an assumption that customer is very probably accessing site through some proxy which slightly changes the requests and IE can't handle them.
Question 1: does anybody knows how can I simulate or reproduce mentioned error locally?
Question 2: does anybody knows how this problem can be gracefully worked around?
Question 3: is it ok to simply retry the request, or this request may have reached the server and modify it, so retrying it may produce duplicate modification?
Will try to setup forwarding proxy to simulate possible customer setup to at least reproduce mentioned error...
I greatly appreciate any help!

Ok, so after bugging with this problem for a workweek I finally managed to solve it.
Actually, I was able to reproduce very similar problem locally when I installed Apache2 server in front of Tomcat and accessed it from another VirtualBox Win7 host with IE11. This gave me sporadic StatusCodeException: 0 with Network error 0x2ef3 though but the behaviour was very similar: GWT-RPC requests started to fail after a minute or so. This was reproducable in IE10 and IE11 but working fine in IE8 and IE9 :) (is IE getting crappier with new versions?)
Locally I was able to fix that problem by simply disabling Keep-alive functionality for IE browsers by adding following lines to /etc/apache2/sites-enabled/default-ssl.conf Apache2 ssl configuration file:
# following line was added
BrowserMatch "Trident" nokeepalive ssl-unclean-shutdown downgrade-1.0 force-response-1.0
</VirtualHost>
</IfModule>
This basically tells Apache2 not to use keep-alive, use special SSL handling and generally downgrade to HTTP 1.0 standard whenever user-agent string in request has Trident word (matches IE11 and IE10 and possibly earlier IEs)
This added Connection: close HTTP header to each response and seemed to work fine locally.
On customers site this wasn't still working and produced the same Network error: 0x2ee4.
It may be worth noting that customer was using McAfee Web Gateway as forwarding proxy which stood in the middle of browser <-> server communication.
Long story short, I found out that the problem was in the following: when page loads there are multiple GET requests being sent to server to get the page, resources etc. Then after 10 seconds of using it (my webapp is single-page-application, so user may spend more than 10 minutes on same page) only GWT-RPC requests are being made to the server which are POST requests. And after a minute of using this page (I suspect 1 minute = keep-alive timeout of proxy server) these POST requests start randomly fail with 0x2ee4 network error.
After I implemented GWT-RPC retry functionality, I found out that after 30 seconds of retries simply ALL GWT-RPC requests fail with above error. Refreshing the page was solving this problem again for a minute or so and then same story happened.
So, I figured out that CRAPPY IE11 and IE10 are incorrectly handling combination of SSL, Keep-alive and POST requests. It seems that сrappy IE10 and IE11 simply can't renew keep-alive ssl connection using POST requests and only do this using GET requests.
Please note that Chrome, Firefox and other normal browsers are handling this situation quite well. When inspecting how Firefox behaves in such situation in Firebug: it can be clearly seen that POST request is made, then it is shown as aborted for like 0.5s and then this it is shown as successful (I suspect that Firefox handles this specific situation and makes GET request to server itself to renew SSL keep-alive connection and then retries POST request)
So, to fix this problem in IE I simply implemented functionality which "pings" server with GET request every 5 seconds (be ready to experiment with this time since this is most probably related to customer's proxy keep-alive timeout).
This made it work (please note that above Apache2 configuration hack is not needed in this case)
I really hope that this will help people with similar issue and save their time
Resources used:
IE Network Error 0x2ef3 question 1
IE Network Error 0x2ef3 question 2
IE Network Error 0x2ef3 question 3
Awesome q&a on how to implement transparent GWT-RPC retry functionality
P.S. Will I report this IE10 and IE11 issue to Microsoft? - really I'm not eager spending 30+ minutes of my time reporting issue on commercial crappy IE browser issue after I've already spent more than a week of finding out the problem.
I insist on recommending Chrome or Firefox or other normal browser to customers as viable alternative and I still think that IE11 is not suited for modern websites with AJAX

Related

Body params disappear on some POST requests routes

I have some rest and GraphQL services works with ruby and typescript (with nestjs framework).
I've noticed that a couple of days ago some of the POST requests failed due to params validation error.
After further investagaion, it seems like on some requests the body params sphoradicly drops from the requests somewhere on the network. It happen for small amount of requests (less than 1%).
This issue started when no changes deployed on the server or clients side. My clients are iOS and android apps and it happens for both platforms.
I've tried to find the exact point on the network that the body dropped, with no success.I've also tried to find a solutions to similar issue on the net.
Does anyone have any idea what can it be? I haven'e found any relevant information about similar issues.
Thanks!

Why does my Github webhook keep timing out?

We couldn’t deliver this payload: Service Timeout
I was successfully sending webooks to my server 5 minutes ago, and now I just keep getting timeouts. I tried deleting the webook and re-adding it, changing the URL it points to, but nothing.
Am I flooding it with too many pushes, or is GitHub's webhook service just down?
It also turns out that GitHub has a 10-second timeout set on their webhooks. That is what I ran into. See the documentation here.
Unless there is some kind of error on the GitHub side (which doesn't seem to be the case at the moment, given their "System Status" history), you might check the program receiving the payload of that webhook.
See a similar problem in Supybot-plugins 225:
I contacted GitHub support and one of the employees has been troubleshooting this for me. Here is part of what he had to say about the issue:
I just tried making a request manually from one of our machines, and that went through with no error (see curl -v output below).
However, I did notice that it took extremely long for the request to be processed -- over 15 seconds (for 2 bytes of data).
Decoupling the listening and reception of the payload, from its proicessing, is generally the right approach, as I recommended ion "Perl Script slow over Tomcat 6.0 and generates service time out".
The first part should be as fast as possible.

Twython 401 on stream, but REST works fine

NO KIDDING. My code was working yesterday
So I'm writing a script to get streaming data using twython. But today it's giving me 401 errors. Even with the example code
I tried new keys, and a new app, but it still gives me 401.
However, normal REST api interface in Twython (see next code block) works just fine.
twitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
print twitter.get_home_timeline()
Some Googlefu led me to this SO thread which suggested using ntpd -q, but this doesn't solve my issue!
This affects only some of the servers behind stream.twitter.com. Perhaps their clocks are off.
Some kind of geo-aware load balancing appears to be used and can the domain name resolves to a number of different IP addresses, depending on where you are doing the resolving.
Connecting to 199.59.148.229 or 199.59.148.138 gives the 401 error for me (these I get resolving from AWS Oregon), but 199.16.156.20 works (this I get resolving from my home network).
Therefore, my temporary workaround for this was to add this to /etc/hosts:
199.16.156.20 stream.twitter.com
I expect the original problem to go away in a few hours, but this works for me, for now.

Why do some servers respond with a 503 error unless my User-Agent starts with "Mozilla"?

I'm writing a client that grabs a page from a web server. On one particular server, it would work fine from my web browser, but my code was consistently getting the response:
HTTP/1.1 503 Service Unavailable
Content-Length:62
Connection: close
Cache-Control: no-cache,no-store
Pragma: no-cache
<html><body><b>Http/1.1 Service Unavailable</b></body> </html>
I eventually narrowed this down to the User-Agent header I was sending: if it contains Mozilla, everything is fine (I tried many variations of this). If not, I get 503. As soon as I realized it was User-Agent, I remembered having this same issue in the past (different project, different servers), but I've never figured out why.
In this particular case, the web server I'm connecting to is running IIS 7.5, but I'm not sure if there are any proxies/firewalls/etc in front of it (I suspect there probably is something because of this behaviour).
There's an interesting history to User-Agents which you can read about on this question: Why do all browsers' user agents start with "Mozilla/"?
It's clearly no issue for me to have Mozilla in my User-Agent, but my question is simply: what is the configuration or server that causes this to happen, and why would anyone want this behaviour?
Here is an interesting history of this phenomenon: User Agent String History
The main reason that this exists is because the internet, web, and browsers were not designed, but evolved, with high amounts of backwards compatibility, but then a lot of vender exclusive extensions. In particular, frames (which are widely considered a bad idea these days) were not well supported by Mosaic, but were by Netscape (which had Mozilla as it's user agent).
Server administrators then had a choice: did they use the new hip cool frames and only support Netscape, or did they use old boring pages that everyone can use? Their choice was a hack; if someone tells me they are Mozilla, send them frames; if not, send them not frames.
This ruined everything. IE had to call itself Mozilla compatible, everyone impersonated everyone else, it's all well detailed in the link at the top. But this problem more or less went away in the modern era, as everyone impersonated everyone, and everyone supported more and more of a common subset of features.
And then mobile browsers and smart phone browsers became wide spread. Suddenly, there wasn't just a few main browsers with basically the same features, and a few outlying browsers you could easily ignore. Now it was dozens of small browsers, with less power and less ability and a disjoint odd set of capabilities! And so, many servers took the easy road and simply did not send the proper data, or any data at all, to any browser they did not recognize.
Now rather than a poorly rendered or inoperable website, you had...no website on certain platforms, and a perfect one on others. This worked, but wasn't tolerable for many businesses; they wanted to work right on ALL platforms, because that's how the web was supposed to work.
Mobile versions, mobile first, responsive design, media queries, all these were designed to fill in those gaps. But for the most part, a lot of websites still just ignore less than modern browsers. And media queries were quickly subverted: no one wants to declare their browser is handheld, oh no. We're a real display browser, even if our screen is only 3 inches, yes sir!
In summary, some servers are configured to drop any browser which is not Mozilla compatible because they think it's better to serve no page than a poorly rendered one.
I've also seen some arguments that this improves security because then the server doesn't have to deal with rogue programs that aren't browsers (much like your own) connecting to them. As the user agent is easy to change, this holds no water for me; it's simply security through obscurity.
Many firewalls are configured to drop all requests which do not have "proper" user agent, as many DDoS attacks do not bother to send it - this is easy, reliable filter.

Blackberry ksoap2 request issues

First time posting a question. I'm trying to call some SOAP webservices from inside a blackberry app using the ksoap2 library. I've successfully managed to get a response from the one service, which uses an HTTP url, but now that I'm trying to get response from a (different) HTTPS url, I've run up against a brick wall.
The response dump I'm getting has the following fault message:
"An error occurred while routing the message for element value : (country option I specified in my request). Keep-Alive and Close may not be set using this property. Parameter name: value."
The weird thing is that using Oxygen XML's SOAP tools with the XML request dump works just fine. Any ideas where to start looking? This has taken up a full day already.
Update:
Responding to your comment below - it turns out the double quoting is part of the SOAP spec. Some servers are more relaxed in their implementation, and will work without the quotes.
ksoap2 doesn't force the quotes onto your actions - you may want to patch your ksoap2 library to ensure the quotes are always there.
ymmv
Original:
I don't think this is a SOAP related problem, nor with BlackBerry.
I think the problem lies on the server side, since that error string is not a common error (just google it to see no hits on the whole internet other than this question).
Looks like this is a job for the network guy on the server side to tell you what he's seeing on his end.
Only other thing I can think of is to make the call using HTTP instead of HTTPS. You can then use some network sniffer to see what the difference between the messages is. Alternatively, install an SSL proxy with something like "Charles" and sniff the packets like that.