Can too many WSASend in short time be a problem? - sockets

I'm making a simple mmorpg server with IOCP.
I implemented a simple movement function so I tested with dummy clients(also IOCP).
Everything works fine only when few clients are connected. After around 500~1000 clients are connected, some dummy clients occasionally read weird data. I checked that server sends data as I expected but when it comes to dummy clients reading them, they read random data.
My guess is that it could be related to operation system's recv buffer being overflowed but I'm only guessing right now... I have no idea how to check them.
Any suggestion would be very thankful!

The problem with too many WSASends doesn't usually manifest as corrupted data; that's more likely to be a bug in your code. Perhaps your problem is caused by you failing to manage the lifetime of the buffer that is being used to send data correctly? It needs to stay stable until you get the completion for the WSASend call. If you were reusing it sooner than that then you would corrupt the data being sent.
The reason this may show up when you have lots of WSASends outstanding to lots of clients is that the send operations may be taking longer to complete and so make it more likely that your bug will be hit...
It doesn't matter how many WSASends you issue as long as your clients are able to receive the data as fast as you can send it. As soon as you are sending faster than they can receive then there will be problems. I address these problems in this answer.

Related

GWT: Client procedure and rpc request are always called several times with multiple thread id

For some client side procedures, I implement remote logging to log the calling of the procedure. The log is printed several times with different thread id, even though the procedure is only called once. Some rpc requests are sent to the sever a few times which causes some database session problem. Is it normal? Is there anyway to avoid it?
Thanks
This is not normal, and suggests there is a bug on your client causing it to send the same call more than once. Try adding logging on the client where you invoke the RPC call, and possibly add breakpoints to confirm why it is being called twice.
My best guess with no other information would be that you have more than one event handler wired up to the same button, or something like that.
--
More specifically, your servlet container starts multiple threads to handle incoming requests - if two requests come in close succession, they might be handled by different threads.
As you noted, this can cause problems with a database, where two simultaneous calls could be made to change the same data, especially if you have some checks to ensure that a servlet call cannot accidentally overwrite some newer data. This is almost certainly a bug in your client code, and debugging it should start there.

Laravel Mail Queue Infinite Loop on Exception

Hello fellow programmers, I wish everyone a good morning.
The Situation
Laravel is great. Laravel Mail queues and the beanstalkd integration is great. It took me almost no time to get everything working. The sun is shining and its not raining. Its awesome.
Except when an exception is thrown while sending an email. Then thise mail is processed again and again and again and the exception is also thrown again and again and again.
Infinite loop.
I think I wouldnt even notice this if I wouldn't have seeded the database with invalid data. Validation usually would have taken care of that, that emails like 361FlorindaMatthäi#gmail.com dont end up with the folowing exception:
[Swift_RfcComplianceException]
Address in mailbox given [361FlorindaMatthäi#gmail.com] does not
comply with RFC 2822, 3.6.2.
But what validation wouldnt have taken care for is for example, when my mandrill account reaches its limits or my server looses internet connection, whatever. An Exception sends it into an infinite loop.
In the world where the sun is shining and everything is great the job has to be marked as buried or suspended and the next email should be processed. An infinite loop with an invalid email address is not great.
Basicly your application doesnt send out any emails anymore. This guy has roughly the same issue.
How can I fix this? Has anyone else encountered this Error?
Any Help is much appreciated.
You just need to travel Laravel how many times to try a specific job, before deciding it has failed:
php artisan queue:daemon --tries=3
This way, it will stop processing that specific job after 3 tries.
The hard part of any queue-based system is dealing with the errors, I've run tens of millions of jobs through BeanstalkD and many more through other systems like SQS.
With this Swift_RfcComplianceException exception it's clear that the job will never be able to succeed, and so trying it again would be futile.
Some other problems might be able to be recovered, but in either event, you have to wrap the code in a try/catch block and do what you can.
Since there is no way to 'fix' this particular issue, I would record what happened (the name of the exception and any message, and the data) to a log to check on, and then delete or bury the job. If you store the job-id in the log when it is buried it, you can go back and delete or kick that particular job again later - this would be after being able to change what happens to the job (rather than having it fail again).

Asynchronouos Socket Communication & Heap fragmentation

I wrote a multithreaded Socket Server application which accepts over a 1000 concurrent connections. Recently we had application crash; after analyzing the dump files came to know app has crash due to heap corruption. I found the same issue discussed in following links.
.NET Does NOT Have Reliable Asynchronouos Socket Communication?
http://support.microsoft.com/kb/947862
And also discussion suggest 3 solutions.
The network application should have an upper bound on the number of outstanding asynchronous IO that it posts.
Use Microsoft CCR
Use TPL
Due to the time factor, I thought to stick with #1, but I don't have a clear picture how to implement this. Can some one give a good starting point please?
And also has anyone used Async with TPL to solve this issue?
You mean a better starting point than the blog posting that I linked to in the answer that you refer to?
The issue is this:
Memory and other per-operation resources that are used during an async write are often "in use" until the remote peer's TCP stack acks the data and the local stack can complete your async write operation to tell you that you can reuse your buffer.
The local peer has no control over this as it's all governed by the speed at which the remote peer reads data from its socket and the congestion on the link between the two peers.
Because of the above you need to have a hard limit on the amount of async writes that you have outstanding at any one time. You can track this by incrementing a counter just before you issue an async write and decrementing it in the completion handler.
What you do once you hit that limit is up to you. In the original article I favour a queue that data to be written is placed into. This queue can then be used as a source of data as write completions occur. Once the queue is empty you can send normally again. Of course this simply moves the problem - you still have a memory resource that's controlled by the remote peer (the queued data) but you don't also have other OS resources used too (non-paged pool, I/O page lock limit, etc).
You could simply stop your peer sending when you reach your limit - and now the API that you build over the async API needs to have a 'can't sent at the moment, try again later' return from a send which previously used to always "work".
If you're doing this I would also seriously look at avoiding the pinned memory issue by allocating a large block of buffers in one contiguous block and using them from the pool.
First, that's a very old KB article. How are you sure you have that particular problem?
Then, as Hans Passant answers in the SO question, if you write bad async code, it will bite you. If you don't take care of your resources (and memory buffers are resources), a concurrent program will face memory errors
It's very hard to write good concurrent code using raw Threads and TPL does make it easier but it won't fix the bugs you already have. In fact, unless you identify your current problems you are likely to transfer them to the version that uses TPL.
Without knowing the specific problem that caused your application to crash, I can only make some suggestions:
Use BufferManager to reuse memory buffers instead of allocating new ones.
Use a queue to store requests and process them asynchronously instead of starting a new thread for each request.
There are other techniques you can use as well, depending on the type of application you are building. Eg you could use TPL DataFlow to break processing in independent steps.
As for CCR, there is not much point in using it outside Robotics Studio. TPL contains most of the relevant functionality you need to write concurrent apps.

How to tell the difference between an offline and online mobile phone via sip?

For a toy project I want to find out if a mobile phone is connected to gsm or not. So I thought "Okay, let's use my local sip provider and see".
But in both cases, the thing goes like this:
I send an INVITE
0 s: I get a 100 Trying
5 s: I get a 183 Session description
I get an audio stream, in the one case with the ringing, in the other case with a "The person you are calling is…"
If I wait long enough (~ 40 s), I get a more appropiate status code like 180 Ringing.
Audio analysis is not an option, really.
Any hints on where to go now?
(I used twinkle for testing and a local german sip-provider.)
This issue is endemic in the way telephone networks work, and is not specific to SIP or IP. It's why, when you place a call to another country and the number is busy, you might sometimes hear your local country's busy tone, or you might hear a different busy tone that comes from the other country. In the latter case you cannot detect except by audio analysis, what the problem is. In SS7 and ISDN we speak of Q.931 cause codes instead of SIP error codes, but the principle is the same.
There's an argument to be made for configuring telephone systems to emit status codes instead of audio error messages. For callers using normal phones, the originating switch (the one closest to the caller) can then map that code to the appropriate spoken error message or audio tone. That way, when the call is being placed by software rather than by a person, the software can have access to the actual error code right away.
On the other hand you can also argue for having the remote switch (the one nearest the destination or the one that encounters the problem) speak its own error message. That switch knows best what the actual problem is. For example, a mobile operator can emit a spoken error message saying that the mobile phone you are trying to call is currently out of range. There is no Q.931 code (or SIP error code for that matter) with that meaning. It could return 27=Destination out of order?? Or 35=Destination unattainable?? Both of those codes are so esoteric, who knows what error message the local switch would translate them to (in practice: probably just a reorder tone, which is really user-unfriendly to a human caller). And when you try to map Q.931 cause codes to SIP error codes back and forth, even more information is lost because the codes really don't match up well at all. It's likely to be a much better user experience for the caller if the remote switch just plays back an informative, appropriate, recording which describes the problem.
Since there is this dilemma (arguments on both sides), we can conclude that this will not likely be resolved by completely standardizing on one way or ther other way anytime soon.
Anyway, sometimes this is configurable: your SIP provider may be able to configure your trunk for coded errors instead of recorded messages. If they offer this (some do), it's worth a try to set this option. But results will vary: this option only affects its local behaviour. In general if you want immediately call clearing with cause code and are instead getting a recorded error message from the other end, you will not be able to do anything about it, because the switch that makes the decision on which way it's going to respond is the remote one.
When using the audio message method, a proper Q.931 cause code or SIP error code usually comes eventually (after the recording is finished), but as you point out, it's probably too late by then.

Get changes immediately when something changed in server

I would like to know what is the best method to get data in iPhone as soon as a user entered or modified data in server. I can send a request for a small time interval to server to check any modifications done in server(Like Polling). I know it is very awkward. Pleas suggest a best one !!!
EDIT
I am not talking about push notifications. I need some Data something like while having a cricket match, when each time score updates in server I need to get that data (via XML,JSON, or any other medium) in my iPhone.
You're talking about push notifications: http://developer.apple.com/library/mac/#documentation/NetworkingInternet/Conceptual/RemoteNotificationsPG/ApplePushService/ApplePushService.html
These let you send specific messages from your server, to devices that opt in to receiving push notifications from your app.
What you are looking for is known as "Push Technology" (there are several variations of the same idea). In your case, what I think is best suited is "long polling". In short:
you poll specifying a very long timeout;
the server will not reply until it has some new data, so your request will be kept open as long as timeouts allow;
as soon as the server has got new data, it will reply, and you get the changes immediately;
when the timeout expires, you send a new request.
The fact of having a long poll will reduce the overhead you are worried about with "short" polling. Indeed, with short polls the idea is sending frequent requests, with a very short round-around time. This will make you send constantly requests to check for new data. With long polling you send a request only when you have got new data, or when a timeout fires (which can be several minutes).
In this S.O. post, you will find a way to implement it.