i'm using syslog-ng to send data to mongo
after a while the process hung up. tcpdump shows no data outgoing.
debuggint syslog-ng, i found Destination queue full, dropping message;... appears several times, then back to normal. for last time it never come back.
using kill -1 $PID can solve it. but the reason is unkown, i'm trying to figure it out.
if anyone have an idea?
There are a couple of things which can cause this, but it's hard to tell without more information, I'd suggest asking on the syslog-ng mailing list too, as it's very likely I'd ask for a bit of debugging info there, which isn't all that suitable on stackoverflow.
Nevertheless, there's one known case of deadlock that I know of in afmongodb, which isn't fixed in 3.3.4, for that, there's a fix available here. From your description, I'm not sure this would help (the destination queue full stuff is interesting, by the way), but from what you describe, this is my best bet.
Hope it helps!
Related
I'm making a simple mmorpg server with IOCP.
I implemented a simple movement function so I tested with dummy clients(also IOCP).
Everything works fine only when few clients are connected. After around 500~1000 clients are connected, some dummy clients occasionally read weird data. I checked that server sends data as I expected but when it comes to dummy clients reading them, they read random data.
My guess is that it could be related to operation system's recv buffer being overflowed but I'm only guessing right now... I have no idea how to check them.
Any suggestion would be very thankful!
The problem with too many WSASends doesn't usually manifest as corrupted data; that's more likely to be a bug in your code. Perhaps your problem is caused by you failing to manage the lifetime of the buffer that is being used to send data correctly? It needs to stay stable until you get the completion for the WSASend call. If you were reusing it sooner than that then you would corrupt the data being sent.
The reason this may show up when you have lots of WSASends outstanding to lots of clients is that the send operations may be taking longer to complete and so make it more likely that your bug will be hit...
It doesn't matter how many WSASends you issue as long as your clients are able to receive the data as fast as you can send it. As soon as you are sending faster than they can receive then there will be problems. I address these problems in this answer.
I try to use txt-records to share information between multiple devices. Therefore I am using bonjour/avahi. The server-side works fine as wireshark proofs. Information is added to the txt-record and sent out using MDNS.
The problem occurs on the client side, where the daemon/service does not seem to get the information change all the time. It is stuck with information that is already outdated and does not automatically update it when I try to resolve the service again.
On the client side I am using DNSServiceResolve in combination with a callback function where I call TXTRecordContainsKey and TXTRecordGetValuePtr to make sure the data is available before use. This all works fine except that, as already mentioned, the information is not always updated.
Am I missing something, or are there any additional API-function calls that I can use to force the daemon to update its record except DNSServiceResolve?
Thank you in advance.
Solved, always make sure you deaktivate your firewall when dealing with such strange problems...
This completely solved my issue.
I have a process, running on Solaris 10, that is terminating due to a SIGSEGV. For various uninteresting reasons it is not possible for me to get a backtrace by the usual means (gdb, backtrace call, corefile are all out). But I think dtrace might be useable.
If so, I'd like to write a dtrace script that will print the thread stacks of the process when the process is killed. I'm not very familiar with dtrace, but this seems like it must be pretty easy for someone who knows it. I'd like to be able to run this in such a way as to monitor a particular process. Any thoughts?
In case anyone else stumbles across this, I'm making some progress experimenting on OS X with the following script I've cobbled together:
#!/usr/sbin/dtrace -s
proc:::fault
/pid == $1/
{
ustack();
}
I'll update this with a complete solution when I have one.
A couple of Solaris engineers wrote a script for using Dtrace to capture crash data and published an article on using it, which can now be found at Oracle Technology Network: Enabling User-Controlled Collection of Application Crash Data With DTrace.
One of the authors also published a number of updates to his blog, which can still be read at https://blogs.oracle.com/gregns/, but since he passed away in 2007, there haven't been any further updates.
Recently I've heard a bit about the implementation (or rather, use of) /dev/null in Mongrel2, as well as other projects. However, I've never seen it explained what this actually means.
What does this mean, and why is it good for scalability (as I've seen it be claimed)?
Please read this :-) The Mongrel2 "support" was a joke (see the change, which was later removed).
Being serious though, /dev/null is useful when you want to discard output from processes. You can redirect output to it, for example, and the kernel will just discard that output.
/dev/null is a virtual device on UNIX systems which basically drops all data sent to it.
I'm having an issue with jboss server. when i run jboss server, it stops responding( no fixed time, so cannot predict when will it stop responding after start) after that it doesn't writes anything in log file. my problem is similar to the problem described on jboss community, link given below but it doesn't have the answer. please help.
http://community.jboss.org/message/526193
--Ravi
It sounds like your jboss server is running out of threads to allocate and is waiting for a new one to become available. Try triggering a thread dump (ctrl-\) and see if you find any threads suspiciously locked and waiting in some of your code. Quite possibly you have a deadlock or memory leak somewhere in your code which is causing old threads to lock up and never be released.
Alternatively try what the guy you linked to did, i.e. increasing the amount of threads available.
edit: For some more basic advice, this post might be of use to you.