Sphinx: How to stop a query submitted? - sphinx

In sphinx, Sometimes there is a situation that a query takes a long time to complete. Is there any way to cancel submitted query or stop it somehow with SphinxQL? (I don't want to kill or stop searchd deamon and just want to kill thread that is running the query).

There isn't an explicit kill command.
If have slow queries perhaps consider using max_query_time option to prevent these in the first place.

Related

i/o timeout with mgo and mongodb

I am running a map-reduce job from mgo. It runs on a collection with a little more than 3.5M records. For some reasons right now I can not port this to aggregation; may be later. So, map-reduce is the thing I am looking forward to. This job, when I run it from the original js files I have created to test the code and output, runs fine. I tried to put the map and reduce code inside two strings and then tried to call the mgo.MapReduce to do the map-reduce for me where I am writing the output in a different collection. And it gives me
read tcp 127.0.0.1:27017: i/o timeout
Though, as the job has been fired in back-ground it is still running. Now according to this thread here --- http://grokbase.com/t/gg/mgo-users/1396d9wyk3/i-o-timeout-in-statistics-generation-upsert
It is easy to solve by calling the session.SetSocketTimeout but I do not want to do this as the total number of documents on which this map-reduce will run will vary and thus, I believe, the time. So, I will never be able to solve the problem by that way I believe.
What are the other ways that I might have?
Please help me
Moving my comment to an answer.
I believe the only way to fix this is simply setting the socket timeout to something ridiculously high, for example:
session.SetSocketTimeout(1 * time.Hour)

MongoDB verify if fsync is working

I perform fsync() operation in MongoDB, but it just returns immediately. It doesn't appear to have done anything. How do I verify if the data has indeed been flushed to disk at all?
NOTE: I set syncdelay to 0 which means that the writes won't be automatically fsync'ed every 60 seconds.
my actual command is using the perl driver:
$connection->fsync({async=>1});
Thanks.
If you don't want the fsync to return immediately, then you can remove the async option and it will become a blocking operation.
But if you don't want it to be blocking, you can use db.currentOp from the shell to query the current state of the fsync.
If you want to get that information from Perl, you can use the technique I outlined in this answer. Unfortunately there's no convenient way to get it directly via run_command.

mongodb background task

If possible, i'd like to run a find & remove query on non-indexed columns in "background", without disturbing other tasks or exhausting memory to the detriment of others.
For indexing, there is a background flag. Can the same be appended for find/remove tasks?
Thanks for a tip
This is not something you can use "background:true" for. Possibly the best way to handle this is to write a script that does this in the background. This script should run your operation in small batches with some delay in between. In pseudo code you would do:
find 10 docs you need to update
update those 10 docs
sleep
goto first step.
You will have to experiment with which value for sleep works. You do need to realize that all documents that you are updating need to be pulled into memory, so it will have at least some impact.
No, there is not a background:true flag for this operation. The remove will yield when page faults occur and allow other operations to execute. If you need to throttle this, then you can either remove in smaller batches or use a find/remove pattern which will lower the impact to other operations.

Why does MongoDB *client* use more memory than the server in this case?

I'm evaluating MongoDB. I have a small 20GB subset of documents. Each is essentially a request log for a social game along with some captured state of the game the user was playing at that moment.
I thought I'd try finding game cheaters. So I wrote a function that runs server side. It calls find() on an indexed collection and sorts according to the existing index. Using a cursor it goes through all documents in indexed order. The index is {user_id,time}. So I'm going through each user's history, checking if certain values (money/health/etc) increase faster than is possible in the game. The script returns the first violation found. It does not collect violations.
The ONLY thing that this script does on the client is define the function and calls mymongodb.eval(myscript) on a mongod instance on another box.
The box that mongod is running on does fine. The one that the script is launched from starts losing memory and swap. Hours later: 8GB of RAM and 6GB of swap are being used on the client machine that did nothing more than launch a script on another box and wait for a return value.
Is the mongo client really that flakey? Have I done something wrong or made an incorrect assumption about mongo/mongod?
If you just want to open up a client connection to a remote database you should use the mongo command, not mongod. mongod starts up a server on your local machine. Not sure what specifying a url will do.
Try
mongo remotehost:27017
From the documentation:
Use map/reduce instead of db.eval() for long running jobs. db.eval blocks other operations!
eval is a function that blocks the entire server if you don't use a special flag. Again, from the docs:
If you don't use the "nolock" flag, db.eval() blocks the entire mongod process while running [...]
You are kind of abusing MongoDB here. Your current routine is strange, because it returns the first violation found, but it will have to re-check everything when run the next time (unless your user ids are ordered and you store the last evaluated user id).
Map/Reduce generally is the better option for a long-running task, but aggregating your data does not seem trivial. However, a map/reduce based solution would also solve the re-evaluation problem.
I'd probably return something like this from map/reduce:
user id -> suspicious actions, e.g.
------
2525454 -> [{logId: 235345435, t: ISODate("...")}]

Is it possible to pause an SQL query?

I've got a really long running SQL query (data import, etc). It's crap - it uses cursors and it running slowly. It's doing it, so I'm not too worried about performance.
Anyways, can I pause it for a while (instead of canceling the query)?
It chews up a a bit of CPU so i was hoping to pause it, do some other stuff ... then resume it.
I'm assuming the answer is 'NO' because of how rows and data gets locked, etc.
I'm using Sql Server 2008, btw.
The best approximation I know for what you're looking for is
BEGIN
WAITFOR DELAY 'TIME';
EXECUTE XXXX;
END;
GO
Not only can you not pause it, doing so would be bad. SQL queries hold locks (for transactional integrity), and if you paused the query, it would have to hold any locks while it was paused. This could really slow down other queries running on the server.
Rather than pause it, I would write the query so that it can be terminated, and pick up from where it left off when it is restarted. This requires work on your part as a query author, but it's the only feasible approach if you want to interrupt and resume the query. It's a good idea for other reasons as well: long running queries are often interrupted anyway.
Click the debug button instead of execute. SQL 2008 introduced the ability to debug queries on the fly. Put a breakpoint at convenient locations
When working on similar situations, where I was trying to go through an entire list of data, which could be huge, and could tell which ones I have visited already, I would run the processing in chunks.
update or whatever
where (still not done)
limit 1000
And then I would just keep running the query until there are no rows being modified. This breaks the locks up into reasonable time chunks and can allow you to do thinks like move tens of millions of rows between tables while a system is in production.
Jacob
Instead of pausing the script, perhaps you could use resource governor. That way you could allow the script to run in the background without severely impacting performance of other tasks.
MSDN-Resource Governor