How to increase MongoDB performance - mongodb

I am developing an application in MongoDB, CakePHP. I have 145,000 record in my database. When I query for records using following command then it takes 12seconds. Which is very bad for my application.
$params= array('aggregate'=>array(
array('$project'=>array('as'=>1,'pid'=>1,'st'=>1)),
array('$unwind'=>'$as'),
array('$match' => array('pd'=>array('$gt'=>$f,'$lt'=>$t),'pid'=>$project_id)),
array('$group'=>array('_id'=>'$as')),
array('$sort'=>array('_id'=>1)),
array('$limit'=>10)
)
);
$results = $this->Detail->find('all',array('conditions'=>$params));
Can anyone help me in reducing time for query.
I have indexes on as & pid. My system RAM is 1.5GB.
I got following data as result
[1] => Array
(
[Detail] => Array
(
[_id] => Array
(
[0] => "superfone" Llc,moscow,ru
)
)
)
[2] => Array
(
[Detail] => Array
(
[_id] => Array
(
[0] => "superphone" Llc,moscow,ru
)
)
)

Before performing the find, make sure to set the recursive level to the level you need, then report back on your performance.
$this->Detail->recursive = -1; // no joins
results = $this->Detail->find('all',array('conditions'=>$params));
or
$this->Detail->recursive = 0; // data + domain
results = $this->Detail->find('all',array('conditions'=>$params));
Not sure what level of recursive you need, so give them a test, but default is 1 and it can be a costly exercise.
ref
http://book.cakephp.org/2.0/en/models/model-attributes.html#recursive
The reference also says to make -1 your default, and raise the recursive levels as you need them, this can be done by adding the following to AppModel
public $recursive = -1;

Related

Cannot retrieve data from find query

In my Event model, I have the following function to retrieve all the events with status = 1 with a 12 limit and order according to event created DESC:
public function latestEvents() {
$this->Behaviors->load('Containable');
$result = $this->find('all' ,array('recursive' => -1, 'conditions'=> array('Event.status' => 1), 'limit' => 12, 'order' => array('Event.created DESC')));
debug($result); die();
return $result;
}
This function is not returning any data. When I change my limit to 6 and debug it returns six records but when I change my limit to more than 6 it returns (empty) this :
I even checked in my database by doing this query :
SELECT * FROM `events` WHERE `status` = 1 ORDER BY `created` DESC LIMIT 12
and this returns the desired data that I want. I even tried :
$result = $this->query('SELECT * FROM `events` WHERE `status` = 1 ORDER BY `created` DESC LIMIT 12');
but the same thing is happening with the limit (6 returns the data but more than 6 does not).
I found out that the data I was trying to debug had special characters and I had to include 'encoding' => 'utf8' in my database.php and worked like a charm. This post helped me.

Additional conditions in JOIN

I have tables with articles and users, both have many-to-many mapping to third table - reads.
What I am trying to do here is to get all unread articles for particular user ( user_id not present in table reads ).
My query is getting all articles but those read are marked, which if fine as I can filter them out (user_id field contains id of user in question).
I have an SQL query like this:
SELECT articles.id, reads.user_id
FROM articles
LEFT JOIN
reads
ON articles.id = reads.article_id AND reads.user_id = 9
ORDER BY articles.last_update DESC LIMIT 5;
Which yields following:
articles.id | reads.user_id
-------------------+-----------------
57125839 | 9
57065456 |
56945065 |
56945066 |
56763090 |
(5 rows)
This is fine. This is what I want.
I'd like to get same result in Catalyst using my article model, but I cannot find any option to add conditions to a JOIN clause.
Do you know any way how to add AND X = Y to DBIx JOIN?
I know this can be done with custom resoult source and virtual view, but I have some other queries that could benefit from it and I'd like to avoid creating virtual view for each of them.
Thanks,
Canto
I don't even know what Catalyst is but I can hack the SQL query:
select articles.id, reads.user_id
from
articles
left join
(
select *
from reads
where user_id = 9
) reads on articles.id = reads.article_id
order by articles.last_update desc
limit 5;
I got an solution.
It's not straight forward, but it's better than virtual view.
http://search.cpan.org/dist/DBIx-Class/lib/DBIx/Class/Relationship/Base.pm#condition
Above describes how to use conditions in JOIN clause.
However, my case needs an variable in those conditions, which is not available by default in model.
So getting around a bit of model concept and introducing variable to it, we have the following.
In model file
our $USER_ID;
__PACKAGE__->has_many(
pindols => "My::MyDB::Result::Read",
sub {
my $args = shift;
die "no user_id specified!" unless $USER_ID;
return ({
"$args->{self_alias}.id" => { -ident => "$args->{foreign_alias}.article_id" },
"$args->{foreign_alias}.user_id" => { -ident => $USER_ID },
});
}
);
in controller
$My::MyDB::Result::Article::USER_ID = $c->user->id;
$articles = $channel->search(
{ "pindols.user_id" => undef } ,
{
page => int($page),
rows => 20,
order_by => 'last_update DESC',
prefetch => "pindols"
}
);
Will fetch all unread articles and yield following SQL.
SELECT me.id, me.url, me.title, me.content, me.last_update, me.author, me.thumbnail, pindols.article_id, pindols.user_id FROM (SELECT me.id, me.url, me.title, me.content, me.last_update, me.author, me.thumbnail FROM articles me LEFT JOIN reads pindols ON ( me.id = pindols.article_id AND pindols.user_id = 9 ) WHERE ( pindols.user_id IS NULL ) GROUP BY me.id, me.url, me.title, me.content, me.last_update, me.author, me.thumbnail ORDER BY last_update DESC LIMIT ?) me LEFT JOIN reads pindols ON ( me.id = pindols.article_id AND pindols.user_id = 9 ) WHERE ( pindols.user_id IS NULL ) ORDER BY last_update DESC: '20'
Of course you can skip the paging but I had it in my code so I included it here.
Special thanks goes to deg from #dbix-class on irc.perl.org and https://blog.afoolishmanifesto.com/posts/dbix-class-parameterized-relationships/.
Thanks,
Canto

Getting last row in mongodb

I am using find() function in mongodb and got a record in following format
Array
(
[_id] => MongoId Object
(
[$id] => 52a561ea78e9288b568b4567
)
[friendID] => 1
[name] => Shobhit Srivastav
[senderID] => 2
[receiverID] => 1386570218
[receiverType] => TW
[receiverUserID] => 3
[status] => 0
)
Array
(
[_id] => MongoId Object
(
[$id] => 52a5623178e928d8568b4567
)
[friendID] => 2
[name] => Sachin Tendulkar
[senderID] => 2
[receiverID] => 1386570289
[receiverType] => TW
[receiverUserID] => 3
[status] => 0
)
but I want record of last row which are inserted in the table. how can I find??
Thanks in advance!!
If you want to get last record inserted in the table, then sort by ObjectId in descending order:
sorting on an _id field that stores ObjectId values is roughly
equivalent to sorting by creation time.
and get first record:
db.collection.find().sort( { _id : -1 } ).limit(1);
With php driver it will look like
$doc = $collection->find()->sort(array("_id" => -1))->limit(1)->getNext();

How can I find a bounding box that encapsulates a specific point?

[uppercaseName] => ATLANTA, GA
[description] => Atlanta, GA
[name] => Atlanta, GA
[_id] => MongoId Object (
)
[addedOn] => MongoDate Object (
[sec] => 1318879015
[usec] => 517000
)
[excludePoints] => Array (
)
[boundingBox] => Array (
[0] => Array (
[lon] => -84.516
[lat] => 33.6747
)
[1] => Array (
[lon] => -84.516
[lat] => 33.8232
)
[2] => Array (
[lon] => -84.2599
[lat] => 33.8232
)
[3] => Array (
[lon] => -84.2599
[lat] => 33.6747
)
)
That's my document (in MongoDB). I have several such documents and I want to run a query to find all documents that have a bounding box that encapsulates that specific Long and Lat. How would I do this?
Unfortunately, this is not currently possible with MongoDB. MongoDB can index points and find all of the documents inside an area, but cannot index areas and query all of the documents containing an area that encloses a gives point.
There is a feature request for this: https://jira.mongodb.org/browse/SERVER-2874
Presently there is no scheduled date for this feature. Please vote for it!
For more information on Geospatial Indexing and what it is capable of, please see the MongoDB documentation:
http://www.mongodb.org/display/DOCS/Geospatial+Indexing

Having an SQL SELECT query, how do I get number of items?

I'm writing a web app in Perl using Dancer framework. The database is in sqlite and I use DBI for database interaction.
I'm fine with select statements, but I wonder is there a way to count selected rows.
E.g. I have
get '/' => sub {
my $content = database->prepare(sprintf("SELECT * FROM content LIMIT %d",
$CONTNUM));
$content->execute;
print(Dumper($content->fetchall_arrayref));
};
How do I count all items in the result without issuing another query?
What I want to achieve this way is showing 30 items per page and knowing how many pages there would be. Of course I can run SELECT COUNT (*) foo bar, but it looks wrong and redundant to me. I'm looking for a more or less general, DRY and not too heavy on database way to do so.
Any SQL or Perl hack or a hint what should I read about would be appreciated.
// I know using string concatenation for querys is bad
You have to do it the hard way: one query to get the count and another to get your desired slice of the row set:
my $count = $database->prepare('SELECT COUNT(*) FROM content');
$count->execute();
my $n = $count->fetchall_arrayref()->[0][0];
my $content = $database->prepare('SELECT * FROM content LIMIT ?');
$content->execute($CONTNUM);
#...
Not too familiar with perl, but I assume you can just store the result of $content->fetchall_arrayref and retrieve the count from that array befor you print it.
[edit]
Something like
my $ref = $content->fetchall_arrayref;
my $count = scalar(#$ref);
Don't use sqlite myself but the following might work:
select * from table join (select count(*) from table);
Whether the above works or not the first thing I'd look for is scrollable cursors if you are going to page through results - I doubt sqlite has those. However, in DBI you can use fetchall_arrayref with a max_rows to fetch a "page" at a time. Just look up the example in the DBI docs under fetchall_arrayref - it is something like this:
my $rowcache = [];
while( my $row = ( shift(#$rowcache) || shift(#{$rowcache=$sth->fetchall_arrayref(undef,100)||[]}) )
) {
# do something here
}
UPDATE: Added what you'd get with selectall_hashref assuming the table is called content with one integer column called "a":
$ perl -le 'use DBI; my $h = DBI->connect("dbi:SQLite:dbname=fred.db"); my $r = $h->selectall_hashref(q/select * from content join (select count(*) as count from content)/, "a");use Data::Dumper;print Dumper($r);'
$VAR1 = {
'1' => {
'count' => '3',
'a' => '1'
},
'3' => {
'count' => '3',
'a' => '3'
},
'2' => {
'count' => '3',
'a' => '2'
}
};
If you want to know how many results there will be, as well as getting the results themselves, all in one query, then get the count as a new value:
SELECT COUNT(*) AS num_rows, * from Table WHERE ...
Now the row count will be the first column of every row of your resultset, so simply pop that off before presenting the data.