Auto complete on Large data set

Auto complete on Large data set - mongodb

I'm writing a project where I need to do an autocomplete on a data set that has 5 milion objects (schema is different for objects).
My first thought was to do SQL, but since Schema is changing it will not be fast
So I thought about MongoDB.
Two questions:
1 - do you have sample code that's working that I can use?
2- is Mongo the best solution in place? will it be fast? is there another NoSQL database that I can use instead?

If the time is critical and you wish to have the fastest database than Redis may be the database you are looking for. Here is a link to the Auto complete blog post using Redis.
MongoDB is a great database and includes many great feature so it may be a good choice either.

Related

Use two mongoDB databases in one SpringBoot application

My SpringBoot API is supposed to read data from a collection of one database and before returning response back, it is supposed to insert a document in a collection of another database.
I am looking for a quick and efficient way to do this. I searched and found that I can make two entries in my application.properties and create two different Mongo template connection using those. But I am looking for a more clean and compact way to do this (if any).

Refer
https://github.com/Mohit-Hurkat/spring-boot-multi-mongo
it's by using two templates (but a clean way and simple to do this)
https://github.com/Mohit-Hurkat/multi-tenant-spring-mongodb

You can use change stream concept in mongodb..
If you have any change in database it automatically drop the changes in another database

MongoDB in Luigi Python

I would like to know if there is a way to output to a MongoDB in Luigi. I see in the documentation they support files (local FS, HDFS), S3, PostgreSQL but not MongoDB. If not, could someone explain me why not? Maybe it is a bad idea to have it? I would like to store the data in a database because then I can explore it by querying it. However I am using mongodb and I would not like to install another database. I do not need a relational database as I am using the database only to store and query ( NoSql ) without relationships, so the best option is mongodb.
Basically I need a task to read the data and save it in the database. Then the next task take this output and process the data.
Any recommendation, suggestion or clarification is more than welcome. Thanks!

You can try using mortar-luigi.
Check out this link for MongoDB tasks and this example.

Jsondb performance

Good day
I'm using QtJsonDb from http://qt-project.org/wiki/Building_QtJsonDb_from_Git as a JsonDb backend NoSQL database.
It used to work very good, but now I have over 10,000 records and its becoming very very slow
I'm saving somewhat complex objects to the db
1- how fast should the db be when retrieving the details
2- is there a 3rd party application or framework where I can load the json files and test the queries on them as well and see how is the performance there
Thanks!

Look at MongoDb, it can store data in json and it has ability to add custom indexes for quick retrieval.

Existing Postgres Database vs Solr

We have an app that uses postgres database, that has about 50 tables. Each table contains about 3 Million records (on average). The tables get updated with new data every now and than. Now, we want to implement search feature in our app. The search needs to be performed on one table at a time (no joins needed).
I've read about postgres full text support and that looks promising. But it seems that Solr is Super fast in comparison to it. Can I use my existing postgres database with Solr? If tables get updated would I need to re-index everything again?

It is definitely worth giving Solr a try. We moved many MySQL queries involving JOINs on multiple tables with sorting on different fields to Solr. We are very happy with Solr's search speed, sort speed, faceting capabilities and highly configurable text analysis/tokenization options.
If tables get updated would I need to re-index everything again?
No, you can run delta imports to only re-index your new and updated documents. See https://wiki.apache.org/solr/DataImportHandler.
Get started with https://lucene.apache.org/solr/4_1_0/tutorial.html and all the links in there.

Since nobody has leapt in, I'll answer.
I'm afraid it all depends. It depends on (at least)
how big the text is in each "document"
how flexible you want your searching to be
how much integration you need between database and text-search
how fast is fast enough
how much experience you have with both
When I've had a database that needs some text searching, I've just used PG's built-in options. If I didn't have superuser access to the db, or was already running a big Java setup then Solr might well have appealed.

Obj-c, What's the quickest way to execute many SQLite insert / update queries, without core data?

I'm committed along the route of using SQLite without core data.
I need to speed up a function which performs some database transactions after querying the database. I've created a dictionary for the rows with all the values I'll need.
I need to do this to avoid the database locking.
At the moment I'm calling my add record to database function, which opens and closes the database each time.
Obviously this is where the process is slow.
I was thinking that it's common for apps to be embedded with a database setup script, so it must be possible to run a batch of queries.
So I'm thinking if I can build up a string with all my queries I could just execute that.
But I'm not 100% this is the best approach or how to execute batch queries.
Can anyone advise me how to proceed?

For starters .. check out these links:
how-do-i-improve-the-performance-of-sqlite
ios-coredata-batch-insert (Yes I know that you said no core data - but it is worth a read)
fast-bulk-inserts-into-sqlite (Looks similar in content to the first link)

I was about to do the same - using plain SQLite instead of CoreData - but changed my mind later. In that process if found this link useful: Improve INSERT-per-second performance of SQLite? . Beyond the obvious (transaction,prepared statement,..) it uses some SQLite specific performance tweaks.