how to decrease the size of db file that created by LiteDb after i has deleted about 200 records - litedb

How to decrease the size of db file that created by LiteDb , after I has deleted about 200 records;
I have seen all issues of the litedb doc, but no satisfactory answer;

You can call the Rebuild() method on your database:
using var db = new LiteDatabase(#"MyData.db");
db.Rebuild();

Related

MongoDB does't work as expected (Realm.findAll)

I am a newbie in MongoDB Realm. I followed this guide to start(https://www.mongodb.com/docs/realm/sdk/java/quick-start-sync/).
This is the implementation to fetch all employees from MongoDB.
val employeeRealmConfig = SyncConfiguration.Builder(
realmApp.currentUser()!!,
AppConfigs.MONGODB_REALM_USER_PARTITION_ID
).build()
backGroundRealm = Realm.getInstance(employeeRealmConfig)
val queryEmployeesTask = backGroundRealm.where<Employee>().findAll()
I printout queryEmployeesTask size but each time I run my application there is a different result printed out and queryEmployeestask size < 25000. I used mongo compas to check database, there are 25000 records for partition AppConfigs.MONGODB_REALM_USER_PARTITION_ID.
I want to get full 25000 records, Could you help me to resolve this problem ?
Thank in advanced
After checking the document carefully, I realized that Employee Object in the client has a different schema with Mongo Atlast schema, after correcting this problem val queryEmployeesTask = backGroundRealm.where<Employee>().findAll() returns the correct value.
I hope this can help someone has the same problem with me

Performance degrade with Mongo when using bulkwrite with upsert

I am using Mongo java driver 3.11.1 and Mongo Version 4.2.0 for my development.I am still learning mongo. My application receives data and either have to do insert or replace the existing document i.e. do an upsert.
Each document size is 780-1000 bytes as of now and each collection can have more than 3 millions records.
Approach 1: I tried using findOneandreplace for each document and it was taking more than 15 minutes to save the data.
Approach-2 I changed it to bulkwrite using below, which resulted in ~6-7 minutes for saving 20000 records.
List<Data> dataList;
dataList.forEach(data-> {
Document updatedDocument = new Document(data.getFields());
updates.add(new ReplaceOneModel(eq("DataId", data.getId()), updatedDocument, updateOptions));
});
final BulkWriteResult bulkWriteResult = mongoCollection.bulkWrite(updates);
3) I tried using collection.insertMany which takes 2 seconds to store the data.
As per driver code, insertMany also Internally InsertMany uses MixedBulkWriteOperation for inserting the data similar to bulkWrite.
My queries are -
a) I have to do upsert operation, Please let me know where i am doing any mistakes.
- Created the indexes on DataId field but resulted in <2 miliiseconds difference in terms of performance.
- Tried using writeConcern of W1, but performance is still the same.
b) why insertMany's performance is faster than bulk write. I could understand in terms of few seconds difference but unable to figure out the reason for 2-3 seconds for insertMany and 5-7 minutes for bulkwrite.
c) Are there any approaches that can be used to solve this situation.
This problem was solved to greater extent by adding index on DataId Field. Previously i had created index on DataId field but forgot to create index after creating collection.
This link How to improve MongoDB insert performance helped in resolving the problem

Are indexes deleted after deleting file in GridFS?

I am using GridFS MongoDB via JAVA Driver 2.13.
I inserted a file:
File file = new File("/home/dev/abc.pdf");
GridFSInputFile inputFile = gfs.createFile(file);
I removed it using its _id which is p_id in this case:
DBObject query = new BasicDBObject("_id",p_id)
gfs.remove(query);
I came to know GridFS maintains compound Index on the primary key of metadata file and number of the chunk.
Are these indexes deleted after deleting the file in GridFS?
Index changes happen synchronously with data changes. If you deleted a file, index was updated at the same time.
Are these indexes deleted after deleting the file in GridFS?
You're, likely, just confused about terminology, so I'll clarify. "Deleting an index" means un-indexing a collection (removing index data for all documents and not adding it in the future). What happens here is "index update" where certain entries of the index are updated when you change/insert/remove corresponding data documents.

Taking hours to iterate 300 million mongo db records

I'm iterating over the whole mongo documents from a mongo slave using mongo java API.
Mongo Server: 2.4.10
Number of records in slave: 300 million.
I've one mongo master, one mongo slave.
(No sharding done)
The mongo slave gets replicated very high frequency 2000 insertions and deletions every 10 seconds.
The iteration is taking more than 10 hours.
My goal is to fetch each record in the collection and create a csv and load it to redshift.
DB db = null;
DBCursor cursor = null;
mongo = new MongoClient(mongoHost);
mongo.slaveOk();
db = mongo.getDB(dbName);
DBCollection dbCollection = db.getCollection(dbCollectionName);
cursor = dbCollection.find();
while (cursor.hasNext()) {
DBObject resultObject = cursor.next();
String uid = (String) ((Map) resultObject.get("user")).get("uid");
String category = (String) resultObject.get("category");
resultMap.put(uid, category);
if (resultMap.size() >= csvUpdateBatchSize) {
//store to a csv - append to an existing csv
}
}
is there a way to bring down the iteration time to below 1 hours?
Infrastructure changes can be done too ..Like increasing shards.
Please suggest.
Have you considered performing a parallel mongoexport on your collection?
If you have a way to partition your data with a query (something like modulo over an id or indexed field) and pass this as a standard input to your program.
Your program then will handle each document as a JSON row, which you can load to a designated object representing document structure with GSON or some other similar library
and eventually run your logic on that object.
Using mongoexport and adding parallelism can improve your performance greatly.

Updating index in Lucene.NET

I am doing a POC on Search using Lucene.NET.
I fire a stored procedure which fetches around 50000 records from the database.
Thses records I put in the Lucene Index.
Now when the records in database changes, how to update the Lucene index.
Deleting the entire previous indexed and creating a new one will take a lot of time.
I want to append the new records from the database to the existing index.
How can I achieve this.
Any ideas ???
Thanks,
Aneesh
Just use lucene AddDocument method, something like this :
IndexWriter iw = new IndexWriter(folder, GetAnalyzer(), false);
try
{
Document luceneDoc = new Document();
/// add fields to the lucene document
iw.AddDocument(luceneDoc);
}
finally
{
iw.Close();
}