difference between db.runCommand({getlasterror:1,fsync:true}) and db.runCommand({getlasterror:1}) in MongoDB? - mongodb

I understand that to getlasterror, it guarantees that the write has been done to a file.
This means that, even the computer power is off, the previous write is still ok.
But what is the use of fsync:true?

Essentially getLastError checking for an error in last database operation for the current connection. If you will run this command with fsync option it will also flush data to the datafiles (by defaul mongodb do it each 60 seconds).
More details you can find here and here

Related

Is db.stats() a blocking call for MongoDB?

While researching how to check the size of a MongoDB, I found this comment:
Be warned that dbstats blocks your database while it runs, so it's not suitable in production. https://jira.mongodb.org/browse/SERVER-5714
Looking at the linked bug report (which is still open), it quotes the Mongo docs as saying:
Command takes some time to run, typically a few seconds unless the .ns file is very large (via use of --nssize). While running other operations may be blocked.
However, when I check the current Mongo docs, I don't find that text. Instead, they say:
The time required to run the command depends on the total size of the database. Because the command must touch all data files, the command may take several seconds to run.
For MongoDB instances using the WiredTiger storage engine, after an unclean shutdown, statistics on size and count may off by up to 1000 documents as reported by collStats, dbStats, count. To restore the correct statistics for the collection, run validate on the collection.
Does this mean the WiredTiger storage engine changed this to a non-blocking call by keeping ongoing stats?
a bit late to the game but I found this question while looking for the answer, and the answer is: Yes until 3.6.12 / 4.0.5 it was acquiring a "shared" lock ("R") which block all write requests during the execution. After that it's now an "intent shared" lock ("r") which doesn't block write requests. Read requests were not impacted.
Source: https://jira.mongodb.org/browse/SERVER-36437

Write-ahead-logging (WAL) used in Postgres

I want to do some work with Write-ahead-logging(WAL) on Postgres. Could anyone point me to the WAL implementation in Postgres codebase? I just want to know current implementation and start to modify that. Any version of Postgres is fine unless it has WAL.
Thanks in advance.
The main part of the code is here:
src/backend/access/transam/xlog.c
And:
src/backend/access/transam/README
But of course the need to do WAL permeates the entire code base.
You have picked perhaps the most difficult possible starting point to get your feet wet. (I should know--that is also how I did it).
WAL is write-ahead logging. Basically, before the database actually
performs an operation, it writes in a log what it's about to do. Then, it
goes and does it. This ensures data consistency. Let's say that the
computer was powered off suddenly. There are several points that could
happen:
1) before a write - in this case the database would be fine with or
without write-ahead logging.
2) during a write - without write-ahead logging, if the machine is powered
off during a write, the database has no way of knowing what remained to be
written, or what was being written. WIth Postgres, this is furthere
broken down into two possibilities:
The power-off occurred while it was writing to the log - in this
case, the log is rolled back. The database is unaffected because the data
was never written to the database proper.
The power-off occurred after writing to the log, while writing to
disk - in this case, Postgres can simply read from the log what was
supposed to be written, and complete the write.
3) after a write - again, this does not affect Postgres either with or
without WAL.
In addition, WAL increases PostgreSQL's efficiency, because it can delay
random-access writes to disk, and just do sequential writes to the log for
a long time. This reduces the amount of head-seek the dissk are doing.
If you store your WAL files on a different disk, you get even more speed
advantages.

MongoDB verify if fsync is working

I perform fsync() operation in MongoDB, but it just returns immediately. It doesn't appear to have done anything. How do I verify if the data has indeed been flushed to disk at all?
NOTE: I set syncdelay to 0 which means that the writes won't be automatically fsync'ed every 60 seconds.
my actual command is using the perl driver:
$connection->fsync({async=>1});
Thanks.
If you don't want the fsync to return immediately, then you can remove the async option and it will become a blocking operation.
But if you don't want it to be blocking, you can use db.currentOp from the shell to query the current state of the fsync.
If you want to get that information from Perl, you can use the technique I outlined in this answer. Unfortunately there's no convenient way to get it directly via run_command.

Whether MongoDB has logs for each insertion and removal

I am wondering whether mongodb has logs for each insertion and removal? ie. monitoring or backup capabilities?
You can actually create an extremely verbose log of writes and reads and all that.
When you go to actually run mongod you can define a: http://docs.mongodb.org/manual/reference/mongod/#cmdoption-mongod--diaglog param which when set to 1 will log every single write operation including insertion and deletion.
Look at the oplog. It could be what you are looking for.
docs.mongodb.org
www.briancarpio.com
By default only queries which take longer that slowms are logged. You can log every query whith profiling level 2 however.
See here for more details

PostgreSQL. Slow queries in log file are fast in psql

I have an application written on Play Framework 1.2.4 with Hibernate(default C3P0 connection pooling) and PostgreSQL database (9.1).
Recently I turned on slow queries logging ( >= 100 ms) in postgresql.conf and found some issues.
But when I tried to analyze and optimize one particular query, I found that it is blazing fast in psql (0.5 - 1 ms) in comparison to 200-250 ms in the log. The same thing happened with the other queries.
The application and database server is running on the same machine and communicating using localhost interface.
JDBC driver - postgresql-9.0-801.jdbc4
I wonder what could be wrong, because query duration in the log is calculated considering only database processing time excluding external things like network turnarounds etc.
Possibility 1: If the slow queries occur occasionally or in bursts, it could be checkpoint activity. Enable checkpoint logging (log_checkpoints = on), make sure the log level (log_min_messages) is 'info' or lower, and see what turns up. Checkpoints that're taking a long time or happening too often suggest you probably need some checkpoint/WAL and bgwriter tuning. This isn't likely to be the cause if the same statements are always slow and others always perform well.
Possibility 2: Your query plans are different because you're running them directly in psql while Hibernate, via PgJDBC, will at least sometimes be doing a PREPARE and EXECUTE (at the protocol level so you won't see actual statements). For this, compare query performance with PREPARE test_query(...) AS SELECT ... then EXPLAIN ANALYZE EXECUTE test_query(...). The parameters in the PREPARE are type names for the positional parameters ($1,$2,etc); the parameters in the EXECUTE are values.
If the prepared plan is different to the one-off plan, you can set PgJDBC's prepare threshold via connection parameters to tell it never to use server-side prepared statements.
This difference between the plans of prepared and unprepared statements should go away in PostgreSQL 9.2. It's been a long-standing wart, but Tom Lane dealt with it for the up-coming release.
It's very hard to say for sure without knowing all the details of your system, but I can think of a couple of possibilities:
The query results are cached. If you run the same query twice in a short space of time, it will almost always complete much more quickly on the second pass. PostgreSQL maintains a cache of recently retrieved data for just this purpose. If you are pulling the queries from the tail of your log and executing them immediately this could be what's happening.
Other processes are interfering. The execution time for a query varies depending on what else is going on in the system. If the queries are taking 100ms during peak hour on your website when a lot of users are connected but only 1ms when you try them again late at night this could be what's happening.
The point is you are correct that the query duration isn't affected by which library or application is calling it, so the difference must be coming from something else. Keep looking, good luck!
There are several possible reasons. First if the database was very busy when the slow queries excuted, the query may be slower. So you may need to observe the load of the OS at that moment for future analysis.
Second the history plan of the sql may be different from the current session plan. So you may need to install auto_explain to see the actual plan of the slow query.