Has a small table within a big DB the same performance as the same small table within a small DB? [closed] - rdbms

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 months ago.
Improve this question
Has a small table within a big DB the same performance as the same small table within a small DB?
(all about RDBMS)
I wanna understand do I need to split my DB or not

At first sight, and without more precision on the actual RDBMS and its underlying OS, the question is close to asking whether performances for accessing a small file on a disk depend on the disk size. And the answer is that the size of the disk has no direct impact on the performances.
It could have indirect impacts, if splitting the DB ends in splitting it on a number of systems because it could increase scalability by also splitting the number of connections across the various DB. But those are only indirect impact, because caching parts of a large DB could achieve the same scalability without actualy splitting the DB.

Related

New Intel processors KPTI bug. Which slowdown to expect for floating point computation? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
Some media have reported that a new hardware bug in Intel processors, allowing user mode processes to access kernel mode memory, has been discovered:
It is understood the bug is present in modern Intel processors
produced in the past decade. It allows normal user programs – from
database applications to JavaScript in web browsers – to discern to
some extent the layout or contents of protected kernel memory areas.
The effects [of fixes] are still being benchmarked, however we're
looking at a ballpark figure of five to 30 per cent slow down,
depending on the task and the processor model.
After the bug is fixed, which slowdown am I to expect for multicore floating point computations?
To my understanding, only the performance of switches between kernel and user mode are affected. For example, handling a lot of I/O is a workload where this is frequent, but CPU-intensive processes should not be affected as much.
To quote from one article that analyzes performance of the Linux KPTI patch:
Most workloads that we have run show single-digit regressions. 5% is a good round number for what is typical. The worst we have seen is a roughly 30% regression on a loopback networking test that did a ton of syscalls and context switches.
...
So PostgreSQL SELECT command is about ~20% slower with KPTI workaround, and I/Os in general seem to be impacted negatively according to Phoronix benchmarks especially with fast storage, but not gaming performance, Linux kernel compilation, H.264 encoding, etc…
Source: https://www.cnx-software.com/2018/01/03/intel-hardware-security-bug-fix-to-hit-performance-on-windows-linux/
So, if your FP computations rely mostly on in-memory data shifting and not I/O, they should be mostly unaffected.

MongoDB SSD vs more RAM [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
Is it better to order VPS with SSD drive or more RAM? SSD is quite expensive so far and MongoDB requires a lot of disk space. My current database size is ~50 GB, so would performance be better on SSD with 1 GB RAM or HDD with 8 GB RAM?
I'd start with estimating what you think the working set of memory used by MongoDB size will be - some details on how to approach that estimate can be found here:
http://docs.mongodb.org/manual/faq/diagnostics/
It's a function of the set of data that your application routinely accesses plus the amount of the indexes that need to be kept in memory. If you don't have enough memory to do that my experience has been that performance drops off of a cliff fairly quickly. I think SSDs are great for database performance all around but again my experience has been that you're better off making certain your working set remains in memory.
In my opinion 1GB for a production system is rather low - this needs to handle the OS, mongod and other daemons, the working set, plus anything else you need to do on the server. And keep in mind that if your database is growing your working set size is likely growing too, if for no other reason than indexes getting larger.

Time to learn Cassandra/ MongoDB [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am going to start learning NOSQL databases (in practices, already done my researches and understood the concepts and modeling approaches).
I am working with time series data and both cassandra and mongodb are recommended for this use case.
I would like to know which one takes less time to learn? (Unfortunately, I don't have much time to spend on learning)
PS: I noticed that there are more tutorials and documentations with mongoDB (am I correct?)
Thank you!
Having used them both extensively, I can say that the learning curve isn't as bad as you might think. But as different people learn and absorb information at different rates, it is difficult to say which you will find easier or how quickly you will pick them up. I will mention that MongoDB has a reputation of being very developer-friendly, due to the fact that you can write code and store data without having to define a schema.
Cassandra has a little steeper learning curve (IMO). However that has been lessened due to the CQL table-based column families in recent versions, which help to bridge the understanding gap between Cassandra and a relational database. Since tutorials for MongoDB have been mentioned, I will post a link to DataStax's Academy, which offers a free online course you can take to introduce yourself to Cassandra. Specifically, the course DS220 deals with modeling your data.
With both, a key concept to understand is that you must break yourself of the relational database idea where you want to build your tables/collections to store your data most-efficiently. In non-relational data modeling you will want to model your tables/collections to reflect how you want to query your data, and this might mean using redundant data in some places.
qMongoFront is a qt based mongodb gui application under linux. It's free and opensouce. If you want to learn mongodb, qMongoFront is a good choice.
http://sourceforge.net/projects/qmongofront/

MongoDB: Should I use Solid State Devices? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I am pretty new to MongoDB. Going through MongoDB's online documentation (here), I came across the following statement:
Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by
100 times or more for random workloads
Does this mean that it is recommended that mongo-db should runs on SSDs in production for best performance (from a hardware perspective)? We working on improving mongo-db performance for our product in production.
Before working on improving performance, it is nice to find your bottleneck. You can buy the fastest drive possible and it would do nothing, because the problem would be in your RAM or your queries. So before checking which type of disk you have to buy, better check where exactly is your problem.
I would start with analysing your queries with explain, checking your indexes. If your problem is not in reads, but in writes, check if your documents were moved too many times (moves are expensive so you have to investigate why they happened). You can check them with db.system.profile.find({'op': 'update', 'moved': True}). Then try to see if padding can help you to mitigate the moves. If no, than may be a good idea can be to restructure your database.
But if you want to look into hardware, take a look at this and mongodb video about hardware.

How mongoDB handles memory managment? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
quote from http://blog.engineering.kiip.me/post/20988881092/a-year-with-mongodb
Poor Memory Management - MongoDB manages memory by memory mapping your
entire data set, leaving page cache management and faulting up to the
kernel. A more intelligent scheme would be able to do things like
fault in your indexes before use as well as handle faulting in of
cold/hot data more effectively. The result is that memory usage can’t
be effectively reasoned about, and performance is non-optimal.
I don't understand his point. Can someone elaborate on this?
As #cHao said it is a bit of a rant and the author doesn't really understand just how massively complex and intricate the OS' own memory management programs are.
This is why MongoDB does not have its own memory management, because to do so could cause a headache of problems and other nonsense. At the end of the day the OS has a really good memory management process (even Windows does) so why not use that instead of creating one that would require years, maybe even decades to get to the same level?
A more intelligent scheme would be able to do things like fault in your indexes before use
Not sure if MongoDB can read your mind...
I mean what if you have actually designed your system right and don't need ALL (or even those you do in their entirety) your indexes in RAM at the same time?
Doing such pre-emptive paging (not faulting) in of data sounds counter-intuitive to a good setup.
If you require data to be in RAM at certain times you can use touch() http://docs.mongodb.org/manual/reference/command/touch/ or you can run a scirpt of your common queries that you need in RAM.
The result is that memory usage can’t be effectively reasoned about, and performance is non-optimal.
Hmm, that person obviously never actually bothered to use the OS' own inbuilt tools to measure page faulting and memory accession of the mongod process in their testing.
That being said there is actually a tool in the latter version of MongoDB to help cauge memory usage: http://docs.mongodb.org/manual/reference/command/serverStatus/#server-status-workingset