What is the correct way to enable query cache? - google-cloud-sql

Based on the documentation, the super privilege is not supported, which means that the following query:
SET GLOBAL query_cache_size = 1000000;
results in an error message
Access denied; you need (at least one of) the SUPER privilege(s) for this operation
and does not allow us to set the query cache size.
What's the correct way to accomplish the task?

Unfortunately, Cloud SQL does not support query caching and query_cache_size cannot be set.
If you are experiencing performance issues, you can try changing your instance tier to give your instance access to more resources. Also, it is preferable to use InnoDB over MyISAM tables. The reason for this is because when a Cloud SQL instance is started, it gives most of the available memory to the InnoDB buffer pool.

As mhalt hints at, there is a good reason not to use the query cache:
You should be using InnoDB rather than MyISAM, as MyISAM is not robust enough for the cloud environment.
InnoDB has built in caching as part of it's buffer pool. This caches individual pages of data, rather than entire result sets.
The buffer pool generally provides superior caching to the query cache: 1) it does not get flushed after writes 2) multiple different queries can be served using the same cache entries 3) it supports partial caching if the active set is larger than available ram.
The only workload where the query cache is superior is if you have a very low write rate and almost all your queries are exactly the same.
For this reason Cloud SQL is optimized by maximizing RAM allocated to the buffer pool instead of having a query cache.

CloudSQL now support query_cache flags.
https://cloud.google.com/sql/docs/mysql/flags
But these options may break the SLA coverage.

Related

Postgres why is swap-usage growing? How to reduce it? - AWS RDS

Having a postgres DB on AWS-RDS the Swap Usage in constantly rising.
Why is it rising? I tried rebooting but it does not sink. AWS writes that high swap usage is "indicative of performance issues"
I am writing data to this DB. CPU and Memory do look healthy:
To be precise i have a
db.t2.micro-Instance and at the moment ~30/100 GB Data in 5 Tables - General Purpose SSD. With the default postgresql.conf.
The swap-graph looks as follows:
Swap Usage warning:
Well It seems that your queries are using a memory volume over your available. So you should look at your queries execution plan and find out largest loads. That queries exceeds the memory available for postgresql. Usually over-much joining (i.e. bad database structure, which would be better denonarmalized if applicable), or lots of nested queries, or queries with IN clauses - those are typical suspects. I guess amazon delivered as much as possible for postgresql.conf and those default values are quite good for this tiny machine.
But once again unless your swap size is not exceeding your available memory and your are on a SSD - there would be not that much harm of it
check the
select * from pg_stat_activity;
and see if which process taking long and how many processes sleeping, try to change your RDS DBparameter according to your need.
Obviously you ran out of memory. db.t2.micro has only 1GB of RAM. You should look in htop output to see which processes takes most of memory and try to optimize memory usage. Also there is nice utility called pgtop (http://ptop.projects.pgfoundry.org/) which shows current queries, number of rows read, etc. You can use it to view your postgress state in realtime. By the way, if you cannot install pgtop you can get just the same information from posgres internal tools - check out documentation of postgres stats collector https://www.postgresql.org/docs/9.6/static/monitoring-stats.html
Actually it is difficult to say what the problem is exactly but db.t2.micro is a very limited instance. You should consider taking a biggier instance especially if you are using postgres in production.

How to performance test MongoDB Storage Engine for website Session data?

I'm looking to utilize MongoDB for session data storage, so we don't need sticky sessions in our load balanced environment.
As of 3.0, we can use different storage engines within MongoDB.
While MMapV1 and WiredTiger come out of the box, it's also possible to run other storage engines (RocksDB?).
What I would like to do is test out my website using MongoDB with the different storage engines backed behind it.
I currently have a JMeter script that will hit multiple pages on the site for many different users.
Between tests I can switch out the Mongo connection, to different Mongod instances on different storage engines.
All I can really take out of this is the average latency for the page loads in JMeter.
Is there better results I can find, possibly using different tools or techniques?
Or, for session data, which is heavily read/write, is there one storage engine that would be preferred over another?
I'm not sure if this question is too open-ended or not, but I thought I'd ask here to maybe get more direction about how to test this out.
An important advantage of WiredTiger over the default MMAP storage engine is that while MMAP locks the whole collection for a write, WiredTiger locks only the affected document(s). That means multiple users can change multiple documents at the same time. This is especially interesting in your case of session data, because you will likely have many website visitors at the same time, each one regularly updating their own session document. But when you want to test if this feature really provides a benefit in your use-case, you will have to build a more sophisticated test setup which simulates many simultaneous updates and requests from multiple users.
Another interesting feature of WiredTiger is that it compresses both data and indexes, which greatly reduces filesize. But this feature does of course cost performance. So when you only want to compare performance, you should switch off compression to have a fair comparison. The relevant config keys are:
storage.wiredTiger.collectionConfig.blockCompressor = none
storage.wiredTiger.indexConfig.prefixCompression = false
Keep in mind that changes to these keys will only take effect on newly created collections and indexes.
Another factor which could skew your results is cache size. The MMAP engine always uses all the RAM it can get to cache data. But WiredTiger is far more conservative and only uses half of the available RAM, unless you set a different value in
storage.wiredTiger.engineConfig.cacheSizeGB
So when you want a fair comparision, you should set this to the RAM size of the machine it runs on, minus the ram required by other processes running on the same machine. But this will of course only make a difference when your test uses more test data than fits into memory, so that the cache handling of both engines starts to matter.

Using memcache infront of a mongodb server

I am trying to understand how mongo's internal cache works and if it does eliminate using memcache. Our database size is around 200G and index fits in the memory but after the index not much free memory left on the server.
One of my colleague says mongo's internal cache will be as fast as memcache so no need to introduce another level of complexity by using memcache.
The scenario in my head is when we read the data from db, it's saved in memcache and next time it's directly read from the cache instead of going back to db server. If the data is changed and needs to be saved/updated, it's done on both memcache server and database server.
I have been reading about this but couldn't convince myself yet. So I'd really appreciate if someone could shed some light on this.
First thing is that a cache storage is different to a database. So MongoDB and SQL are different in purpose and usage when compared to Memcache.
Memcache is really good at lowering working set sizes for queries. For example: imagine a huge aggregated query with subselects and CASE statements and what not in SQL (think of the most complex query you can), doing this query in realtime all the time could cause the computer(s) to "thrash" (not to mention the problems client side).
However as everyone knows you need only summarise this query to another collection/table for it to be instantly faster. The real speed of memcache comes from the fact that it is a in memory key value store. This is where MongoDB could fail in speed because it is not memory stored, it is memory mapped but not stored.
MongoDB does no self caching, providing the query is "hot" and in LRU (this is where your working set comes in) you shouldn't notice much of a difference in response times. A good way to ensure a query is "hot" is to run it. Some people have a script of their biggest queries that they run to warm up the cache.
As I said memcache is a cache layer this is why:
If the data is changed and needs to be saved/updated, it's done on both memcache server and database server.
Makes me die a little inside. Many do blur the line between the DB and the cache layer.

why memcached instead of hashmap

I am trying to understand what would be the need to go with a solution like memcached. It may seem like a silly question - but what does it bring to the table if all I need is to cache objects? Won't a simple hashmap do ?
Quoting from the memcache web site, memcache is…
Free & open source, high-performance,
distributed memory object caching
system, generic in nature, but
intended for use in speeding up
dynamic web applications by
alleviating database load.
Memcached is an in-memory key-value
store for small chunks of arbitrary
data (strings, objects) from results
of database calls, API calls, or page
rendering. Memcached is simple yet
powerful. Its simple design promotes
quick deployment, ease of development,
and solves many problems facing large
data caches. Its API is available for
most popular languages.
At heart it is a simple Key/Value
store
A key word here is distributed. In general, quoting from the memcache site again,
Memcached servers are generally
unaware of each other. There is no
crosstalk, no syncronization, no
broadcasting. The lack of
interconnections means adding more
servers will usually add more capacity
as you expect. There might be
exceptions to this rule, but they are
exceptions and carefully regarded.
I would highly recommend reading the detailed description of memcache.
Where are you going to put this hashmap? That's what it's doing for you. Any structure you implement on PHP is only there until the request ends. If you throw stuff in a persistent cache, you can fetch it back out for other requests, instead of rebuilding the data.
I know that this question is rather old, but in addition to being able to share a cache across multiple servers, there is also another aspect that is not mentioned in other answers and is the values expiration.
If you store the values in a HashMap, and that HashMap is bound to the Application context, it will keep growing in size, unless you expire items in some ways. Memcached expires object lazily for maximum performance.
When an item is added to the memcache, it can have an expiration time, for instance 600 seconds. After the object is expired it will just remain there, but if another object asks for it, it will purge it and return null.
Similarly, when memcached memory is full, it will look for the first expired item of adequate size and expire it to make room for the new item. Lastly, it can also happen that the cache is full and there isn't any item to expire, in which case it will replace the least used items.
Using a fully flagded cache system usually allow you to replicate the cache on many servers, or just scale to many server just to scale a lot of parallel requestes, all this remaining acceptable fast in term of reply.
There is an (old) article that compares different caching systems used by php:
https://www.percona.com/blog/2006/08/09/cache-performance-comparison/
Basically, file caching is faster than memcached.
So to answer the question, I believe you would have better performances using a file based cache system.
Here are the results from the tests of the article:
Cache Type Cache Gets/sec
Array Cache 365000
APC Cache 98000
File Cache 27000
Memcached Cache (TCP/IP) 12200
MySQL Query Cache (TCP/IP) 9900
MySQL Query Cache (Unix Socket) 13500
Selecting from table (TCP/IP) 5100
Selecting from table (Unix Socket) 7400

Is Memcache recommended when using MongoDB?

I would like to know if Memcache is recommended when using a NoSQL database like mongoDB.
The concept of using memcache stems from the idea that you have "extra RAM" sitting around somewhere. Both MongoDB and MySQL (and most DBs) will take every meg of RAM that they can get.
In the case of the very common MySQL / Memcache, it is very well documented that using Memcache is more about reducing query load on the server than it is about speeding up queries. A good memcache implementation basically just tries to keep the most common data in memory so that the database server can churn away on bigger stuff.
In fact, it's been my experience that use of memcache generally becomes a reliance on memcache to maintain system performance.
So back to the original question, where do you have extra RAM?
If you have extra RAM on web servers, you may be able to use Memcache. Of course, you could also run Mongo locally on the web server. Just slave the data you need from the master.
If you have extra RAM on other computers, then there's not really a point in using memcache. Just add more nodes to your MongoDB replica set or shard. This is where MongoDB actually shines. Because of sharding / replication, you can add more RAM to Mongo Horizontally to increase performance. With SQL it's very difficult to "just add more servers" because joins don't scale very well. But with Mongo, it's quite possible to simply "add more nodes" to a problem.
MongoDB stores everything in memory anyway and works in a similar vein, being a key-value based system, however I believe MongoDB is more flexible, as it allows for storing BSON objects within themselves.
(Just for clarification, MongoDB uses BSON, a specialised form of JSON, for storing all its data, which includes objects within objects.)
At first no. If you run into performance problems later add a caching layer (memcache). But you won't gain anything if you're going to use Redis for example, as Redis already stores everything in memory.
The answer would depend on your use cases.
In general, accessing RAM is orders of magnitude faster than accessing disk.
Even the fastest SSD drives are about 100 times slower to access than RAM.
Now, I don't know if Mongo has a caching system in place (most likely it does), or what the eviction policy is, but as a programmer i would prefer a cache where i can store/retrieve and delete items at will. Therefore i would prefer using a caching solution even with Mongo.
In summary, it really depends what you are using these solutions for. There is no one answer to cover all possible uses.