Memcache to use disk storage?

Memcache to use disk storage? - memcached

Can Memcache be configured to use disk storage instead of RAM ?
I am running Memcache Server on High IO Amazon EC2 instance. The instance has 2TB of SSD storage available. Can I configure Memcache to use the SSD storage to store the cache contents ?
Thanks

Twitter just released their Memcache on SSD, named fatcache

As far as I know this does not work. But there are alternatives as pointed out here: memcached-like key/value cache that uses both RAM and disk

instead of memcache you could use redis witch write on disk and will use your ssd.

Related

Mongodb on the cloud

I'm preparing my production environment on the Hetzner cloud, but I have some doubts (I'm more a developer than a devops).
I will get 3 servers for the replicaset with 8 core, 32 Gb ram and 240 gb ssd. I'm a bit worried about the size of the ssd the server comes with and Hetzner has the possibility to create volumes to be attached to the servers. Since mongodb uses a single folder for the db data, I was wondering how can I use the 240 gb that comes with the server in combination with external volumes. At the beginning I can use the 240 gb, but then I will have to move the data folder to a volume when it reaches capacity. Im fine with this, but it looks to me that when I will move to volumes, this 240gb will not be used anymore (yes I can use them to save the mongo journaling as they suggest to store it in a separate partition).
So, my noob question is, how can I use both the disk that comes with the server and the external volumes?
Thank you

Low Ram Usage in Postgres CloudSQL instance?

I have a production environment setup with Postgres CloudSql instance. My database in around 30GB and I have ram of 8GB on master and 16GB on slave. But one weird thing happening with me is that the memory usage on both master and slave is stuck at 43%. I am not sure what is the reason for same. Can anyone help regarding this?

I cannot tell what number exactly the graph represents, but I assume it is allocated memory.
Then that would be fine, because the "free" RAM is actually used by the kernel to cache files, and PostgreSQL uses that memory indirectly via the kernel cache.

Migrate to kubernetes

We're planning to migrate our software to run in kubernetes with auto scalling, this is our current infrastructure:
PHP and apache are running in Google Compute Engine n1-standard-4 (4 vCPUs, 15 GB memory)
MySql is running in Google Cloud SQL
Data files (csv, pdf) and the code are storing in a single SSD Persistent Disk
I found many posts that recomments to store the data file in the Google Cloud Storage and use the API to fetch the file and uploading to the bucket. We have very limited time so I decide to use NFS to share the data files over the pods, the problem is nfs speed is slow, it's around 100mb/s when I copying the file with pv, the result from iperf is 1.96 Gbits/sec.Do you know how to achieve the same result without implement the cloud storage? or increase the NFS speed?

Data files (csv, pdf) and the code are storing in a single SSD Persistent Disk
There's nothing stopping you from volume mounting an SSD into the Pod so you can continue to use an SSD. I can only speak to AWS terminology, but some EC2 instances come with "local" SSD hardware, and thus you would only need to use a nodeSelector to ensure your Pods were scheduled onto machines that had said local storage available.
Where you're going to run into problems is if you are currently just using one php+apache and thus just one SSD, but now you want to scale the application up and it requires that all php+apache have access to the same SSD. That's a classic distributed application architecture problem, and something kubernetes itself can't fix for you.
If you're willing to expend the effort, you can also try any one of the other distributed filesystems (Ceph, GlusterFS, etc) and see if they perform better for your situation. Then again, "We have very limited time" I guess pretty much means that's off the table.

EBS or Instance Storage for MongoDB in EC2?

Cassandra recommends using instance local storage for EC2 deployments instead of EBS
I am deploying MongoDB in EC2... should I also be using instance local storage instead of EBS PIOPS?

Here is a slide about using Hybrid (Instance store and PIOPS EBS) of MongoDB on EC2.
http://www.slideshare.net/mongodb/world-high-performance-mongo-db-on-ec2-20140620
Related topic:
Instance store is super fast - https://gist.github.com/ktheory/3c3616fca42a3716346b
Conclusions:
Instance-store is over 5x faster than EBS-SSD for uncached reads.
Instance-store and EBS-SSD are equalivent for cached reads.
Instance-store is over 10x faster than EBS-SSD for writes.
Special notes:
Ephemeral storage or instance-store DOES persist across reboots of an instance! It does not persist across a stop/start, nor a termination, nor some instance hardware failures.

The MongoDB manual has a section with EC2 storage considerations including the general recommendation to use EBS-optimized EC2 instances with provisioned IOPS (PIOPS) EBS volumes.
There are several good reasons to use EBS over local storage:
Local storage (or "Instance Store" in EC2 terms) is ephemeral and introduces potential data loss scenarios on instance stop/start/terminate as well as hardware failure (see AWS docs on Instance Store Lifetime).
While an Instance Store is dedicated to a particular instance, the disk subsystem is shared among instances on the host server hardware. As with regular EBS volumes, contention for a shared resource can lead to unpredictable I/O behaviour. Provisioned IOPS EBS volumes will provide more predictable I/O performance for an active database workload -- no spikes of higher than expected performance, but also no troughs of decreased performance.
The sizes of Instance Stores are determined by the instance type. EBS volumes can be provisioned independently to meet your storage and performance requirements.
If you want to change your instance types, EBS volumes can be re-attached to a new instance in the same availability zone.
EBS volumes can be combined using RAID for additional capacity or redundancy.
EBS volumes support asynchronous snapshots, which are a common backup strategy.
EBS volumes can support encryption for data at rest for most instance types.

EBS is recommended because it provided by more than one actual drive with 2ms transaction commit between mirror drives. EBS itself is fast enough and can reach 500+MB/sec for read and write.
Linux kernel this is what affect IOPS dramatically, see what Pinterest engineers investigated:
Final choice: kernel 3.18.7 + XFS + 64K RAID block size.
• Best overall performance for async random read.
• Very competitive performance everywhere else.
• Networking-related kernel bugs (Xen-specific) in 3.13 that aren’t
fixed until 3.16.
https://www.percona.com/live/mysql-conference-2015/sites/default/files/slides/all_your_iops_are_belong_to_usPLMCE2015.pdf

Do you need to run RAID 10 on Mongo when using Provisioned IOPS on Amazon EBS?

I'm trying to setup a production mongo system on Amazon to use as a datastore for a realtime metrics system,
I initially used the MongoDB AMIs[1] in the Marketplace, but I'm confused in that there is only one data EBS. I've read that Mongo recommends RAID 10 on EBS storage (8 EBS on each server). Additionally, I've read that the bare minimum for production is a primary/secondary with an arbiter. Is RAID 10 still the recommended setup, or is one provisioned IOPS EBS sufficient?
Please Advise. We are a small shop, so what is the bare minimum we can get away with and still be reasonably safe?
[1] MongoDB 2.4 with 1000 IOPS - data: 200 GB # 1000 IOPS, journal: 25 GB # 250 IOPS, log: 10 GB # 100 IOPS

So, I just got off of a call with an Amazon System Engineer, and he had some interesting insights related to this question.
First off, if you are going to use RAID, he said to simply do striping, as the EBS blocks were mirrored behind the scenes anyway, so raid 10 seemed like overkill to him.
Standard EBS volumes tend to handle spiky traffic well (it may be able to handle 1K-2K iops for a few seconds), however eventually it will tail off to an average of 100 iops. One suggestion was to use many small EBS volumes and stripe them to get better iops throughput.
Some of his customers use just the ephemeral storage on the EC2 images, but then have multiple (3-5) nodes in the availability set. The ephemeral storage is the storage on the physical machine. Apparently, if you use the EC2 instance with the SSD storage, you can get up to 20K iops.
Some customers will do a huge EC2 image w/ssd for the master, then do a smaller EC2 w/ EBS for the secondary. The primary machine is performant, but the failover is available but has degraded performance.
make sure you check 'EBS Optimized' when you spin up an instance. That means you have a dedicated channel to the EBS storage (of any kind) instead of sharing the NIC.
Important! Provisioned IOPS EBS is expensive, and the bill does not shut off when you shut down the EC2 instances they are attached to. (this sucks while you are testing) His advice was to take a snapshot of the EBS volumes, then delete them. When you need them again, just create new provisioned IOPS EBS volumes, restore the snapshot, then reconfigure your EC2 instances to attache the new storage. (it's more work than it should be, but it's worth it not to get sucker punched with the IOPS bill.

I've got the same question. Both Amazon and Mongodb try to market a lot on provisioned IOPs chewing over its advantages over a standard EBS volume. We run prod instances on m2.4xlarge aws instances with 1 primary and 2 secondaries setup per service. In the highest utilized service cluster, apart from a few slow queries the monitoring charts do not reveal any drop on performance at all. Page faults are rare occurrences and that too between 0.0001 and 0.0004 faults once or twice a day. Background flushes are in milliseconds and locks and queues are so far at manageable levels. I/O waits on the Primary node at any time ranges between 0 to 2 %, mostly less than 1 and %idle steadily stays above 90% mark. Do I still need to consider provisioned IOPs given we've a budget still to improve any potential performance drag? Any guidance will be appreciated.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse