I have a C# backend running on AWS Lambda. It connects to a PostgreSQL DB in the same region.
I observed extremely slow cold-startup execution time after adding the DB connection code. After allocating more memory to my Lambda function, the execution time has significantly reduced.
The execution time as measured by Lambda console:
128 MB (~15.5 sec)
256 MB (~9 sec)
384 MB (~6 sec)
512 MB (~4.5 sec)
640 MB (~3.5 sec)
768 MB (~3 sec)
In contrast, after commenting out the DB connection code:
128 MB (~5 sec)
256 MB (~2.5 sec)
So opening a DB connection has contributed a lot to the execution time.
According to AWS documentation:
In the AWS Lambda resource model, you choose the amount of memory you
want for your function, and are allocated proportional CPU power and
other resources.
Since the peak memory usage has consistently stayed at ~45 MB, this phenomenon seems to suggest that database connection establishment is a computationally intensive activity. If so, why?
Code in question (I'm using Npgsql):
var conn = new NpgsqlConnection(Db.connStr);
conn.Open();
using(conn)
{ // print something }
Update 1
I set up a MariaDB instance with the same configuration and did some testing.
Using MySql.Data for synchronous operation:
128 MB (~12.5 sec)
256 MB (~6.5 sec)
Using MySqlConnector for asynchronous operation:
128 MB (~11 sec)
256 MB (~5 sec)
Interestingly, the execution time on my laptop has increased from ~4 sec (for Npgsql) to 12~15 sec (for MySql.Data and MySqlConnector).
At this point, I'll just allocate more memory to the Lambda function to solve this issue. But if anyone knows why the connection establishment took so long, I'd appreciate an answer. I might post some profiling results later.
Related
When I run a query in postgres it will load in memory all the data it needs for query execution but when the query finishes executing the memory consumed by postgres is not released.
I run postgres with the Timescaledb extension installed, which already stores 93 million records, the request to get all these 93 million records takes 13GB of RAM and 21mins. After the request, RAM is reduced to 9GB and holds it.
How can I release all this memory?
Here's my postgresql.conf memory section.
shared_buffers = 8GB
huge_pages = try
huge_page_size = 0
temp_buffers = 8MB
max_prepared_transactions = 0
work_mem = 5118kB
hash_mem_multiplier = 1.0
maintenance_work_mem = 2047MB
autovacuum_work_mem = -1
logical_decoding_work_mem = 64MB
max_stack_depth = 2MB
shared_memory_type = mmap
effective_cache_size = 4GB
UPD 15.12.2022
#jjanes Yes, you were right. As a colleague and I found out, memory usage is not exactly what we thought about from the very beginning.
As we also understood from the free -m -h, when a request uses more memory than configured, the remaining data begins to be loaded into cache/buffer.
When we clean cache/buffer after big query, memory usage for this service drops to 8,53GB. This occupied memory +- coincides with the value of shared_buffers = 8GB, it practically does not decrease. For large requests, postgres still continues to use to request large chunks of memory from cache/buffer.
It turns out that we need to somehow clean the cache of postgres itself, since it does not release it itself?
I am using gcp cloud sql (mysql 8.0.18) and I am trying to execute a query for only 5000 rows,
SELECT * FROM client_1079.c_crmleads ORDER BY LeadID DESC LIMIT 5000;
but I think the execution is taking long time to fetch data
here is the time details
Affected rows: 0 Found rows: 5,000 Warnings: 0 Duration for 1 query: 0.797 sec. (+ 117.609 sec. network)
Instance configuration is vCPU: 8 , RAM: 20 GB, SSD: 410GB
screenshot of gcp cloud sql instance
also I am facing some issues on high table_open_cache and high ram utilization.
how do I reduce open_table_cache also how to increase instance performance?
Looks like the size of the data retrieved is quite large and the time spent on sending the data from the SQL instance to your App is the reason of the latency observed.
You may review your use case and maybe retrieve less information, or try to parallellize queries, or improve the SQL instance I/O performance (it is related to DB Disk Size).
Explanation: We are able to run 200 TPS on 16 core cpu successfully with around 40% cpu utilization, 50-60 concurrent locks. However when we increase the load upto 300 TPS DB response gets slow post 15-20mins run. System shows dead tuple of 2-4%.
Observation : CPU and other resources remain stable and if we perform vacuum during system latency, performance gets increased. However after 15-20 mins system again starts getting slow.
CPU : 16 core , RAM 128 GB, DB size 650 GB
Our Postgres DB (hosted on Google Cloud SQL with 1 CPU, 3.7 GB of RAM, see below) consists mostly of one big ~90GB table with about ~60 million rows. The usage pattern consists almost exclusively of appends and a few indexed reads near the end of the table. From time to time a few users get deleted, deleting a small percentage of rows scattered across the table.
This all works fine, but every few months an autovacuum gets triggered on that table, which significantly impacts our service's performance for ~8 hours:
Storage usage increases by ~1GB for the duration of the autovacuum (several hours), then slowly returns to the previous value (might eventually drop below it, due to the autovacuum freeing pages)
Database CPU utilization jumps from <10% to ~20%
Disk Read/Write Ops increases from near zero to ~50/second
Database Memory increases slightly, but stays below 2GB
Transaction/sec and ingress/egress bytes are also fairly unaffected, as would be expected
This has the effect of increasing our service's 95th latency percentile from ~100ms to ~0.5-1s during the autovacuum, which in turn triggers our monitoring. The service serves around ten requests per second, with each request consisting of a few simple DB reads/writes that normally have a latency of 2-3ms each.
Here are some monitoring screenshots illustrating the issue:
The DB configuration is fairly vanilla:
The log entry documenting this autovacuum process reads as follows:
system usage: CPU 470.10s/358.74u sec elapsed 38004.58 sec
avg read rate: 2.491 MB/s, avg write rate: 2.247 MB/s
buffer usage: 8480213 hits, 12117505 misses, 10930449 dirtied
tuples: 5959839 removed, 57732135 remain, 4574 are dead but not yet removable
pages: 0 removed, 6482261 remain, 0 skipped due to pins, 0 skipped frozen
automatic vacuum of table "XXX": index scans: 1
Any suggestions what we could tune to reduce the impact of future autovacuums on our service? Or are we doing something wrong?
If you can increase autovacuum_vacuum_cost_delay, your autovacuum would run slower and be less invasive.
However, it is usually the best solution to make it faster by setting autovacuum_vacuum_cost_limit to 2000 or so. Then it finishes faster.
You could also try to schedule VACUUMs of the table yourself at times when it hurts least.
But frankly, if a single innocuous autovacuum is enough to disturb your operation, you need more I/O bandwidth.
I am building a large postgres 9.1 database on ubuntu 12.04, with one table that holds about 80 million rows or so. Whenever I run a SELECT statement:
SELECT * FROM db WHERE ID=1;
It takes almost 2.5 minutes to execute the query which returns only a few thousand rows. After running a few diagnostics on the disk I/O, I think that is not the problem, but just in case below is the output from a diagnostic. (I have 2GB of RAM) I am not exactly sure what a good output is here, but it seems ballpark given stats found for other servers on the internet.
time sh -c "dd if=/dev/zero of=bigfile bs=8k count=500000 && sync"
500000+0 records in
500000+0 records out
4096000000 bytes (4.1 GB) copied, 106.969 s, 38.3 MB/s
real 1m49.091s
user 0m0.248s
sys 0m9.369s
I have modified postgresql.conf considerably, boosting the effective_cache to 75% of ram, shared_buffers to 25%, checkpoint_segments to 15, work_mem to 256MB, autovacuum, SHMMAX on the kernel, etc. I have had some performance increases, but not more than 5% better. Networking shouldnt be an issue since it still takes a long time even running on localhost. I am planning to add even more data, and the query time seems to be growing quickly with the number of rows.
It seems like I should be able to run these SELECT statements in a few seconds, not a few minutes. Any suggestions on where this bottleneck could be?
sorry if this is inexcusably obvious, but do you have an index on the ID column?
also, though I'm not blaming the disk, you merely tested sequential bandwidth, which tells you very little about latency. though I have to say that 38 MB/s is underwhelming even for that measure...