Is there a way to set CPU affinity for PostgreSQL processes through its configuration files? - postgresql

I am using PostgreSQL 9.5 with a Java 8 Application on Windows OS (System: i5 2nd Generation). I noticed that when my application is in execution, there are several PostgreSQL processes / sub-processes that are created/removed dynamically.
These PostgreSQL processes use almost all of the CPU (>95%), due to which problem arises with other applications installed at my system.
I recently came to know about CPU affinity. For the time being, I am executing PowerShell script (outside of my Java application) which checks periodically and sets the desired value of cpu affinity for all PostgreSQL processes in execution.
I am looking for a way where I don't need to execute an external script(s) and/or there is one-time configuration required.
Is there a configuration supported by PostgreSQL 9.5 through which we can set max CPU cores to be used by PostgreSQL processes?
I looked for the solution, but could not find any.

There is no way to set this in the PostgreSQL configuration.
But you can start your PostgreSQL server from cmd.exe with:
start /affinity 3 C:\path\to\postgresql.exe C:\path\to\data\directory
That would allow PostgreSQL to run only on the twp “first” cores.
The cores are numbered 1, 2, 4, 8, 16 and so on, and you use the sum of the cores where you want PostgreSQL to run as argument to /affinity. For example, if you only want it to run on the third and fourth core, you would use /affinity 12.
This should work, since the Microsoft documentation says:
Process affinity is inherited by any child process or newly instantiated local process.

Related

AWS RDS with Postgres : Is OOM killer configured

We are running load test against an application that hits a Postgres database.
During the test, we suddenly get an increase in error rate.
After analysing the platform and application behaviour, we notice that:
CPU of Postgres RDS is 100%
Freeable memory drops on this same server
And in the postgres logs, we see:
2018-08-21 08:19:48 UTC::#:[XXXXX]:LOG: server process (PID XXXX) was terminated by signal 9: Killed
After investigating and reading documentation, it appears one possibility is linux oomkiller running having killed the process.
But since we're on RDS, we cannot access system logs /var/log messages to confirm.
So can somebody:
confirm that oom killer really runs on AWS RDS for Postgres
give us a way to check this ?
give us a way to compute max memory used by Postgres based on number of connections ?
I didn't find the answer here:
http://postgresql.freeideas.cz/server-process-was-terminated-by-signal-9-killed/
https://www.postgresql.org/message-id/CAOR%3Dd%3D25iOzXpZFY%3DSjL%3DWD0noBL2Fio9LwpvO2%3DSTnjTW%3DMqQ%40mail.gmail.com
https://www.postgresql.org/message-id/04e301d1fee9%24537ab200%24fa701600%24%40JetBrains.com
AWS maintains a page with best practices for their RDS service: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html
In terms of memory allocation, that's the recommendation:
An Amazon RDS performance best practice is to allocate enough RAM so
that your working set resides almost completely in memory. To tell if
your working set is almost all in memory, check the ReadIOPS metric
(using Amazon CloudWatch) while the DB instance is under load. The
value of ReadIOPS should be small and stable. If scaling up the DB
instance class—to a class with more RAM—results in a dramatic drop in
ReadIOPS, your working set was not almost completely in memory.
Continue to scale up until ReadIOPS no longer drops dramatically after
a scaling operation, or ReadIOPS is reduced to a very small amount.
For information on monitoring a DB instance's metrics, see Viewing DB Instance Metrics.
Also, that's their recommendation to troubleshoot possible OS issues:
Amazon RDS provides metrics in real time for the operating system (OS)
that your DB instance runs on. You can view the metrics for your DB
instance using the console, or consume the Enhanced Monitoring JSON
output from Amazon CloudWatch Logs in a monitoring system of your
choice. For more information about Enhanced Monitoring, see Enhanced
Monitoring
There's a lot of good recommendations there, including query tuning.
Note that, as a last resort, you could switch to Aurora, which is compatible with PostgreSQL:
Aurora features a distributed, fault-tolerant, self-healing storage
system that auto-scales up to 64TB per database instance. Aurora
delivers high performance and availability with up to 15 low-latency
read replicas, point-in-time recovery, continuous backup to Amazon S3,
and replication across three Availability Zones.
EDIT: talking specifically about your issue w/ PostgreSQL, check this Stack Exchange thread -- they had a long connection with auto commit set to false.
We had a long connection with auto commit set to false:
connection.setAutoCommit(false)
During that time we were doing a lot
of small queries and a few queries with a cursor:
statement.setFetchSize(SOME_FETCH_SIZE)
In JDBC you create a connection object, and from that connection you
create statements. When you execute the statments you get a result
set.
Now, every one of these objects needs to be closed, but if you close
statement, the entry set is closed, and if you close the connection
all the statements are closed and their result sets.
We were used to short living queries with connections of their own so
we never closed statements assuming the connection will handle the
things once it is closed.
The problem was now with this long transaction (~24 hours) which never
closed the connection. The statements were never closed. Apparently,
the statement object holds resources both on the server that runs the
code and on the PostgreSQL database.
My best guess to what resources are left in the DB is the things
related to the cursor. The statements that used the cursor were never
closed, so the result set they returned never closed as well. This
meant the database didn't free the relevant cursor resources in the
DB, and since it was over a huge table it took a lot of RAM.
Hope it helps!
TLDR: If you need PostgreSQL on AWS and you need rock solid stability, run PostgreSQL on EC2 (for now) and do some kernel tuning for overcommitting
I'll try to be concise, but you're not the only one who has seen this and it is a known (internal to Amazon) issue with RDS and Aurora PostgreSQL.
OOM Killer on RDS/Aurora
The OOM killer does run on RDS and Aurora instances because they are backed by linux VMs and OOM is an integral part of the kernel.
Root Cause
The root cause is that the default Linux kernel configuration assumes that you have virtual memory (swap file or partition), but EC2 instances (and the VMs that back RDS and Aurora) do not have virtual memory by default. There is a single partition and no swap file is defined. When linux thinks it has virtual memory, it uses a strategy called "overcommitting" which means that it allows processes to request and be granted a larger amount of memory than the amount of ram the system actually has. Two tunable parameters govern this behavior:
vm.overcommit_memory - governs whether the kernel allows overcommitting (0=yes=default)
vm.overcommit_ratio - what percent of system+swap the kernel can overcommit. If you have 8GB of ram and 8GB of swap, and your vm.overcommit_ratio = 75, the kernel will grant up to 12GB or memory to processes.
We set up an EC2 instance (where we could tune these parameters) and the following settings completely stopped PostgreSQL backends from getting killed:
vm.overcommit_memory = 2
vm.overcommit_ratio = 75
vm.overcommit_memory = 2 tells linux not to overcommit (work within the constraints of system memory) and vm.overcommit_ratio = 75 tells linux not to grant requests for more than 75% of memory (only allow user processes to get up to 75% of memory).
We have an open case with AWS and they have committed to coming up with a long-term fix (using kernel tuning params or cgroups, etc) but we don't have an ETA yet. If you are having this problem, I encourage you to open a case with AWS and reference case #5881116231 so they are aware that you are impacted by this issue, too.
In short, if you need stability in the near term, use PostgreSQL on EC2. If you must use RDS or Aurora PostgreSQL, you will need to oversize your instance (at additional cost to you) and hope for the best as oversizing doesn't guarantee you won't still have the problem.

Centos - my Centos 5 (hosted asterisk) always have a large CPU usage process

i have a Centos 5 server which hosted asterisk 13.
server works fine last week but now top command always show me a process with large amount of CPU usage. when i kill the process a few second later another command with large CPU usage started. many times processes command is ".syslog" but have other command like "qjennjifes", "vnvebynufu" and another unknown commands like that.
1) Check you have firewall and fail2ban recomended settings
2) Check you have no DoS/DDoS by "sip show channels"
3) Check your system not hacked/no broken soft on your host.

bind9 (named) does not start in multi-threaded mode

From the bind9 man page, I understand that the named process starts one worker thread per CPU if it was able to determine the number of CPUs. If its unable to determine, a single worker thread is started.
My question is how does it calculate the number of CPUs? I presume by CPU, it means cores. The Linux machine I work is customized and has kernel 2.6.34 and does not support lscpu or nproc utilities. named is starting a single thread even if i give -n 4 option. Is there any other way to force named to start multiple threads?
Thanks in advance.

Limiting the number of processors available to Windows Server 2008 R2

I'm working on setting up a test environment, but I need to scale down the hardware we're using for our SQL box, which is running Windows Server 2008 R2 SP 1 and SQL 2008. I'm noticing that MSConfig.exe has options for limiting the number of CPUs available, but I'm not able to find any documentation on how that works on MSDN.
I'm also not seeing any way to change this via command line using the BCDEdit that Microsoft put in.
Anyone know of documentation on this? I'm trying to decide whether limiting the Processor usage at boot would be the best test, or limiting the processors in SQL itself. I'm leaning towards boot time because I'm trying to accurately mimic a lower-power physical box, and if I limit the power at the database level the extra power may show up in other areas.
In an era of multi-core, hyperthreaded CPUs, "Processors" is now an ambiguous term. Does the processor refer to a threads (from Hyperthreading), cores, or sockets (physical CPUs).
Windows recognises logical processors (LP) as the basic compute platform - giving one LP for each hyperthread within a core, and then multiplied for each core, which is finally multiplied by the number of sockets.
The easiest way in Windows to reduce the LP count is to use the /NUMPROC option. In this example itc creates a maximum number of processors of 8.
Backup bcdedit /export c:\Backup\bcd.bak
List of current entries bcdedit /v
Copy of existing config bcdedit /copy {current} /d "Windows 2008 R2 with NumProc"
The current is a "well known" identifier . This also returns the Id that is created. So that can be used directly
List of entries bcdedit /v
adding parameter to new entry To set maximum number of processors
bcdedit /set {new_ID} NUMPROC 8
Changing default entry bcdedit /default {new_ID}
But the danger of this strategy is for example on a 4 core system with HyperThreading enabled then only one processor (socket) is used. Since the first 8 LPs (0..7) will be hyperthreaded cores on the first processor. So you are not really emulating an 8 way system - but a 1 way - 4 core system with HyperThreading enabled.
If this doesn't fit your model other options are
Disable HyperThreading in the BIOS - this reduces the LP count per core so helping distribute the load over cores and sockets.
Does the System BIOS support reducing the core counter per processor? If so this will help distribute loads over sockets.
Building your system within a virtual environment and limiting physical resources from that perspective.
If you are dealing with more than 64 Logical Processors under Windows then that introduces Processor Groups and that add another layer of options.

Distributed write job crashes remote machine with MongoDB server

Looking for any advice I can get.
I have 16 virtual CPUs all writing to a single remote MongoDB server. The machine that's being written to is a 64-bit machine with 32GB RAM, running Windows Server 2008 R2. After a certain amount of time, all the CPUs stop cold (no gradual performance reduction), and any attempt to get a Remote Desktop Connection hangs.
I'm writing from Python via pymongo, and the insert statement is "[collection].insert([document], safe=True)"
I decided to more actively monitor my server as the distributed write job progressed, remoting in from time to time and checking the Task Manager. What I see is a steady memory creep, from 0.0GB all the way up to 29.9GB, in a fairly linear fashion. My leading theory is therefore that my writes are filling up the memory and eventually overwhelming the machine.
Am I missing something really basic? I'm new to MongoDB, but I remember that when writing to a MySQL database, inserts are typically followed by commits, where it's the commit statement that actually makes sure the record is written. Here I'm not doing any commits...?
Thanks,
Dave
Try it with journaling turned off and see if the problem remains.