Apache (apache2) HTTP server stops accepting connections after a while and requires restart - sockets

The apache server uses up all of the servers (up to ServerLimit) and then does not accept any more connections.
Slot PID Stopping Connections Threads Async connections
total accepting busy idle writing keep-alive closing
0 23257 yes 1 no 0 0 0 0 0
1 27271 no 0 yes 1 24 0 0 0
2 24876 yes 2 no 0 0 0 0 0
3 23117 yes 2 no 0 0 0 0 0
4 22671 yes 1 no 0 0 0 0 0
5 23994 yes 1 no 0 0 0 0 0
6 25159 yes 1 no 0 0 0 0 0
7 24604 yes 1 no 0 0 0 0 0
Sum 8 7 9 1 24 0 0 0
The one pid that was accepting was killed and restarted to get the status report above. Over time this PID would also end up like the rest. How do I find out why Apache stops accepting connections after a while? The timeout is set at 90 seconds.
Additional information:
Server Version: Apache/2.4.33 (Unix) OpenSSL/1.0.2o
Server Built: Apr 18 2018 10:56:21
Server loaded APR Version: 1.6.3
Compiled with APR Version: 1.6.3
Server loaded APU Version: 1.6.1
Compiled with APU Version: 1.6.1
Module Magic Number: 20120211:76
Hostname/port: localhost:8006
Timeouts: connection: 90 keep-alive: 5
MPM Name: event
MPM Information: Max Daemons: 8 Threaded: yes Forked: yes
Server Architecture: 64-bit

Related

Powershell ScriptBlock closure: am I missing something?

I have been struggling with this for several hours now and after reading many threads about ScriptBlocks, Closures, scopes etc I still don't see what's wrong in my code.
Let me explain: I have a main script that dynamically generates an HTML page using PSWriteHTML module and ScriptBlocks.
As I have a lot of PSWriteHTML pages to write, I use an arrayList of ScriptBlocks to generate the code with different set of values each time (corresponding to different servers), these ScriptBlocks being executed into a foreach loop.
This is done using the Save-utilizationReport function (I have only kept the relevant code):
function Save-utilizationReport ($currentDate, $navLinksScriptBlock, $htmlScriptBlockArray, $emeaTotalNumberOfCalls, $namTotalNumberOfCalls, $apacTotalNumberOfCalls, $path, $logFilePath) {
[...]
# Using Script Blocks, add the pages generated during the analysis to the HTML Report
foreach($htmlScriptBlock in $htmlScriptBlockArray){
Invoke-Command -ScriptBlock ($htmlScriptBlock)
}
[...]
}
The ScriptBlock are created using set of values gathered from a list of servers' logs and added to the arrayList of ScriptBlocks in the Create-utilizationReportPage function (again, I've only kept the relevant code):
function Create-utilizationReportPage ($matchedLines, $ipAddress, $hostname, $pageId, $utilisationReportTemplate, $htmlScriptBlockArray) {
# Retrieve the content of the Utilisation Report Template as a RAW string (Here-String)
$htmlPageCodeBlock = Get-Content $utilisationReportTemplate -Raw
# Create the Script Block that contains the HTML page
$htmlPageScriptBlock = {
# Get the "Total number of calls" information
$timeArray = $matchedLines.time
$participantNumberArray = $matchedLines.participantNumber
[...]
# Update the page ID in the template
$htmlPageCodeBlock = $htmlPageCodeBlock -replace '%PAGE_ID%', "$pageId"
# Update the page header information in the template
$htmlPageCodeBlock = $htmlPageCodeBlock -replace '%PAGE_HEADER%', "$ipAddress [$hostname]"
Invoke-Command -ScriptBlock ([scriptblock]::Create($htmlPageCodeBlock))
}.GetNewClosure()
# Add the Page's script block to the Script Blocks array
$htmlScriptBlockArray.Add($htmlPageScriptBlock)
}
These are called in the script below:
$currentDate = Get-CorrectDate $latestFolder
# The global Log file
$logFilePath = "$scriptPath/Logs/logs_$currentDate.txt"
$serversList = "$scriptPath/Config/$configFileName"
# If the Servers list exists, retrieve the Servers list
if (Test-Path $serversList) {
# Get the data from the file
[xml]$servers = Get-Content $serversList
# Select only the Servers information
$nodes = $servers.SelectNodes("//server")
try {
[...]
# Iterate through the Servers list
foreach ($node in $nodes) {
# Get the Server IP Address
$ipAddress = $node.ip
# Get the Server Hostname
$hostname = $node.hostname
# Get the "Debug Utilization" lines from the logbundle's syslog files
$matchedLines = [System.Collections.ArrayList]#(Find-UtilizationLinesInLogs $currentDate "$scriptPath\Data\$currentDate\logs_$ipAddress" "host:server: \[USAGE\] : \{`"1`" : ")[-1]
# Add the Server's participants throughout the day
switch ($hostname) {
{$_.Contains("emea")} {
# Update EMEA total number of participants
Update-ZoneTotalParticipants ([ref]$emeaServer) ([ref]$emeaTotalNumberOfCalls) $matchedLines
}
{$_.Contains("nam")} {
# Update NAM total number of participants
Update-ZoneTotalParticipants ([ref]$namServer) ([ref]$namTotalNumberOfCalls) $matchedLines
}
{$_.Contains("apac")} {
# Update APAC total number of participants
Update-ZoneTotalParticipants ([ref]$apacServer) ([ref]$apacTotalNumberOfCalls) $matchedLines
}
}
# Get the CPU Utilization values lines from the logbundle's sysdebug files
$cpu = Get-CpuUsage "$scriptPath\Data\$currentDate\logs_$ipAddress\sysdebug"
# Get the Memory Utilization values lines from the logs' sysdebug files
$memory = Get-MemoryUsage "$scriptPath\Data\$currentDate\logs_$ipAddress\sysdebug"
# Export the "Debug Utilization" to a CSV file
Save-csvUtilizationReport $matchedLines "$scriptPath\Output\$currentDate" "$ipAddress" "$hostname" $logFilePath
# Draw graphs from the "Debug Utilization" information and then export it to an HTML File
Create-utilizationReportPage $matchedLines "$ipAddress" "$hostname" $pageId $utilisationReportTemplate $htmlScriptBlockArray
# Add the new navigation link to the Navigation Links array
Add-htmlNavLink $navLinksArray "$ipAddress" "$hostname" $pageId $logFilePath
# Increment the Page ID counter
$pageId += 1
}
# Create an Here-String from the Navigation Links array
$OFS = ""
$navLinksCode =#"
$($navLinksArray)
"#
$OFS = " "
# Create a script block from the Navigation Links
$navLinksScriptBlock = [scriptblock]::Create($navLinksCode)
# Save the daily HTML utilization report
Save-utilizationReport $currentDate $navLinksScriptBlock $htmlScriptBlockArray $emeaTotalNumberOfCalls $namTotalNumberOfCalls $apacTotalNumberOfCalls "$scriptPath\Output\$currentDate" $logFilePath
}
catch
{
Write-Logs $logFilePath "Error: $($_.Exception.Message)"
exit 1
}
}
Everything is working fine and as expcted except for the first set of values in the first page which is somehow the sum of all the other set of values...
For example when I have 3 pages, I can see that the collected values are correct when the Find-UtilizationLinesInLogs function is executed:
matchedLines.participantNumber:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 5 5 5 4 4 4 4 4 12 14 14 15 16 16 16 7 7 7 7 8 14 17 18 19 18 19 19 20 16 16 16 15 7 7 7 7 5 4 4 4 4 4 4 4 4 1 1 0 0 0 3 6 5 5 5 5 9 14 16 18
matchedLines.participantNumber:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 8 9 10 10 10 10 9 9 9 9 9 9 8 8 8 8 8 8 8 8 8 8 9 4 12 15 11 14 14 13 12 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
matchedLines.participantNumber:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 0 0 0 3 9 10 10 9 8 10 9 9 10 10 10 10 10 11 11 11 11 8 7 8 8 7 7 7 7 7 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 16 17 16 16
But when the ScriptBlocks are executed using the Invoke-Command in the foreach loop, the first batch of values is systematically the sum of the 3 sets of values while the following ones are correct:
matchedLines.participantNumber inside Create-utilizationReportPage:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 6 6 7 7 5 5 5 8 29 33 34 34 34 36 34 25 26 26 26 27 32 36 37 38 37 35 34 36 32 31 32 26 26 29 25 22 20 18 17 17 17 6 6 6 6 3 3 2 2 2 5 8 7 7 7 10 25 31 32 34
matchedLines.participantNumber inside Create-utilizationReportPage:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 8 9 10 10 10 10 9 9 9 9 9 9 8 8 8 8 8 8 8 8 8 8 9 4 12 15 11 14 14 13 12 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
matchedLines.participantNumber inside Create-utilizationReportPage:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 0 0 0 3 9 10 10 9 8 10 9 9 10 10 10 10 10 11 11 11 11 8 7 8 8 7 7 7 7 7 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 16 17 16 16
I have tried many things without success so if someone has any hint of what can go wrong, it would be great!
Thanks for your help!
So! I finally found out what was wrong.
I suspected that my problem was related to arrayList copies or at least variable copies... so I tried to remove this part of the script where I extract the $macthedLines values and copy them into global arrays using references:
# Add the Server's participants throughout the day
switch ($hostname) {
{$_.Contains("emea")} {
# Update EMEA total number of participants
Update-ZoneTotalParticipants ([ref]$emeaServer) ([ref]$emeaTotalNumberOfCalls) $matchedLines
}
{$_.Contains("nam")} {
# Update NAM total number of participants
Update-ZoneTotalParticipants ([ref]$namServer) ([ref]$namTotalNumberOfCalls) $matchedLines
}
{$_.Contains("apac")} {
# Update APAC total number of participants
Update-ZoneTotalParticipants ([ref]$apacServer) ([ref]$apacTotalNumberOfCalls) $matchedLines
}
}
And bingo, this time the values written down in the first PSWriteHTML page are the correct one!
So I focused on the Update-ZoneTotalParticipants function which is doing the values copy:
function Update-ZoneTotalParticipants ([ref][int]$ServerNumber, [ref][System.Collections.ArrayList]$totalNumberOfCalls, $values) {
# If this is the first server to be analysed in the zone
if ($ServerNumber.value -eq 1) {
# Copy the Utilization lines of the server
$totalNumberOfCalls.value = $values
# Increment the zone's server counter
$ServerNumber.value += 1
}
# If this at least the 2nd server to be analysed in the zone
elseif ($ServerNumber.value -gt 1) {
# Parse the server matched lines, get the participantNumber value and add it to the total
0..($totalNumberOfCalls.value.Count - 1) | ForEach-Object {
if ($_ -le ($values.Count - 1)) {
$totalNumberOfCalls.value[$_].participantNumber = [int]($totalNumberOfCalls.value[$_].participantNumber) + [int]($values[$_].participantNumber)
}
# If there are less objects in the current server matched lines, add a 0 instead
else {
$totalNumberOfCalls.value[$_].participantNumber = [int]($totalNumberOfCalls.value[$_].participantNumber) + 0
}
}
}
}
The only part of the code where $matchedLines is involved is $totalNumberOfCalls.value = $values so it certainly is were the array manipulation goes wrong.
So I dug around ArrayList copies or clones and found out that I was not doing a deep copy of the object and that it could cause issues.
I used Petru Zaharia's solution in this thread to update the function:
# Copy the Utilization lines of the server
$totalNumberOfCalls.value = $values
# replaced with:
# Copy the Utilization lines of the server : Serialize and Deserialize data using PSSerializer:
$_TempCliXMLString = [System.Management.Automation.PSSerializer]::Serialize($matchedLines, [int32]::MaxValue)
$totalNumberOfCalls.value = [System.Management.Automation.PSSerializer]::Deserialize($_TempCliXMLString)
And now everything works as expected.
Thanks guys for your support!

Check value of PF_NO_SETAFFINITY

Is it possible to tell whether a process/thread has the PF_NO_SETAFFINITY flag set? I'm running taskset on a series of process ids and some are throwing errors of the following form:
taskset: failed to set pid 30's affinity: Invalid argument
I believe this is because some processes have PF_NO_SETAFFINITY set (see Answer).
Thank you!
Yes - look at /proc/PID/stat's 'flag' field
<linux/sched.h
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */
Look here for details on using /proc:
http://man7.org/linux/man-pages/man5/proc.5.html
https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solutionid=sk65143
Example:
ps -eaf
www-data 30084 19962 0 07:09 ? 00:00:00 /usr/sbin/apache2 -k start
...
cat /proc/30084/stat
30084 (apache2) S 19962 19962 19962 0 -1 4194624 554 0 3 0 0 0 0 0 20 0 1 0 298837672 509616128 5510 18446744073709551615 1 1 0 0 0 0 0 16781312 201346799 0 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The flags are 4194624
Q: Do you mind specifying how you'd write a simple script that outputs
true/false based on whether you're allowed to set affinity?
A: I don't feel comfortable providing this without the opportunity to test, but you can try something like this...
flags=$(cut -f 9 -d ' ' /proc/30084/stat)
echo $(($flags & 0x40000000))

netstat/ss shows duplicated outgoing time_wait sockets

I encountered this behaviour many times in many servers which processed lots of network connections.
# ss -nt state time-wait sport ne :80 and sport ne :10050 | sort -k3
0 0 127.0.0.1:13530 127.0.0.1:8888
0 0 127.0.0.1:21978 127.0.0.1:8080
0 0 127.0.0.1:32490 127.0.0.1:8080
0 0 127.0.0.1:42922 127.0.0.1:8080
0 0 127.0.0.1:50728 127.0.0.1:8080
0 0 127.0.0.1:51542 127.0.0.1:8888
0 0 127.0.0.1:6274 127.0.0.1:8888
0 0 127.0.0.1:65264 127.0.0.1:8888
0 0 172.16.40.100:10000 172.16.40.5:3010
0 0 172.16.40.100:10002 172.16.40.34:3010
0 0 172.16.40.100:10002 172.16.40.97:3020
0 0 172.16.40.100:10004 172.16.40.116:3010
0 0 172.16.40.100:10004 172.16.40.21:3010
0 0 172.16.40.100:10008 172.16.40.30:3010
0 0 172.16.40.100:10010 172.16.40.216:3020
0 0 172.16.40.100:10012 172.16.40.30:3010
0 0 172.16.40.100:10014 172.16.40.131:3010
0 0 172.16.40.100:10014 172.16.40.22:3010
0 0 172.16.40.100:10014 172.16.40.33:3010
This is a part of ss output. As you may see, there are several strings with duplicated outgoing time_wait sockets. Such as:
0 0 172.16.40.100:10002 172.16.40.34:3010
0 0 172.16.40.100:10002 172.16.40.97:3020
or
0 0 172.16.40.100:10014 172.16.40.131:3010
0 0 172.16.40.100:10014 172.16.40.22:3010
0 0 172.16.40.100:10014 172.16.40.33:3010
I googled this question but could not get a reasonable explanation of this topic.
Thanks a lot!
As you may see, there are several strings with duplicated outgoing time_wait sockets. Such as:
0 0 172.16.40.100:10002 172.16.40.34:3010
0 0 172.16.40.100:10002 172.16.40.97:3020
or
0 0 172.16.40.100:10014 172.16.40.131:3010
0 0 172.16.40.100:10014 172.16.40.22:3010
0 0 172.16.40.100:10014 172.16.40.33:3010
The lines in this display are connections, not sockets.
There are exactly zero 'duplicated sockets' here. There is a duplicated port, because at the server end the port is always the same. However either the client IP address or the client port is always different. Or both.

MongoDB: Increasing read speed on a single machine

I use MongoDB to store price events for stocks. Depending on what you want to screen, the number of event can rapidly grow to 1Go-2Go.
I run MongoDB on a single machine and it is taking longer and longer to load the data. I am not able to find a clear answer on the web if "sharding on a single server" is a benefit to read speed.
Is it the right path to increase the read speed?
insert query update delete getmore command flushes mapped vsize res faults locked db
0 1 395 0 1 395 0 63.9g 128g 2.53g 7484 prices:70.5%
0 0 14726 0 5 14728 0 63.9g 128g 2.48g 31555 prices:7.4%
0 0 0 0 0 1 0 63.9g 128g 2.48g 436 prices:0.0%
0 0 0 0 1 1 0 63.9g 128g 2.48g 0 prices:0.0%
0 0 0 0 0 1 1 63.9g 128g 2.49g 3877 .:83.9%
0 0 0 0 0 1 0 63.9g 128g 2.49g 0 prices:0.0%

Memcached using more than max memory

i have an installation on memcache which i want to use in my production environment but when i have ran a couple of tests it seems that memcache doesn't free up memory even after it has used up all of it allocated memory, Also i logged in and ran a flush_all command but the objects are still in the cache.
Here are outputs from some tests
memcached-tool
memcache-top v0.6 (default port: 11211, color: on, refresh: 3 seconds)
INSTANCE USAGE HIT % CONN TIME EVICT/s READ/s WRITE/s
127.0.0.1:11211 427.1% 0.0% 18 1.4ms 0.0 244 261.0K
AVERAGE: 427.1% 0.0% 18 1.4ms 0.0 244 261.0K
TOTAL: 4.3MB/ 1.0MB 18 1.4ms 0.0 244 261.0K
memcached-tool 127.0.0.1:11211 display
No Item_Size Max_age Pages Count Full? Evicted Evict_Time OOM
1 560B 4s 1 1872 yes 0 0 15488
2 704B 32s 1 559 no 0 0 0
3 880B 4s 1 1191 yes 0 0 1335
4 1.1K 9s 1 116 no 0 0 0
5 1.4K 21s 1 14 no 0 0 0
6 1.7K 4s 1 17 no 0 0 0
7 2.1K 84s 1 24 no 0 0 0
8 2.7K 130s 1 60 no 0 0 0
9 3.3K 25s 1 290 no 0 0 0
10 4.2K 9s 1 194 no 0 0 0
11 5.2K 9s 1 116 no 0 0 0
15 12.7K 816s 1 1 no 0 0 0
16 15.9K 769s 1 5 no 0 0 0
18 24.8K 786s 1 1 no 0 0 0
21 48.5K 816s 1 1 no 0 0 0
memcached-tool 127.0.0.1:11211 stats
127.0.0.1:11211 Field Value
accepting_conns 1
auth_cmds 0
auth_errors 0
bytes 4478060
bytes_read 23964596
bytes_written 546642860
cas_badval 0
cas_hits 0
cas_misses 0
cmd_flush 0
cmd_get 240894
cmd_set 4504
conn_yields 0
connection_structures 21
curr_connections 18
curr_items 4461
decr_hits 0
decr_misses 0
delete_hits 0
delete_misses 0
evictions 0
get_hits 43756
get_misses 197138
incr_hits 0
incr_misses 0
limit_maxbytes 1048576
listen_disabled_num 0
pid 8731
pointer_size 64
reclaimed 0
rusage_system 5.047232
rusage_user 4.311344
threads 4
time 1306247929
total_connections 3092
total_items 4504
uptime 1240
version 1.4.5
-m tells memcached how much RAM to use for item storage (in megabytes). Note
carefully that this isn't a global
memory limit, so memcached will use a
few % more memory than you tell it to.
Set this to safe values. Setting it to
less than 48 megabytes does not work
properly in 1.4.x and earlier. It will
still use the memory.
Source: https://github.com/memcached/memcached/wiki/ConfiguringServer#commandline-arguments