lmtpd: failed to mmap file /var/lib/imap/deliver.db.NEW (in reply to end of DATA command) - email

Good day!
After installing and running kolab letters delivered instantly. But after a few days letters to local destinations have become delivered with a delay. Over time, they are delivered, but the delay may be several hours. An example of the path of the letter:
root#myhost:~# cat /var/log/mail.log | grep 7AA7935B1FC
Jan 12 11:31:03 myhost postfix/smtpd[19494]: 7AA7935B1FC:
client=localhost[127.0.0.1]
Jan 12 11:31:05 myhost postfix/cleanup[19492]: 7AA7935B1FC:
message-id=<20160112093103.7AA7935B1FC#mail.myhost.com>
Jan 12 11:31:05 myhost postfix/qmgr[7021]: 7AA7935B1FC:
from=<noreply#myhost.com>, size=1279, nrcpt=3 (queue active)
Jan 12 11:31:05 myhost lmtpunix[19631]: Delivered:
<20160112093103.7AA7935B1FC#mail.myhost.com> to mailbox:
myhost.com!user.user1
Jan 12 11:31:06 myhost postfix/lmtp[19617]: 7AA7935B1FC: to=<user1#myhost.com>, relay=mail.myhost.com[/var/lib/imap/socket/lmtp], delay=2.6, delays=2/0.01/0/0.59, dsn=4.3.0, status=deferred (host
mail.myhost.com[/var/lib/imap/socket/lmtp] said: 421 4.3.0 lmtpd:
failed to mmap /var/lib/imap/deliver.db.NEW file (in reply to end of
DATA command))
Jan 12 11:31:06 myhost postfix/lmtp[19617]: 7AA7935B1FC: to=<user2#myhost.com>, relay=mail.myhost.com[/var/lib/imap/socket/lmtp], delay=2.7, delays=2/0.01/0/0.68, dsn=4.4.2, status=deferred (lost connection with mail.myhost.com[/var/lib/imap/socket/lmtp] while sending end of data
-- message may be sent more than once
Jan 12 11:31:07 myhost postfix/lmtp[19617]: 7AA7935B1FC: to=<user3#myhost.com>, relay=mail.myhost.com[/var/lib/imap/socket/lmtp], delay=2.7, delays=2/0.01/0/0.68, dsn=4.4.2, status=deferred (lost connection with mail.myhost.com[/var/lib/imap/socket/lmtp] while sending end of data
-- message may be sent more than once)
Currently mailq features a variety of messages in queue. An example of one of these:
7BBDF35B123 6162 Tue Jan 12 13:19:24 user#rambler.ru (delivery temporarily suspended: lost connection with mail.myhost.com[/var/lib/imap/socket/lmtp] while sending end of data -- message may be sent more than once) user4#myhost.com
-- 11667 Kbytes in 327 Requests.
I think that the main reason is described here:
lmtp: failed to mmap /var/lib/imap/deliver.db.NEW file
But, unfortunately, not been able to find a solution.

The problem was solved according to this recommendation: http://lists.kolab.org/pipermail/users-de/2015-May/001998.html
Stop Services cyrus-imap and postfix
Delete files deliver.db.NEW and deliver.db in the directory /var/lib/imap/
Start the services and the file deliver.db is automatically created
Restart the queue: postsuper -r ALL
Some of the letters delivered from the queue again.
Proposed cause: after installing and start services on the new server users download messages en masse in the format *.eml, downloaded from the last post. Perhaps these actions somehow overflowed index files.
P.S.: Unfortunately, the solution was temporary: the situation described above is repeated periodically :(

Related

Loki LogQL corellate maillogs

please assist me on parsing mail logs using Loki & Grafana :)
My logging server collects maillog files from Linux server, and I want to use Loki to check status (sent, deferred, etc) of messages from specific user.
The problem is that mail logs are divided into different log lines and I need to correlate different log lines using message id (40F36420E05 in text below):
Jun 9 22:38:36 mail postfix/smtp[376635]: 40F36420E05: to=<otheruser#domain2>, relay=domain3[11.11.11.11]:25, delay=13, delays=0.58/0/4.6/7.8, dsn=2.6.0, status=sent (250 2.6.0 <20220609193823.D980A420E06#mail> [InternalId=13731010457062, Hostname=XXX] 15472 bytes in 0.524, 28.786 KB/sec Queued mail for delivery)
Jun 9 22:37:35 mail postfix/qmgr[193514]: 40F36420E05: from=<user#domain>, size=4496, nrcpt=1 (queue active)
Jun 9 22:37:35 mail opendkim[251972]: 40F36420E05: DKIM-Signature field added (s=mail, d=domain)
Jun 9 22:37:35 mail postfix/cleanup[376634]: 40F36420E05: message-id=<20220609193735.40F36420E05#mail>
Jun 9 22:37:35 mail postfix/submission/smtpd[376557]: 40F36420E05: client=compute-1.amazonaws.com[44.11.11.11], sasl_method=PLAIN, sasl_username=user
I'm using this query to find required mail messages and regexp function to extract messageid label:
{host="mail.com"} |~"from=<user#domain>" | regexp "(?P<messageid>\\S+): from="
Jun 9 22:59:58 mail postfix/qmgr[377114]: 40F36420E05: from=<user#domain>, size=11916, nrcpt=1 (queue active)
Jun 9 22:59:58 mail postfix/qmgr[377114]: C3E5D420E05: from=<user#domain>, size=9622, nrcpt=1 (queue active)
Jun 9 22:59:57 mail postfix/qmgr[377114]: 27057420E07: from=<user#domain>, size=6695, nrcpt=1 (queue active)
Now I want to fetch all log lines containing with all messageid labels extracted from previous query. Like {host="mail.com"} |~"from=<user#domain>" | regexp "(?P<messageid>\\S+): from="} | messageid={list_of_parsed_messageids}
How can I achieve that? Thanks!

Postfix possible SMTP attack and blacklist

I have plesk 12.5.30 on my server which is often blacklisted on Symantec Mail Security reputation.
The ip is new (I have purchased the server on 13.02.2017).
Also my ip is blacklisted on BACKSCATTERER.
Seeing the log of postfix I have a lot of entries like
Mar 22 14:51:43 server postfix/smtpd[14204]: connect from 75-143-80-240.dhcp.aubn.al.charter.com[75.143.80.240]
Mar 22 14:51:45 server postfix/smtpd[14204]: lost connection after EHLO from 75-143-80-240.dhcp.aubn.al.charter.com[75.143.80.240]
Mar 22 14:51:45 server postfix/smtpd[14204]: disconnect from 75-143-80-240.dhcp.aubn.al.charter.com[75.143.80.240]
Mar 22 14:51:50 server postfix/smtpd[14204]: connect from 128.128.72.76.cable.dhcp.goeaston.net[76.72.128.128]
Mar 22 14:51:51 server postfix/smtpd[14204]: lost connection after EHLO from 128.128.72.76.cable.dhcp.goeaston.net[76.72.128.128]
Mar 22 14:51:51 server postfix/smtpd[14204]: disconnect from 128.128.72.76.cable.dhcp.goeaston.net[76.72.128.128]
Mar 22 14:52:19 server postfix/smtpd[14204]: connect from mail.dedeckeraccountants.be[91.183.46.186]
Mar 22 14:52:19 server postfix/smtpd[14204]: disconnect from mail.dedeckeraccountants.be[91.183.46.186]
I have
Changed the smtp port to a non standard one (9456)
Installed firewall and fail2ban on plesk and setted as in image
Setted mail settings of plesk as in image
Installed a spamassasin
I have noticed also that some days ago i have lines in log like these
Mar 19 06:47:00 server postfix/smtp[13517]: CCC1C510023D: to=<229e7dc3183452c7d3290d1ba28f073e#www.lablue.de>, relay=none, delay=235637, delays=235636/0.05/0.09/0, dsn=4.4.1, status=deferred (connect to www.lablue.de[217.22.195.26]:25: Connection refused)
Mar 19 06:47:00 server postfix/smtp[13503]: 7EDD55100138: to=<Weber226#brockel.kirche-rotenburg.de>, relay=kirche-rotenburg-verden.de[136.243.213.122]:25, delay=239980, delays=239979/0.01/0.35/0.1, dsn=4.0.0, status=deferred (host kirche-rotenburg-verden.de[136.243.213.122] said: 451 Temporary local problem - please try later (in reply to RCPT TO command))
Mar 19 06:47:00 server postfix/smtp[13504]: 97B055100233: to=<office#angerlehner.at>, relay=none, delay=222922, delays=222922/0.01/0.64/0, dsn=4.4.3, status=deferred (Host or domain name not found. Name service error for name=angerlehner.at type=MX: Host not found, try again)
Mar 19 06:47:00 server postfix/smtp[13509]: 1E15F510019B: host mx1.leventboru.com.tr[89.19.1.69] said: 450 4.7.1 Recipient address rejected: Requested action not taken: mailbox unavailable or not local (in reply to RCPT TO command)
And i noticed a very long mail queue in plesk settings (i have deleted all mail in queue)
Any advice to block this attack??
Thanks in advance
Edit: I want to share my plesk-postfix settings
[plesk-postfix]
enabled = true
filter = postfix-sasl
action = iptables-multiport[name="plesk-postfix", port="http,https,smtp,submission,pop3,pop3s,imap,imaps,sieve", protocol=tcp]
logpath = /var/log/maillog
maxretry = 2
There is somenthing can i improve here?
You might consider to use a Fail2Ban - filter with the following regex - expressions:
failregex = ^%(__prefix_line)slost connection after (AUTH|UNKNOWN|EHLO) from [^\[]*\[<HOST>\]\s*$
If you need further Fail2Ban regex - expressions, pls. consider to ADD the corresponding log - file entries, because some general standart ones may not suit your needs or/and your qmail/postfix/imap-courier/dovecot version, installed on your server. ;-)
Edit:
In order to be more precise, I now add the full suggestion, incl. the regex, that #MattiaDiGiuseppe already used in his comments - it's just a bit better formatted this way.
[Definition]
_daemon = postfix(-\w+)?/(?:submission/|smtps/)?smtp[ds]
failregex = ^%(__prefix_line)swarning: (.*?)does not resolve to address <HOST>: Name or service not known$
^%(__prefix_line)swarning: [-._\w]+\[<HOST>\]: SASL ((?i)LOGIN|PLAIN|(?:CRAM|DIGEST)-MD5) authentication failed(:[ A-Za-z0-9+/:]*={0,2})?\s*$
^%(__prefix_line)sNOQUEUE: reject: RCPT from \S+\[<HOST>\]: 554 5\.7\.1 .* Relay access denied.*$
^%(__prefix_line)sSSL_accept error from \S+\s*\[<HOST>\]: lost connection$
^%(__prefix_line)sSSL_accept error from \S+\s*\[<HOST>\]: -1$
^%(__prefix_line)slost connection after (AUTH|UNKNOWN|EHLO) from [^\[]*\[<HOST>\]\s*$
ignoreregex = authentication failed: Connection lost to authentication server$
Pls. consider to have a look at all standart filters ( for Fail2Ban 0.10 AND older versions), by visiting:
=> https://github.com/fail2ban/fail2ban/tree/0.10/config/filter.d
If you desire to view the standarts for older versions, just click on the "Branch: 0.10" dropdpwn - button, pls.

WAL contains references to invalid pages

centos 6.7
postgresql 9.5.3
I've DB servers that are on master-standby replication.
Suddenly, standby server's postgresql process was stopped with this logs.
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]WARNING: page 1671400 of relation base/16400/559613 is uninitialized
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]CONTEXT: xlog redo Heap2/VISIBLE: cutoff xid 1902107520
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]PANIC: WAL contains references to invalid pages
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]CONTEXT: xlog redo Heap2/VISIBLE: cutoff xid 1902107520
2016-07-14 18:14:21.026 JST [][5783e038.3cd9][0][15577]LOG: startup process (PID 15579) was terminated by signal 6: Aborted
2016-07-14 18:14:21.026 JST [][5783e038.3cd9][0][15577]LOG: terminating any other active server processes
And, master server's postgresql logs were nothing special.
But, master server's /var/log/messages was listed as below.
Jul 14 05:38:44 host kernel: sbridge: HANDLING MCE MEMORY ERROR
Jul 14 05:38:44 host kernel: CPU 8: Machine Check Exception: 0 Bank 9: 8c000040000800c0
Jul 14 05:38:44 host kernel: TSC 0 ADDR 1f7dad7000 MISC 90004000400008c PROCESSOR 0:306e4 TIME 1468442324 SOCKET 1 APIC 20
Jul 14 05:38:44 host kernel: EDAC MC1: CE row 1, channel 0, label "CPU_SrcID#1_Channel#0_DIMM#1": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=8 Err=0008:00c0 (ch=0), addr = 0x1f7dad7000 => socket=1, Channel=0(mask=1), rank=4
Jul 14 05:38:44 host kernel:
Jul 14 18:30:40 host kernel: sbridge: HANDLING MCE MEMORY ERROR
Jul 14 18:30:40 host kernel: CPU 8: Machine Check Exception: 0 Bank 9: 8c000040000800c0
Jul 14 18:30:40 host kernel: TSC 0 ADDR 1f7dad7000 MISC 90004000400008c PROCESSOR 0:306e4 TIME 1468488640 SOCKET 1 APIC 20
Jul 14 18:30:41 host kernel: EDAC MC1: CE row 1, channel 0, label "CPU_SrcID#1_Channel#0_DIMM#1": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=8 Err=0008:00c0 (ch=0), addr = 0x1f7dad7000 => socket=1, Channel=0(mask=1), rank=4
Jul 14 18:30:41 host kernel:
The memory error's started at 1 week ago. So, I doubt the memory error causes postgresql's error.
My question is here.
1) Can memory error of kernel cause postgresql's "WAL contains references to invalid pages" error?
2) Why there is not any logs at master server's postgresql?
thx.
Faulty memory can cause all kinds of data corruption, so that seems like a good enough explanation to me.
Perhaps there are no log entries at the master PostgreSQL server because all that was corrupted was the WAL stream.
You can run
oid2name
to find out which database has OID 16400 and then
oid2name -d <database with OID 16400> -f 559613
to find out which table belongs to file 559613.
Is that table larger than 12 GB? If not, that would mean that page 1671400 is indeed an invalid value.
You didn't tell which PostgreSQL version you are using, but maybe there are replication bugs fixed in later versions that could cause replication problems even without a hardware bug present; read the release notes.
I would perform a new pg_basebackup and reinitialize the slave system.
But what I'd really be worried about is possible data corruption on the master server. Block checksums are cool (turned on if pg_controldata <data directory> | grep checksum gives you 1), but possibly won't detect the effects of memory corruption.
Try something like
pg_dumpall -f /dev/null
on the master and see if there are errors.
Keep your old backups in case you need to repair something!

possible attack on postfix server

I am afraid that my vps server is being attacked as the postfix logs have hundreds of lines with these messages:
May 24 10:50:32 ukvps postfix/smtpd[29971]: warning: hostname xep9.flink.uz does not resolve to address 91.234.218.9: Name or service not known
May 24 10:50:32 ukvps postfix/smtpd[29971]: connect from unknown[91.234.218.9]
May 24 10:50:33 ukvps postfix/smtpd[29971]: lost connection after UNKNOWN from unknown[91.234.218.9]
May 24 10:50:33 ukvps postfix/smtpd[29971]: disconnect from unknown[91.234.218.9]
May 24 10:53:53 ukvps postfix/anvil[29724]: statistics: max connection rate 77/60s for (smtp:91.234.218.9) at > May 24 10:48:31
May 24 10:53:53 ukvps postfix/anvil[29724]: statistics: max connection count 1 for (smtp:91.234.218.9) at > May 24 10:47:31
May 24 10:53:53 ukvps postfix/anvil[29724]: statistics: max cache size 1 at May 24 10:47:31
May 26 10:51:56 ukvps postfix/smtpd[13694]: warning: hostname myco-bio.com does not resolve to address 112.72.13.230
May 26 10:51:56 ukvps postfix/smtpd[13694]: connect from unknown[112.72.13.230]
May 26 10:51:57 ukvps postfix/smtpd[13694]: lost connection after UNKNOWN from unknown[112.72.13.230]
May 26 10:51:57 ukvps postfix/smtpd[13694]: disconnect from unknown[112.72.13.230]
May 26 10:52:19 ukvps postfix/smtpd[13694]: warning: hostname myco-bio.com does not resolve to address 112.72.13.230
May 26 10:52:19 ukvps postfix/smtpd[13694]: connect from unknown[112.72.13.230]
May 26 10:52:20 ukvps postfix/anvil[12258]: statistics: max connection rate 8/60s for (smtp:112.72.13.230) at May 26 10:42:43
May 26 10:52:20 ukvps postfix/anvil[12258]: statistics: max connection count 1 for (smtp:112.72.13.230) at May 26 10:42:21
May 26 10:52:20 ukvps postfix/anvil[12258]: statistics: max cache size 1 at May 26 10:46:06
ii postfix 2.9.6-2 amd64 High-performance mail transport agent
Also some customers are replying to us with spam complaints from users that dont exist on our domain.
Any help is appreciated, thanks.
That looks like some portscan or other scanning attemps. They connect, issue some invalid commands and then disconnect. They don't try to send any emails, as in that case you would see in the postfix log info about either accepting those emails or rejecting them.
Regarding the second issue, you are probably victim of backscatter spam. Some spammer is using your domains names to send out spam. They use your email address like anything#domain.com to send spam from botnet or anywhere. When that mail is undeliverable, your users would receive that bounce message. This is worse when they have catch-all address defined (*#domain.com gets delivered into some mailbox). There is nothing you can do about it as it is completely out of control of your server or your domain. You can little help by having strict SPF records, but it doesn't help much.

Log Memcached Activity

I'm looking to log all activity going on in my memcached server. All reads and writes.
This is going to be used as a distributed daemon for lots of remote php apps in the cloud and need a way to SSH in and check out the activity going on, on the daemon.
I've googled extensively and can't find a single way to do this.
The Redis equivalent would be logging into the interactive console and typing MONITOR.
Thanks in advance.
You can do this using memcached's telnet stats command as such :
enable capturing information : stats detail on
wait a while
disable capturing : stats detail off
dump the information : stats detail dump
telnet yourhost 11211 and run the sequence above. Note that this will impact greatly the performance.
Also you could check out phpmemcacheadmin - it's a really nice tool for monitoring memcached pools.
You can start your memcached using the -vv parameter:
-vv very verbose (also print client commands/reponses)
The output is similar to this:
Dec 12 10:33:12 hostname memcached[18350]: <33 new auto-negotiating client connection
Dec 12 10:33:12 hostname memcached[18350]: 33: Client using the ascii protocol
Dec 12 10:33:12 hostname memcached[18350]: <33 set your.key.1 3 300 525
Dec 12 10:33:12 hostname memcached[18350]: >33 STORED
Dec 12 10:33:12 hostname memcached[18350]: <33 get your.key.2
Dec 12 10:33:12 hostname memcached[18350]: >33 sending key your.key.2
Dec 12 10:33:12 hostname memcached[18350]: >33 END
The output is sent to stdout, so you might want to redirect it to some file.
Use supervisord to run memcached. It'll log the memcached output (based on the 1,2 or 3 -v's) to syslog - so log rotate should work (it won't if you just pipe output to a file) and you can use syslog to send all the logs to a central logging machine (using something like graylog2 to take all the log info).