Configure procmail to match an external email address list when filtering emails - email

My fetchmail scripts retrieves emails from an email box and puts them into a file, called mario, and dumps it into my /var/mail/ folder. I am trying to set up a procmail script to process mario; by processing, this is what I mean: the procmail script should filter against an external text file (fromlist) containing a list of known email addresses. Once there is a match mario/fromlist, the message is pulled out from mario and stored into my local nbox/ folder.
Online, I found a piece of code, including a recipe, that I have entered into my procmail control file (.procmailrc) but it doesn't seem to be working. This is the code:
FROMFL=$MAIL/fromlist
FROMLS=formail -xFrom: | sed -e 's/*(.*)//;s/>.*//;s/.*[:]*//'`
:0
* ? fgrep -xi $FROMLS $FROMFL
$MAIL/inbox
I think I have addressed the sed (see my question Sed command and unknown patterns found online), but I still haven't been able to address the formail and fgrep parts. So when I run the procmail script, the logs I obtain are:
$ mailstat var/log/procmail.log
/bin/sh: 0: Can't open fgrep
/bin/sh: 1: grep: not found
/bin/sh: 1: sed: not found
/home/user/var/mail/reginbox/
procmail: [6880] Sat Jun 16 16:57:32 2018
procmail: Acquiring kernel-lock
procmail: Assigning "FROMFL=/home/user/var/mail/fromlist"
procmail: Assigning "FROMLS="
procmail: Assigning "LASTFOLDER=/home/user/var/mail/reginbox/msg.XXX"
procmail: Assigning "SHELL=/bin/sh"
procmail: Executing "fgrep,-xi,/home/user/var/mail/fromlist"
procmail: Executing "formail -xFrom: | sed -e `'s/.*<//; s/>.*//'`"
procmail: No match on "fgrep -xi /home/user/var/mail/fromlist"
procmail: Non-zero exitcode (127) from "fgrep"
procmail: Notified comsat: "user#0:/home/user/var/mail/reginbox/msg.XXX"
procmail: Opening "/home/user/var/mail/reginbox/msg.XXX"
It looks as if formail cannot quite extract the lines where "From:" is located, which means that the email addresses in those lines are not carved out of the rest by the SED command and are not compared against the text file with the list of emails (fromlist), that's why the log show a "No match" message.
How can I find out where these things break down?

The syntax for running an external command is
VARIABLE=`command to run`
You are missing the opening backtick, so you are running effectively
FROMLS="formail"
-xFrom: | sed etc is a syntax error
Anyway, the recipe to extract the sender is a bit inexact, because it doesn't cope correctly with various variations in email address formats. A more robust but slightly harder to understand solution is
FROMLS=`formail -rtzxTo:`
which makes formail generate a reply -rt, then from the generated reply extract the To: address, which of course now points back to the original sender. By design, formail only puts the actual email address of the sender of the input message in the To: header when it generates the reply, so that's what you will be extracting.
With that out of the way, your script should technically work to the point where it can extract the matching messages and copy them to the destination folder you want. Here's a quick demo:
tripleee$ cd /tmp
tripleee$ echo moo#example.com >fromlist
tripleee$ cat one.rc
# temporary hack
SHELL=/bin/sh
MAILDIR=/tmp
MAIL=.
VERBOSE=yes
FROMFL=$MAIL/fromlist
FROMLS=`formail -rtzxTo:`
:0
* ? fgrep -xi "$FROMLS" "$FROMFL"
$MAIL/inbox
tripleee$ procmail -m one.rc <<\:
From: ick#example.com
To: poo#example.org
Subject: no match
hello
:
procmail: [16406] Wed Jun 27 13:41:35 2018
procmail: Assigning "FROMFL=./fromlist"
procmail: Executing "formail,-rtzxTo:"
procmail: Assigning "FROMLS=ick#example.com"
procmail: Executing "fgrep,-xi,ick#example.com,./fromlist"
procmail: Non-zero exitcode (1) from "fgrep"
procmail: No match on "fgrep -xi ick#example.com ./fromlist"
Subject: no match
Folder: **Bounced** 61
tripleee$ procmail -m one.rc <<\:
From: moo#example.com
To: poo#example.org
Subject: match
hello
:
procmail: [16410] Wed Jun 27 13:41:37 2018
procmail: Assigning "FROMFL=./fromlist"
procmail: Executing "formail,-rtzxTo:"
procmail: Assigning "FROMLS=moo#example.com"
procmail: Executing "fgrep,-xi,moo#example.com,./fromlist"
procmail: Match on "fgrep -xi moo#example.com ./fromlist"
procmail: Assigning "LASTFOLDER=./inbox"
procmail: Opening "./inbox"
procmail: Acquiring kernel-lock
Subject: match
Folder: ./inbox 68
There is no way really for procmail to remove anything from the input folder. If you want to do that, a common solution is to have Procmail write the non-matching messages to another output folder, then copy that back over the input file. The net effect is that the messages from the original input folder are now partitioned into two files, one with matching, and one with non-matching messages.

Related

Writing a filter for a regex that works in fail2ban-regex on the command line

I have entries like these in apache2 error.log
[Thu Jan 12 09:18:51.078445 2023] [core:error] [pid 47992] [client 152.89.196.211:53158] AH10244: invalid URI path (/cgi-bin/.%2e/.%2e/.%2e/.%2e/bin/sh)
[Wed Jan 11 06:01:09.820582 2023] [core:error] [pid 30833] [client 185.225.74.55:39856] AH10244: invalid URI path (/cgi-bin/.%%%%32%%65/.%%%%32%%65/.%%%%32%%65/.%%%%32%%65/.%%%%32%%65/bin/sh)
[Wed Jan 11 17:16:49.643509 2023] [core:error] [pid 41882] [client 152.89.196.211:52746] AH10244: invalid URI path (/cgi-bin/.%2e/.%2e/.%2e/.%2e/bin/sh)
I got this to work on the command line:
fail2ban-regex test.log '.*\[client <HOST>:\d+\] AH10244.*$'
Every time I try to stick the regex into a .conf file like so:
[Definition]
failregex = .*\[client <HOST>:\d+\] AH10244.*$
ignoreregex =
fail2ban complains:
Running tests
=============
Use failregex line : filter.conf
ERROR: No failure-id group in 'filter.conf'
I've looked in the man pages and online but I can't find an explanation of what this message is trying to say, or how to fix it.
The Questions
How do I wrap a .conf file around this regex?
What does that error mean?
Could I (how would I) use the pre-defined stuff in apache-common.conf to make this regex more robust?
This fixed it:
fail2ban-regex test.log ./filter.conf
I had my test files (test.log and filter.conf) in my home dir. When I ( from the home dir ) issued the command:
fail2ban-regex test.log filter.conf
I assumed that I was referencing ./test.log and ./filter.conf but I think that fail2ban was looking in the filter.d/ folder to try to find filter.conf.
I found that if filter.conf was in the /etc/fail2ban/filter.d/ folder, then fail2ban-regex test.log filter.conf succeeded.

Deadlock with Perl, IO:Async::Loop and pipe to sendmail

We are seeing suck sendmail processes when we are attempting to send email from a Perl FCGI process. These processes are taking too long, hours to a day, since it should just be doing a relay to a server configured in sendmail as the smart host. Most of the mail from the FCGI processes takes less than 5 seconds. The slow sendmail processes are easy to find on the our servers with $ ps -ef | grep sendmail
Almost all of the email works normally from these web nodes. I'd guess thousands of mails go through with no problem. Sending test email from the command line goes smoothly. The sendmail command gets stuck rarely and we don't have a way to reproduce it.
It seems that most of this stuck email gets through sooner or later. These seem to be sending mail hours later, sometimes over a day later.
All of the sendmail that we've seen stuck has been a command that was run by a Perl process, which is a child process of a FCGI process.
Looking at the logs of the smart host we see that most of this mail does get through sooner or later but we have found some that don't seem to have ever been sent.
This is running in FCGI for Catalyst and then added to a IO::Async::Loop which does some processing, and in the IO::Async::Loop, Email::Sender::Transport::Sendmail is used which does a open($fh, '|-', #args) and pipes the mail header+body and does a close($fh).
I've seen this http://perldoc.perl.org/perlipc.html#Avoiding-Pipe-Deadlocks but don't know how to apply it in this situation. The child sendmail has only STDIN open.
When we have one of these stuck sendmails the sendmail is waiting on STDIN:
[<ffffffff8119ce8b>] pipe_wait+0x5b/0x80
[<ffffffff8119d8ad>] pipe_read+0x34d/0x4d0
[<ffffffff8119204a>] do_sync_read+0xfa/0x140
[<ffffffff81192945>] vfs_read+0xb5/0x1a0
[<ffffffff81192c91>] sys_read+0x51/0xb0
[<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
and the async perl process is waiting on the child to die:
#0 0x00007f8849e6065e in waitpid () from /lib64/libc.so.6
#1 0x000000000046dc2d in Perl_wait4pid ()
#2 0x000000000046de2d in Perl_my_pclose ()
#3 0x00000000004cec4e in Perl_io_close ()
#4 0x00000000004ceda8 in Perl_do_close ()
#5 0x00000000004c2629 in Perl_pp_close ()
#6 0x00000000004804de in Perl_runops_standard ()
#7 0x000000000042e7ad in perl_run ()
#8 0x000000000041bbc5 in main ()
An example of one that didn't get through:
Job #1653576 (that's just our internal job number) has a sendmail process that started on Aug 19 13:04.
Process on webnode2:
fcgi-user 13621 13466 0 13:04 ? 00:00:00 /usr/sbin/sendmail -i -f admin#ourServer.org -- proffunnyhat#mit.edu
I don't see the record I expect to see on our smart host for this in /var/log/maillog that would indicate that it was relayed to nexus and then to MIT.
I do see successful email for proffunnyhat#mit.edu on Aug 21 (from web2 /var/log/maillog):
Aug 21 00:00:02 node-008 sendmail[13621]: u7JH4tbr013621: to=proffunnyhat#mit.edu, ctladdr=admin#ourServer.org (10520/10520), delay=1+10:55:07, xdelay=00:00:01, mailer=relay, pri=32292, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (u7L401Z1026237 Message accepted for delivery)
Aug 21 00:00:02 node-008 sendmail[26247]: u7L401Z1026237: to=<proffunnyhat#mit.edu>, delay=00:00:01, xdelay=00:00:00, mailer=relay, pri=122657, relay=mail.ourServer.org. [128.84.4.11], dsn=2.0.0, stat=Sent (u7L402jx001185 Message accepted for delivery)
and then on mail.ourServer.org:
bdc34 #mail.ourServer.org: log$ sudo grep u7L402jx001185 maillog*
maillog-20160821:Aug 21 00:00:02 web2 sendmail[1185]: u7L402jx001185: from=<admin#ourServer.org>, size=2874, class=0, nrcpts=1, msgid=<201608191704.u7JH4tbr013621#mail.ourServer.org>, proto=ESMTP, daemon=MTA, relay=mail.ourServer.org [128.84.4.13]
maillog-20160821:Aug 21 00:00:03 mail.ourServer.org[1200]: u7L402jx001185: to=<proffunnyhat#mit.edu>, ctladdr=<e-admin#ourServer.org> (10519/10519), delay=00:00:01, xdelay=00:00:01, mailer=esmtp, pri=122874, relay=dmz-mailsec-scanner-8.mit.edu. [18.7.68.37], dsn=2.0.0, stat=Sent (OK 5E/2D-20045-34729B75)
An example of one that was stuck but seems to have been sent:
mail.ourServer.org:/var/log/sendmail:
Aug 19 02:19:51 mail.ourServer.org sendmail[20792]: u7J6JlP6020790: to=<jxjx#connect.ust.hk>, ctladdr=<admin#ourServer.org> (10519/10519), delay=00:00:04, xdelay=00:00:04, mailer=esmtp, pri=122504, relay=connect-ust-hk\
.mai...ction.outlook.com. [213.199.154.87], dsn=2.0.0, stat=Sent (<201608190619.u7J6Jlda000738#web2.ourServer.org> [InternalId=15526306777069,...1MB1197.apcprd01.prod.exchangelabs.com] 9137 bytes in 0.189, 47.082 KB/sec\
Queued mail for delivery)
Things we have tried
I've modified Email::Sender::Transport::Sendmail to send a '\x00' to the pipe, that didn't work.
I've replaced IO::Async::Loop::Poll with IO::Async::Loop::Select. That didn't change anything.
I've tried sending signals to the sendmail and its parent. That killed them but the mail was aborted.
Added our fcgi user to sendmail's trusted users file. Didn't change anything.
I wrote a wrapper script that read from STDIN and writes to sendmail. If nothing comes in on STDIN for 5 seconds it exits. This feels really hacky to me but it does seem to work. Since mail is a critical part of our system I'd rather have a real solution.
ikegami comment lead us to the answer of doing a double fork. Looking at the signal handlers and file handles set up under FCGI made it clear that excessively clever things were happening. So I moved to cut all ties with the parent process using a double fork like when starting a daemon. That worked.
# FCGI does some clever signal handeling and file handers to do
# its work. This causes problems with forked processes that then
# fork and wait for other processes. So use exec to get a process
# that is not a copy of the parent. Double fork to avoid zombies.
# Check -e of submit script because $! has a cryptic messsage if it doens't exist
my $script = "$SAFEBINDIR/submit.pl";
unless( -e -r -x $script ){
$submission->submit_log->error($submission->submission_id . ": $script doesn't exist");
$c->flash( message => "There was a problem");
$c->res->redirect( $c->uri_for('/user') );
$c->detach;
}
# Do the double fork + exec to have an independent process
my $pid;
unless( $pid = fork() ) { #this is the child
unless( fork() ){ #this is the grandchild
exec( $script, $submission->submission_id ) #should never return
or $submission->submit_log->error($submission->submission_id
. ": Error when trying to exec() $script '$!'");
exit(0);
}
}
waitpid($pid,0); #wait for child, grandchild will get ppid 1
}

Wierd behavior of db2ls when run from directory containing spaces

I am facing a strange issue with getting the db2 version using db2ls.
Below are 2 instances of db2ls execution
[root#dummy 6]# cd /tmp
[root#dummy tmp]# db2ls
Install Path Level Fix Pack Special Install Number Install Date Installer UID
---------------------------------------------------------------------------- -----------------------------------------
/opt/ibm/db2/V10.5 10.5.0.3 3 Tue Mar 10 01:38:53 2015 PDT 0
[root#dummy tmp]# mkdir test\ dir
[root#dummy tmp]# cd test\ dir/
[root#dummy test dir]# db2ls
/usr/local/bin/db2ls: line 43: cd: /tmp/test: No such file or directory
Install Path Level Fix Pack Special Install Number Install Date Installer UID
---------------------------------------------------------------------------------------------------------------------
/opt/ibm/db2/V10.5 10.5.0.3 3 Tue Mar 10 01:38:53 2015 PDT 0
It looks like db2ls is having issues when executed from a directory having spaces. Is this a known issue? I could not find any documentation for this. I am trying to circumvent this problem by using db2ls 2>/dev/null.
If there is a more efficient way please let me know.
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.qb.server.doc/doc/t0023683.html
for DB2 installation paths it says
DB2 installation paths have the following rules:
Can include lowercase letters (a-z), uppercase letters (A-Z), and the underscore character ( _ )
Cannot exceed 128 characters
Cannot contain spaces
Cannot contain non-English characters
Reading statements like this are always a warning sign not to use spaces in paths in general. BTW, shell scripts don't play well with spaces in paths which is for me a good reason to avoid spaces in general.

Adding data to a text file with SED not changing the file size

I have some text files where I need to add 1 character to the beginning of every line of the file.
In windows, I found that a quick way to do this was by installing Cygwin and using the following command, which prepends the letter N to every line of the file:
$ sed 's/^/N/' inputFile.txt > outputFile.txt
What I found strange, was that after I added a new character to the front of each line, the file size was almost completely unchanged. I tested this further, to see if I could recreate the problem with the following steps:
Created a text file called "Test.txt", which had 10,000 lines with the word "TEST" on each line.
Created a text file called "TestWithNPrefix.txt" which had 10,000 lines with the word "NTEST" on each line.
Executed the following command to create another file which had 10,000 lines of "NTEST"
$ sed 's/^/N/' Test.txt > "SEDTest.txt"
Results
"Test" and "SEDTest" were almost the exact same size, while "TestWithNPrefix" was 10KB larger.
Test = 59,998 Bytes; SEDTest = 59,999 Bytes; TestWithNPrefix = 69,998 Bytes
When I ran the "fc" command in Command Prompt, it returned that there were no differences between "SEDTest" and "TestWithNPrefix". "FC" between "SEDTest" and "Test" returned "Resync Filed. Files are too different".
Can someone please help me understand what is causing these file size discrepancies?
EDIT: I created the files "Test.txt" and "TestWithNPrefix.txt" in UltraEdit. I just typed out the word "TEST"/"NTEST", then copied and pasted it 10,000 times.
Not an answer, but a comment with formatting:
You seem to be running into some odd situation with DOS versus Unix line endings. I have to ask: How are you creating the files? I would expect 10,000 lines of "TEST\r\n" would be exactly 60,000 bytes in size, not 59,999
On Linux (I don't have access to a cygwin environment at the moment):'
$ yes $'TEST\r' | head -n 10000 > Test
$ ll Test
-rw-r--r-- 1 jackman jackman 60000 Jan 8 13:06 Test
$ sed 's/^/N/' Test > SEDTest
$ ll *Test
-rw-r--r-- 1 jackman jackman 70000 Jan 8 13:06 SEDTest
-rw-r--r-- 1 jackman jackman 60000 Jan 8 13:06 Test

JMeter Command Line Output

I'm running a JMeter test plan from command line and it's currently outputting something along the lines of:
Created the tree successfully using C:\*****\TestPlan.jmx
Starting the test # Thu Oct 11 10:20:43 EDT 2012 (1349965243947)
Waiting for possible shutdown message on port 4445
Tidying up ... # Thu Oct 11 10:20:46 EDT 2012 (1349965246384)
... end of run
Is there any way to turn off this output and have the plan execute 'silently'?
Found a way to do this, by following this article http://www.robvanderwoude.com/battech_redirection.php
and appending > NUL to the command
jmeter -n -t C:\***\TestPlan.jmx -Jhostname=%1 > NUL