Any way to filter email dynamically by taking 'from' and matching it with database ? (Using procmail or Virtualmin or Webmin) - postgresql

I basically want to check the incoming 'From' in the email received and then either
Keep it and make it deliver to the intended mailbox if the email matches a Specified MySQL/PostgreSQL
Database User (eg. select email from users where exists ('from email address') )
If the 'From' address is blank or it is not found in the database, the email should be discarded
Any way I can achieve this before the e-mail is delivered to the intended mailbox?
I am using Procmail + Virtualmin + Webmin + PostgreSQL
PS: I want to apply this filter not to the wole server but to some specified mailboxes/users (i'm assuming 1 user = 1 mailbox here)

Procmail can easily run an external command in a condition and react to its exit status. How exactly to make your particular SQL client set its exit code will depend on which one you are using; perhaps its man page will reveal an option to make it exit with an error when a query produces an empty result set, for example? Or else write a shell wrapper to look for empty output.
A complication is that Procmail (or rather, the companion utility formail) can easily extract a string from e.g. the From: header; but you want to reduce this to just the email terminus. This is a common enough task that it's easy to find a canned solution - generate a reply and then extract the To: address (sic!) from that.
FROM=`formail -rtzxTo:`
:0
* FROM ?? ^(one#example\.com|two#site\.example\.net|third#example\.org)$
{
:0
* ? yoursql --no-headers --fail-if-empty-result \
--batch --query databasename \
--eval "select yada yada where address = '$FROM'"
{ }
:0E
/dev/null
}
The first condition examines the variable and succeeds if it contains one of the addresses (my original answer simply had the regex . which matches if the string contains at least one character, any character; I'm not convinced this is actually necessary or useful; there should be no way for From: to be empty). If it is true, Procmail enters the braces; if not, they will be skipped.
The first recipe inside the braces runs an external command and examines its exit code. I'm imagining your SQL client is called yoursql and that it has options to turn off human-friendly formatting (table headers etc) and for running a query directly from the command line on a specific database. We use double quotes so that the shell will interpolate the variable FROM before running this command (maybe there is a safer way to pass string variables which might contain SQL injection attempts with something like --variable from="$FROM" and then use that variable in the query? See below.)
If there is no option to directly set the exit code, but you can make sure standard output is completely empty in the case of no result, piping the command to grep -q . will produce the correct exit code. In a more complex case, maybe write a simple Awk script to identify an empty result set and set its exit status accordingly.
Scraping together information from
https://www.postgresql.org/docs/current/app-psql.html,
How do you use script variables in psql?,
Making an empty output from psql,
and from your question, I end up with the following attempt to implement this in psql; but as I don't have a Postgres instance to test with, or any information about your database schema, this is still approximate at best.
* ? psql --no-align --tuples-only --quiet \
--dbname=databasename --username=something --no-password \
--variable=from="$FROM" \
--command="select email from users where email = :'from'" \
| grep -q .
(We still can't use single quotes around the SQL query, to completely protect it from the shell, because Postgres insists on single quotes around :'from', and the shell offers no facility for embedding literal single quotes inside single quotes.)
The surrounding Procmail code should be reasonably self-explanatory, but here goes anyway. In the first recipe inside the braces, if the condition is true, the empty braces in its action line are a no-op; the E flag on the next recipe is a condition which is true only if any of the conditions on the previous recipe failed. This is a common idiom to avoid having to use a lot of negations; perhaps look up "de Morgan's law". The net result is that we discard the message by delivering it to /dev/null if either condition in the first recipe failed; and otherwise, we simply pass it through, and Procmail will eventually deliver it to its default destination.
The recipe was refactored in response to updates to your question; perhaps now it would make more sense to just negate the exit code from psql with a ! in front:
FROM=`formail -rtzxTo:`
:0
* FROM ?? ^(one#example\.com|two#site\.example\.net|third#example\.org)$
* ! ? psql --no-align --tuples-only --quiet \
--dbname=databasename --username=something --no-password \
--variable=from="$FROM" \
--command="select email from users where email = :'from'" \
| grep -q .
/dev/null
Tangentially, perhaps notice how Procmail's syntax exploits the fact that a leading ? or a doubled ?? are not valid in regular expressions. So the parser can unambiguously tell that these conditions are not regular expressions; they compare a variable to the regex after ??, or examine the exit status of an external command, respectively. There are a few other special conditions like this in Procmail; arguably, all of them are rather obscure.
Newcomers to shell scripting should also notice that each shell command pipeline has two distinct results: whatever is being printed on standard output, and, completely separate from that, an exit code which reveals whether or not the command completed successfully. (Conventionally, a zero exit status signals success, and anything else is an error. For example, the exit status from grep is 0 if it finds at least one match, 1 if it doesn't, and usually some other nonzero exit code if you passed in an invalid regular expression, or you don't have permission to read the input file, etc.)
For further details, perhaps see also http://www.iki.fi/era/procmail/ which has an old "mini-FAQ" which covers several of the topics here, and a "quick reference" for looking up details of the syntax.
I'm not familiar with Virtualmin but https://docs.virtualmin.com/Webmin/PostgreSQL_Database_Server shows how to set up Postgres and as per https://docs.virtualmin.com/Webmin/Procmail_Mail_Filter I guess you will want to use the option to put this code in an include file.

Related

Perl interface with Aspell

I am trying to identify misspelled words with Aspell via Perl. I am working on a Linux server without administrator privileges which means I have access to Perl and Aspell but not, for example, Text::Aspell which is a Perl interface for Aspell.
I want to do the very simple task of passing a list of words to Aspell and having it return the words that are misspelled. If the words I want to check are "dad word lkjlkjlkj" I can do this through the command line with the following commands:
aspell list
dad word lkjlkjlkj
Aspell requires CTRL + D at the end to submit the word list. It would then return "lkjlkjlkj", as this isn't in the dictionary.
In order to do the exact same thing, but submitted via Perl (because I need to do this for thousands of documents) I have tried:
my $list = q(dad word lkjlkjlkj):
my #arguments = ("aspell list", $list, "^D");
my $aspell_out=`#arguments`;
print "Aspell output = $aspell_out\n";
The expected output is "Aspell output = lkjlkjlkj" because this is the output that Aspell gives when you submit these commands via the command line. However, the actual output is just "Aspell output = ". That is, Perl does not capture any output from Aspell. No errors are thrown.
I am not an expert programmer, but I thought this would be a fairly simple task. I've tried various iterations of this code and nothing works. I did some digging and I'm concerned that perhaps because Aspell is interactive, I need to use something like Expect, but I cannot figure out how to use it. Nor am I sure that it is actually the solution to my problem. I also think ^D should be an appropriate replacement for CTRL+D at the end of the commands, but all I know is it doesn't throw an error. I also tried \cd instead. Whatever it is, there is obviously an issue in either submitting the command or capturing the output.
The complication with using aspell out of a program is that it is an interactive and command-line driver tool, as you suspect. However, there is a simple way to do what you need.
In order to use aspell's command list one needs to pass it words via STDIN, as its man page says. While I find the GNU Aspell manual a little difficult to get going with, passing input to a program via its STDIN is easy enough and we can rewrite the invocation as
echo dad word lkj | aspell list
We get lkj printed back, as due. Now this can run out of a program just as it stands
my $word_list = q(word lkj good asdf);
my $cmd = qq(echo $word_list | aspell list);
my #aspell_out = qx($cmd);
print for #aspell_out;
This prints lines lkj and asdf.
I assemble the command in a string (as opposed to an array) for specific reasons, explained below. The qx is the operator form of backticks, which I prefer for its far superior readability.
Note that qx can return all output in a string, if in scalar context (assigned to a scalar for example), or in a list when in list context. Here I assign to an array so you get each word as an element (alas, each also comes with a newline, so may want to do chomp #aspell_out;).
Comment on a list vs string form of a command
I think that it's safe to recommend to use a list-form for a command, in general. So we'd say
my #cmd = ('ls', '-l', $dir); # to be run as an external command
instead of
my $cmd = "ls -l $dir"; # to be run as an external command
The list form generally makes it easier to manage the command, and it avoids the shell altogether.
However, this case is a little different
The qx operator doesn't really behave differently -- the array gets concatenated into a string, and that runs. The very fact that we can pass it an array is incidental, and not even documented
We need to pipe input to aspell's STDIN, and shell does that for us simply. We can use a shell with command's LIST form as well, but then we'd need to invoke it explicitly. We can also go for aspell's STDIN by means other than the shell but that's more complex
With a command in a list the command name must be the first word, so that "aspell list" from the question is wrong and it should fail (there is no command named that) ... except that in this case it wouldn't (if the rest were correct), since for qx the array gets collapsed into a string
Finally, apsell nicely exposes its API in a C library and that's been utilized for the module you mention. I'd suggest to install it as a user (no privileges needed) and use that.
You should take a step back and investigate if you can install Text::Aspell without administrator privilige. In most cases that's perfectly possible.
You can install modules into your home directory. If there is no C-compiler available on the server you can install the module on a compatible machine, compile and copy the files.

Variable not being recognized after "read"

-- Edit : Resolved. See answer.
Background:
I'm writing a shell that will perform some extra actions required on our system when someone resizes a database.
The shell is written in ksh (requirement), the OS is Solaris 5.10 .
The problem is with one of the checks, which verifies there's enough free space on the underlying OS.
Problem:
The check reads the df -k line for root, which is what I check in this step, and prints it to a file. I then "read" the contents into variables which I use in calculations.
Unfortunately, when I try to run an arithmetic operation on one of the variables, I get an error indicating it is null. And a debug output line I've placed after that line verifies that it is null... It lost it's value...
I've tried every method of doing this I could find online, they work when I run it manually, but not inside the shell file.
(* The file does have #!/usr/bin/ksh)
Code:
df -k | grep "rpool/ROOT" > dftest.out
RPOOL_NAME=""; declare -i TOTAL_SIZE=0; USED_SPACE=0; AVAILABLE_SPACE=0; AVAILABLE_PERCENT=0; RSIGN=""
read RPOOL_NAME TOTAL_SIZE USED_SPACE AVAILABLE_SPACE AVAILABLE_PERCENT RSIGN < dftest.out
\rm dftest.out
echo $RPOOL_NAME $TOTAL_SIZE $USED_SPACE $AVAILABLE_SPACE $AVAILABLE_PERCENT $RSIGN
((TOTAL_SIZE=$TOTAL_SIZE/1024))
This is the result:
DBResize.sh[11]: TOTAL_SIZE=/1024: syntax error
I'm pulling hairs at this point, any help would be appreciated.
The code you posted cannot produce the output you posted. Most obviously, the error is signalled at line 11 but you posted fewer than 11 lines of code. The previous lines may matter. Always post complete code when you ask for help.
More concretely, the declare command doesn't exist in ksh, it's a bash thing. You can achieve the same result with typeset (declare is a bash equivalent to typeset, but not all options are the same). Either you're executing this script with bash, or there's another error message about declare, or you've defined some additional commands including declare which may change the behavior of this code.
None of this should have an impact on the particular problem that you're posting about, however. The variables created by read remain assigned until the end of the subshell, i.e. until the code hits a ), the end of a pipe (left-hand side of the pipe only in ksh), etc.
About the use of declare or typeset, note that you're only declaring TOTAL_SIZE as an integer. For the other variables, you're just assigning a value which happens to consist exclusively of digits. It doesn't matter for the code you posted, but it's probably not what you meant.
One thing that may be happening is that grep matches nothing, and therefore read reads an empty line. You should check for errors. Use set -e in scripts to exit at the first error. (There are cases where set -e doesn't catch errors, but it's a good start.)
Another thing that may be happening is that df is splitting its output onto multiple lines because the first column containing the filesystem name is too large. To prevent this splitting, pass the option -P.
Using a temporary file is fragile: the code may be executed in a read-only directory, another process may want to access the same file at the same time... Here a temporary file is useless. Just pipe directly into read. In ksh (unlike most other sh variants including bash), the right-hand side of a pipe runs in the main shell, so assignments to variables in the right-hand side of a pipe remain available in the following commands.
It doesn't matter in this particular script, but you can use a variable without $ in an arithmetic expression. Using $ substitutes a string which can have confusing results, e.g. a='1+2'; $((a*3)) expands to 7. Not using $ uses the numerical value (in ksh, a='1+2'; $((a*3)) expands to 9; in some sh implementations you get an error because a's value is not numeric).
#!/usr/bin/ksh
set -e
typeset -i TOTAL_SIZE=0 USED_SPACE=0 AVAILABLE_SPACE=0 AVAILABLE_PERCENT=0
df -Pk | grep "rpool/ROOT" | read RPOOL_NAME TOTAL_SIZE USED_SPACE AVAILABLE_SPACE AVAILABLE_PERCENT RSIGN
echo $RPOOL_NAME $TOTAL_SIZE $USED_SPACE $AVAILABLE_SPACE $AVAILABLE_PERCENT $RSIGN
((TOTAL_SIZE=TOTAL_SIZE/1024))
Strange...when I get rid of your "declare" line, your original code seems to work perfectly well (at least with ksh on Linux)
The code :
#!/bin/ksh
df -k | grep "/home" > dftest.out
read RPOOL_NAME TOTAL_SIZE USED_SPACE AVAILABLE_SPACE AVAILABLE_PERCENT RSIGN < dftest.out
\rm dftest.out
echo $RPOOL_NAME $TOTAL_SIZE $USED_SPACE $AVAILABLE_SPACE $AVAILABLE_PERCENT $RSIGN
((TOTAL_SIZE=$TOTAL_SIZE/1024))
print $TOTAL_SIZE
The result :
32962416 5732492 25552588 19% /home
5598
Which are the value a simple df -k is returning. The variables seem to last.
For those interested, I have figured out that it is not possible to use "read" the way I was using it.
The variable values assigned by "read" simply "do not last".
To remedy this, I have applied the less than ideal solution of using the standard "while read" format, and inside the loop, echo selected variables into a variable file.
Once said file was created, I just "loaded" it.
(pseudo code:)
LOOP START
echo "VAR_A="$VAR_A"; VAR_B="$VAR_B";" > somefile.out
LOOP END
. somefile.out

cURL filling up an HTML form with tcl

I need to make a Tcl program that logs into a web page and i need to fill up some information and get some information.
The page has lots of forms with diferent types of input, radio/check buttons, entry strings etc the usual.
i can log into the page no problem and fill up the forms without a problem but i have to fill EVERY input for that particular form or else it will be save as empty (the things i didnt specify)
Heres an example:
this is the form:
--- FORM report. Uses POST to URL "/goform/FormUpdateBridgeConfiguration"
Input: NAME="management_ipaddr" (TEXT)
Input: NAME="management_mask" (TEXT)
Input: NAME="upstr_addr_type" VALUE="DHCP" (RADIO)
Input: NAME="upstr_addr_type" VALUE="STATIC" (RADIO)
--- end of FORM
and this is the command i use to fill it up
eval exec curl $params -d upstr_addr_type=STATIC https://$MIP/goform/FormUpdateBridgeConfiguration -o /dev/null 2> /dev/null
where params is:
"\--noproxy $MIP \--connect-timeout 5 \-m 5 \-k \-S \-s \-d \-L \-b Data/curl_cookie_file "
yes i know is horrible but it is what it is .
In this case i want to change the value of upstr_addr_type to STATIC but when i sumit it i lose the info from management_ipaddr and management_mask.
This is a small example, i have to do this for every form and a gizillion more variables so its a real problem for me.
i figure its concept problem or something like that, i look and look and look some more, try -F -X GET -GET -almost every thing on cURL manual, can someone guide me here
If you know what the values of management_ipaddr and management_mask should be, you can just supply them as extra -d arguments. It probably makes sense to wrap this in a procedure
proc UpdateBridgeConfiguration {management_ipaddr management_mask upstr_addr_type} {
global MIP params
eval exec curl $params \
-d management_ipaddr=$management_ipaddr \
-d management_mask=$management_mask \
-d upstr_addr_type=$upstr_addr_type \
"https://$MIP/goform/FormUpdateBridgeConfiguration" \
-o /dev/null 2> /dev/null
# You ought to replace the first line of the above call with:
# exec curl {*}$params \
# Provided you're not on Tcl 8.4 or before...
}
Like that, you'll find it much easier to get the call correct. (You shouldn't need to specify -X POST for this; it's default behaviour when -d is provided.)
To get the existing values, you'll need to GET them from the right URL (which I can't guess for you) and extract them from the resulting HTML. Which might involve using a regular expression against the retrieved document. This is pretty awful, but it's what you're stuck with sometimes. (You can use tDOM to parse HTML properly — provided it isn't too ill-formed — and then use its XPath support to query for the values correctly, but that's rather more complex and introduces a dependency on an external package.) Knowing what the right RE to use is can be tricky, but it is likely to involve grabbing a copy of the form and doing something vaguely like this:
regexp -nocase {<input type="text" name="management_ipaddr" value="([^<"">]*)"} $formSource -> management_ipaddr
regexp -nocase {<input type="text" name="management_mask" value="([^<"">]*)"} $formSource -> management_mask
While in general it could be encoded all sorts of ways, that's very unlikely for IP addresses or masks! On the other hand, the order of the attributes can vary; you have to customize your RE to what you're really dealing with, not merely what it might be…
The curl invokation will be something like
set formSource [exec curl "http://$MIP/the/right/url/here" {*}$params 2>/dev/null]
It's much simpler when you're not having to send data up and you want to consume the result.

updating table rows based on txt file

I have been searching but so far I only found how to insert date into tables based on a csv files.
I have the following scenario:
Directory name = ticketID
Inside this directory I have a couple of files, like:
Description.txt
Summary.txt - Contains ticket header and has been imported succefully.
Progress_#.txt - this is everytime a ticket gets udpdated. I get a new file.
Solution.txt
Importing the Issue.txt was easy since this was actually a CSV.
Now my problem is with Description and Progress files.
I need to update the existing rows with the data from this files. Something on the line of
update table_ticket set table_ticket.description = Description.txt where ticket_number = directoryname
I'm using PostgreSQL and the COPY command is valid for new data and it would still fail due to the ',;/ special chars.
I wanted to do this using bash script, but it seem that it is it won't be possible:
for i in `find . -type d`
do
update table_ticket
set table_ticket.description = $i/Description.txt
where ticket_number = $i
done
Of course the above code would take into consideration connection to the database.
Anyone has a idea on how I could achieve this using shell script. Or would it be better to just make something in Java and read and update the record, although I would like to avoid this approach.
Thanks
Alex
Thanks for the answer, but I came across this:
psql -U dbuser -h dbhost db
\set content = `cat PATH/Description.txt`
update table_ticket set description = :'content' where ticketnr = TICKETNR;
Putting this into a simple script I created the following:
#!/bin/bash
for i in `find . -type d|grep ^./CS`
do
p=`echo $i|cut -b3-12 -`
echo $p
sed s/PATH/${p}/g cmd.sql > cmd.tmp.sql
ticketnr=`echo $p|cut -b5-10 -`
sed -i s/TICKETNR/${ticketnr}/g cmd.tmp.sql
cat cmd.tmp.sql
psql -U supportAdmin -h localhost supportdb -f cmd.tmp.sql
done
The downside is that it will create always a new connection, later I'll change to create a single file
But it does exactly what I was looking for, putting the contents inside a single column.
psql can't read the file in for you directly unless you intend to store it as a large object in which case you can use lo_import. See the psql command \lo_import.
Update: #AlexandreAlves points out that you can actually slurp file content in using
\set myvar = `cat somefile`
then reference it as a psql variable with :'myvar'. Handy.
While it's possible to read the file in using the shell and feed it to psql it's going to be awkward at best as the shell offers neither a native PostgreSQL database driver with parameterised query support nor any text escaping functions. You'd have to roll your own string escaping.
Even then, you need to know that the text encoding of the input file is valid for your client_encoding otherwise you'll insert garbage and/or get errors. It quickly lands up being easier to do it in a langage with proper integration with PostgreSQL like Python, Perl, Ruby or Java.
There is a way to do what you want in bash if you really must, though: use Pg's delimited dollar quoting with a randomized delimiter to help prevent SQL injection attacks. It's not perfect but it's pretty darn close. I'm writing an example now.
Given problematic file:
$ cat > difficult.txt <__END__
Shell metacharacters like: $!(){}*?"'
SQL-significant characters like "'()
__END__
and sample table:
psql -c 'CREATE TABLE testfile(filecontent text not null);'
You can:
#!/bin/bash
filetoread=$1
sep=$(printf '%04x%04x\n' $RANDOM $RANDOM)
psql <<__END__
INSERT INTO testfile(filecontent) VALUES (
\$x${sep}\$$(cat ${filetoread})\$x${sep}\$
);
__END__
This could be a little hard to read and the random string generation is bash specific, though I'm sure there are probably portable approaches.
A random tag string consisting of alphanumeric characters (I used hex for convenience) is generated and stored in seq.
psql is then invoked with a here-document tag that isn't quoted. The lack of quoting is important, as <<'__END__' would tell bash not to interpret shell metacharacters within the string, wheras plain <<__END__ allows the shell to interpret them. We need the shell to interpret metacharacters as we need to substitute sep into the here document and also need to use $(...) (equivalent to backticks) to insert the file text. The x before each substitution of seq is there because here-document tags must be valid PostgreSQL identifiers so they must start with a letter not a number. There's an escaped dollar sign at the start and end of each tag because PostgreSQL dollar quotes are of the form $taghere$quoted text$taghere$.
So when the script is invoked as bash testscript.sh difficult.txt the here document lands up expanding into something like:
INSERT INTO testfile(filecontent) VALUES (
$x0a305c82$Shell metacharacters like: $!(){}*?"'
SQL-significant characters like "'()$x0a305c82$
);
where the tags vary each time, making SQL injection exploits that rely on prematurely ending the quoting difficult.
I still advise you to use a real scripting language, but this shows that it is indeed possible.
The best thing to do is to create a temporary table, COPY those from the files in question, and then run your updates.
Your secondary option would be to create a function in a language like pl/perlu and do this in the stored procedure, but you will lose a lot of performance optimizations that you can do when you update from a temp table.

Same piped call to isql works on Solaris but not on RHEL

Background: I need to port a ksh script from SunOS 5.10 to RHEL 5.8. It makes a call to isql to retrieve some data and, quite contrary to the intended application of final endpoint client utilities such as isql, it parses its out to be used by a variable in the shell script. Please note that I just inherited this and by no means did design such a hack myself. I certainly never would be parsing isql out to assign value to a var in shell -- if the script needed that info, I would use Perl with some API like DBD::DBI that is designed to marshall data between the application and the data store. But I have what I have and must work within the parameters.
What is happening is that the following piped input does return data on SunOS but not in RHEL:
echo "SELECT some_field FROM some_table WHERE some_crtra = 'X' \ngo" | isql -U$USER -P$PASS -D$DB -S$SERVER
That output on Solaris being:
some_field
------
Y
(1 row affected)
From that point, the script uses awk to extract just the field value from the above stream but let's ignore that because that's not the problem.
Also please note that I am able to get the data executing the piped commands separately, i.e. by going manually into isql and running the SQL. So the SQL or the connection string are not the problem -- it is either how the piping streams data OR isql itself works differently on the different platforms.
Can anybody see why there is disparate response to the same input on the two systems? Any idea how I can change the piping to make it work?
Thanks
echo "SELECT some_field FROM some_table WHERE some_crtra = 'X' \ngo"
is non portable.
I would suggest instead:
printf "SELECT some_field FROM some_table WHERE some_crtra = 'X' \ngo\n"
From the ksh93 manual page:
When the first arg does not begin with a -, and none of the
arguments contain a \, then echo prints each of its arguments
separated by a space and terminated by a new-line. Otherwise,
the behavior of echo is system dependent and print or printf
described below should be used. See echo(1) for usage and
description.