how to get list of POSIX group members in Perl - perl

Is there any way to get a list of all the members of a POSIX group in Perl?
I can't use getgrent() and similar because it returns the list as a space delimited string, and some usernames can have spaces in them.
I have to handle spaces in user and group names, because I'm working in an AD environment that other organizations can create users and groups in, so I'm trying to account for possible edge cases.

I'd say just use getgrent() and don't worry about spaces.
It may be possible to create a user name with one or more spaces in it, perhaps by manually editing /etc/passwd, but it's going to cause other problems as well. For example, ~foo is foo's home directory, but ~foo bar isn't foo bar's home directory.
On Linux, the useradd and adduser commands don't even permit spaces in file names. On Linux Mint 14 (based on Ubuntu 12.10):
$ sudo adduser 'foo bar'
adduser: To avoid problems, the username should consist only of
letters, digits, underscores, periods, at signs and dashes, and not start with
a dash (as defined by IEEE Std 1003.1-2001). For compatibility with Samba
machine accounts $ is also supported at the end of the username
$ sudo useradd !$
sudo useradd 'foo bar'
useradd: invalid user name 'foo bar'
$
Do you actually have user names with spaces on your system?
UPDATE: I've found that it actually is possible to create user names with spaces. useradd and adduser don't allow it (and you should be using one of those commands, or something similar, to create new accounts). But if I manually edit /etc/passwd using sudo vipw, I can create a user named foo bar, and I can do:
su - 'foo bar'
ssh 'foo bar#localhost'
etc. But it's a Really Bad Idea. Perl's getgr*() cannot tell whether a group contains one entry for foo bar or two entries for foo and bar (which is what you're asking about), and I can't use the shell's ~name syntax to refer to the account's home directory. I could use other methods to get both pieces of information, but it's much easier to avoid creating such an account in the first place.
If you're seriously concerned about some admin being foolish enough to create such an account, then you can use some of the alternative methods that have been discussed. But as I said, I don't think it's worth the effort.
(Perl could have avoided this problem by delimiting the list with : characters rather than spaces, since those are actually incompatible with the format of /etc/passwd and /etc/group, which the system depends on. But it's too late to change it now.)
UPDATE 2:
As you say in a comment (which I've edited into your question):
I have to handle spaces in user and group names, because I'm working in an AD environment that other organizations can create users and groups in, so I'm trying to account for possible edge cases.
Your solution from the same comment:
map((getgrgrid($_))[0], split(/ /, `id -G $username`))
is probably the best workaround. (id -G prints numeric group ids; which obviously can't contain spaces.)
It's probably also worth checking whether you actually have user or group names with spaces in them (though of course that doesn't guard against such names being added in the future). I wonder how your POSIX system actually deals with such names. I wouldn't be astonished if they're automatically translates them somehow. Even so, your id -G solution will still work.

If I have /etc/group:
...
postgres:x:26:
fsniper:x:481:
clamupdate:x:480:
some spacey group:x:482:saml, some spacey user
I can use the following commands to see this group's members:
% getent group
...
postgres:x:26:
fsniper:x:481:
clamupdate:x:480:
some spacey group:x:482:saml,some spacey user
Or if you know the specific group that you're interested in:
% getent group "some spacey group"
some spacey group:x:482:saml,some spacey user
These could be wrapped inside of a Perl script like this:
#!/usr/bin/perl
use feature qw(say);
chomp (my $getent = `getent group "some spacey group" | sed 's/.*://'`);
my #users = split(/,/, $getent);
foreach my $i (#users) { say $i; }
Running it:
% ./b.pl
saml
some spacey user
Resources
How to list all users in a Linux group?

Related

Any way to filter email dynamically by taking 'from' and matching it with database ? (Using procmail or Virtualmin or Webmin)

I basically want to check the incoming 'From' in the email received and then either
Keep it and make it deliver to the intended mailbox if the email matches a Specified MySQL/PostgreSQL
Database User (eg. select email from users where exists ('from email address') )
If the 'From' address is blank or it is not found in the database, the email should be discarded
Any way I can achieve this before the e-mail is delivered to the intended mailbox?
I am using Procmail + Virtualmin + Webmin + PostgreSQL
PS: I want to apply this filter not to the wole server but to some specified mailboxes/users (i'm assuming 1 user = 1 mailbox here)
Procmail can easily run an external command in a condition and react to its exit status. How exactly to make your particular SQL client set its exit code will depend on which one you are using; perhaps its man page will reveal an option to make it exit with an error when a query produces an empty result set, for example? Or else write a shell wrapper to look for empty output.
A complication is that Procmail (or rather, the companion utility formail) can easily extract a string from e.g. the From: header; but you want to reduce this to just the email terminus. This is a common enough task that it's easy to find a canned solution - generate a reply and then extract the To: address (sic!) from that.
FROM=`formail -rtzxTo:`
:0
* FROM ?? ^(one#example\.com|two#site\.example\.net|third#example\.org)$
{
:0
* ? yoursql --no-headers --fail-if-empty-result \
--batch --query databasename \
--eval "select yada yada where address = '$FROM'"
{ }
:0E
/dev/null
}
The first condition examines the variable and succeeds if it contains one of the addresses (my original answer simply had the regex . which matches if the string contains at least one character, any character; I'm not convinced this is actually necessary or useful; there should be no way for From: to be empty). If it is true, Procmail enters the braces; if not, they will be skipped.
The first recipe inside the braces runs an external command and examines its exit code. I'm imagining your SQL client is called yoursql and that it has options to turn off human-friendly formatting (table headers etc) and for running a query directly from the command line on a specific database. We use double quotes so that the shell will interpolate the variable FROM before running this command (maybe there is a safer way to pass string variables which might contain SQL injection attempts with something like --variable from="$FROM" and then use that variable in the query? See below.)
If there is no option to directly set the exit code, but you can make sure standard output is completely empty in the case of no result, piping the command to grep -q . will produce the correct exit code. In a more complex case, maybe write a simple Awk script to identify an empty result set and set its exit status accordingly.
Scraping together information from
https://www.postgresql.org/docs/current/app-psql.html,
How do you use script variables in psql?,
Making an empty output from psql,
and from your question, I end up with the following attempt to implement this in psql; but as I don't have a Postgres instance to test with, or any information about your database schema, this is still approximate at best.
* ? psql --no-align --tuples-only --quiet \
--dbname=databasename --username=something --no-password \
--variable=from="$FROM" \
--command="select email from users where email = :'from'" \
| grep -q .
(We still can't use single quotes around the SQL query, to completely protect it from the shell, because Postgres insists on single quotes around :'from', and the shell offers no facility for embedding literal single quotes inside single quotes.)
The surrounding Procmail code should be reasonably self-explanatory, but here goes anyway. In the first recipe inside the braces, if the condition is true, the empty braces in its action line are a no-op; the E flag on the next recipe is a condition which is true only if any of the conditions on the previous recipe failed. This is a common idiom to avoid having to use a lot of negations; perhaps look up "de Morgan's law". The net result is that we discard the message by delivering it to /dev/null if either condition in the first recipe failed; and otherwise, we simply pass it through, and Procmail will eventually deliver it to its default destination.
The recipe was refactored in response to updates to your question; perhaps now it would make more sense to just negate the exit code from psql with a ! in front:
FROM=`formail -rtzxTo:`
:0
* FROM ?? ^(one#example\.com|two#site\.example\.net|third#example\.org)$
* ! ? psql --no-align --tuples-only --quiet \
--dbname=databasename --username=something --no-password \
--variable=from="$FROM" \
--command="select email from users where email = :'from'" \
| grep -q .
/dev/null
Tangentially, perhaps notice how Procmail's syntax exploits the fact that a leading ? or a doubled ?? are not valid in regular expressions. So the parser can unambiguously tell that these conditions are not regular expressions; they compare a variable to the regex after ??, or examine the exit status of an external command, respectively. There are a few other special conditions like this in Procmail; arguably, all of them are rather obscure.
Newcomers to shell scripting should also notice that each shell command pipeline has two distinct results: whatever is being printed on standard output, and, completely separate from that, an exit code which reveals whether or not the command completed successfully. (Conventionally, a zero exit status signals success, and anything else is an error. For example, the exit status from grep is 0 if it finds at least one match, 1 if it doesn't, and usually some other nonzero exit code if you passed in an invalid regular expression, or you don't have permission to read the input file, etc.)
For further details, perhaps see also http://www.iki.fi/era/procmail/ which has an old "mini-FAQ" which covers several of the topics here, and a "quick reference" for looking up details of the syntax.
I'm not familiar with Virtualmin but https://docs.virtualmin.com/Webmin/PostgreSQL_Database_Server shows how to set up Postgres and as per https://docs.virtualmin.com/Webmin/Procmail_Mail_Filter I guess you will want to use the option to put this code in an include file.

Perl interface with Aspell

I am trying to identify misspelled words with Aspell via Perl. I am working on a Linux server without administrator privileges which means I have access to Perl and Aspell but not, for example, Text::Aspell which is a Perl interface for Aspell.
I want to do the very simple task of passing a list of words to Aspell and having it return the words that are misspelled. If the words I want to check are "dad word lkjlkjlkj" I can do this through the command line with the following commands:
aspell list
dad word lkjlkjlkj
Aspell requires CTRL + D at the end to submit the word list. It would then return "lkjlkjlkj", as this isn't in the dictionary.
In order to do the exact same thing, but submitted via Perl (because I need to do this for thousands of documents) I have tried:
my $list = q(dad word lkjlkjlkj):
my #arguments = ("aspell list", $list, "^D");
my $aspell_out=`#arguments`;
print "Aspell output = $aspell_out\n";
The expected output is "Aspell output = lkjlkjlkj" because this is the output that Aspell gives when you submit these commands via the command line. However, the actual output is just "Aspell output = ". That is, Perl does not capture any output from Aspell. No errors are thrown.
I am not an expert programmer, but I thought this would be a fairly simple task. I've tried various iterations of this code and nothing works. I did some digging and I'm concerned that perhaps because Aspell is interactive, I need to use something like Expect, but I cannot figure out how to use it. Nor am I sure that it is actually the solution to my problem. I also think ^D should be an appropriate replacement for CTRL+D at the end of the commands, but all I know is it doesn't throw an error. I also tried \cd instead. Whatever it is, there is obviously an issue in either submitting the command or capturing the output.
The complication with using aspell out of a program is that it is an interactive and command-line driver tool, as you suspect. However, there is a simple way to do what you need.
In order to use aspell's command list one needs to pass it words via STDIN, as its man page says. While I find the GNU Aspell manual a little difficult to get going with, passing input to a program via its STDIN is easy enough and we can rewrite the invocation as
echo dad word lkj | aspell list
We get lkj printed back, as due. Now this can run out of a program just as it stands
my $word_list = q(word lkj good asdf);
my $cmd = qq(echo $word_list | aspell list);
my #aspell_out = qx($cmd);
print for #aspell_out;
This prints lines lkj and asdf.
I assemble the command in a string (as opposed to an array) for specific reasons, explained below. The qx is the operator form of backticks, which I prefer for its far superior readability.
Note that qx can return all output in a string, if in scalar context (assigned to a scalar for example), or in a list when in list context. Here I assign to an array so you get each word as an element (alas, each also comes with a newline, so may want to do chomp #aspell_out;).
Comment on a list vs string form of a command
I think that it's safe to recommend to use a list-form for a command, in general. So we'd say
my #cmd = ('ls', '-l', $dir); # to be run as an external command
instead of
my $cmd = "ls -l $dir"; # to be run as an external command
The list form generally makes it easier to manage the command, and it avoids the shell altogether.
However, this case is a little different
The qx operator doesn't really behave differently -- the array gets concatenated into a string, and that runs. The very fact that we can pass it an array is incidental, and not even documented
We need to pipe input to aspell's STDIN, and shell does that for us simply. We can use a shell with command's LIST form as well, but then we'd need to invoke it explicitly. We can also go for aspell's STDIN by means other than the shell but that's more complex
With a command in a list the command name must be the first word, so that "aspell list" from the question is wrong and it should fail (there is no command named that) ... except that in this case it wouldn't (if the rest were correct), since for qx the array gets collapsed into a string
Finally, apsell nicely exposes its API in a C library and that's been utilized for the module you mention. I'd suggest to install it as a user (no privileges needed) and use that.
You should take a step back and investigate if you can install Text::Aspell without administrator privilige. In most cases that's perfectly possible.
You can install modules into your home directory. If there is no C-compiler available on the server you can install the module on a compatible machine, compile and copy the files.

How to read a large number of LDAP entries in perl?

I already have an LDAP script in order to read LDAP user information one by one. My problem is that I am returning all users found in Active Directory. This will not work because currently our AD has around 100,000 users causing the script to crash due to memory limitations.
What I was thinking of doing was to try to process users by batches of X amount of users and if possible, using threads in order to process some users in parallel. The only thing is that I have just started using Perl, so I was wondering if anyone could give me a general idea of how to do this.
If you can get the executable ldapsearch to work in your environment (and it does work in *nix and Windows, although the syntax is often different), you can try something like this:
my $LDAP_SEARCH = "ldapsearch -h $LDAP_SERVER -p $LDAP_PORT -b $BASE -D uid=$LDAP_USERNAME -w $LDAP_PASSWORD -LLL";
my #LDAP_FIELDS = qw(uid mail Manager telephoneNumber CostCenter NTLogin displayName);
open (LDAP, "-|:utf8", "$LDAP_SEARCH \"$FILTER\" " . join(" ", #LDAP_FIELDS));
while (<LDAP>) {
# process each LDAP response
}
I use that to read nearly 100K LDAP entries without memory problems (although it still takes 30 minutes or more). You'll need to define $FILTER (or leave it blank) and of course all the LDAP server/username/password pieces.
If you want/need to do a more pure-Perl version, I've had better luck with Net::LDAP instead of Net::LDAP::Express, especially for large queries.

How do you search by DN in LDAP?

I'm pulling information about a user from LDAP. This includes directReports, which is in the full CN=cnBlah, OU=ouBlah, DC=dcBlah form. I'm trying to do another lookup to find info about the reportee.
So far the only way I've been able to actually find said user is to break out the CN= and set the remainder of the string as the base.
Is this the proper way of doing it? Or is there a way to search for an entry given the full DN?
Use the DN as the base object in the search and set the scope of the search to base.
Calling ldapsearch with the -f option would do pretty much what you want.
Save your first search results to a file, with only the value of the cn attribute. For example, your file would look like this :
users.txt:
user1
user2
cnBlah
john
jim
user883
Then call ldapsearch with a base that is high enough to encompass all users. This could be -b dc=users,dc=example,dc=com.
So if you saved your user list to a file named users.txt, your ldapsearch command line would look like this :
#I removed the hostname, port and authentification for clarity
ldapsearch -b "dc=users,dc=example,dc=com" -s sub "cn=%s" -f users.txt -LLL
Long lines will wrap at ~76 characters. Nothing that a pipe through perl -p00e 's/\r?\n //g' can't fix. (Or just add option -o ldif-wrap=no to your ldapsearch commandline.)
Closing the loop on this question, courtesy of https://www.openldap.org/lists/openldap-software/200503/msg00520.html
When you know the DN of an entry, there is no need to "search" for it all, just retrieve the entry directly:
ldapsearch -x -LLL -b "uid=droy,ou=people,dc=eclipse,dc=org"
So that answers the "how do you use ldapsearch to lookup() an item rather than search for it"

updating table rows based on txt file

I have been searching but so far I only found how to insert date into tables based on a csv files.
I have the following scenario:
Directory name = ticketID
Inside this directory I have a couple of files, like:
Description.txt
Summary.txt - Contains ticket header and has been imported succefully.
Progress_#.txt - this is everytime a ticket gets udpdated. I get a new file.
Solution.txt
Importing the Issue.txt was easy since this was actually a CSV.
Now my problem is with Description and Progress files.
I need to update the existing rows with the data from this files. Something on the line of
update table_ticket set table_ticket.description = Description.txt where ticket_number = directoryname
I'm using PostgreSQL and the COPY command is valid for new data and it would still fail due to the ',;/ special chars.
I wanted to do this using bash script, but it seem that it is it won't be possible:
for i in `find . -type d`
do
update table_ticket
set table_ticket.description = $i/Description.txt
where ticket_number = $i
done
Of course the above code would take into consideration connection to the database.
Anyone has a idea on how I could achieve this using shell script. Or would it be better to just make something in Java and read and update the record, although I would like to avoid this approach.
Thanks
Alex
Thanks for the answer, but I came across this:
psql -U dbuser -h dbhost db
\set content = `cat PATH/Description.txt`
update table_ticket set description = :'content' where ticketnr = TICKETNR;
Putting this into a simple script I created the following:
#!/bin/bash
for i in `find . -type d|grep ^./CS`
do
p=`echo $i|cut -b3-12 -`
echo $p
sed s/PATH/${p}/g cmd.sql > cmd.tmp.sql
ticketnr=`echo $p|cut -b5-10 -`
sed -i s/TICKETNR/${ticketnr}/g cmd.tmp.sql
cat cmd.tmp.sql
psql -U supportAdmin -h localhost supportdb -f cmd.tmp.sql
done
The downside is that it will create always a new connection, later I'll change to create a single file
But it does exactly what I was looking for, putting the contents inside a single column.
psql can't read the file in for you directly unless you intend to store it as a large object in which case you can use lo_import. See the psql command \lo_import.
Update: #AlexandreAlves points out that you can actually slurp file content in using
\set myvar = `cat somefile`
then reference it as a psql variable with :'myvar'. Handy.
While it's possible to read the file in using the shell and feed it to psql it's going to be awkward at best as the shell offers neither a native PostgreSQL database driver with parameterised query support nor any text escaping functions. You'd have to roll your own string escaping.
Even then, you need to know that the text encoding of the input file is valid for your client_encoding otherwise you'll insert garbage and/or get errors. It quickly lands up being easier to do it in a langage with proper integration with PostgreSQL like Python, Perl, Ruby or Java.
There is a way to do what you want in bash if you really must, though: use Pg's delimited dollar quoting with a randomized delimiter to help prevent SQL injection attacks. It's not perfect but it's pretty darn close. I'm writing an example now.
Given problematic file:
$ cat > difficult.txt <__END__
Shell metacharacters like: $!(){}*?"'
SQL-significant characters like "'()
__END__
and sample table:
psql -c 'CREATE TABLE testfile(filecontent text not null);'
You can:
#!/bin/bash
filetoread=$1
sep=$(printf '%04x%04x\n' $RANDOM $RANDOM)
psql <<__END__
INSERT INTO testfile(filecontent) VALUES (
\$x${sep}\$$(cat ${filetoread})\$x${sep}\$
);
__END__
This could be a little hard to read and the random string generation is bash specific, though I'm sure there are probably portable approaches.
A random tag string consisting of alphanumeric characters (I used hex for convenience) is generated and stored in seq.
psql is then invoked with a here-document tag that isn't quoted. The lack of quoting is important, as <<'__END__' would tell bash not to interpret shell metacharacters within the string, wheras plain <<__END__ allows the shell to interpret them. We need the shell to interpret metacharacters as we need to substitute sep into the here document and also need to use $(...) (equivalent to backticks) to insert the file text. The x before each substitution of seq is there because here-document tags must be valid PostgreSQL identifiers so they must start with a letter not a number. There's an escaped dollar sign at the start and end of each tag because PostgreSQL dollar quotes are of the form $taghere$quoted text$taghere$.
So when the script is invoked as bash testscript.sh difficult.txt the here document lands up expanding into something like:
INSERT INTO testfile(filecontent) VALUES (
$x0a305c82$Shell metacharacters like: $!(){}*?"'
SQL-significant characters like "'()$x0a305c82$
);
where the tags vary each time, making SQL injection exploits that rely on prematurely ending the quoting difficult.
I still advise you to use a real scripting language, but this shows that it is indeed possible.
The best thing to do is to create a temporary table, COPY those from the files in question, and then run your updates.
Your secondary option would be to create a function in a language like pl/perlu and do this in the stored procedure, but you will lose a lot of performance optimizations that you can do when you update from a temp table.