How can I open multiple attachments of the same name in an email, then move the sender of the attachment to a spreadsheet? - perl

I have an internship and was recently assigned the tedious task of cleaning the email lists. My employer has sent me a series of email with email bounces as attachments, many at a time, all with the same name. I have considered ways of doing this most efficiently, I'm looking to avoid just clicking through like a slave. My thoughts were to create a macro using autohotkey's language, but I feel like maybe a batch file or some sort of Perl might do the same thing. Could anybody give me an idea as to how to do this, specifically with a batch file? Thanks in advance!

Mail::DeliveryStatus::BounceParser parses bouncing email addresses out of delivery report messages.

If you don't know any perl, then I recommend that you first convert the mailbox into some format that stores each email in separate text files, like MH or similar.
At that point, you can trivially use the command grep _pattern_ | sed -e 's/:.*//' | sort | uniq > _list_ to obtain lists of all files matching _pattern_. You may inspect/edit this file _list_ to verify that the desired results were obtained.
You may then create another director junk or whatever and move all the files listed in _list_ into junk with a command like perl -e 'chomp; rename($_,"junk");' < _list_.
If you'll need this regularly, then you could automate this further, likely using perl alone, but a one off task will probably involve more messing about with getting the right message list.
Alternatively, you could load all the emails into a single folder in an sane mail reader, like Mac OS X's Mail.app, and do simply search, select all, move/delete commands.

Related

Splitting Emails with MIME::Parser

I got handed 4GB of emails concatenated into a single file and the suggestion that MIME::Parser could split the individual emails back out again. All my attempts to date end up with the parser just copying the original file without extracting any of the emails. So: Is this even something that MIME::Parser can handle? My code is very basic:
my $file = IO::File->new("somefile", O_RDONLY);
my $parser = new MIME::Parser;
$parser->output_dir("somedir");
my $entity = $parser->parse($file);
$file->close;
Below is a link to sample date that some have requested. This is all SPAM and phishing emails. DO NOT CLICK ANY OF THE LINKS. Enjoy: Pastbin of 4KB of emails.
MIME::Parser is for reading a single Mail to get the attachments etc. It can be used to extract mails which are attached inside another mail as message/rfc822, but is is not intended to extract mails from some kind of archive with lots of mails in it concatenated.
It is not clear what format your single file with mails has. But if it comes from a UNIX system or from a Thunderbird installation it might simply be in the classical Mbox format and there are several tools to split Mbox files into separate messages. Apart from several perl modules there are also other tools like git-mailsplit which help you extract the mails from Mbox-format.

Check email in shell. And make it look good

I want to output the email information as
(Who it's from)
(Subject)
Here's the code i found on the internet
#!/bin/sh
curl -u username --silent "https://mail.google.com/mail/feed/atom" | perl -ne 'print "\t" if /<name>/; print "$2\n" if /<(title|name)>(.*)<\/\1>/;')
however this isn't the format i want it and i don't know any perl.
i was wondering if i could either output the perl information to a shell variable so i can edit the layout. or output the perl information to a txt file and then edit it then. by edit i mean change the order of it so its name, subject rather than subject, name. Any help will be a big help. I may be completely wrong because i don't know any perl.
Assuming you are logged in to Gmail, https://mail.google.com/mail/feed/atom will give you an dump of the latest few messages in your inbox. This is in XML (Atom) format and you can use whatever you want to read it. You can see what the raw data looks like by going there in a browser.
The code you have found uses Perl to parse and display a more readable form of this data. You are certainly not obligated to use Perl if you don't know it, so you can use whatever programming language you are familiar with.

SAS- Reading multiple compressed data files

I hope you are all well.
So my question is about the procedure to open multiple raw data files that are compressed.
My files' names are ordered so I have for example : o_equities_20080528.tas.zip o_equities_20080529.tas.zip o_equities_20080530.tas.zip ...
Thank you all in advance.
How much work this will be depends on whether:
You have enough space to extract all the files simultaneously into one folder
You need to be able to keep track of which file each record has come from (i.e. you can't tell just from looking at a particular record).
If you have enough space to extract everything and you don't need to track which records came from which file, then the simplest option is to use a wildcard infile statement, allowing you to import the records from all of your files in one data step:
infile "c:\yourdir\o_equities_*.tas" <other infile options as per individual files>;
This syntax works regardless of OS - it's a SAS feature, not shell expansion.
If you have enough space to extract everything in advance but you need to keep track of which records came from each file, then please refer to this page for an example of how to do this using the filevar option on the infile statement:
http://www.ats.ucla.edu/stat/sas/faq/multi_file_read.htm
If you don't have enough space to extract everything in advance, but you have access to 7-zip or another archive utility, and you don't need to keep track of which records came from each file, you can use a pipe filename and extract to standard output. If you're on a Linux platform then this is very simple, as you can take advantage of shell expansion:
filename cmd pipe "nice -n 19 gunzip -c /yourdir/o_equities_*.tas.zip";
infile cmd <other infile options as per individual files>;
On windows it's the same sort of idea, but as you can't use shell expansion, you have to construct a separate filename for each zip file, or use some of 7zip's more arcane command-line options, e.g.:
filename cmd pipe "7z.exe e -an -ai!C:\yourdir\o_equities_*.tas.zip -so -y";
This will extract all files from all of the matching archives to standard output. You can narrow this down further via the 7-zip command if necessary. You will have multiple header lines mixed in with the data - you can use findstr to filter these out in the pipe before SAS sees them, or you can just choose to tolerate the odd error message here and there.
Here, the -an tells 7-zip not to read the zip file name from the command line, and the -ai tells it to expand the wildcard.
If you need to keep track of what came from where and you can't extract everything at once, your best bet (as far as I know) is to write a macro to process one file at a time, using the above techniques and add this information while you're importing each dataset.

Best (quickest) way to parse and modify a file

Recently I have been using alot of text files (csv) with 10-60k lines, something like this
id1,id2
id3,id1
id81,id13
...
And most of the times, I need to extract this informaton in form of an array:
id1,id2,id3,id1,id81,id13
Or at times, unique elements array:
id1,id2,id3,id81
Then the result is used by my code (java) to do something.
Now, most of the times I write a java function which does the task for me, right from file reading, logic and then returning back the list of Ids.
Is there is a better and a quicker way to achieve this, maybe via command line?
Update:
If I was asked to build an app which was supposed to read a file and do something with it, I will surely write that logic in Java, but in my case I have to go through alot of text files which I get from the data warehouse, extract relevant info from it and then run it over my java based app.
Now, this is only for my experiment and evaluation of my app.
I copied your input in a file, test.csv:
$ cat test.csv
id1,id2
id3,id1
id81,id13
Now, with the 'tr' utility, you can do:
$ cat test.csv | tr '\n' ',' | tr -d ' '
and you have:
id1,id2,id3,id1,id81,id13
Unless your Java code is doing something silly, it will be in the same speed ballpark as anything else.
There's nothing magic about command-line tools that will make them faster than your code.

Powershell - MS Exchange E-mail Autoresponder

We've currently got an issue where we're receiving a lot of bounced e-mails (from an auto generated e-mail) back from people where a specified e-mail address is not valid (failure notice). I need to identify certain messages in the mailbox and respond automatically to them - as a newbie to Powershell I'm struggling a bit! I think I understand how to check for the occurrence of a string but I don't know how to iterate through an inbox to look at/get a handle on each message in turn and I don't know how to extract the subject or body text in order to analyse the contents and perform a string comparison. I fear this should be easy - but I can't find anything on the web that might do the job - can anyone help?
So just to clarify what you're looking for.
Mailbox A receives a large number of failure notice/bounce messages.
You'ld like your powershell script to search Mailbox A for every instance where the Subject line (or message body) contains "String X" and if there is a match, take some action?
Also, what version of Exchange are you using? You need to be at least on 2007 to use Exchange Command Shell. You'll then want to look over the Command Shell commands that can be run.
Look at the Exchange Message Tracking Log, and Pipe the results from one command you run to the next. Think of it like this...
(Run a command) | (Run another command on the results of the first command) | (Run a last command on the results of the second).
You can view an example on my website at:
http://www.technoctopus.com/?p=223
While not exactly the same, it might get you moving in the right direction.