Does Mail::Box::Manager handle Qmail maildir format? - perl

It's not obvious which format this module supports. No mention of 'qmail' nor 'maildir'.
http://search.cpan.org/~markov/Mail-Box-2.120/lib/Mail/Box/Manager.pod
It has functions to move between folders, which sounds like maildir, but then it also hints that it encapsulates Mail::Box::Mbox, which sounds like the single file mbox format.
Here is a description of maildir format: http://wiki.dovecot.org/MailboxFormat/Maildir, and description of both: http://www.postfix.org/virtual.8.html

http://search.cpan.org/~markov/Mail-Box-2.120/lib/Mail/Box-Overview.pod
Each folder maintains a list of messages. Much effort is made to hide differences between folder types and kinds of messages. Your program can be used for MBOX, MH, Maildir, and POP3 folders with no change at all (as long as you stick to the rules).

Related

Adding Author Metadata to a TXT or CFG file

Basically I am looking for a way of "branding" a cfg file. Specifically a CSGO-config, so yes, nothing too important. I'm just surprised, that after quite a bit of google search, I still havent found anything.
In general, plain text files (and probably .cfg files) do not contain any metadata by default. The file system should keep track of some properties for these file types, but they won't otherwise transfer across filesystems.
If you would like to "brand" your file, perhaps you could add a comment to the top with your name. It would be about as permanent and immutable as any metadata anyways.

What is the difference between an `eml` file and an RFC822 email message?

I see these two terms used interchangeably in a lot of context, but some sources say that eml is a file format that originated with Microsoft Outlook.
Is eml the official file suffix of an RFC822 message saved as a flat text file?
There is no official definition of what "eml" means, and different programs use it in different ways. Sometimes people use it to mean RFC822, other times they mean something different. If you mean RFC822 format, say so.

Create Numbers file and open it with Numbers on iPad

I would like to do a task that is quite simple on other OS, but it is not so trivial on iOS. Namely, I want to create file and open it in Numbers.
I can preview the file with UIDocumentInteractionController and then offer it to user that he/she opens it.
THis seems to me quite a reasonable solution. However, I need to offer proper file format. I suppose CSV and XLS would be reasonable to implement and it would most probably work, but I would still like to do it in native Numbers format if possible. However, I can't find any info about this file format.
Basically, this task is about exporting data to another app and then working further with them.
I don't know of a library that can create native Numbers files. There are hoewever some libraries that allow creating XLS files. Since Numbers fully supports XLS, this is probably the way to go.
There is a comercial library available that might work on the iPhone (costs $200): http://www.libxl.com/
As for free XLS libraries, I only know xlwt, a Python module. You could set up a webservice that creates an XLS file for your app, using xlwt on the server side.
If you want to pass information to Numbers, you can probably also use CSV files. If you use CSV files, you must be aware of some things. There are two kinds of CSV files: the comma separated version (used in english speaking countries) and the semicolon separated (used in continental europe).
The comma separated CSV files look for example like this:
"ID","First Name","Last Name","Salary"
1,"John","Malkovich",3400.20
2,"Fred","Astaire",2000.60
The second kind of CSV files are semicolon separated and use a comma as decimal mark. They look like this:
"ID";"First Name";"Last Name";"Salary"
1;"John";"Malkovich";3400,20
2;"Fred";"Astaire";2000,60
On the Macintosh, Numbers expects a different format depending on the Region setting. If you have your Region set to the US, it will expect the first kind. If you choose Germany, it will expect the second kind.
I don't know what kind of files Numbers on the iPad expects.
Another alternative would be using copy and paste. Try to copy tab separated text into the clipboard.
I hope this may help you. I've contacted libxl team and they responded with the link to the demo version of their iPhone library: http://www.libxl.com/download/libxl-iphone.zip

How to extract e-mail data into R?

How could I export my e-mail database from Gmail (or Thunderbird) into R?
Like there is the rgoogledocs package and twitteR, is there a gmailR package, or a standard format for exporting emails into stat packages ?
Tal
Need to install it library(edeR) first. May need to manually install Java 64 on Windows 8, may need to enable IMAP access in Gmail.
dat3 <-extractKeyword(username="YOURLOGIN#gmail.com",
password="YouRPaSS",
kw="adsense",
nmail=5)
This will download 5 emails with keyword 'adsense'.
Standard email (on a Unix system) is either an mbox file (containing several messages) or a maildir setup where each mail is a file in a directory.
Either way, it's ascii text. That is how a MUA (mail-user agents -- your mail reader) is orthogonal to your MTA (mail-transport agent -- mail server software like exim, qmail, postfix, ...). The MTA may use a network protocol like POP3 or IMAP to serve the mail files to the client in which case the client (which may be Gmail or Thunderbird) no longer sees the underlying files. So you may need to learn how to export your mail from whichever backend you employ and then read it.
This has nothing to do with R or programming so far --- unless you now feel you must extend R with POP3 or IMAP facilities to connect to a (remote) mail server.
Now there is R package to extract email data. This package still in testing phase but anyone can install it from GitHub, the package name is edeR. Right now this can extract email data from IMAP enabled Gmail.
Gmail and Thunderbird are not the same... you can enable Gmail account in Thunderbird, hence export each email in ASCII file, hence write a R batch script that will take each file and import it in R as an object, hence... you get the point. =)
Usually I'm trying to avoid "the pedestrian approach"... but I'm getting an impression that you're prone on using R as a "general purpose" programming language... Python or JAVA, on the other hand can be quite efficient, so you can write (or ask someone to write it for you) a script that will "bring" you data in desirable format, and then crunch it in R. R has matured a lot, and it's not solely a tool for statistical analysis any more, but it's always a good idea to use some widely-known programming language to carry out your data.
So there... Roll up your sleeves, and dive into Python (JAVA, C... whatever you feel like diving in)!
P.S.
I reckon that this has something to do with your previous post with word cloud...
Once you have exported your e-mails in mbox format into your PC, you can make use of both tm and tm.plugin.mail packages in R. The latter makes it possible to export your e-mails into R.
require("tm")
require("tm.plugin.mail")
Then, to convert your e-mails from mbox (i.e., several mails in a single box) format to eml (i.e., every mail in a single file) format: convert_mbox_eml(mbox, dir). In the example below, mbox is represented by "yourmails.mbox" and it describes the mbox location. The output directory is given by "your_mails".
convert_mbox_eml("yourmails.mbox", "your_mails")
You can read in an electronic mail document and inspect with the following R commands.
mails <- VCorpus(DirSource("your_mails/"), readerControl = list(reader =
readMail))
inspect(mails)

How can I limit file types in CGI file uploads in Perl?

I am using CGI to allow the user to upload some files. I just want the just to be able to upload .txt or .csv files. If the user uploads file with any other format then I want to be able to put out an error message.
I saw that this can be done by javascript: http://www.codestore.net/store.nsf/unid/DOMM-4Q8H9E
But is there a better way to achieve this? Is there is some functionality in Perl that allows this?
The disclaimer on the site to you link to is important:
Note: This is not entirely foolproof as people can easily change the extension of a file before uploading it, or do some other trickery, as in the case of the "LoveBug" virus.
If you really want to do this right, let the user upload the file, and
then use something like File::MimeInfo::Magic (or file(1), the
UNIX utility) to guess the actual file type. If you don't like the
file type, delete the file and give the user an error message.
I just want the just to be able to upload .txt or .csv files.
Sounds easy, doesn't it? It's not. And then some.
The simple approach is just to test that the file ends in ‘.txt’ or ‘.csv’ before storing it on the filesystem. This should be part of a much more in-depth validation of what the filename is allowed to contain before you let a user-submitted filename anywhere near the filesystem.
Because the rules about what can go in a filename are complex on some platforms (especially Windows) it's usually best to create your own filename independently with a known-good name and extension.
In any case there is no guarantee that the browser will send you a file with a usable name at all, and even if it does there is no guarantee that name will have ‘.txt’ or ‘.csv’ at the end, even if it is a text or CSV file. (Some platforms simply do not use extensions for file typing.)
Whilst you can try to sniff the contents of the file to see what type it might be, this is highly unreliable. For example:
<html>,<body>,</body>,</html>
could be plain text, CSV, HTML, XML, or a variety of other formats. Better to give the user an explicit control to say what file type they're uploading (or use one file upload field per type).
Now here's where it gets really nasty. Say you've accepted the upload and stored it as /data/mygoodfilename.txt, and the web server is correctly serving it as the Content-Type ‘text/plain’. What do you think the browser interprets it as? Plain text? You should be so lucky.
The problem is that browsers (primarily IE) don't trust your Content-Type header, and instead sniff the contents of the file to see if it looks like something else. Serve the above snippet as plain text, and IE will happily treat it as HTML. This can be a huge problem, because HTML can include client-side scripts that will take over the user's access to the site (a cross-site-scripting attack).
At this point you might be tempted to sniff the file on the server-side, for example using the ‘file’ command, to check it doesn't contain ‘<html>’. But this is doomed to failure. The ‘file’ command does not sniff for all the same HTML tags as IE does, and other browsers sniff differently anyway. It's quite easy to prepare a file that ‘file’ will claim is not HTML, but that IE will nevertheless treat as if it is (with security-disaster implications).
Content-sniffing approaches such as ‘file’ will give you only a false sense of security. This is a convenience tool for loose guessing of filetypes and not an effective security measure.
At this point your last desperate possibilities are things like:
serving all user-uploaded files from a separate hostname, so that a script injection attack can't purloin the credentials of your main site;
serving all user-uploaded files through a CGI wrapper, adding the header ‘Content-Disposition: attachment’ so that browsers won't attempt to display them directly;
only accepting uploads from trusted users.
On unix the easiest way is to do an JRockway suggested. If not on unix then your options are limited. You can examine the file extension and you can examine the contents to verify. I'm assuming for you specific case that you only want "* seperated value" text files. So one of the Text::CSV::* modules may be useful in verifying the file is the type you asked for.
Security for this operation is a whole other ball of wax.
try this:
$file_name = "file.txt";
$file_cmd = "file \"$file_name"\";
$file_type = `$file_cmd`;
return 0 unless($file_type =~ /(ASCII|text)/i)