Cron job to download emails to external hard drive

Cron job to download emails to external hard drive - email

My client has this situation where he's constantly out ot server storage space. And he needs to archive emails from couple of years (preferably as long as possible).
He has a server (shared) with the ability to make cron jobs, the standard setup with php, apache, etc. In his office there's a external NAS Disc with mirrored RAID (static IP).
Now my first thought was to make a cron job to that would run and send all the emails older then (lest say 10 months) to the hard drive as *.eml or other format that can be opened with a installed program (like thunderbird).
After searching for a long time I could not find a solution that I thought of, and came here for pointers.
Is this possible? Or should I search for another solution?

Related

Tool to inspect and parse an inbox

I have a inbox that receives lots of emails from my various systems and such as Nagios and Azure alerts regarding disk usage, exceptions and job failures.
Since I get so many of these alerts, I was wondering if there was any tools that I could use to filter out and only receive the most important alerts - a lot of these are spam that are only affecting my development environments and so I only want to be alerted only when something goes wrong with my production environment.
Does anyone know of any such tools or knows a better way of dealing with this sort a problem? There must be a better solution to this rather than manually reading through all my emails and checking the contents.
I've heard of a tool known as LogRythm but I'm under the impression that this is purely a data security tool and am unsure whether it would be able to parse an inbox.
Thanks all in advance.

A good solution is IMAPfilter. This is a utility for Linux systems (but I run it on Windows with the Linux Subsystem) that uses the IMAP protocol to manage one or more mailboxes.
You need to define all your filters and actions using the Lua language (but it's pretty simple, in the GitHub there are many examples) and keep the program running all the time (if the program is not running, all mails will arrive unfiltered, and they will be reorganized once the program is started again)
This is not a pretty software (no GUI, weird language, needs an always on server to have real time filtering), but since you can program your filters you can do pretty much anything, and it's also very light

Possible to use server for collecting and distributing mail

We currently have POP3 mail accounts where I am and try as I might to convince my manager that we should be using hosted IMAP or Exchange he won't budge because of the cost. The staff are mostly out of office so there is no domain server here, however, we do have a dedicated server and I wondered whether I could use this to collect the mail and distribute it from there in some way.
Effectively what I'm trying to do is ensure mail is stored somewhere other than the end users machine because backups are user dependant at the moment. With hosted Exchange or Exchange on this server would be simple but my manager won't shell out for it. I have seen free mail servers called MailEnable and Axigen but unsure if they will do the job. Sorry if this seems like an easy or stupid question but never needed to do this before.

I am assuming due to the reference to Exchange that you are on Windows.
If you have an old box lying around that works, you could install linux on it and then choose from a number of different imap servers. Dovecot and Courier are both good choices and I have worked with them before.
You could use fetchmail to then pick up the mailboxes and then deliver to the imap boxes or get them deliver directly.
Setting up such a linux server for email was one of the first things I ever did on Linux. While initially daunting, once you get the hang of it, it is pretty straightforward and there are plenty of resources out there to help.
Ubuntu is probably the easiest to get used to. CentOs is also a reasonable choice.

You shouldn't be running your own server if you aren't willing to administer your own server, and they are not easy to configure if you don't know what you are doing (e.g., you mess up and you are exploited for spamming).
Look into a service like mailgun. In my application we are using them for forwarding to REST endpoints as well as onto another SMTP server.
Competitors that wound up not meeting my needs but may meet yours include Dyn, email yak, Sendgrid, etc. etc.

Why not just setup the mail clients to store their mail files on a standard network drive or share? I follow that this situation is pretty silly in your view - 100% because of the ridiculous constraints that you are being asked to work within: I would similarly find the solution I am suggesting ridiculous generally; but under the circumstances, it seems like a simple answer to your problem - replacing distributed mail storage and backup with centralized storage and backup.

Don't POP3 email clients have the option keep a copy on the server? Mine certainly does. See second tick box on the pic.
You can then periodically take a back up of all the emails from the server to stop it getting clogged up.

Data transfers (from/to server vs from/to client) in non-browser distributed applications

So we have command line scripts (written in Python) that sit on customer machines and send us data in CSV after every 24 hours. Now we are at a point that we actually want to be able to tell the clients to send us data any time. Almost all of the customers are on MS Windows machines and requirement is that we can install very little software on the customer machines (and most people cannot even log on to customer machines, only few people can).
I'm not actually sure as to how to best solve this problem. May be following are three possible ways (but looking for better)
We make a daemon in Python and install it on customer machine.
Daemon talks to our servers and we send back configuration
information. In that configuration information we send back the
"sleep duration". So daemon sends us the data and then goes to sleep
for number of seconds defined in "sleep duration" variable. Once the
limit is over, daemon pings us and again we send back the
configuration information. Rinse and repeat.
We install a script on customer machine and it runs every hour. At
our end, we've stored how often a customer should send us data (24
hours, 12 hours, and etc) and when script talks to us we determine
how much time has passed and if it is time that script should be
sending us data? If it's time, then we tell the script to send us
data.
We install a very small server-side (Django or Flask) application
and it runs on customer machines. Whenever we want data we send a
request to customer machine and our small server-side application
serves us. For that may be we will ask our customers to reserve a
port for us (not sure how many customers will actually allow this)
I'm sure there are better ways possible. Can you kindly let me which of the above methods are most suitable? Or please let me know if there exists a better way.
I really appreciate all insights, thanks for all help in advance.

Option 3 may not work. Most people have their machines behind a
fire-wall or router which does NAT. In such a scenario, a server that is listening for a request to come in would not typically be accessible from the public internet.
If they have static IP addresses and if the server is accessible from the public internet, then port scanners would detect it and potentially attempt to do undesirable things. You really do not someone hacking into your customer systems and wrecking havoc on them. Please avoid this option if possible.
However, it is safe to have a server on a customer system as long as it is the one logging into your server and sending data.
A better solution would be to have an app that is continuously
feeding data to your server as it is generated. Is is relatively
easy to do an equivalent of
tail -f csv_file | send_data_home
where send_data_home is program running on your customer's system.
This way there is minimal impact. The csv file creation is not
affected. The send_data_home logs into your server and sends
data as it is generated.

How to make scalable backend postfix server solution?

Here is the problem I am facing with. We are having a postfix server that needs to parse emails forwarded from some user's account and extract some data from it. Usually there are around 200 emails per user. We have tested it with 5 users and this all was good, but what to do if the number of user reaches some greater number, for example 10000 or 100000? Do you have any ideas how to make the postfix solution scalable so it could support this heavy load.
Our current postfix server resource is Ubuntu 10.04 machine with 512MB of RAM.
Best regards,
Mladjo

Postfix is a mailer. Not a data miner, arbitrary string parser or general purpose light bulb. When receiving 10000 letters, you - the mentally unstable postal worker - do not want to open the letters, read them, cut out some parts, close them and then deliver them.
You want to figure out if they're yours to deliver and put them in the right pile. For the other task, you call on your buddy Cron, who's dating Ms. Perl and have all the right features for previously mentioned tasks.

gather file(s) from users

I'm looking for ways to gather files from clients. These clients have our software and we are currently using FTP for gathering files from them. The files are collected from the client's database, encrypted and uploaded via FTP to our FTP server. The process is fraught with frustration and obstacles. The software is frequently blocked by common firewalls and often runs into difficulties with VPNs and NAT (switching to Passive instead of Active helps usually).
My question is, what other ideas do people have for getting files programmatically from clients in a reliable manner. Most of the files they are submitting are < 1 MB in size. However, one of them ranges up to 25 MB in size.
I'd considered HTTP POST, however, I'm concerned that a 25 mb file would often fail over a post (the web server timing out before the file could completely be uploaded).
Thoughts?
AndrewG
EDIT: We can use any common web technology. We're using a shared host, which may make central configuration changes difficult to make. I'm familiar with PHP from a common usage perspective... but not from a setup perspective (written lots of code, but not gotten into anything too heavy duty). Ruby on Rails is also possible... but I would be starting from scratch. Ideally... I'm looking for a "web" way of doing it as I'd like to eventually be ready to transition from installed code.

Research scp and rsync.

One option is to have something running in the browser which will break the upload into chunks which would hopefully make it more reliable. A control which does this would also give some feedback to the user as the upload progressed which you wouldn't get with a simple HTTP POST.
A quick Google found this free Java Applet which does just that. There will be lots of other free and pay for options that do the same thing

You probably mean a HTTP PUT. That should work like a charm. If you have a decent web server. But as far as I know it is not restartable.
FTP is the right choice (passive mode to get through the firewalls). Use an FTP server that supports Restartable transfers if you often face VPN connection breakdowns (Hotel networks are soooo crappy :-) ) trouble.
The FTP command that must be supported is REST.
From http://www.nsftools.com/tips/RawFTP.htm:
Syntax: REST position
Sets the point at which a file transfer should start; useful for resuming interrupted transfers. For nonstructured files, this is simply a decimal number. This command must immediately precede a data transfer command (RETR or STOR only); i.e. it must come after any PORT or PASV command.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Cron job to download emails to external hard drive - email

Related

Tool to inspect and parse an inbox

Possible to use server for collecting and distributing mail

Data transfers (from/to server vs from/to client) in non-browser distributed applications

How to make scalable backend postfix server solution?

gather file(s) from users

Categories

Resources