How to get attached files from an email, using Pentaho Kettle?

How to get attached files from an email, using Pentaho Kettle? - email

I'm stuck in a great problem. My task is to download some emails from a server using the IMAP protocol. This is accomplished by using the "get mails (POP3 / IMAP)" job entry, which downloads the emails, but in binary format.
Files in binary format are .mail files containing sender, subject, body, and encoded attachment files. I need to obtain separate files, because I must realize some steps with these files as input.
I've seen that there are third-party libraries or utilities to decode the .mail file and get the attachment file list. However, I want to do this process without any additional utility (because this should require a shell step, depending on the SO).
Is there any way or trick to get the attachments using only Pentaho job entries or transformation steps?
I'm using the version 5.1 of Pentaho Kettle.

I will explain the whole process so that anybody can get the advantage of it.
1) Add START and Get mails (POP3/IMAP) job entries, and create a hop between them.
2) Edit the Get mails entry to use your IMAP server (host name, port number, username, password, etc), and click Test Connection to verify settings.
3) In the Target folder, uncheck Save message content and check Get mail attachment and Different folder for attachment. Define a target folder for both the Target directory and Attachment files folder.
4) On the Settings tab, select the IMAP folder that you want to download from. Change other settings as desired.
5) Click OK, Save the Job, and Run the job.

Related

PowerBI/Powershell - Get list of datasources that a PBIX file uses

I am trying to get a list of datasources that a Powerbi file is using. I seen solutions online where I can use the ReportingService module to get a list but this only works when the PowerBI report is published online. Is there a solution that would work for a local file?
Here is the situation.
A user gives me a Powerbi file. In order for me to get a list of datasources, I have to go in manually and to take a look at sources manually. Ideally, I would like to use Powershell to get this list.

There isn't an API that can access the desktop application. You would have to brute force it.
The PBX file is basically a Zip file which then contains separate files with JSON information. You would have to follow the following steps:
Use Expand-Archive to get the files out of the PBX (Not sure if you will need to change the file extension first).
Read the "Connections" file (Which is Json). It will have the various connection strings used by the model.
You can do this manually by changing the file extension to Zip and opening the Zip file directly, and looking at the connections file in notepad.

LotusNotes: saving documents as email files

I need to ask you about the possibility of saving LotusNotes documents (with the attachments) as separated files in EML format on a hard disc.
Of course it's not important to keep the original document's look but it's very important to input into the file the content of the notes document including all the attached files.
The reason is to be able to open the exported file in an email client.
Is it possible?
Do you have any experience with resolving a problem like this?

The easiest way to do this for a small number of documents is to use #MailSend to forward the documents to a Notes user account or to mail-in database, and then go into that mailbox, select the message, and drag it to your desktop. Recent versions of the Notes client will save the document as .eml file that can be opened in Outlook or other standard mail clients. Or instead of sending to something in Notes, you could send to a non-Domino email system, connect with Outlook and do the same drag-to-desktop there, which I believe results in a .msg file instead of a .eml file, but they're essentially the same.
To automate it for a large number of documents that I need to do in one batch, I might still use the #Mailsend approach, but I'd do this on a dedicated Domino server. I'd address the email to an external address, and I'd set up SMTPSaveOutboundToFile=1 in the notes.ini file of that dedicated Domino server.
I think the Notes-client drag to desktop operation results in somewhat higher fidelity in the .eml file than either of the other approaches, but it's been about ten years and three major Notes/Domino versions since I played around with any of these.

Yes this can definitely be done programmatically. To do this, convert the doc to MIME via convertToMIME() using the DxlExporter to do the rest of the work. It creates XML output that contains a <mime> tag in which the output of the fully converted MIME format document resides. See this for a full description: How to Programmatically Convert Lotus Notes email Document to MIME Format

Coldfusion 9 - respool mail

I had a bunch of mail to be sent out that was not. I realized that my mail settings in Coldfusion Administrator were incorrect and have since corrected them. I tested the new settings with new mail and they work. I am now trying to re-send out the messages in the spool, but they go right back in the undelivered mail spool. I'm assuming that they are still using the old mail settings. Is there any way to force them to send out using the new current settings?

You'll need to edit the individual spool files, as they most likely have the mail server information in the file itself. If you open a few of the files in your badmail directory, you should be able to locate the server information and adjust accordingly.
If you can do a bulk find/replace on the files, it should make short work of it.

How do I edit files in place that were uploaded to Moodle?

I would like a better workflow for debugging uploaded SCOs. As things are, I must edit a file in the activity, repackage, upload, and test. Often, I just need to change a single line of code. It would be VERY nice to be able to edit that file, that line of code, on the server. So far, all I've found is that Moodle manages the files, so it seems impractical to locate and decipher the renamed files after upload.
Is there a way to configure Moodle so that it doesn't rename and relocated files in SCOs upon extraction? Actually, I'm open to any suggestions on the best, fastest workflow for debugging SCOs.

Problem background
Since Moodle 2.0, files are no longer stored on server in the conventional /this/is/the/path/to/my.file way. Instead, files are rehashed and stored in Repositories (i.e. spread all over the moodledata folder as a collection of seemingly random data). This increases security and cross-OS compatibility but complicates stuff for people who would like to simply upload a SCORM zip package via FTP. Here's more information on file handling in Moodle 2.0
Path to the soluton
Let's locate the file you want to update, then update it.
Run phpmyadmin, go to mdl_files table, find your file by name in the filename field (let's say it's portrait.jpg)
Look at the contenthash field, it'll look like abcde1234567890. This means your file is stored in moodledata/filedir/ab/cd/ folder under the name abcde1234567890.
Rename the updated portrait.jpg to abcde1234567890, upload and overwrite.
Go back to phpmyadmin and update the filesize field in record for portrait.jpg with the size of the updated file.
Obviously, this process can be automated. You'll have to write a script that allows you to upload a file, then it'll search for that file in mdl_files, save it to the correct folder and update all fields accordingly.
Alternative idea
Enable external package type (and also enable 'Update on every launch'). Go to Site administration / Plugins / Activities / SCORM and check the box down below. Now you'll be able to launch SCORM packages directly from another server, so Moodle won't mess with it. Of course, you can run in other (probably cross-domain related) problems.

Sergey's answer is very good, with one caveat:
In his example with the contenthash of abcde1234567890, the file is stored in the moodledata/filedir/ab/cd/ folder under the name abcde1234567890. Moodle uses the full contenthash to name the file.

Choosing right tool

I have following need:
1) Users will upload .xls or .csv files in "uploads" folder.
2) "uploads" folder have to be constantly monitored, and with each new file added to him, a job has to be started.
3) Job will process data from .xls or .csv file so they meet DB table structure, and write this data into DB table.
This have to be automated process, and I'm looking for all-in-one solution tool.

You didn't tell on which operating system, and you didn't tell if the user upload the files on a different server, or not. If the upload goes thru a web application (using an HTTP POST request), it is also different.
And I'm not sure that your wish scales well with many users.

You should take a look at Pentaho Data Integration, a.k.a. Kettle: http://sourceforge.net/projects/pentaho/
With Kettle you can desing a Job that pools the upload directory and once a file is found makes all the needed transformation and input on the desired database table.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to get attached files from an email, using Pentaho Kettle? - email

Related

PowerBI/Powershell - Get list of datasources that a PBIX file uses

LotusNotes: saving documents as email files

Coldfusion 9 - respool mail

How do I edit files in place that were uploaded to Moodle?

Choosing right tool

Categories

Resources