Getting data from emails in SAS - email

I have emails that contain data in an unstructured format. I need to extract this data from these emails. Is there a way to do this in SAS?
Thanks for the help

Related

How to remove duplication with Talend Data Preparation?

I would like to remove duplication with my Talend Data Preparation and I have a column named: HOURS, I want to calculate those hours between them and remove the email and names duplication, here is an example of my table :
As you can see I have a lot of user_name and email is the same, but my hours are not same, I want to add my hours together depending on the user_name and email and remove any duplication of my user_name and email at the same time.
(I am not really into Data Prep, so perhaps there is an inside solution that I don't know of).
I think you can't have a GROUP BY with a SUM operation in Talend Data Preparation, as the tool is only able to correct lines of data, and can't make aggregation operations.
You'll be able to sum your data with a tAggregateRow in Talend Data Integration, after exporting your corrected data from Data Prep.

How to parse EDIFACT file data using apache spark?

Can someone suggest me how to parse EDIFACT format data using Apache spark ?
i have a requirement as every day EDIFACT data will be written to aws s3 bucket. i am trying to find a best way to convert this data to structured format using Apache spark.
In case you have your invoices in EDIFACT format you can read each one of them as one String per Invoice using RDD´s. Then you will have a RDD[String] which represents the distributed invoice collection. Take a look to https://github.com/CenPC434/java-tools with this you can convert the EDIFACT strings to XML. This repo https://github.com/databricks/spark-xml shows how to use XML format as input source to create Dataframes and perform multiples queries, aggregation... Etc.

Filling table from several input files

I have the following scenario: several csv files contain different columns of the same table. Can I fill the redshift table from them somehow, and, ideally, with the help of the data pipeline? I couldn't find the way I can achieve this. Can anyone help with the solution or maybe simple example if it's possible?
You can do it by converting your csv files into json format prior to their load. Then particular Json tag will not be found in the file: copy will just dismiss it.

Encode to Quoted-Printable in TSQL (or FreeMarker)?

I store the message part of lots of emails in an MsSql database. Before sending an email with the message I need to encode it into Quoted-Printable format. I don't encode it before saving it to db because I want to have the original message. And I don't want to have both the original and the encoded one in the db.
I'm using third-party software for sending mails so my only options to encode the messages is when reading them from the database or to encode them in freemarker.
So, does anyone know how to encode the messages from TSQL or FreeMarker? Preferrably a solution that doesn't involve buying a license.
The options you have are as follows:
select the original email from sql server and then encode it in client application.
create an extended stored procedure or function using CLR.
create a sql function without using CLR. In this case you will have to implement all the Quoted Printable rules. This solution would be really messy and may not be very efficient also.

Import data from Excel File to Core Data

I have thousands of students records in Excel sheet. Now I will import that all data into Coradata [from Excel sheet to Coredata] and I will create my iPhone application using that coredata.
I don't have any idea, how to import Excel file data into coredata.
You're thinking to broadly. You need to decompose this problem further. Here's your real problem:
How do I read an Excel file into memory?
How do I create Core Data objects?
"Excel" has nothing to do with "Core Data". They are entirely disjoint topics.
For the first question, there are several options. You could try and find a library that reads .xls or .xlsx files directly, or you could require that the file be in a different format (like CSV or something).
For the second question, that's easily answered by reading the Core Data documentation.
I would convert the file to xml.
There are plenty of codes showing how to parse xml.