Parsing email "Received:" headers - email

We need to parse Received: email headers according to RFC 5321. We need to extract domains or IPs through which the mail has traversed. Also, we need to figure out if an IP is an internal IP.
Is there already a library which can help out, preferably in C\C++?
For example:
Received: from server.mymailhost.com (mail.mymailhost.com [126.43.75.123])
by pilot01.cl.msu.edu (8.10.2/8.10.2) with ESMTP id NAA23597;
Fri, 12 Jul 2002 16:11:20 -0400 (EDT)
We need to extract the "by" server.

The format used by 'Received' lines is defined in RFC 2821, and regex can't parse it.
(You can try anyway, and for a limited subset of headers produced by known software you might succeed, but when you attach this to the range of strange stuff found in real-world mail it will fail.)
Use an existing RFC 2821 parser and you should be OK, but otherwise you should expect failure, and write the software to cope with it. Don't base anything important like a security system around it.
We need to extract the "by" server.
'from' is more likely to be of use. The hostname given in a 'by' line is as seen by the host itself, so there is no guarantee it will be a publically resolvable FQDN. And of course you don't tend to get valid (TCP-Info) there.

There is a Perl Received module which is a fork of the SpamAssassin code. It returns a hash for a Received header with the relevant information. For example
{ ip => '64.12.136.4',
id => '875522',
by => 'xxx.com',
helo => 'imo-m01.mx.aol.com' }

vmime should be fine, moreless any mail library will allow you to do that.

You'll want to use Regular Expressions possibly
(?<=by).*(?=with)
This will give you pilot01.cl.msu.edu (8.10.2/8.10.2)
Edit:
I find it amusing that this was modded down when it actually gets what the OP asked for.
C#:
string header = "Received: from server.mymailhost.com (mail.mymailhost.com [126.43.75.123]) by pilot01.cl.msu.edu (8.10.2/8.10.2) with ESMTP id NAA23597; Fri, 12 Jul 2002 16:11:20 -0400 (EDT)";
System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(#"(?<=by).*(?=with)");
System.Text.RegularExpressions.Match m = r.Match(header);
Console.WriteLine(m.Captures[0].Value);
Console.ReadKey();
I didnt claim that it was complete, but am wondering if the person that gave it a -1 even tried. Meh..

You can use regular expressions. It would look like this(not tested):
#include <regex.h>
regex_t *re = malloc(sizeof(regex_t));
const char *restr = "by ([A-Za-z.]+) \(([^\)]*)\)";
check(regcomp(re, restr, REG_EXTENDED | REG_ICASE), "regcomp");
size_t nmatch = 1;
regmatch_t *matches = malloc(sizeof(regmatch_t) * nmatch);
int ret = regexec(re, YOUR_STRING, nmatch, matches, 0);
check(ret != 0, "regexec");
int size;
size = matches[2].rm_eo - matches[2].rm_so;
char *host = malloc(sizeof(char) * size);
strncpy(host, YOUR_STRING + matches[2].rm_so, size );
host[size] = '\0';
size = matches[3].rm_eo - matches[3].rm_so;
char *ip = malloc(sizeof(char) * size);
strncpy(ip, YOUR_STRING + matches[3].rm_so, size );
ip[size] = '\0';
check is a macro to help you figure out if there are any problems:
#define check(condition, description) if (condition) { fprintf(stdout, "%s:%i - %s - %s\n", __FILE__, __LINE__, description, strerror(errno)); exit(1); }

typedef struct mailHeaders{
char name[100];
char value[2000];
}mailHeaders;
int header_count = 0;
mailHeaders headers[30]; // A struct to hold the name value pairs
char *GetMailHeader(char *name)
{
char *value = NULL;;
int i;
for(i=0;i<header_count;i++){
if(strcmp(name,headers[i].name) == 0){
value = headers[i].value;
break;
}
}
return(value);
}
void ReadMail(void)
{
//Loop through the email message line by line to separate the headers. Then save the name value pairs to a linked list or struct.
char *Received = NULL // Received header
char *mail = NULL; // Buffer that has the email message.
char *line = NULL; // A line of text in the email.
char *name = NULL; // Header name
char *value = NULL; // Header value
int index = -1; // Header index
memset(&headers,'\0',sizeof(mailHeaders));
line = strtok(mail,"\n");
while(line != NULL)
{
if(*line == '\t') // Tabbed headers
{
strcat(headers[index].value,line); // Concatenate the tabbed values
}
else
{
name = line;
value = strchr(line,':'); // Split the name value pairs.
if(value != NULL)
{
*value='\0'; // NULL the colon
value++; // Move the pointer past the NULL character to separate the name and value
index++;
strcpy(headers[index].name,name); // Copy the name to the data structure
strcpy(headers[index].value,value); // Copy the value to the data structure
}
}
if(*line == '\r') // End of headers
break;
line = strtok(NULL,"\n"); // Get next header
header_count = index;
}
Received = GetMailHeader("Received");
}

It is not difficult to parse such headers, even manually line-by-line. A regex could help there by looking at by\s+(\w)+\(. For C++, you could try that library or that one.

Have you considered using regular expressions?
Here is a list of internal, non-routable address ranges.

Related

Xilinx Echo Server Data Variable

I want to have my Zedboard return a numeric value using the Xilinx lwIP example as a base but no matter what I do I can't figure out what stores the data received or transmitted.
I have found the void type payload but I don't know what to do with it.
Snapshot of one instance of payload and a list of lwIP files
Below is the closest function to my goal:
err_t recv_callback(void *arg, struct tcp_pcb *tpcb,
struct pbuf *p, err_t err){
/* do not read the packet if we are not in ESTABLISHED state */
if (!p) {
tcp_close(tpcb);
tcp_recv(tpcb, NULL);
return ERR_OK;
}
/* indicate that the packet has been received */
tcp_recved(tpcb, p->len);
/* echo back the payload */
/* in this case, we assume that the payload is < TCP_SND_BUF */
if (tcp_sndbuf(tpcb) > p->len) {
err = tcp_write(tpcb, p->payload, p->len, 1);
//I need to change p->paylod but IDK where it is given a value.
} else
xil_printf("no space in tcp_sndbuf\n\r");
/* free the received pbuf */
pbuf_free(p);
return ERR_OK;
}
Any guidance is appreciated.
Thanks,
Turtlemii
-I cheated and just made sure that the function has access to Global_tpcb from echo.c
-tcp_write() reads in an address and displays each char it seems.
void Print_Code()
{
/* Prepare for TRANSMISSION */
char header[] = "\rSwitch: 1 2 3 4 5 6 7 8\n\r"; //header text
char data_t[] = " \n\r\r"; //area for storing the
data
unsigned char mask = 10000000; //mask to decode switches
swc_value = XGpio_DiscreteRead(&SWCInst, 1); //Save switch values
/* Write switch values to the LEDs for visual. */
XGpio_DiscreteWrite(&LEDInst, LED_CHANNEL, swc_value);
for (int i =0; i<=7; i++) //load data_t with switch values (0/1)
{
data_t[8+2*i] = '0' + ((swc_value & mask)/mask); //convert one bit to 0/1
mask = mask >> 1;//move to next bit
}
int len_header = *(&header + 1) - header; //find the length of the
header string
int len_data = *(&data_t + 1) - data_t; //find the length of the data string
tcp_write(Global_tpcb, &header, len_header, 1); //print the header
tcp_write(Global_tpcb, &data_t, len_data, 1); //print the data
}

PKCS#11 C_Encrypt fails with Bad Arguments for 128 bit AES key

I generate a 128-bit AES object using "C_CreateObject".
I then do the following to encrypt a piece of data and get a "Bad Argumnents" error on the call to "C_Encrypt" to get the encrypted data length.
char clear[] = "My name is Eric!";
buf_len = sizeof(clear) -1;
rv = pfunc11->C_EncryptInit(session, pMechanism, hObject);
if (rv != CKR_OK)
{
printf("ERROR: rv=0x%08X: initializing encryption:\n", (unsigned int)rv);
return false;
}
rv = pfunc11->C_Encrypt(session, (CK_BYTE_PTR)clear, (CK_ULONG)buf_len, NULL, pulEncryptedDataLen);
if (rv != CKR_OK)
{
printf("ERROR: rv=0x%08X: derror getting encryption data buffer length:\n", (unsigned int)rv);
return false;
}
What am I doing wrong here ?
Here is my mechanism definition -
CK_MECHANISM myMechanism = {CKM_AES_CBC_PAD, (CK_VOID_PTR)"01020304050607081122334455667788", (CK_ULONG)16};
CK_MECHANISM_PTR pMechanism = &myMechanism;
Your pulEncryptedDataLen is probably NULL which causes CKR_ARGUMENTS_BAD.
It is better to use e.g.:
CK_ULONG ulEncryptedDataLen;
...
rv = pfunc11->C_Encrypt(session, (CK_BYTE_PTR)clear, (CK_ULONG)buf_len, NULL, &ulEncryptedDataLen);
The number of bytes sufficient to store encryption result of a single-part encryption gets stored into ulEncryptedDataLen.
Also please note that your way of passing IV value is not correct as "01020304050607081122334455667788" results in an ASCII string (giving IV as 30313032303330343035303630373038 -- which is probably not what you want).
To get correct IV use "\x01\x02\x03\x04\x05\x06\x07\x08\x11\x22\x33\x44\x55\x66\x77\x88" instead.
Good luck!

Generating DXL documentation using Doxygen : if is shown as a function

I am trying to generate some DXL documentation usings Doxygen , but the results are often not correct , DXL is used as a scripting language and that has a C/C++ like syntax with some changes , like for example i can ignor using the Semicolons , What should i do to correct this problem ?
which creates some problems while generating the documentation, here is an example of my dxl code database :
string replace (string sSource, string sSearch, string sReplace) {
int iLen = length sSource
if (iLen == 0) return ""
int iLenSearch = length(sSearch)
if (iLenSearch == 0) {
return ""
}
char firstChar = sSearch[0]
Buffer s = create()
int pos = 0, d1,d2;
int i
while (pos < iLen) {
char ch = sSource[pos];
bool found = true
if (ch != firstChar) {pos ++; s+= ch; continue}
for (i = 1; i < iLenSearch; i++) {
if (sSource[pos+i] != sSearch[i]) { found = false; break }
}
if (!found) {pos++; s+= ch; continue}
s += sReplace
pos += iLenSearch
}
string result = stringOf s
delete s
return result }
as i said the main difference with C and that may cause doxygen to interpret this code incorrectly is that in DXL , we dont have to use ";" .
thanks in advance
You must do three things to apply Doxygen successfully on DXL scripts:
1.) In Doxygen-GUI, 'Wizard' tab, section 'Mode' choose 'Optimize for C or PHP'
2.) The DXL code must be C-confom, i.e. each statement ends with a semicolon ';'
3.) In tab 'Expert' set language mapping for DXL and INC files in section 'Project' under 'EXTENSION_MAPPING':
dxl=C
inc=C
This all tells Doxygen to treat DXL scripts as C code.
Further, for DOORS to recognize a DXL file documented for DoxyGen as valid and bind it to a menu item, it must comply with certain header structure, consisting of single line and multi-line comment, e.g.
// <dxl-file>
/**
* #file <dxl-file>
* #copyright (c) ...
* #author Th. Grosser
* #date 01 Dec 2017
* #brief ...
*/

Losing values with iterative realloc in C

I am working in C with Netbeans8.0
I have to read files in an iterative approach to get list of words. That is, in single iteration a file is read into an array of strings and then merge this array into a single array.
void merge_array(char** a,int* M, char** b,int N)
{
//............. Add extra memory to a ..............*/
void *tmp = realloc(a, (*M+N) * sizeof(*a));
if (tmp == NULL)
{
perror("Merging -> Could not reallocate");
exit(EXIT_FAILURE);
}
a = tmp;
memset(a+(*M), 0, N*sizeof(*a));
//............. copy strings in b to a ..............*/
int i,j=0;
for(i=*M; i<((*M)+N); i++)
{
size_t wlen = strlen(b[j]);
a[i] = malloc((wlen+1) * sizeof(char));
if (a[i] == NULL)
{
perror("Failed to replicate string");
exit(EXIT_FAILURE);
}
memcpy(a[i], b[j], wlen+1);
j++;
}
(*M) = (*M)+N; // resetting the count
printf("Confirm - %s, %d\n",a[0],*M);
}
Above function reads the contents of a file. In main above function is called iteratively and merged into a single array named 'termlist'. Main code is given below
char** termlist;
int termCount=0;
while(files[i]){
char **word_array;
int wdCnt,a;
char* tmp = (char*) malloc(strlen(path)*sizeof(char));
strcpy(tmp,path); strcat(tmp,files[i]); strcpy(files[i],tmp);
printf("\n\n******* Reading file %s...\n",files[i]);
word_array = getTerms_fscanf(files[i],&a); //reading contents of file
wdCnt = a;
if(i==0) // before reading the first file initializing the termlist
{
termlist = (char**) malloc(wdCnt*sizeof(char*));
}
merge_array(termlist,&termCount,word_array,wdCnt);
printf("CHECK - %s, %d\n",termlist[0],termCount);
free(word_array);
++i;
}
Now the problem is that,
After 1st two iterations, Inside function everything works fine but in main values of termlist[0], termlist[1] turns out to be junk.. That is first 2 words read from first file is lost. The 3rd iteration returns with failure at merge_array function call.
Output is
******* Reading F:/Netbeans C/Test Docs/doc1.txt...
Confirm - tour, 52
CHECK - tour, 52
******* Reading F:/Netbeans C/Test Docs/doc2.txt...
Confirm - tour, 71
CHECK - Ôk'aÔk'a`œ€`œ€äk'aäk'aìk'aìk'aôk'aôk'aük'aük'ah“€, 71
I am not able to identify problem with this.. Please help with this..

How do I send XML as the email body from a native iPhone app?

I am writing an app that ultimately wants to send some XML via email.
I have the mailto/URL thing sussed, thanks to various links on the interweb, including Brandon and Simon Maddox.
So I can send emails with the xml formatted using square brackets ([ ]), rather than the usual angle brackets (< >). But when I send angle brackets, with the XML mangled using the stringByAddingPercentEscapesUsingEncoding call, It treats it as HTML and just prints the values.
If change them to "& lt;" and "& gt;" then it totally strips the XML out... (I know there should not be a space after the & - but the SO formatter turns them into <,>...)
I tried adding some HTML in front to see if that helped, to no avail.
I don't suppose anyone has done this?
Perhaps in-app email is the easy route for me to go... must look into that.
Thanks in advance.
The following code worked for me... I have SIP message data containing <> that needed escaping.
/* remember to call urlEscapeStringDone to free the malloced string.. */
char *urlEscapeString(char *str)
{
int i, l;
char *escStr;
escStr = malloc(strlen(str)*3 + 1);
if(!escStr) return NULL;
memset(escStr, 0, strlen(str)*3);
l = strlen(escStr);
for(i = 0; i < strlen(str); i++)
{
char c = str[i];
/* < and > handling for HTML interpreters.. (apple mail) */
if(c == '<')
{
strcat(escStr, "%26lt%3b");
l += 8;
}
else if(c == '>')
{
strcat(escStr, "%26gt%3b");
l += 8;
}
else if(must_escape(c))
{
char tmp[3];
sprintf(tmp, "%02x", (unsigned) c);
escStr[l] = '%'; l++;
escStr[l] = tmp[0]; l++;
escStr[l] = tmp[1]; l++;
}
else
{
escStr[l] = str[i];
l++;
}
}
printf("escaped: %s\n", escStr);
return escStr;
}
void urlEscapeStringDone(char *str)
{
if(str) free(str);
}
int must_escape(char c)
{
char *allowedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789._";
if(!strchr(allowedChars, c)) return 1;
return 0;
}
Did you try replacing all the '<' and '>' characters with '&lt' and '&gt' after you had wrapped it in the basic HTML headers?
As I understand it, this is the usual technique to display XML on a web page.