perl smtp datasend - perl

I'm testing a program that I'm writing in perl to send automated emails by sending them to myself first and I am noticing that all the carriage returns and tabs (\n and \t) that I am putting in the emails are turning up in outlook as spaces when I read the emails. Any idea what could be going on here?

"\n" is a unix end of line
i think you need to use "\r\n" for windows

There is a feature in Outlook to discard the extra lines. The alert is hard to see in the message. It should be above the From line when you double-click and open the message.
It should say something like "Extra line breaks in this message were removed -> Restore line breaks"
To prevent this problem, try formatting your lines with a carriage return and line feed (\r\n) instead.

For starters, I would advise avoiding tab (\t) characters in email messages; there's no standardization among email clients controlling how they're displayed.

Related

JavaMail, adding line breaks in plain text emails

I am using java mail api to send "text/plain" content-type emails over smtp. I am using email templates that are stored in database. In order to put line breaks in the email body, I am using \r\n. However, when the email is received the \r\n are not converted to line breaks instead they appear as text \r\n.
For example:
This line is followed by a carriage return.\r\nThis is a new line.
in the template email body appears in the received email as
This line is followed by a carriage return.\r\nThis is a new line.
instead of
This line is followed by a carriage return.
This is a new line.
I have tried using just \n and that too does not work. How can I resolve this problem?
I get same problem,
And when I add "space" before newline, It's work
But that's mean create tag " pre " for each line.
If the template contains the line breaks as separate backslash and 'n' characters, you're going to need to do something to convert it to a real "newline" character. Ditto \r. Better would be to store the template with real carriage return and newline characters to begin with.

SMTP dot stuffing.. when and where to do it?

I have found conflicting information about dot stuffing when transmitting an email.
stuff a dot if the line contains a single dot (to avoid premature termination)
stuff a dot to every line stat starts with a dot
stuff a dot to (1) and to every line part of a quoted-printable message part only
Can anyone clarify?
According to the SMTP standard RFC 5321, section 4.5.2:
https://www.rfc-editor.org/rfc/rfc5321#section-4.5.2
To allow all user composed text to be transmitted transparently, the following procedures are used:
Before sending a line of mail text, the SMTP client checks the first character of the line. If it is a period, one additional period is inserted at the beginning of the line.
When a line of mail text is received by the SMTP server, it checks the line. If the line is composed of a single period, it is treated as the end of mail indicator. If the first character is a period and there are other characters on the line, the first character is deleted.
So, from the three points of your question, the second one is right.
The practical answer: If you're using quoted printable format then always translate a dot to =2E. You can't rely on all smtp servers doing the dot removal correctly.
If you want to assume the whole world is standards compliant then go with answer 2 above.
In SMTP protocol the mail is terminated by a single dot and a newline character(s)
In simple terms something like:
\r\n.\r\n
The characters:
CR LF DOT CR LF
Which corresponds to a single dot at the beginning of a line.
In case the mail data contains a single . At the beginning of line and is followed by a new line character then the SMTP protocol will consider it as mail termination and hence only a part of mail would be delivered.
So the whole idea is to avoid these type of situation by padding an extra dot.

why is there a '=' at the end of a SMTP message body?

I receive email messages over sockets and see that long lines in the message body are broken up, separated by the following expression
'=\r\n'
I cannot find any documentation on this and wonder if someone just happens to know where I can find information on this behavior.
Also, please ONLY feedback on my question, no comments regarding email and sockets!
Thanks
Alex
From Wikipedia, regarding Quoted-printable:
Lines of quoted-printable encoded data must not be longer than 76 characters. To satisfy this requirement without altering the encoded text, soft line breaks may be added as desired. A soft line break consists of an "=" at the end of an encoded line, and does not appear as a line break in the decoded text.
The \r\n is likely coming from whatever is generating the content or body of the email, and is a line break also. Depending on the client used to view the message, it may or may not render as an actual line break.

Quoted Printable Email showing equal signs in certain email clients

I'm generating emails. They Show up fine for me in gmail and Outlook 2010. However, my client sees the = sign that gets added to the end of lines by the quoted-printable formatting. It also eats the character on the next line, but then displaying the equal sign.
Example:
line that en=
ds like this
shows up like
line that en=s like this
(Note: The EOL character in my emails is just LF. No CR.)
I'm confirming what outlook version my client is using, but I think it's 2007. The email headers from her appear to come through Exchange 6.5.
My emails are created in php using the HtmlMimeMail5 library. They are multipart emails, with the applicable section sent with:
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
It appears I could just make sure nothing in my email reaches the line wrap at 76 characters, but that seems like the wrong way to solve the problem. Should the EOL character be different? (In emails from the client, the EOL character is simply a LF) Any other ideas?
I do not know what the PHP library does, but in the end MIME mail must contain CR LF line endings. Obviously the client notices that = is not followed by a proper CR LF sequence, so it assumes that it is not a soft line break, but a character encoded in two hex digits, therefore it reads the next two bytes. It should notice that the next two bytes are not valid hex digits, so its behavior is wrong too, but we have to admit that at that point it does not have a chance to display something useful. They opted for the garbage in, garbage out approach.

Why is it so important for CR and LF to appear together in Email?

From http://www.faqs.org/rfcs/rfc2822.html:
CR and LF MUST only occur together as
CRLF; they MUST NOT appear
independently in the body.
We have a web service that sends out confirmation emails, but one of our users pointed out that this does not adhere to the rfc2822 standard. So my question is, why is it important for CR and LF to appear together in email messages?
Because it's in the accepted RFC?
Implementations are derived from RFCs. If that was not the case, then there would be no guarantee of interoperability between different implementations. There may or may not be tangible, technical reasons of requiring them to appear together, but in this case those reasons are irrelevant. It's a simple matter of "because they said so."
Because in email CRLF is the line separator. If you only use CR or only use LF you will have all sorts of unexpected problems with various clients, SMTP server combination. Some servers will reject your emails, some will "fix" your emails. Fixed emails are some of the most fun to deal with.
Think in term of an old teletype. CR returns the write head to the beginning of the line, LF rolls the paper one line forward. You need both steps to begin a new line. If you use CR without LF, you will overwrite the same text, which is of course illegal.
Anyway, this is the historial reason to define CR+LF as the ASCII-code for a new line. Of course in the end it is just arbitrary codes. Some systems use only CR to indicate a new line, some systems use only LF, some use a different character entirely. RFC2822 had to chose one, and decided to allow only the sequence CRLF.
Since the RFC decided to use CRLF, it makes sense to disallow CR or LF seperately, since this would be pretty useless and problematic to handle anyway.
If not you end up with a CR, which puts you on the same line, then whatever you write would be on top of the chars at the left on the same line, then comes the LF and you are in some column towards the middle and start writing again. Messy.