Before i start thinking about this programatically, does anyone know if it is possible to actually extract the correct url from an email link that is basically a tracking module?
Our work email system auto blocks tracking based urls from email, so i am thinking of writing something to extract the correct url so people can copy and paste the tracking link into a program and it will provide the correct url.
Is this even possible with the way that email tracking works?
Here is an example of a url in an email that i recently received:
http://t.dripemail2.com/c/eyJhY2NvdW50X2lkIjoiNTE0MTQ4NSIsImRlbGl2ZXJ5X2lkIjoiOTI0NzI2MTU0IiwidXJsIjoiaHR0cHM6Ly93d3cuYXhzaWVkLmNvbS9nY3NlLWNvbXB1dGVyLXNjaWVuY2Uvb2NyLW5lYS1ndWlkZS8_X19zPXphb2txcDVpaWN4NGkxZndtYmNnIn0
Our system blocks these. It eventually resolves to:
https://www.axsied.com/gcse-computer-science/ocr-nea-guide/?__s=zaokqp5iicx4i1fwmbcg
(got our network admin to check it for me)
I want a system that gets the right url from the ugly mess that is blocked so we can actually view links from emails.
Thanks in advance for any help.
The data in tracking URLs are typically a unique ID pointing to some entry in a database, or are encrypted with a private key, so there's no way to obtain any meaningful information from them. (see answers to this related question: Generate unique link for each website visitor)
More naive approaches will simply encode the data, in which case you may be able to extract useful information from them. Funnily enough, your example URL is a base 64 encoded JSON object containing the link itself:
{
"account_id": "5141485",
"delivery_id": "924726154",
"url": "https://www.axsied.com/gcse-computer-science/ocr-nea-guide/?__s=zaokqp5iicx4i1fwmbcg"
}
In this case you could actually resolve the URL on your own, but this type of approach is uncommon for that very reason.
I am using the current link in my email.
*|baseUrl|*/verifyEmail?token=*|token|*
This however causes one or two people to get strange links from the email and get not found, usually based on some random email providers. E.g. - if I use a 10 minute mail (10minutemail.com), I get the following:
https://10minutemail.com/10MinuteMail/www.mywebsite.com/verifyEmail?token=b32fee82da59e7b4085269faca35ec7025122876
Correct link: www.mywebsite.com/verifyEmail?token=b32fee82da59e7b4085269faca35ec7025122876
Assuming this is due to baseUrl? Am I doing something fundamentally wrong when setting up my email link?
You need to include http:// or https:// with your baseUrl. Otherwise the email client may prepend a default base address instead of 'just' the missing protocol, especially if it is a webmail client.
I'm trying to piece out how difficult it would be to set up an email server that will accept a URL as the subject of an email and respond with an attached copy of said webpage, or element(s) of that webpage (ie, an image from the page, or all of the videos on the page).
I don't necessarily need the code written for me, but would appreciate if someone could suggest a starting point.
I have very little web-programming knowledge (some C++, some Actionscript), which is partly why I don't even know where to begin.
There is several ways to achieve this.
In most unix MTAs you can set up an alias to pipe all messages for some address through a program.
This program need to parse the message header for the "from" and "subject", fetch the url and sent it back.
You can also do this with a program like fetchmail, so you dont even need to make something in the server side.
Finally, several languages have wonderful libraries fetch the mail using POP3, parse it, fetch the URL from the subject and compose a new mail message. Should be no more than 100 code lines with perl or python.
I have a web application which uses URLs that look like this:
http://library.example.com/Register.aspx?query=academic&key=586c70bb-5683-419c-aae9-e596af9ab66a
(The GUID is used instead of a plain int to discourage guessing, which is all we need for now.)
The problem: that long URL frequently breaks when sent via email. It's humans sending the emails, so I can't control the formatting. Sometimes it's the sending email program at fault, sometimes the receiving, but regardless I'm spending too much time on talking people through fixing problems.
Everything has to be from this domain, so I can't use a third-party shortener. I could host my own, but that seems like a kludge.
Any suggestions?
Edits
#Sunny: Thanks for elaborating, but my situation differs from what you assume. A corporate customer (of mine) passes this URL to its employees, and they use it to get to a branded Registration page. They need to give a working email as part of registration, and that gets forwarded to the corporate supervisor.
Registration gets them access to a database, but what they see is not specific to the corporate customer. So the occasional interloper is not a big deal; when they get weeded out by the corporate supervisor, we invite them to subscribe.
#Everybody: the email breakage is not on the punctuation (?&=), but at some predetermined line-length. Surprised me, too. Note that the domain name is long, as is the path to the virtual directory, which is a part of the problem.
After reading the responses, I'm going to use base64 as a pseudo-shortener, something like:
http://a.MyLongDomainName.com/?q=a&key=base64_encoded_GUID
...and see if that survives. Thanks to all.
You can at least shorten it a bit. Right now, you're send a GUID, which is a 128-bit number, in a format that is essentially hexadecimal with extra dashes. If you view the GUID as a byte array and convert it to Base64, you can cut things down a bit. Likewise, "query=academic" could be "q=a".
The GUID is currently taking up 36 characters. Converting to Base-64 cuts this down to 22, saving 14 chars. Replacing "query=academic&key=" with "q=a&k=" shaves off another 13. Cutting a total of 27 characters may well keep your URL short enough not to wrap, despite the presence of ampersands and equal signs.
One more detail: the Base-64 text is going to end with an "=", which will then be hex-encoded into "%3D". The solution is to cut that character off, because it's just padding.
With credit to the original posters, it looks like the best bet is a combination of things:
Compact GUID with base-64.
Shorten key names and, if possible, values.
Wrap URL in angle-braces to encourage client to parse it properly.
If possible, replace key names with URL-rewriting, so that it looks like a path.
If you can't use a third-party URL shortener, then your only option (besides changing the URL structure, as Sunny suggested) is to surround your URL with angle brackets, like this:
<http://library.YourDomainNameHere.com/Register.aspx?query=academic&key=586c70bb-5683-419c-aae9-e596af9ab66a>
Any email client that follows the guidelines found in the Uniform Resource Identifiers (URI): Generic Syntax document should display a clickable link. This is not a fool-proof solution, however, and you'll likely end up resorting to a URL shortening service or restructuring your URLs.
The only alternative to installing your own shortener service (which would be the ideal solution IMO), may be base64 encoding of the whole URL (and using a shorter key). But that would increase string length by 33% (very likely to break in E-Mail clients as well), and look ugly.
I would go with building a URL shortener service that shortens URLs on demand to something like this:
http://library.example.com/go/586c70bb-5683-419c-aae9-e596af9ab66a
There are some prepackaged URL Shorteners that you could host on your own. Here's a codeplex search
http://www.codeplex.com/site/search?query=url%20shortener
This will give you the ability to keep your short url's in house
Alternatively you could some how implement a RESTFul URL that would be a lot harder to screw up
http://library.example.com/Register/Academic/586c70bb-5683-419c-aae9-e596af9ab66a
This solution should work better than the querystring simply because what usually breaks in the email clients is the ?, the =, and the &
I personally think a RESTFul solution is best as it creates the cleanest urls that still make "some" sense.
How about replacing the GUIDS with YouTube style keys
e.g. http://library.example.com/Register.aspx?q=academic&k=jkGlkNu8
By using base-64 strings (instead of Guids which are base-16) and dropping those pesky dashes, you can pack a decent range of unique keys into a small amount of characters.
What about a combination of the methods described here?
Combining shorter URLs with Base64 encoding of the key would turn
http://library.example.com/Register.aspx?query=academic&key=586c70bb-5683-419c-aae9-e596af9ab66a
into
http://l.example.com/register/ac/WGxwu1aDQZyq6eWWr5q2ag
Much more readable, IMO. And lack of chars like ? and & reduces the risk of cut'n'paste errors.
REST-ful url like:
http://www.yourdomainhere.com/register/academic/{userName_here}
might help IMO.
If the user is not registered, this will do it & return a message confirming the fact
If the user has already been registered, there will be no action & perhaps a notification that the user has been registered can be shown.
The routing of the URL and/or validating the request etc. can be implementation detail best left to a module looking at the request pipeline...
HTH.
EDIT:
As pointed out by #Steven below, there is an addition step involved in this solution:
When the user clicks on the REST URL, launch the confirmation/login screen with the user name pre-filled. The user can login to the account & this is confirmation that the user is valid. Till he does the first login, the status of the account can be "not confirmed" & at his first login, it can be "confirmed" without bothering if the click/request has come from the email sent and/or via a request in a web browser.
This will also ensure that it will work for authentic email account since till the user actually does a valid login, the account will not be in "confirmed" status...
I am trying to embed an ID into an email so that when a recipient replies to an email that my system sends out, my system can pick it up and match the two together.
I have tried appending a custom header, however this is stripped out when the user replies.
I have tried embedding an HTML comment within the email, but outlook does not seem to keep comments when a reply email is created.
Worst case scenario, I can manually try and match the sent and received emails by time span or have a visible tag within the message body.
Does anyone know of a more elegant solution?
Thanks in advance
Email messages already contain such an identifiers, called Message-ID. And there's even a way to send which message you're replying to by sending that ID in a header called In-Reply-To. That's done by pretty much all email clients, that's how they usually do their threading.
It's defined in RFC 822 (yep that's pretty old) and probably re-defined and refined in more modern versions of that.
I have seen a method that includes a one byte image with a unique name that's linked to the user. When they view the email and download the images, your HTTP server will record a hit for that unique image. Of course the user needs to display images, but you can include a message in the body asking them to display the images. We actually include content in an image so they need to show images.
If your incoming e-mail can handle +foo or -foo suffixes, use that.
Many e-mail systems can route user+foo#example.com or user-foo#example.com
to user#example.com. You can replace foo with some kind of identifier.
Several mailing list servers use this for tracking bounces.
While I can't say for certain, my investigation in that sort of matter some time ago yielded the following "conclusion":
Headers are transformed a lot
Message bodies are transformed a lot
This is partly because, I suspect, of:
Need to protect users from malicious intentions
Need to perform "targeted marketing"
I have seen "unique codes" flying around in clear text in the email body but I would suggest having a unique identifier embedded in the return address instead.
The usual approach is to place the id in the subject line and/or somewhere visible in the message text and informing the recipient that he should not modify the subject or quote the original mail when responding.