How does Litmus track their email analytics?

How does Litmus track their email analytics? - email

So, 'Litmus', a web app for testing emails and webpages across browsers and email clients, has a proprietary method that they claim is able to track not just opens, clicks, browsers, etc (standard with an embedded image and pass-through link tracking.)
What's unique is they claim that they are able to track what actions the end user took, how long the end user read it for, and if they deleted or forwarded the email. They claim they do this without JavaScript, and purely using embedded images. They claim that the method works across most major email clients.
What could they be doing to track this? Obviously, if they're doing it with third party applications that they don't control, whatever they are doing should be replicable.
I'm thinking that they realized that when an email client forwards or deletes an email, it 'opens' the email in a different way then normal, creating a unique user string on the server log of some kind? I'm grasping at strings, though.
http://litmusapp.com/email-analytics
Details here http://litmusapp.com/help/analytics/how-it-works
EDIT: It also looks like they track Prints. Maybe they do this by tracking calls to the 'print' css?

It's all done with good ol' image bugs. Breaking down how they find out...
Which client was used: Check the user-agent
Whether an email was forwarded: Done by attaching image bugs to divs that are loaded only when the message is forwarded.
Whether an email was printed: bug attached to print stylesheet
How long it takes to read an email: A connection that's kept open, as pointed out by Forrest (this is also how Facebook tracks(ed?) whether or not you are online on chat).
Whether an email was deleted: Check If a message was read for a short period of time or not opened. In fact, they group "glanced" and "deleted" together.
Of course none of this will work if email clients disable images in emails.
EDIT: Here's another question on this:
The OP actually has their tracking code, and this answer here explains how it works.

One way I can think of doing that is having an embedded image that loads from a script on a server. The script would not return anything or maybe send data really slowly to keep the connection open. Once the email is deleted the connection would be closed. This way they could know how long the email was open. Maybe they just assume if it's open for less than 10 seconds it was deleted?
Another way is tracking the referrer - this would give a lot of data on what a webmail client is doing, but I doubt it would be useful with a desktop client.

They know when the email is opened (it's when the image is called from their http server).
They also know what the user do and when since they can easily replace all links with their own tracking URLs redirecting to the original link.
There is nothing exceptional here. They are just a bit more advanced than their compatitors. There is no magic.
I have only one doubt: how they track delete. Technically, there is no way to know what happened to the message after it was read.
I suspect that a "deleted" mail is a mail that is never opened.

Related

Are URLs in emails indexed by search engines so they become publicly searchable?

I have read a few questions on here about e-mail clients prefetching URLs in e-mails. An answer to this seems to be to add a new confirmation page, where the user has to click a button to confirm the desired action.
But, this answer states the following:
As of Feb 2017 Outlook (https://outlook.live.com/) scans emails
arriving in your inbox and it sends all found URLs to Bing, to be
indexed by Bing crawler.
This effectively makes all one-time use links like
login/pass-reset/etc useless.
(Users of my service were complaining that one-time login links don't
work for some of them and it appeared that BingPreview/1.0b is hitting
the URL before the user even opens the inbox)
Drupal seems to be experiencing the same problem:
https://www.drupal.org/node/2828034
My major concern is with this statement:
As of Feb 2017 Outlook (https://outlook.live.com/) scans emails
arriving in your inbox and it sends all found URLs to Bing, to be
indexed by Bing crawler.
If this is the case, any URL in an e-mail meant to confirm an action, e.g. confirming a login, subscription, or unsubscription, can end up searchable in a search engine, if that's whats meant by indexed in the quote above. In this case, it's Bing. Not even a dedicated confirmation page where the user confirms the desired action truly mitigates this.
Scenario #1
If I email the user a login link with a one-time token in the URL, that URL will end up in Bing. This token will have a short lifetime, lets say 5 minutes, so I doubt anyone will manage to search on Bing and find the URL before the user clicks it or it expires.
Scenario #2
The user gets an e-mail with a link to confirm a subscription. This link is perhaps valid for 24 hours. This might(?) be long enough for someone else to stumble over the link on a search engine and accidentally (or on purpose) confirm the subscription on behalf of the user.
Scenario #2 is not uncommon, it's even best practice to use double opt-in as far as I am aware.
Scenario #3
Unsubscribe URLs in the bottom of newsletters. Maybe valid for forever? You don't want this publicly searchable in an search engine.
Assume all the one-time confirmation links land on a confirmation page where the user confirms the desired action.
Is it truly the issue that URLs in e-mails are indexed by search engines, at least Bing? And will they actually end up publicly searchable? If not, what is meant by indexed in the quote above?
I'll add for the sake of completion that I don't think I've had much of a problem with this in my own use of the web, so my gut feeling is that this is unlikely the case.

Is it truly the issue that URLs in e-mails are indexed by search engines, at least Bing?
I can't definitely say if they are being indexed or not, only Bing could answer this question, but they are surely being visited, at least with a simple GET request. I just tested this sending myself a link to a page on my website that logs the requests that are made against it, and indeed I'm seeing a GET coming from 207.46.13.181 (reverse DNS says msnbot-207-46-13-181.search.msn.com), which suggests that an automated program from search.msn.com is crawling the link. This leads me to believe that yes, they are trying to index the link's content somehow, but it's only my opinion really.
And will they actually end up publicly searchable? If not, what is meant by "indexed" in the quote above?
Well, again, impossible to say unless you work for Bing. In any case, "indexing" means exactly what you think it does: parsing the content of a page to potentially include it in search results.
The real question here is: does this somehow represent a security problem or will it compromise my website's functionality?
It surely has the potential to: if your confirmation/reset/subscription/whatever process only relies on a single GET request with the appropriate GET parameter, then you should definitely revisit the strategy, as it obviously allows anyone to perform the action (even maliciously for example enumerating possible IDs for your GET parameters).
If the link you are trying to send contains sensible information or can be used to alter important data for an user of your website, then you should at least put it behind a login page only giving access to the interested user. This way, anyone who wants to access it (including search engines) will be redirected to a login page if not already logged in.
If the link you are trying to send is just some kind of harmless confirmation link (e.g. subscribe/unsubscribe from a newsletter), then at least use a form inside the web page to do the actual confirmation through a POST request (possibly also using a CSRF token), otherwise you will unequivocally end up with false positives.

Prevent form data from being cached, and re-accessing with back button

I am considering making a very simple form for clients to use in a sort of web browser kiosk fashion, where they submit some of their information through the computer in the lobby at their option instead of writing something out by hand. This would be used if they come in person rather than calling or going to the web site first. I already have a form on our site for clients to use from their home computers so this would be very similar but tailored for and only used for the in-person clients.
Since the form will sort of just loop back to itself (not really "back" but just have a link to go back to a fresh form) for a fresh form after every client, how can I ensure that one can't hit back a few times to see the previous client's info? It's not really sensitive data, I just would like to provide that bit of privacy. Of course clients using our web site and the form there from their own computer are responsible for their own privacy.
Apart from having customer service walk to the computer and close and reopen the browser, or using AJAX, what should I do?
The other topics I've read related to this all have someone basically saying "you're not supposed to do that, you bad person". This seems like a valid reason to me. Any ideas?
Thanks!

Disable autocomplete by adding autocomplete="off" to the input tags or form tag.

Keeping track of whether an email has been opened

I'm using rails 2 for this app, with ActionMailer, but this is a general question about emails.
When we send out emails, i save a record corresponding to the email in a database table. I'd like to keep track of whether people have read the emails, and am wondering the best way to do it. On initial googling, it seems like i've stumbled into an ongoing battle between spammers and email clients!
My first thought was to use the "read receipt" header, but i know that this isn't supported by a lot of clients and is therefore unreliable. After that, i read of the tactic of including an image in the mail, and of detecting that image being loaded. I was thinking that i could put a parameter with the email record's id in the image url, so that when i get a request for that image i can see if it has a (for example) email_id param and if so, mark the corresponding email as having been read.
But, then i remembered that many clients are wise to this tactic and specifically ask the viewer of the mail if they want to display images. Obviously they might say no.
Am i right in thinking that i can't pull in other resources, such as stylesheets, in my mail? Because if i can pull them in, i could do that same trick but with the stylesheet rather than an image.
Grateful for any advice, max

Externally-hosted stylesheets are generally treated the same way as images. The client will not download them without prompting the user, if that works at all with HTML-formatted emails.
One thing to consider- you're looking to determine whether the email was read, not necessarily just received, right? Format your email so that it can't be easily read without viewing the images, and include a "view in browser" link at the top. Track image and page-format views and I think you'll have a fairly reliable way to measure actual reads.

Bit late on this, but we've got a similar problem.
We're tracking the links to our site that are included within the email. We're doing this by, like you, having a DB record per email sent out. We've generated a unique hash key per email and are including that as a parameter on all the links included in the email.
We simply then have a before_filter that looks for the parameter and records the fact against the correct email record by using the unique hash to identify the correct one.
We use a unique hash key (rather than the DB's primary key) just so it is a little bit more secure / reliable.
Obviously this method only helps us track the clicks our emails have generated (and not if they've been read) but it is still useful as we can see which of ours users has clicked on which links.

We are having major problems with this as well.
We have task wek portal, where users create tasks (like paint my house) and then we invite painters to give the task creator an price on painting his house.
For that we had a very advanced email system, that sends an invitation and if they accept the invitation we send them the contact info of the task creator.
We need to be able to track if the email was opened, and then once it's opened, we know that the company got the contact info, and we can now send another email to the task creator, telling them that they can expect to be contacted by that company.
The problem is that tracking if the email was opened is not reliable at all. There are different systems for this like msgtag (which does not support a wide range of mail clients like yahoo and other major clients) and our email API client (elastic email) even offer some API call back functions to tell us if each email was opened or bounced or whatever. But again, it's not reliable. To track if it's open, elastic email just includes a 1x1 px image and track if it's opened. So if people don't click "show images in this email" it's not tracked as opened.
So basically we are down to two options.
Have vital portions of the content printed on images, that they have to view to get the info we want to track if they got (in this case contact info)
Just have a link in the email "click here to get the contact info" and then track if that is clicked.
So in conclusion, the "track if opened" is totally useless and unreliable, unless you can fully control which email clients your recipients are using and how they are using them (like if they are all your employees or something).

using ajax to make email addresses safe?

I'm sorry if this is duplicating another post. I have a possible answer to a question in another post but I'm not sure if its a good solution and I wanted to ask people for their views.
The problem is the one raised in this post, how to protect emails from spam bots.
Rather than have the addresses on the page, split into different vars and then assembled by JavaScript, I send an ajax request to the server (just a GET to the welcome_controller) with a key ie 'address_id_42' in the params and it returns a mailto link which is then inserted into the page.
Is there any gain by not having any address data on the page initially? Is any advantage immediately lost by the fact the server will just hand out mailto links if you send it the right address id?
I could easily extend it so the server replies with some custom structure which gets unraveled by the js, but I agree that really this is not the right place to focus and that better spam filtering is the way forward, but I'm interesting in what people think to using ajax as a level of obfuscation?
Cheers :)

It depends on the kind of website it is.
Is the page only accessible after authentication (login)?
Is there another (simpler) way of doing it rather than getting it using AJAX?
The answer to your question really depends on these things.
But in a general way, yes, it might help. But such AJAX requests should only be triggered by some "humanly" action like clicking on "show email" button or something like that.
Also you could convert the email text to an image (which I believe is pretty easy to do with PHP).
Also other solutions could involve separating the two parts of the email address (part before and after '#' symbol) by putting them in different 'spans' etc.

I think obfuscating content through AJAX is a great idea. However, you can also try ready to use third party implementations like Mailhide instead of building all of this yourself. You get an additional layer of security by making the user fill up a CAPTCHA before the email address is revealed.

How do email analytics work?

While I haven't actually used it, an email analytics web app called Litmus claims to be able to track:
How long someone takes to read an email.
Whether it is forwarded.
Whether it is deleted.
Whether it is printed.
What email client was used to read it.
I'm curious about where it gets this information from. Most email clients i've seen don't even load external images without explicitly loading them, let alone javascript.
Even if a lot just support images, that wouldn't give away items 1-3.

Here's my best guess. As this link rockinthesixstring posted says, it relies on images but not javascript.
How long someone takes to read an email.
Place several images in the email that take a while to load, if the email is read for a long time, more are loaded.
Whether it is forwarded.
Is the image loaded from more than one IP/user agent?
Whether it is deleted.
The screenshots show this combined with glanced.
Whether it is printed.
Add a background image bug to a print stylesheet.
What email client was used to read it.
Check the user agent.

Any email analytics application I've seen use an image tracker. Basically if you attach a code generated 1x1 px image somewhere in the email, then during the loading of that image on the server side, you capture all of the ServerVariables
http://msdn.microsoft.com/en-us/library/ms524602%28VS.90%29.aspx
EDIT:
I just read some of the information on the Litmus website an it looks as though it confirms what I wrote above regarding image tracking

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse