I'm trying to find a way to identify deleted user accounts from my system based on their email address without violating gdpr / privacy - email

My business recruits people for focus groups and one of our main selling points is that we ensure that recruits don't see the same researcher more than once.
Often we will be given customer lists from our clients where a condition of the job is that we delete the user records at the end of the project. Whilst we are able to keep the associated data that ties them to a project (for business stats etc.) we need to remove identifying & contact information -> Email address and phone numbers being how we identify a specific persons account.
My issue is:
What can i do to ensure that, if these deleted users show up in my system again, that I can identify their association with old projects / focus groups, so that we can prevent these deleted users from signing up again and being placed in a focus group with a researcher they have already seen.
My first thought was, upon "deletion", to hash their email address and remove the plaintext address, and check this hashed address against new accounts, to link their old db associations with this new account.
I am fairly new to security / privacy concepts, so I'm not sure whether this would be secure, or if being able to identify the link to the old account is a violation of privacy.

You're on the right track here. Hashing the email address or phone number means that you've effectively put that data "beyond use". So long as you delete all the other data relating to it, it does not represent "personal data" in the GDPR sense.
Also consider the basis for processing – if you are legally obliged (either by statute or by the original contract with your users) to implement this suppression mechanism, then you would be permitted to do so even if it was personal data.
Note this email marketing industry body that recommends and promotes hash-based suppression lists, and this site suggesting the same.

Related

Keycloak: How to customize user email to be mandatory and immutable

Our old authentication mechanism had mandatory and immutable email for each user by design. After exporting old authentincation mechanism into the hands of Keycloak 4.6.Final, We are left with old references to users by email as this was in fact used as an id from the beginning of this system.
Keycloak User Management UI is delivered to client as part of a whole system. Now we're facing a problem where the users administrator at the customer's side is able to create users with no email, and even worst, he give a user one email and overtime change it. Leaving this option open is most likely to create bugs for the client as the user base grows.
I've been digging around google, sof, keycloak mailing list search engine, and couldn't find any documentation relating developer's ability to apply configuration on top of particular keycloak distribution which would set features such as mandatory and immutable on some user attributes which are optional and editable by default.
I know that question is old, but maybe someone will need answer.
it's 2022-11 and there is experimentas feature in Keycloak 20. You can enable declarative-user-profile and then customize your user profile and set required fields and other options. user-profile
This feature may be removed in the future, because it's experimental.
And this feature has bugs (tried with 20.0.1). For example, if you add required attribute group, then you can see groups while creating new user and you can select groups. But if you try to save user, then error appears telling, that group is required.

How do I secure pro membership features in a Chrome App?

I need to know if an installation has been paid for in the past so I can provide some premium features.
Storing a payment flag in indexeddb or the file system sounds easy to defeat. Periodically asking a server and caching the response could do the trick, but I guess the user would have to be logged-in at all times (through google or otherwise) and I'd rather not impose that restriction.
Maybe if there's a way to uniquely identify a user's machine (uuid, mac address, etc) that could allow me to determine if they've made that payment?
Ultimately, this is client side JavaScript. The only means by which you can prevent use of certain features, is to put them on your server and charge for the service.
Some weak methods for preventing access include license validation, and asking the server for non-essential information (if it was essential, then see the above).
For license validation, you could create an algorithm that takes data from the user and transforms it into something else. For example, say they create an account on your website, which your server knows is a 'pro' account. You could then take their first name and email address and do some magic on it.
Here's a simple example that takes those inputs and gives us a key. In this example if our first name is "John" and our email is "john#domain.org", then our key will be fcumnflqjpBfqockp0qtifcufLqjp. However, Tony, with the email "tony#doman.org" would recieve fcumnfvqp{Bfqockp0qtifcufVqp{
You can send this key to the user, and have your code decide whether it can extract the name and email by applying the reverse algorithm.
You can reverse the strings, do various bit math, etc. It's security by obscurity. Other than an account, this is the most common method. It's used by nearly all offline software. Its kryptonite is key generators, which reverse engineer your code, and generate keys by the algorithm you use to verify them.
All the methods such as uuid, mac address etc can be easily forged imo. I think you cannot escape keeping track of user's logged-in status. Implementing something like a cookie based mechanism would be the right way to go.

It is possible to manage users/identities in a data store that exhibits eventual consistency?

Is it possible to create/store user accounts in a data store that exhibits eventual consistency?
It seems impossible to manage account creation without a heap of architectural complexity to avoid situations where two account with the same UID (e.g. email address) can occur?
Do users of eventual consistency stores use a separate consistent DB as an identity store, or are there solutions/patterns that I should be exploring?
Thanks in advance,
Jamie
It is possible to do use management in an eventually consistent data store. We do it. It works under the following assumptions:
Conflicts shouldn't happen and when they do there's a clear path to conflict resolution. If the account ID is a person's email address, then if two separate people try to register under the same email there's a bigger problem here. What we do in this case is block both new accounts as soon as the conflict is discovered and send an email to the address in conflict explaining to the user that there's an issue (possible fraud). You can either ask the user to reset to the account or ask them to contact support.
Repeated access by the same user within the timeframe in which the data is inconsistent go to the same replica. For instance, if a person just registered and the next request is a login, you must validate that login against the data replica where the new registration details exist. So if the eventual consistency is due to multiple data centers in different geographic locations and under normal conditions a request goes to the closest data center geographically, you're OK.
There are some edge cases, such as if a user registered against one data center, then that center crashed, and now the user cannot login even though he still can see the application - served from some other data center. You can calculate the expected frequency of this case based on your number of daily new users and average data center downtime. Then decide whether it's worth worrying about one user in a (million/billion/whatever your number is) having a problem and possibly contacting support. I faced the same decision not long ago and decided that from a cost-benefit perspective the answer is no.

Keeping track of whether an email has been opened

I'm using rails 2 for this app, with ActionMailer, but this is a general question about emails.
When we send out emails, i save a record corresponding to the email in a database table. I'd like to keep track of whether people have read the emails, and am wondering the best way to do it. On initial googling, it seems like i've stumbled into an ongoing battle between spammers and email clients!
My first thought was to use the "read receipt" header, but i know that this isn't supported by a lot of clients and is therefore unreliable. After that, i read of the tactic of including an image in the mail, and of detecting that image being loaded. I was thinking that i could put a parameter with the email record's id in the image url, so that when i get a request for that image i can see if it has a (for example) email_id param and if so, mark the corresponding email as having been read.
But, then i remembered that many clients are wise to this tactic and specifically ask the viewer of the mail if they want to display images. Obviously they might say no.
Am i right in thinking that i can't pull in other resources, such as stylesheets, in my mail? Because if i can pull them in, i could do that same trick but with the stylesheet rather than an image.
Grateful for any advice, max
Externally-hosted stylesheets are generally treated the same way as images. The client will not download them without prompting the user, if that works at all with HTML-formatted emails.
One thing to consider- you're looking to determine whether the email was read, not necessarily just received, right? Format your email so that it can't be easily read without viewing the images, and include a "view in browser" link at the top. Track image and page-format views and I think you'll have a fairly reliable way to measure actual reads.
Bit late on this, but we've got a similar problem.
We're tracking the links to our site that are included within the email. We're doing this by, like you, having a DB record per email sent out. We've generated a unique hash key per email and are including that as a parameter on all the links included in the email.
We simply then have a before_filter that looks for the parameter and records the fact against the correct email record by using the unique hash to identify the correct one.
We use a unique hash key (rather than the DB's primary key) just so it is a little bit more secure / reliable.
Obviously this method only helps us track the clicks our emails have generated (and not if they've been read) but it is still useful as we can see which of ours users has clicked on which links.
We are having major problems with this as well.
We have task wek portal, where users create tasks (like paint my house) and then we invite painters to give the task creator an price on painting his house.
For that we had a very advanced email system, that sends an invitation and if they accept the invitation we send them the contact info of the task creator.
We need to be able to track if the email was opened, and then once it's opened, we know that the company got the contact info, and we can now send another email to the task creator, telling them that they can expect to be contacted by that company.
The problem is that tracking if the email was opened is not reliable at all. There are different systems for this like msgtag (which does not support a wide range of mail clients like yahoo and other major clients) and our email API client (elastic email) even offer some API call back functions to tell us if each email was opened or bounced or whatever. But again, it's not reliable. To track if it's open, elastic email just includes a 1x1 px image and track if it's opened. So if people don't click "show images in this email" it's not tracked as opened.
So basically we are down to two options.
Have vital portions of the content printed on images, that they have to view to get the info we want to track if they got (in this case contact info)
Just have a link in the email "click here to get the contact info" and then track if that is clicked.
So in conclusion, the "track if opened" is totally useless and unreliable, unless you can fully control which email clients your recipients are using and how they are using them (like if they are all your employees or something).

What are the pros and cons of using an email address as a user id? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 months ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm creating a web app that requires registration/authentication, and I'm considering using an email address as the sole user id. Here are what I see as the pros and cons (updated with responses):
PROS
One less field to fill out during registration (it would just be email address, password, and verify password). I'm a big fan of minimalistic registration.
An email address is easier to remember. (thanks Mitch, Jeremy)
You don't have to worry about your favorite username being taken already - you're the only one who uses your email address. (thanks TStamper)
CONS
User has more to type every time they log in.
What if a user wants multiple accounts? They'll need another email address. (Do I even want a user to be able to create multiple accounts?)
Easy for a potential attacker to guess (if they know the target's email address, they know the login id). (thanks Vasil)
Users may be tempted to use the same password they use for their email account, which is bad security. (thanks Thomas)
If you change email addresses frequently, it may be difficult to remember which address you used to sign up for a site after a long hiatus. (thanks Software Monkey)
A hacker could spam the registration form and use "email already taken" responses to generate a list of valid emails. (thanks David)
Not everyone has an email address. (thanks Nicholas)
If I went with email as id, I would provide a mechanism to allow it to be changed in the event a user changes address. In this case users would not be posting content to a public site, so a separate username won't be necessary to protect the email addresses (but it is something to consider for other sites).
Another option is to implement OpenID (which is a whole other debate).
This seems to work for Google, but their services are tightly integrated. What have I missed in my analysis? Do you have any recommendations? Does anyone have experiences to share?
FINAL EDIT
Thank you all for your responses. I have decided to use email as an id, but then allow the creation of a username for login purposes after registration. This allows a little flexibility while keeping registration as short as possible. It also prevents problems when a user changes email addresses (they can just log in with their username and update it). I will also be implementing methods to prevent brute-forcing of email addresses out of the registration and login systems (mainly a cool-down period after repeated attempts).
Personally, I prefer just using my email address as a username. It's one less thing to remember, and I never have to worry about my preferred name being already taken.
Just my 2 cents!
I think you missed a PRO:
Users are likely to remember their email address; and as email addresses are unique, they never have to worry about their preferred username being taken already.
As a user of websites, I can tell you that I hate memorizing unnecessary usernames. I don't use a unique handle or anything so I can never remember which variation of my name I used that wasn't already taken. I'd much rather type my email address.
Also, I like OpenID.
CONS
When the same password is used for the e-mail account, compromising the one automatically means compromising the other.
CON: Not everyone has an e-mail address. Consider if your database is ever accessed by an internal application. If you are running a store, people will call up and place an order by phone and refuse to provide an e-mail address. So while having an e-mail address as the default user ID is cool, be sure to allow alternates to get into the system. (Of course, this depends on the context.)
Learned this one the hard way.
I tend to not prefer pro/con lists, and instead try to think of benefits and challenges.
Challenge:
Some users will be tempted to use their email address from their ISP. Linking to an email alone, may be difficult for the users who forget to update their email in all the web sites they have signed up for before they change ISPs.
Instead:
You should consider allowing a user to provide multiple addresses, as well user-selected id and then let the user decide what they want they wish to do. Perhaps also consider allowing the user to provide an OpenID account.
CON: If I change my email address, suddenly all my account names are invalid. My name doesn't change, but my email often does. I have occasionally revisited a site after a number of years, and been stuck... what was my email address two years ago???
One setup you may want to consider: Have both a username and an email. The email is used to login and is always kept private, the username is used to identify the user in any public interaction, such as posting a comment. It winds up being slightly more secure as both halves of the user login credentials are kept private, whereas if you use a username for both login and public identification, half of the login is already known.
I definitely agree with you about having minimal registration for most cases, but depending on what you're doing you may want to balance that against added security for your users. Four fields isn't outrageous for registration, (username, email, password, confirm password), and if you're feeling particularly adventurous, you could cut it down to three by dropping the confirm password field, or two by emailing them a password that they can change later.
PRO
People hate having to create a unique name that fits their id and that has not already been taken to register for a site..So this is why the user id as EMAIL ADDRESS is so embraced.
ex:TStamper1930, who actually wants to remember 1930 at the end of my name that I really wanted
CON: If a hacker can try registering random email addresses en masse, he or she will be able to figure out which of those addresses are valid based on which registrations fail. This is a tactic that can be used to put together lists of known valid email addresses, which are a hot commodity on the spam black market.
Although now that I think about it, that's a problem that affects any website which asks for an email address as part of the registration process, regardless of whether or not there's a separate username. But it's still something to think about.
Stick to email addresses they are used everywhere, actually most of the major websites use them, they are unique so they save the user from struggling to find a name that's not used by others, also users won't forget their email addresses (in most cases at least :)), which is unlike usernames that they will keep on forgetting if they don't visit your site very often.
You shouldn't be worried about them being too long as all the major browsers (IE, FF .. etc) offer autocomplete to forms which is enabled by default, so you type the first letters in your email and you get a drop down list (ie. autocomplete list) where you just click to enter the whole email, personally I almost never type the email address in full, I always type the first letters then select the email from the autocomplete drop down list. Besides, if you allow users to be remembered (using a Remember Me checkbox and persistent cookies), it will be another reason to not worry about it.
I don't know about your app but usually users having multiple accounts is not desirable in most apps.
One con might be that if it's an email address the login can be guessed by people and brute force attacks attempted. Which is not really a big issue, since on most sites today the logins are publicly displayed.
The biggest pro is that logins are easier to remember this way.
A good setup is to require username and email. Allowing the user to login with either email address or username is very user friendly. An added benefit is the user can change their email address. It would also allow multiple accounts for one email.
To solve your con item of the email being too long to type in every time. I have implemented the StringScan Ruby library.
require 'strscan'
def signup!(user, &block)
self.email = user[:email] unless user[:email].blank?
str = StringScanner.new(self.email)
str.scan_until(/#/)
str.pre_match
self.login = str.pre_match
etc..
Then just change login method to allow either email or login to match password.
This works just like google or mobileme. A user can choose to just type in their email username (ie. username instead of username#gmail.com.)
I'm fighting with removing this right now. Here's a newer CON from the current era.
An email address is considered Personal Identifiable Information (PII) by many governments. Hence extra care needs to be taken any time you display it on a page, or even return it from an end-point.
Consider that many sites allow interactions between different users. This often means the site will provide a list of users to choose from (e.g. a drop-down list, or search results). This ca actually enable the leaking of PII by the site.
Usernames, on the other hand, can be completely anonymous. Given the prevalence today of password managers, users really don't need to actually remember their username and password.
If you don't care about forcing your users to login to your application with Facebook or some other social network (most people don't seem to care), then you can just use their social network email as their 'user id' when referencing other tables/documents (MySQL, Mongo, etc).
I've noticed the bonus to using social media logins is that all the security has been taken care of by said social network, including not allowing 2 users to have the same email or username in their database thus saving you the hassle of having to code for all of that. This is just my personal preference.