Restrict Users from Programmatically posting form data - forms

I have a very old ASP.net Application with a Web Form with 1 Dropdown Box and 2 Text Boxes and a Submit Button.
All 3 are mandatory fields. Based on the data entered, once the user clicks Submit Button additional details are shown on the next page from the database.
On Submit data is posted via Query String that looks like
http://myserver/myapp/search.aspx?f1=1&f2=tom&f3=sales
Though the application is doing what is supposed to do, off late we came across lot of issues:
As couple of entities that are interested in our data wrote programs to programatically build the querystrings and hitting our server.
This is slowing down the server and regular users who manually search records are facing lot of slowness.
Due to come legal restrictions we couldn't implement CAPTCHA or have users get authenticated.
I would appreciate if you can let me know if any of you have come across this kinda situation and how you have dealt with it.
Thanks in advance.

You could implement source-based rate limiting. I.e. per IP address only allow so many requests per minute. If the requester makes too many requests you simply reject the requests. You could also blacklist the IP addresses that are hitting your app too aggressively. Both of these policies can be enforced by a load balancer like HAProxy or nginx.

Related

Are URLs in emails indexed by search engines so they become publicly searchable?

I have read a few questions on here about e-mail clients prefetching URLs in e-mails. An answer to this seems to be to add a new confirmation page, where the user has to click a button to confirm the desired action.
But, this answer states the following:
As of Feb 2017 Outlook (https://outlook.live.com/) scans emails
arriving in your inbox and it sends all found URLs to Bing, to be
indexed by Bing crawler.
This effectively makes all one-time use links like
login/pass-reset/etc useless.
(Users of my service were complaining that one-time login links don't
work for some of them and it appeared that BingPreview/1.0b is hitting
the URL before the user even opens the inbox)
Drupal seems to be experiencing the same problem:
https://www.drupal.org/node/2828034
My major concern is with this statement:
As of Feb 2017 Outlook (https://outlook.live.com/) scans emails
arriving in your inbox and it sends all found URLs to Bing, to be
indexed by Bing crawler.
If this is the case, any URL in an e-mail meant to confirm an action, e.g. confirming a login, subscription, or unsubscription, can end up searchable in a search engine, if that's whats meant by indexed in the quote above. In this case, it's Bing. Not even a dedicated confirmation page where the user confirms the desired action truly mitigates this.
Scenario #1
If I email the user a login link with a one-time token in the URL, that URL will end up in Bing. This token will have a short lifetime, lets say 5 minutes, so I doubt anyone will manage to search on Bing and find the URL before the user clicks it or it expires.
Scenario #2
The user gets an e-mail with a link to confirm a subscription. This link is perhaps valid for 24 hours. This might(?) be long enough for someone else to stumble over the link on a search engine and accidentally (or on purpose) confirm the subscription on behalf of the user.
Scenario #2 is not uncommon, it's even best practice to use double opt-in as far as I am aware.
Scenario #3
Unsubscribe URLs in the bottom of newsletters. Maybe valid for forever? You don't want this publicly searchable in an search engine.
Assume all the one-time confirmation links land on a confirmation page where the user confirms the desired action.
Is it truly the issue that URLs in e-mails are indexed by search engines, at least Bing? And will they actually end up publicly searchable? If not, what is meant by indexed in the quote above?
I'll add for the sake of completion that I don't think I've had much of a problem with this in my own use of the web, so my gut feeling is that this is unlikely the case.
Is it truly the issue that URLs in e-mails are indexed by search engines, at least Bing?
I can't definitely say if they are being indexed or not, only Bing could answer this question, but they are surely being visited, at least with a simple GET request. I just tested this sending myself a link to a page on my website that logs the requests that are made against it, and indeed I'm seeing a GET coming from 207.46.13.181 (reverse DNS says msnbot-207-46-13-181.search.msn.com), which suggests that an automated program from search.msn.com is crawling the link. This leads me to believe that yes, they are trying to index the link's content somehow, but it's only my opinion really.
And will they actually end up publicly searchable? If not, what is meant by "indexed" in the quote above?
Well, again, impossible to say unless you work for Bing. In any case, "indexing" means exactly what you think it does: parsing the content of a page to potentially include it in search results.
The real question here is: does this somehow represent a security problem or will it compromise my website's functionality?
It surely has the potential to: if your confirmation/reset/subscription/whatever process only relies on a single GET request with the appropriate GET parameter, then you should definitely revisit the strategy, as it obviously allows anyone to perform the action (even maliciously for example enumerating possible IDs for your GET parameters).
If the link you are trying to send contains sensible information or can be used to alter important data for an user of your website, then you should at least put it behind a login page only giving access to the interested user. This way, anyone who wants to access it (including search engines) will be redirected to a login page if not already logged in.
If the link you are trying to send is just some kind of harmless confirmation link (e.g. subscribe/unsubscribe from a newsletter), then at least use a form inside the web page to do the actual confirmation through a POST request (possibly also using a CSRF token), otherwise you will unequivocally end up with false positives.

Open REST API attached to a database- what stops a bad actor spamming my db?

I'm a client side developer with little experience of server side, and I'm struggling to understand how to make a database-backed website without requiring users to login.
The usecase is fairly straightforward. The user lands on a website, uploads an image, and performs some processing to that image. Clicking 'share' POSTs JSON to my endpoint, stores it in a DB, and returns a unique URL in a textbox (eg, https://example.com/art/12345) which allows the user to share their artwork with others, or just to come back and do more editing later on.
What stops somebody from doing, POST <data> https://example.com/art 100 million times and filling my pay-as-you-go database?
I've seen examples of this link based method of sharing between users on plenty of sites but I don't understand how to stop abuse, or whether it is safe to just open up an API which allows writes to a database. I do not want users to have to login.
I believe the simplest method is having a quota, either by username for logged in users or by IP, if you don't require logins or only want to allow free usage to a certain point. Perhaps you could have a smaller quota for non-logged in users than for logged in users and even larger for paying users.
Your server side code that handles the POSTS and storing data into the database would have to take care of that. I'd add it to a user_data table on mine, making an additional column that tracks total space used. makes a todo
Then, when the user adds new data, increase the total space used. When they delete old data (I have versioned web pages so that eventually, the user will be able to rollback to previous versions) then the space used decreases. Having another page to look at to see where they're using space makes deciding what to delete to stay under a quota of X MB's/GB's/TB's/etc easier or maybe just an /api/delete_old_pages or notes or comments or all of the above.

Anti-spamming in Google Forms using some form of Captcha

I have a Google Form that works in conjuction with a facebook Page of mine, that has become quite popular. Recently, I have become the victim of spamming on the form, to the point that I have to constantly hop on different forms as the flooded ones get unusable.
As Google Forms does not implement reCaptcha elements, does any of you know of any effective ways to avoid spamming on your forms from a bot/program? My form gathers 300 responses daily (normal traffic, spamming takes this to ~10,000 a day, making the form unusable), and I would like to avoid forcing users to log-in using google and limiting their responses to 1 per user.
Things I have tried:
Shuffling question order (there's 4 of them, the spammer/attacker
program can just parse question titles, and see where to submit what)
Placing a second section on the form, even if it just contains a "submit response" button and nothing else
Place a captcha-like question with data validation (Usually a "what's 10+2" or similar, but as I have to change it manually, it's a chore to do it every day) and the spammer can easily set his program to input the new answer and spam thousands of responses in a single hour.
Forcing users to log-in and limiting them to 1 response per, by using the "Limit responses to 1 per person" setting, but that goes against the goal of the form, which should store as many responses from one user as he sees fit (probably 2 or 3 per user, per week).
Reporting a problem with my form to Google. Heh.
Any ideas?

How websites like Facebook are protected against bot without any captcha

How websites like Facebook and Twitter are protected against bot during registration? I mean, there's no captcha at all on the signup form?
I want to create a signup form for a project, and I don't want bot during registration and Captchas are often ugly..
edit:
My question is really during the registration because I know Facebook uses Captchas once registred for the first time.
Facebook uses some sort of hidden spam protection, if you view source of sign-up form you will see things like:
class="hidden_elem"><div class="fsl fwb">Security Check</div>This is a standard security test that we use to prevent spammers from creating fake accounts and spamming users.
so capture becomes visible when javascript will think that you are a bot.
Where is few methods of making it harder for bots to complete registration without capture, things
like timing to fill out form, originators of mouse clicks events ect.
also random session based values in form (to privent direct submissions without downloading of the form first)
also some people use hidden form elements with common names like 'email' that is styled invisible in css but common simple bots will try to fill out all form fields and so you can block them if this hidden element have any value
twitter and fb spend lot of time on developing tecniques to block spammers i don't think they will made it public as it will be counter productive for them to fight the spammers.
But all the client side javascripts you can download from fb or twitter and study them if you want, because most of the protection will happen inside client not on server.
server could only issue some random session variable, check for valid headers in request, overall time etc. its really limited.
some sites are also use ajax exchanges between server and client during the time when user is filling out the form , mostly just to make it harder for bot developer to do simular fake exchanges of data.
Anyway, unfortunatelly where is no easy solution to do decent protection , espesially without captcha or some kind of question
also,
for submit button you can use image map instead of button,
you can dynamically create big image with a submit botton image drawn on it at random position using things like GDI in PHP and using css to display only portion of that image with the actuall button, and on server side check X and Y position of where mouse was clicked, this will be hard for bots to break.
Unless they use real browsers and just emulate keyboard and mouse. Anyway , as i said unfortunatelly where is no easy solution.
One way would be to send a verification to the user's email address or cell phone and obtain verification (so in that case, you would have to allow only one email address or cell phone per account)
Another option is to use "Negative CAPTCHA" or "Honeypot Captcha"
I don't know how Facebook and Twitter do it, but if you want to create something simple and that doesn't interfere with your site aesthetics, I know that some websites just ask the user to enter an answer to a simple math problem like "what is 2 + 3?". This is not the most secure way to do it, but it's just a thought.
Well you can always deploy hardware solutions as well to create Layer 4-7 firewall rules. You can create specific rules to look for the well known agents of bots crawling the web. However to stop newly created bots you need to know what agent they are using for the bot.
Since you don't want CAPTCHA, you can use Keypic - keypic.com - which is an invisible protection, no CAPTCHA needed. It's an efficient antispam method for any web form. Site users don't pass any tests which is good for the site as it improves the quality of the user experience and thus raises user engagement. The solution is a kind of an expert system which analyses the behaviour of the users and checks the databases, then makes a conclusion if the request comes from a legitimate user or a robot.
BTW, Twitter and Facebook still use CAPTCHA for password verification which is a very disputable method in terms of efficiency of such protection.
I had a problem with tons of bots signing up for my Nintendo site so I put a single image of Mario on the sign-up page (making sure nothing in the image data said "Mario") with the text "Who is this? Answer in one word." Haven't had a single bot sign-up since. Not sure if this is actually a good solution though, not sure how smart bots are. I'm kind of surprised that it worked.
In theory it might be keeping out a few legitimate users, but it is hard to imagine many legitimate users of a Nintendo site not knowing who Mario is...

Prevent form data from being cached, and re-accessing with back button

I am considering making a very simple form for clients to use in a sort of web browser kiosk fashion, where they submit some of their information through the computer in the lobby at their option instead of writing something out by hand. This would be used if they come in person rather than calling or going to the web site first. I already have a form on our site for clients to use from their home computers so this would be very similar but tailored for and only used for the in-person clients.
Since the form will sort of just loop back to itself (not really "back" but just have a link to go back to a fresh form) for a fresh form after every client, how can I ensure that one can't hit back a few times to see the previous client's info? It's not really sensitive data, I just would like to provide that bit of privacy. Of course clients using our web site and the form there from their own computer are responsible for their own privacy.
Apart from having customer service walk to the computer and close and reopen the browser, or using AJAX, what should I do?
The other topics I've read related to this all have someone basically saying "you're not supposed to do that, you bad person". This seems like a valid reason to me. Any ideas?
Thanks!
Disable autocomplete by adding autocomplete="off" to the input tags or form tag.