Time taking for 6 digit pin / otp brute forcing with Burp intruder - brute-force

I would like to know , what will be the approx. time taken for brute-forcing a 6 digit pin ( in my case it is OTP ) using the Burpsuite intruder option. Is there any method to calculate.
Since there will be an expiry time for the otp (say 5minutes) - will it be possible to find the right code within this specific timeperiod keeping in mind that there will be a combination of around 8999999 combinations.
Thanks in advance.
Also I'm expecting any other suggestion appart from burp with intercepting traffic and intruder option.

Related

Picking a check digit algorithm

I am generating random OTP-style strings that serve as a short-term identifier to link two otherwise unrelated systems (which have authentication at each end). These need to be read and re-entered by users, so in order to reduce the error rate and reduce the opportunities for forgery, I'd like to make one of the digits a check digit. At present my random string conforms to the pattern (removing I and O to avoid confusion):
^[ABCDEFGHJKLMNPQRSTUVWXYZ][0-9]{4}$
I want to append one extra decimal digit for the check. So far I've implemented this as a BLAKE2 hash (from libsodium) that's converted to decimal and truncated to 1 char. This gives only 10 possibilities for the check digit, which isn't much. My primary objective is to detect single character errors in the input.
This approach kind of works, but it seems that one digit is not enough to detect single char errors, and undetected errors are quite easy to find, for example K37705 and K36705 are both considered valid.
I do not have a time value baked into this OTP; instead it's purely random and I'm relying on keeping a record of the OTPs that have been generated recently for each user, which are deleted periodically, and I'm reducing opportunities for brute-forcing by rate and attempt-count limiting.
I'm guessing that BLAKE2 isn't a good choice here, but given there are only 10 possibilities for the result, I don't know that others will be better. What would be a better algorithm/approach to use?
Frame challenge
Why do you need a check digit?
It doesn't improve security, and a five digits is trivial for most humans to get correct. Check if server side and return an error message if it's wrong.
Normal TOTP tokens are commonly 6 digits, and actors such as google has determined that people in general manage to get them orrect.

Is there any speed benefit to performing your own algorithm to scramble IDs for security purposes? [duplicate]

I am planning to implement my own very simple "hashing" formula to add a layer of security to an app with multiple users. My current plan is as follows:
User creates an account at which point an ID is generated on the backend. The ID is run through a formula (let's say ID * 57 + 8926 - 36 * 7, or something equally random). I then send back to the frontend the new user ID and the new "hashed" number and store them in localStorage.
User tries to access a secured area (let's say a settings page so they can change their own settings).
I send the backend two values: their ID and the hashed number. I run the ID through the same formula to check it matches the hashed value I've received. If the check passes, they can get in. So if someone has tried, say, changing their ID in localStorage to get access to another user's settings page, the only way they could achieve that is if they guess what the formula was. They could easily guess a user ID, but guess that the corresponding number is the result of ID * 57 + 8926 - 36 * 7 seems pretty unlikely.
I'm doing this because it would be quicker/cheaper than a db lookup for an actual hashed value... I think? Would it make more sense to use a package to create some kind of primary key/uuid instead of "hashing" my own value and doing a db lookup each time?
Tech stack: React on FE, Python on BE, SQL db.
I see a lot of posts saying "don't roll your own" -- is this absolute?
Yes it is. The reason being that whenever a non-cryptographer tries their hand at developing their own algorithms, they invariably fall into a multitude of pit holes which render the security of the algorithm to next to useless.
Your particular scheme, for example, can be trivially broken given two consecutive ID and "hash" pairs. (It's a simple arithmetic sequence, deriving the formula of an arithmetic sequence given two consecutive values is grade ~6 level math.)
I'm doing this because it would be quicker/cheaper than a db lookup for an actual hashed value...
The performance difference would probably be negligible. Don't worry about it.
If the information is not particularly sensitive, just assign each user a randomly generated 128 bit number. The chances of someone guessing a valid user's number are practically zero.
Two property of real hashes that you are missing with this are
a simple change in the input causes a large change in the output
all hashes have the same length
This could be a problem if a user somehow knows their own id and hash.
With your selfmade hash I could easily find out the hash of a random other user by reverse engeniering the hash.

How can OTP (one time password) be protected against brute force attacks?

We have a feature in our application that ask for a six digit OTP before doing certain functions. It is sent via SMS and expiration is 5 mins. There has been an internal penetration test that exposed that this is vulnerable to brute-force attacks. What can we do programmatically to prevent this?
Use a long text for OPT like 6-10 chars long. Which will provide a lot of combinations factorial(N). Which will be a very big number that no ordinary system can guess that OTP in 5 minutes.
Use not only numbers but also characters which can make your OTP more strong.

Why is my identifier collision rate increasing?

I'm using a hash of IP + User Agent as a unique identifier for every user that visits a website. This is a simple scheme with a pretty clear pitfall: identifier collisions. Multiple individuals browse the internet with the same IP + user agent combination. Unique users identified by the same hash will be recognized as a single user. I want to know how frequently this identifier error will be made.
To calculate the frequency, I've created a two-step funnel that should theoretically convert at zero percent: publish.click > signup.complete. (Users have to signup before they publish.) Running this funnel for 1 day gives me a conversion rate of 0.37%. That figure is, I figured, my unique identifier collision probability for that funnel. Looking at the raw data (a table about 10,000 rows long), I confirmed this hypothesis. 37 signups were completed by new users identified by the same hash as old users who completed publish.click during the funnel period (1 day). (I know this because hashes matched up across the funnel, while UIDs, which are assigned at signup, did not.)
I thought I had it all figured out...
But then I ran the funnel for 1 week, and the conversion rate increased to 0.78%. For 5 months, the conversion rate jumped to 1.71%.
What could be at play here? Why is my conversion (collision) rate increasing with widening experiment period?
I think it may have something to do with the fact that unique users typically only fire signup.complete once, while they may fire publish.click multiple times over the course of a period. I'm struggling however to put this hypothesis into words.
Any help would be appreciated.
Possible explanations starting with the simplest:
The collision rate is relatively stable, but your initial measurement isn't significant because of the low volume of positives that you got. 37 isn't very many. In this case, you've got two decent data points.
The collision rate isn't very stable and changes over time as usage changes (at work, at home, using mobile, etc.). The fact that you got three data points that show an upward trend is just a coincidence. This wouldn't surprise me, as funnel conversion rates change significantly over time, especially on a weekly basis. Also bots that we haven't caught.
If you really get multiple publishes, and sign-ups are absolutely a one-time thing, then your collision rate would increase as users who only signed up and didn't publish eventually publish. That won't increase their funnel conversion, but it will provide an extra publish for somebody else to convert on. Essentially, every additional publish raises the probability that I, as a new user, am going to get confused with a previous publish event.
Note from OP. Hypothesis 3 turned out to be the correct hypothesis.

How Does The Google URL Shortener Generate A 5 Digit Hash Without Collisions

How can the Google URL shortener generate a unique hash with five characters without collisions. Seems like there are bound to be collisions, where different urls generate the same hash.
stackoverflow.com => http://goo.gl/LQysz
What's also interesting, is the same URL, generates a completely different hash each time:
stackoverflow.com => http://goo.gl/Dl7sz
So, doing some math, using lower-case characters, upper-case characters, and digits, the total number of combinations are 62^5 = 916,132,832 clearly collisions bound to happen.
How does Google do this?
They have a database which tracks all previously generated URLs and the longer URL that each of those maps to. Easy to make sure that newly generated URLs don't already exist in that table. A little tricky to scale out (they surely have multiple servers so each one needs to be assigned a bucket of values from which it can give out to users). If they ever reach the point of having generated 916,132,832 URLs, they'll just add another character.
They have a hash table with hash to url.
Count the number of rows in that table and encrypt it with a stream cipher then encode with base62.
Using a stream cipher instead of a hash will give you a short pseudo random output that doesn't collide with any previous output so you don't need to check the table.
It keeps track of previously used long URLs. This means that, when someone goes to create a short URL, if the place they are pointing to already has a short URL, it will just give them the pre-existing short URL.
Actually, it would be inefficient to have a system dedicated to creating 'hashes' based on a given set of data. Rather, the short URL is simply a random set of characters which has already been identified as ten digits, plus 26 lowercase letters, plus 26 uppercase letters = 916132832 permutations (not combinations). Random short URLs is the most efficient way to make it work, and that is why they are always different (though I suppose there could be some other component in the algorithm like the time of day, but I don't think it's worth it....there's no point in making it that complex; spending all of that processing power just to make a silly 5 character string which any monkey could do by pressing a button the right way on a permutation calculator).