increasing primes producing last digits and order of efficiency - numbers

Adding two increasing primes can produce last digits of 0,2,4,6,8: there are six ways to do this,
1+9=3+7=0; 3+9=2; 1+3=4;7+9=6;1+7=8. Of course, 1+9 is equivalent to 9+1. Efficiency means after finding a large number of the desired last digit, one has reached the smallest prime in doing so. For example, starting at 3,19,23,29,43,59,73,79,83,89,103 I have found ten last digits of 2 by the time I reached 103. Do you think there is an order of efficiency that stays constant for each of the six ways for these last digits? Which of 1+9, 3+7 and so on will be the
most efficient or the second most efficient or the
least efficient.

Related

How to calculate Log of big numbers whose factor is not known , like prime numbers

I want to calculate log for big number. I got only the factorization method when I checked how to evaluate the log of big number. But in my case I don't know the factor of that number and I don't want to calculate factor first.
Is there any option to evaluate log of big number like of 20 digits or more.

quick sort is slower than merge sort

I think the speed of quick sort is less efficient when arranging an array with duplicate data, right? when datatype is char, the bigger the array(over 100000), the closer it gets to the n^2 order.
and assuming there is no duplicate data, to get the best case of a quick sort where the first element is placed as a pivot, first elementsI think we can recursively change the first and intermediate elements by dividing the already aligned array like a merge sort. right? is there general best case?
Lomuto partition scheme, which scans from one end to the other during partition, is slower with duplicates. If all the values are the same, then each partition step splits it into sizes 1 and n-1, a worst case scenario.
Hoare partition scheme, which scans from both both ends towards each other until the indexes (or iterators or pointers) cross, is usually faster with duplicates. Even though duplicates result in more swaps, each swap occurs just after reading and comparing two values to the pivot and are still in the cache for the swap (assuming object size is not huge). As the number of duplicates increases, the splitting improves towards the ideal case where each partition step splits the data into two equal halves. I ran a benchmark sorting 16 million 64 bit integers: with random data, it took about 1.37 seconds, improving with duplicates and with all values the same, it took about about 0.288 seconds.
Another alternative is a 3 way partition, which splits a partition into elements < pivot, elements == pivot, elements > pivot. If all the elements are the same, it's done in O(n) time. For n elements with only k possible values, then time complexity is O(n ⌈log3(k)⌉), and since k is constant, the time complexity is still O(n).
Wiki links:
https://en.wikipedia.org/wiki/Quicksort#Repeated_elements
https://en.wikipedia.org/wiki/Dutch_national_flag_problem

How to count the number of significant digits?

For example, 5.020 would return 4. Preferably, it should work with vector inputs too.
I Googled around and found some answers, but none of them counted the last zero in 5.020.
From the given information, it is not possible.
The problem is that when you enter a number it is (per standard) represented as a double, and thus it has a precision of eps (the entered precision is lost). However, as one is typically not interested in showing all ~15 digits Matlab uses a couple of different display rules which are independent of the originally entered number, this typically involves the integer part plus 4 digits.
Additionally, the standard rule, when converting a number to a string (num2str) is to cutoff trailing zeros. Which is why you do not get the last zero.
Your only option is to count the number of significant digits when you obtain the data. Which leads back to the question #Beaker asks you in the comments

How many 32-bit numbers have five 1's? Combinatorics

For each bit (binary digit) that you have, there are two possibilities: Either it can be a zero, or it can be a one.
Therefore, if you have one bit, you have two possible numbers. If you have two bits, each of them can be either a zero or a one, and since there are two possibilities for the first, and two possibilities for the second, there are 2^2=4 total possibilities.
Similarly, if you have some number n of bits, each of them can be a zero or a one, and there will therefore be 2^n possibilities.
I understand this. Because of this fundamental counting principle, I know that there are 2^32 total combinations of 32 bit numbers, but how many have just five 1's?
How do I go about solving this? Count everything that doesn't include five 1's?
You have 32 bits total. Pick 5 to be "1". Order doesn't matter.
32C5 = 32!/(5!27!) = 201376

Hash a Sequence of positive/negative integers

I have a file with millions of lines (actually it's an online stream of data, which means we are receiving it line by line) , each line consists of an array of integers which is not sorted (positive and negative) there's no limit for the each number and the lengths are different and we might have duplicate values in one line,
I want to remove the duplicate lines (if 2 lines have same values regardless of how they are ordered we consider them duplicate), is there any good hashing function ?
We want to do this in O(n) while n is number of lines (we can assume that the maximum possibele number of elements in each line is constant, e.g. we have maximum of 100 elements in each line)
I've read some of the questions posted here in stackoverflow and I also googled it, most of them were for the cases where the arrays are of the same length or the integers are positive or even or they are sorted, is there any way to solve this in general case ?
My solution:
First we sort each line with the use of O(n) sorting algorithm e.g. Counting sort , then we put them into a string and then we use md5 hashing to put them into a hashset. If it's not in the set we put it into that set, if it's already in the list we check the arrays with the same hash value.
Problem with the solution : sorting using the Counting Sort takes a lot of space as there's no limit for the numbers and the collisions are possible .
The problem with using a hashing algorithm on a set of data this large is that you have a high probability of two different lines hashing to the same value. You want to stay in O(n) but I am not sure that is possible, with the size of the data and accuracy needed. If you use heapsort, which is space efficient and then traverse down the new sorted data removing consecutive lines that are the same you could accomplish this in O(nlogn)