Why Geohash named `xxxhash` when it's actually an encoding algorithm? - hash

A geohash is a convenient way of expressing a location (anywhere in the world) using a short alphanumeric string, with greater precision obtained with longer strings.
When I learn it the first time I was confused by its name. It's totally different to the other hashing algorithm, it keeps the information of the location. It's actually not a hashing algorithm but an encoding algorithm.
So how the algorithm named? Why it called Geohash?
Comments
To see the different between Encoding and Hashing you can click here: Encoding vs. Encryption vs. Hashing vs. Obfuscation
To see the Geohash algorithm in Java you can click here: Geohash Encoding and Decoding Algorithm

Related

How are embeddings used for fully homomorphic encryption?

How exactly do you perform one way encryption using embeddings from a deep neural network?
Fully homomorphic encryption (FHE) benefits society by ensuring full privacy. The Private Identity recognition algorithm uses FHE to enable encrypted match and search operations on an encrypted dataset without any requirement to store, transmit or use plaintext biometrics or biometric templates. The biometric data is irreversibly anonymized using a 1-way cryptographic hash algorithm and then discarded without the data ever leaving the local device.
My question is how exactly does this use embeddings to accomplish this? Where do embeddings come in?
An embedding is a set of floating point numbers taken from the N-1 layer of a softmax Deep Neural Network (DNN). Initially, the community used DNNs to get a resulting class (softmax), but an interesting property turned out to be the values at the layer before the softamx layer.
These values have interesting properties. They may function as a 1-way encryption. They also closely relate to the initial input. In a geometric distance (cosine, Euclidean) values are close to similar inputs. This means two pictures of my face will be closer (geometrically) than a picture of two different people This property allows operations on the resulting encryption.
One of the operations allowed is match. In the encrypted space, using the distance properties, we can match using only the embedding. Since we are only working in the encrypted space, we have an implementation of FHE and the embedding comes from the DNN.
Subsequently, we have found that a second DNN allows the classification, but only using embeddings. We now have privacy and performance.

Why should we use bag of visual words (or vlad) instead of storing descriptors?

I have read a lot about image encoding techniques, e.g. Bag of Visual Words, VLAD or Fisher Vectors.
However, I have a very basic question: we know that we can perform descriptor matching (brute force or by exploiting ANN techniques). My question is: why don't we just use them?
From my knowledge, Bag of Visual Words are made of hundreds of thousands of dimensions per image to have accurate representation. If we consider an image with 1 thousand SIFT descriptors (which is already a considerable number), we have 128 thousands floating numbers, which is usually less than the number of dimensions of BoVW, so it's not for a memory reason (at least if we are not considering large scale problems, then VLAD/FV codes are preferred).
Then why do we use such encoding techniques? Is it for performance reasons?
I had a hard time understanding your question.
Concerning descriptor matching, brute force, ANN matching techniques are used in retrieval systems. Recent matching techniques include KDtree, Hashing, etc.
BoVW is a traditional representation scheme. At one time BOVW combined with Inverted index was the state-of-the-art in information retrieval systems. But the dimension (memory usage per image) of BOVW representation (upto millions) limits the actual number of images that can be indexed in practice.
FV and VLAD are both compact visual representations with high discriminative ability, something which BoVW lacked. VLAD is known to be extremely compact (32Kb per image), very discriminative and efficient in retrieval and classification tasks.
So yes, such encoding techniques are used for performance reasons.
You may check this paper for deeper understanding: Aggregating local descriptors into a compact image
representation.

Calculation of hash of a string (MD5, SHA) as a basis for CPU benchmarking

I know that there are many applications and tools available for benching the computational power of CPUs especially in terms of floating point and integer calculations.
What I want to know is that how good is to use the hashing functions such as MD5, SHA, ... for benchmarking CPUs? Does these functions include enough floating point and integer calculations that applying a series of those hashing functions could be a good basis for cpu becnhmarking?
In case platform matters, I'm concerned with Windows and .Net.
MD5 and SHA hash functions do not use floating point at all. They are completely implemented using discrete math

Are there architectures which are not using two's complement for representation of negative values?

The benefits of using the two's complement for storing negative values in memory are well-known and well-discussed in this board.
Hence, I'm wondering:
Do or did some architectures exist, which have chosen a different way for representing negative values in memory than using two's complement?
If so: What were the reasons?
Signed-magnitude existed as the most obvious, naive implementation of signed numbers.
One's complement has also been used on real machines.
On both of those representations, there's a benefit that the positive and negative ranges span equal intervals. A downside is that they both contain a negative zero representation that doesn't naturally occur in the sort of integer arithmetic commonly used in computation. And of course, the hardware for two's complement turns out to be much simpler to build
Note that the above applies to integers. Common IEEE-style floating point representations are effectively sign-magnitude, with some more details layered into the magnitude representation.

AES Algorithm S Box uniqueness

This is regarding AES algorithm.
Suppose i have implemented a AES algorithm and encrypt data using my algorithm. Now suppose somebody else also has implemented the same AES algorithm (128 bit). Now if i encrypt a data using my algorithm is it possible for decrypting the data and getting back the original data using the second algorithm that the other person has developed. What is the underlying difference in the algorithms.
Is it something related to S-BOX
Thanks
AES is a specified algorithm. If you have two different implementations they both should be able to encrypt and decrypt without any difference. If there is a difference then at least one of them wouldn't be AES.
For such things you
Either assume all implementations of an encryption algorithm you want to be interoperable with are correct, including yours.
Or don't reinvent the wheel unless you actually want to learn something about wheels.