Lexicographic ordering of triplets of integers in Matlab - matlab

I have the following problem: I have an array of N integer triplets (i.e. an Nx3 matrix) and I would like to order it lexicographically in Matlab. In order to do so I thought of using the built-in sort algorithm of Matlab, but I wanted to ask if the way I thought of doing it is right or if there exists a simpler way (preferably using Matlab routines).
I thought of converting every triplet into a single number and then sort these numbers with sort(). If my integers were between 0 and 9, I could just convert them into decimal. However, they are bigger. If their maximum absolute value is M, I thought of converting them into the (M+1)-ary system like this: if (a,b,c) the triplet, the corresponding integer is a*(M+1)^2+b*(M+1)+c. Would sorting these transformed integers solve the problem, or am I making a logical mistake in my reasoning?
Thank you!
PS: I know that sort() in Matlab does have a lexicographic option for strings, but my integers do not have the same digit length. Maybe padding them with leading zeros and concatenating them would do the trick?

Have you considered using sortrows?
Should enable you to straight-forward sort your 3-columns of data lexicographically.

Related

Purpose of matrix length

Matlab defines the matrix function length to return
Length of largest array dimension
What is an example of a use of knowing the largest dimension? Knowing number of rows or columns has obvious uses... but I don't know why someone would want the largest dimension regardless of whether it is rows or cols.
Thank You
In fact, most of my code wants to do things exactly once for each row, for each column or for each element.
Therefore, I typically use one of these
size(M,1)
size(M,2)
numel(V)
In particular do not depend on length to match the number of elements in a vector!
The only real convenience that I found {in older versions of matlab} for length is if I need a repeat statement rather than a while. Then it is convenient that length of vectors usually returns at least one.
Some other uses that I had for length:
A quick rough check whether something is big.
Making something square as mentioned by #Mike
This question addresses a good point and I have seen programs fail because of applying the length command on matrices (for looping). Especially when one expects to get size(M, n) because the n-th dimension should be the largest. In total, I can not see an advantage of allowing length to be applied on matrices, in fact I only see risks from probably unexpected behavior.
If I want to know the largest dimension of any matrix, I would prefer to be more explicit and use max(size(M)), which also should be much clearer for anyone reading this code.
I am not sure, whether the following example should be in this answer, but It somehow addresses the same point.
It is also useful to be explicit with dimension, when averaging over matrices. Consider the case, where you always want to average over the first dimension, i.e. over the columns of a matrix. As long as your matrix is of size n x m, where n is greater than 1, you do not have to care about specifying a dimension. But for unforseen cases, where your matrix happens to be a row-vector, things get messy:
%// good case, where num of rows is 2 or greater
size(mean(rand(2, 4), 1)) %// [1, 4]
size(mean(rand(2, 4))) %// [1, 4]
%// bad case, where num of rows is 1
size(mean(rand(1, 4), 1)) %// [1, 4]
size(mean(rand(1, 4))) %// [1, 1], returns the average of that row
If you want to create a square matrix B that can contain the input matrix A which is non-square, you can take the latter's length and use it to initialize the matrix B with zeros where the rows and columns would be of A's length, then copy the input matrix into the new zeroed matrix.
Another example - the one I use most - is when working with vectors. There it is very convenient to work with length instead of size(vec,1) or size(vec,2) as it doesn't matter if it is a row or a column vector.
As #Dennis Jaheruddin pointed out, length gave wrong results for empty vectors in some versions of MATLAB. Using numel instead of length might therefore be convenient for better backward compatibility. The readibility of the code is almost the same IMHO.
This question compares length and numel and their performance, and comes to the result that they perform similarly up to 100k elements in a vector. With more than 100k elements, numel appears to be faster. I tried to verify this (with MATLAB R2014a) and came to the following results:
Here, length is a bit slower, but as it is in the range of micro seconds, I guess it won't be a real difference in speed.

Speed up gf(eye(x)) a.k.a. Speed up Galois field creation for sparse matrices

From the documentation (communications toolbox)
x_gf = gf(x,m) creates a Galois field array from the matrix x. The Galois field has 2^m elements, where m is an integer between 1 and 16.
Fine. The effort for big matrices grows with the number of elements of x. No surprise, as every element must be "touched" at some point.
Unfortunately, this means that the costs of gf(eye(n)) throw quadratically with n. Is there a way to profit from all the zeros in there?
PS: I need this to delete a row from a gf-Matrix, as the usual m(:c)= [] way does not work, and my idea of multiplying a gf-matrix with a cut unity matrix was surprisingly slow..
I don't have this toolbox, but maybe gf supports sparse-data inputs, which could drastically reduce your execution time in such a case.

MATLAB: why transpose single dimension array

I have the following Matlab code snippet that I'm having to translate to VBScript. However, I'm not understanding why the last line is even necessary.
clear i
for i = 1:numb_days
doy(i) = floor(dt_daily(i) - datenum(2012,12,31,0,0,0));
end
doy = doy';
Looking over the rest of the code, this happens in a lot of other places where there are single dimension arrays (?) being transposed in place. I'm a newbie when it comes to both these languages, as well as posting a question on Stack, as I'm a sleuth when it comes to finding answers, just not in this case. Thanks in advance.
All "arrays" in MATLAB have at least two dimensions, and can be treated as having any number of dimensions you wish. The transpose operator here is converting between a row (size [1 N] array) and a column (size [N 1] array). This can be significant when it comes to either concatenating the arrays, or performing other operations.
Conceptually, the dimension vector of a MATLAB array has as many trailing 1s as is required to perform an operation. This means that you can index any MATLAB array with any number of subscripts, providing you don't exceed the bounds, like so:
x = magic(4); % 4-by-4 square matrix
x(2,3,1,1,1) % pick an element
One final note: the ' operator is the complex-conjugate transpose CTRANSPOSE. The .' operator is the ordinary TRANSPOSE operator.

Sparse matrices in SciPy defined by a function

Is it possible to define a sparse matrix in scipy from a function rather than the laying out all the possible values? In the doc's I see that a sparse matrix can be created by
There are seven available sparse matrix types:
csc_matrix: Compressed Sparse Column format
csr_matrix: Compressed Sparse Row format
bsr_matrix: Block Sparse Row format
lil_matrix: List of Lists format
dok_matrix: Dictionary of Keys format
coo_matrix: COOrdinate format (aka IJV, triplet format)
dia_matrix: DIAgonal format
All of these force you to specify the matrix beforehand, which takes up memory. Is there a way I can simply supply a function to calculate (i,j) when needed? The end goal is to calculate the few largest eigenvectors of the matrix through something like a Lanczos method.
Short answer is "no", but it's pretty easy i think to roll your own matrix-like object. If you are using eigsh to get your answer, (which appears to be an implementation of the Lanczos algorithm.), then your matrix-like requires a matvec(x) method, which may or may not be easy.
I realize this is not a complete answer, but I hope this sets you on your way.

Associative noncommutative hash function

Is there a hash function with following properties?
is associative
is not commutative
easily implementable on 32 bit integers: int32 hash(int32, int32)
If I am correct, such function allows achieving following goals
calculate hash of concatenated string from hashes of substrings
calculate hash concurrently
calculate hash of list implemented on binary tree - including order, but excluding how tree is balanced
The best I found so far is multiplication of 4x4 matrix of bits, but thats awkward to implement and reduces space to 16bits.
I am grateful for any help.
Polynomial rolling hash could help:
H(A1,...,An) = (H(A1,...,An-1) * Base + An) Mod P
It's easy to concat two results or substract prefix/suffix from result, as long as the length is known.
Matrix multiplication is associative and non-commutative.
You could try representing your hashes as matrices but this will result in a loss of information if they have 0 determinant (which is likely!).
So instead you should generate a triangle matrix with a diagonal of 1's to ensure that you have a determinant of 1 (this guarantees that composition does not loose information).
Furthermore the composition of triangle matrices produces a new triangle matrix, making reading the composition the same as generation.
Note: to use this method the length of your hash must be a triangle number!