How to to generate all possible k-element arrays from n-element array in Swift - swift

I want to create all possible k-element arrays from n-element array. k may be bigger or smaller than n. The elements in the output array don't have to be unique.
For example:
given this array
let a = [1,2]
the function given a desired size 3, should return :
[1,1,1]
[2,1,1]
[1,2,1]
[1,1,2]
[2,2,1]
[2,1,2]
[1,2,2]
[2,2,2]
example 2
given this array
let b = [[0,1], [2,3]]
the function given a desired size 3, should return :
[[0,1], [0,1], [0,1]]
[[2,3], [0,1], [0,1]]
[[0,1], [2,3], [0,1]]
[[0,1], [0,1], [2,3]]
[[2,3], [2,3], [0,1]]
[[2,3], [0,1], [2,3]]
[[0,1], [2,3], [2,3]]
[[2,3], [2,3], [2,3]]
How to do that in Swift?

So you want all k-tuples with elements from a given set. This can be done recursively
by taking all elements from the set as the first tuple element and combining that
with all (k-1) tuples:
func allTupelsFrom<T>(elements: [T], withLength k : UInt,
combinedWith prefix : [T] = []) -> [[T]] {
if k == 0 {
return [prefix]
}
var result : [[T]] = []
for e in elements {
result += allTupelsFrom(elements, withLength: k-1, combinedWith: prefix + [e])
}
return result
}
Examples:
let result1 = allTupelsFrom([1, 2], withLength: 3)
println(result1)
// [[1, 1, 1], [1, 1, 2], [1, 2, 1], [1, 2, 2], [2, 1, 1], [2, 1, 2], [2, 2, 1], [2, 2, 2]]
let result2 = allTupelsFrom(["a", "b", "c", "d"], withLength: 4)
println(result2)
// [[a, a, a, a], [a, a, a, b], ... , [d, d, d, c], [d, d, d, d]]

Related

How to invocate nested loops one loop at a time?

I want to compare each element against all others like following. The number of variables like a, b, c is dynamic. However, each variable's array size is uniform.
let a = [1, 2, 3]
let b = [3, 4, 5]
let c = [4, 5, 6]
for i in a {
for j in b {
for k in c {
/// comparison
}
}
}
Instead looping from start to finish at once, what would be a way to make each comparison on call? For example:
compare(iteration: 0)
/// compares a[0], b[0], c[0]
compare(iteration: 1)
/// compares a[0], b[0], c[1]
/// all the way to
/// compares a[2], b[2], c[2]
Or it could even be like following:
next()
/// compares a[0], b[0], c[0]
next()
/// compares a[0], b[0], c[1]
almost like an iterator stepping through each cycle dictated by my invocation.
Let the number of arrays be n. And let the number of elements in each array, which is guaranteed the same for all of them, be k.
Then create an array consisting of the integers 0 through k-1, repeated n times. For example, in your case, n is 3, and k is 3, so generate the array
[0, 1, 2, 0, 1, 2, 0, 1, 2]
Now obtain all combinations of n elements of that array. You can do this using the algorithm at https://github.com/apple/swift-algorithms/blob/main/Guides/Combinations.md. Unique the result (by, for example, coercing to a Set and then back to an Array). This will give you a result equivalent, in some order or other, to
[[0, 1, 2], [0, 1, 0], [0, 1, 1], [0, 2, 0], [0, 2, 1], [0, 2, 2], [0, 0, 1], [0, 0, 2], [0, 0, 0], [1, 2, 0], [1, 2, 1], [1, 2, 2], [1, 0, 1], [1, 0, 2], [1, 0, 0], [1, 1, 2], [1, 1, 0], [1, 1, 1], [2, 0, 1], [2, 0, 2], [2, 0, 0], [2, 1, 2], [2, 1, 0], [2, 1, 1], [2, 2, 0], [2, 2, 1], [2, 2, 2]]
You can readily see that those are all 27 possible combinations of the numbers 0, 1, and 2. But that is exactly what you were doing with your for loops! So now, use those subarrays as indexes into each of your original arrays respectively.
So for instance, using my result and your original example, the first subarray [0, 1, 2] yields [1, 4, 6] — the first value from the first array, the second value from the second array, and the third value from the third array. And so on.
In this way you will have generated all possible n-tuples by choosing one value from each of your original arrays, which is the desired result; and we are in no way bound to fixed values of n and k, which was what you wanted to achieve. You will then be able to "compare" the elements of each n-tuple, whatever that may mean to you (you did not say in your question what it means).
In the case of your original values, we will get these n-tuples (expressed as arrays):
[1, 4, 6]
[1, 4, 4]
[1, 4, 5]
[1, 5, 4]
[1, 5, 5]
[1, 5, 6]
[1, 3, 5]
[1, 3, 6]
[1, 3, 4]
[2, 5, 4]
[2, 5, 5]
[2, 5, 6]
[2, 3, 5]
[2, 3, 6]
[2, 3, 4]
[2, 4, 6]
[2, 4, 4]
[2, 4, 5]
[3, 3, 5]
[3, 3, 6]
[3, 3, 4]
[3, 4, 6]
[3, 4, 4]
[3, 4, 5]
[3, 5, 4]
[3, 5, 5]
[3, 5, 6]
Those are precisely the triples of values you are after.
Actual code:
// your original conditions
let a = [1, 2, 3]
let b = [3, 4, 5]
let c = [4, 5, 6]
let originals = [a, b, c]
// The actual solution starts here. Note that I never use any hard
// coded numbers.
let n = originals.count
let k = originals[0].count
var indices = [Int]()
for _ in 0..<n {
for i in 0..<k {
indices.append(i)
}
}
let combos = Array(indices.combinations(ofCount: n))
var combosUniq = [[Int]]()
var combosSet = Set<[Int]>()
for combo in combos {
let success = combosSet.insert(combo)
if success.inserted {
combosUniq.append(combo)
}
}
// And here's how to generate your actual desired values.
for combo in combosUniq {
var tuple = [Int]()
for (outerIndex, innerIndex) in combo.enumerated() {
tuple.append(originals[outerIndex][innerIndex])
}
print(tuple) // in real life, do something useful here
}
}

How to encode column of list for catboost?

I have a dataset where some columns contain lists:
import pandas as pd
df = pd.DataFrame(
{'var1': [1, 2, 3, 1, 2, 3],
'var2': [1, 1, 1, 2, 2, 2],
'var3': [["A", "B", "C"], ["A", "C"], None, ["A", "B"], ["C", "A"], ["D", "A"]]
}
)
var1 var2 var3
0 1 1 [A, B, C]
1 2 1 [A, C]
2 3 1 None
3 1 2 [A, B]
4 2 2 [C, A]
5 3 2 [D, A]
As the values within the lists of var3 can be shuffled and we can't assume any specific order the only way I can think of to prepare the columns for modelling is one-hot encoding. It could be done quite easily:
df["var3"] = df["var3"].apply(lambda x: [str(x)] if type(x) is not list else x)
mlb = MultiLabelBinarizer()
mlb.fit_transform(df["var3"])
resulting in:
array([[1, 1, 1, 0, 0],
[1, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[1, 1, 0, 0, 0],
[1, 0, 1, 0, 0],
[1, 0, 0, 1, 0]])
However, quoting catboost documentation:
Attention. Do not use one-hot encoding during preprocessing. This
affects both the training speed and the resulting quality.
Therefore, I'd like to ask if there's any other way I could encode this column for modelling with catboost?

CoffeeScript - transform object into a matrix using comprehensions

I have an object of the form: {a: [1,2,3,4], b: [5,6,7,8]} and I want to transform it, using comprehensions, into an array of arrays of 3 items:
[
['a', 0, 1], ['a', 1, 2], ['a', 2, 3], ['a', 3, 4],
['b', 0, 5], ['b', 1, 6], ['b', 2, 7], ['b', 3, 8]
]
I tried this ( [x,y,v] for v, y in h for x, h of obj ) but it gives an array of two elements of 4 elements:
[
[ [], [], [], [] ],
[ [], [], [], [] ]
]
How can I skip the array of the second level?
It's easier to see when you split out the two comprehensions:
result = for x, h of obj
for v, y in h
[x,y,v]
Your result will be 2 levels deep, as each comprehension is returning an array.
First depth array will have one array for each element in your object.
Each of these arrays will contain the results for one key in your object
Best way around this is to push each of the 3 element arrays you want into a separate array.
result = []
for x, h of obj
for v, y in h
result.push [x,y,v]
Alternatively with the compact formatting:
result = []
result.push [x,y,v] for v, y in h for x, h of obj
If you already have a utility library like lodash, you could use the flatten method. But this would involve another iteration over your arrays, so not practical if performance matters and you have very large datasets

Sum 2 consecutive elements in an array

I have a pointer to array of floats: arr = [a0, a1, a2, a3, ..., an].
I want the result to be: result = [a0+a1, a0+a1, a2+a3, a2+a3, a4+a5, a4+a5, ...].
Now I'm doing it with map() function:
let multiArrayValue: MLMultiArray = someMulityArray
let pointer = (multiArrayValue.dataPointer).bindMemory(to: Float.self, capacity: multiArrayValue.count)
let sums = (0..<multiArrayValue.count/2).map { (index) -> [Float] in
let sum = pointer[index * 2] + pointer[index * 2 + 1]
return [sum, sum]
}.flatMap { $0 }
How to do it in an efficient way with Accelerate framework?
EDIT:
I do manage to get res = [a0+a1, a2+a3, a4+a5, ..., an+an]:
let k = multiArrayValue.count/2
let n = vDSP_Length(k)
var res = [Float](repeating: 0, count: k)
vDSP_vadd(&pointer, vDSP_Stride(2),
&pointer+1, vDSP_Stride(2),
&res, vDSP_Stride(1),
n)
So now the remained question is how, with Accelerate to get repeated values: [a1, a2, a3, ... an] => [a1, a1, a2, a2, ..., an, an]
The solution to this achieved in 2 steps. The key in both steps is to play with the strides. First just calculate the sums vector:
let k = multiArrayValue.count/2
let n = vDSP_Length(k)
var sums = [Float](repeating: 0, count: k)
vDSP_vadd(&pointer, vDSP_Stride(2),
&pointer+1, vDSP_Stride(2),
&sums, vDSP_Stride(1),
n)
Second step is to get the repeated sums:
var resSparse = [Float](repeating: 0.0, count: k * 2)
vDSP_vmax(pointerOpt, 2, &sums + 1, 2, &resSparse, 2, k)
var res = [Float](repeating: 0.0, count: k * 2)
catlas_saxpby(k * 2 - 1, 1.0, &resSparse, 1, 1.0, &res + 1, 1)
catlas_saxpby(k * 2, 1.0, &resSparse, 1, 1.0, &res, 1)

Create a matrix according to a binary matrix

Here I got
A = [1, 2, 3]
B = [1, 0, 0, 1, 0, 1]
I want to create a matrix
C = [1, 0, 0, 2, 0, 3]
You can see B is like a mask, The number of ones in B is equal to the number of elements in A. What I want is arrange elements in A to the place where B is 1.
Any method without loop?
Untested, but should be close:
C = zeros(size(B));
C(logical(B)) = A;
This relies on logical indexing.