I have coded for class balancing loss - loss

Other class balancing loss
class_weights = 1.0/np.mean(mdlParams['labels_array'][mdlParams['trainInd'],:],axis=0)
print("Current class weights",class_weights)
class_weights = class_weights*mdlParams['extra_fac']
print("Current class weights with extra",class_weights)
elif mdlParams['balance_classes'] == 9:
error says:
class_weights_ = 1.0/np.mean(mdlParams['labels_array'][indices_ham,:8],axis=0)
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

mdlParams is a 1 dimension array so calling
mdlParams['labels_array'][mdlParams['trainInd']
will yield you an error.
To explain a bit better, mdlParams is lets say an array of size [10]
but you are trying to read mdlParams[10][1] while there is no second dimension to accept the 1

Related

Mean of values before and after a specific element

I have an array of 1 x 400, where all element values are above 1500. However, I have some elements that have values<50 which are wrong measures and I would like to have the mean of the elements before and after the wrong measured data points and replace it in the main array.
For instance, element number 17 is below 50 so I want to take the mean of elements 16 and 18 and replace element 17 with the new mean.
Can someone help me, please? many thanks in advance.
No language is specified in the question, but for Python you could work with List Comprehension:
# array with 400 values, some of which are incorrect
arr = [...]
arr = [arr[i] if arr[i] >= 50 else (arr[i-1]+arr[i+1])/2 for i in range(len(arr))]
That is, if arr[i] is less than 50, it'll be replaced by the average value of the element before and after it. There are two issues with this approach.
If i is the first or last element, then one of the two values will be undefined, and no mean can be obtained. This can be fixed by just using the value of the available neighbour, as specified below
If two values in a row are very low, the leftmost one will use the rightmost one to calculate its value, which will result in a very low value. This is a problem that may not occur for you in practice, but it is an inherent result of the way you wish to recalculate values, and you might want to keep it in mind.
Improved version, keeping in mind the edge cases:
# don't alter the first and last item, even if they're low
arr = [arr[i] if arr[i] >= 50 or i == 0 or i+1 == len(arr) else (arr[i-1]+arr[i+1])/2 for i in range(len(arr))]
# replace the first and last element if needed
if arr[0] < 50:
arr[0] = arr[1]
if arr[len(arr)-1] < 50:
arr[len(arr)-1] = arr[len(arr)-2]
I hope this answer was useful for you, even if you intend to use another language or framework than python.

Check if assigned elements satisfy a condition in OR-Tools

I have say 100 elements that I want to assign to say 10 spots.
# the elements list holds 100 variables that signify the assignment to a spot
elements = [model.NewIntVar(1, 10) for i in range(100)]
Each of my element has a specific size. Now I want to model one (set of) constraint(s) per spot that says: The added sizes of all elements assigned to this spot lies in a fixed range.
So if spot 1 gets elements 1, 16 and 64 assigned, and their sizes are 1521, 1732, 1431 and my range is (3000, 6000) that would be ok. But if too many or too large elements (or too few/small) get assigned to spot 1, that would not be ok.
Something like the following, which does not work:
for spot in range(10):
sum_ = sum([get_size(e) for e in elements if e == spot]) # if think if e == spot is what fails
model.Add(sum_ >= 3000)
model.Add(sum_ <= 6000)
How can I model such a thing? I have looked at channeling constraints but I can't quite wrap my head around it.
I think it is better to model the assignment as a boolean:
from ortools.sat.python import cp_model
model = cp_model.CpModel()
solver = cp_model.CpSolver()
all_spots = range(10)
all_elems = range(100)
elements = {
(elem, spot): model.NewBoolVar(f"{elem} in spot {spot}")
for elem in all_elems
for spot in all_spots
}
# only one spot for element
for elem in all_elems:
model.Add(sum(elements[elem, spot] for spot in all_spots) == 1)
for spot in all_spots:
# taking the element id as its size
sum_ = sum(elements[elem, spot] * elem for elem in all_elems)
model.Add(sum_ >= 0)
model.Add(sum_ <= 500)
solver.Solve(model)
for (elem, spot), boolean in elements.items():
if solver.Value(boolean):
print(boolean)
See:
https://github.com/google/or-tools/blob/stable/ortools/sat/samples/multiple_knapsack_sat.py
https://github.com/google/or-tools/blob/stable/ortools/sat/samples/binpacking_problem_sat.py
https://github.com/google/or-tools/blob/stable/examples/python/balance_group_sat.py#L102

Data ranged subscribe strange behavior

I was playing with swift's Data in the following a small code:
var d = Data(count: 10)
d[5] = 3
let d2 = d[5..<8]
print("\(d2[0])")
To my surprise, this code throws exception on print() while the following code does not:
var d = Data(count: 10)
d[5] = 3
let d2 = d.subdata(in: 5..<8)
print("\(d2[0])")
I somehow understand why this happens, but I don't get why this is designed like this. When I use subdata() I get a whole copy of range, so indexing is valid from 0. But when I use range subscribe [], I get access to the requested range while indexing is the same as before. So in my first example d2[5] is 3.
But I wonder why it is designed like this? I don't want to make a copy of my data by using subdata() method. I just wanted to access a portion of my data with better indexing.
This is especially creates unexpected behaviors if you pass it to a function. For example, following code creates unexpected results and exceptions and you may not find out easily why:
func testit(idata: Data) {
if idata.count > 0 {
print("\(idata.count)")
print("\(idata[0])")
}
}
//...
var d = Data(count: 10)
d[5] = 3
let d2 = d[5..<8]
testit(idata: d2)
This code is really strange. Because if you debug your code, you see that print("\(idata.count)") prints 3 as size of idata which is correct, but accessing it with idata[0] creates exception.
Is there any reason for this design? I was expecting that I could access resulting Data from subscribe starting index 0 while it is not true. Can I do this without using subdata() which creates copy of data or using additional arguments to pass base of data slice?
d[5..<8] returns Data.Slice – which happens to be Data. Generally, slices share the indices with their base collection, as documented in Slice.
One possible reason for this design decision is that it guarantees that subscripting a slice is a O(1) operation (adding an offset for accessing the base collection is not necessarily O(1), e.g. not for strings.)
It is also convenient, as in this example to locate the text after the second occurrence of a character in a string:
let string = "abcdefgabcdefg"
// Find first occurrence of "d":
if let r1 = string.range(of: "d") {
// Find second occurrence of "d":
if let r2 = string[r1.upperBound...].range(of: "d") {
print(string[r2.upperBound...]) // efg
}
}
As a consequence, you must never assume that the indices of a collection are zero-based (unless documented, as for Array.startIndex). Use startIndex to get the first index, or first to get the first element.

Most efficient way to store dictionaries in Matlab

I have a set of IDs associated with costs which is just a double value. IDs are integers and unique. Two IDs may have same costs. I stored them as:-
a=containers.Map('KeyType','uint32','ValueType','double');
a(1)=7.3
a(2)=8.4
a(3)=7.3
Now i want to find the minimum cost.
b=[];
c=values(a);
b=[b,c{:}];
cost_min=min(b);
Now i want to find all IDs associated i.e. 1 and 3 with the minimum cost i.e. 7.3. I can collect all the keys into an array and then do a for loop over this array. Is there a better way to do this entire thing in Matlab so that for loops are not required?
sparse matrix can work as a hashmap, just do this:
a= sparse(1:3,1,[7.3 8.4 7.3])
find(a == min(nonzeros(a))
There are methods which can be used on maps for this kind of operations
http://se.mathworks.com/help/matlab/ref/containers.map-class.html
The approach finding minimum values and minimum keys can be done something like this,
a=containers.Map('KeyType','uint32','ValueType','double');
a(1)=7.3;
a(3)=8.4;
a(4)=7.3;
minval = inf;
minkeys = -1;
for k = keys(a)
val = a.values(k);
val = val{1};
if (val < minval(1))
minkeys = k;
minval = val;
elseif (val == minval(1))
minkeys = [minkeys,k];
end
end
disp(minval);
disp(minkeys);
This is not efficient though and value search is clumsy for maps. This is not what they are intended for. Maps is supposed to do efficient key lookup. In case you are going to do a lot of lookups and this is what takes time, then use a map. If you need to do a lot of value searches, I would recommend that you use a matrix (or two arrays) for this instead.
idx = [1;3;4];
val = [7.3,8.3,7.3];
minval = min(val);
minidx = idx(val==minval);
disp(minval);
disp(minidx);
There is also another post with an example where it is shown how a sparse matrix can be used as a hashmap. Let the index become the key. This will take about 3 times the memory as all non-zero elements an ordinary array, but a map uses more memory than an array as well.

Why does passing an unnamed function as transform to array add one to iteration count in playground in furthest abstractions

I'm in the process of getting comfortable passing unnamed functions as arguments and I am using this to practice with, based off of the examples in the Swift Programming Guide.
So we have an array of Ints:
var numbers: Int[] = [1, 2, 3, 4, 5, 6, 7]
And I apply a transform like so: (7)
func transformNumber(number: Int) -> Int {
let result = number * 3
return result
}
numbers = numbers.map(transformNumber)
Which is equal to: (7)
numbers = numbers.map({(number: Int) -> Int in
let result = number * 3
return result;
})
Which is equal to: (8)
numbers = numbers.map({number in number * 3})
Which is equal to: (8)
numbers = numbers.map({$0 * 3})
Which is equal to: (8)
numbers = numbers.map() {$0 * 3}
As you can see in the following graphic, the iteration count in the playground sidebar shows that in the furthest abstraction of a function declaration, it has an 8 count.
Question
Why is it showing as 8 iterations for the last two examples?
It's not showing 8 iterations, really. It's showing that 8 things executed on that line. There were 7 executions as part of the map function, and an 8th to do the assignment back into the numbers variable.
It looks like this could probably provide more helpful diagnostics. I would highly encourage you to provide feedback via https://bugreport.apple.com.
Slightly rewriting your experiment to use only closures, the call counts still differ by one:
Case 1: Explicitly specifying argument types (visit count is 7)
var f1 = {(number: Int) -> Int in
let result = number * 3
return result
}
numbers.map(f1)
Case 2: Implicit argument types (visit count is 8)
var f2 = {$0 * 3}
numbers.map(f2)
If the (x times) count reported by the REPL does indeed represent a count of visits to that code location, and noting that the count is greater by one in cases where the closure type arguments are not explicitly specified (e.g. f2), my guess is that at least in the playground REPL, the extra visit is to establish actual parameter types and fill that gap in the underlying AST.