I have an array of 1 x 400, where all element values are above 1500. However, I have some elements that have values<50 which are wrong measures and I would like to have the mean of the elements before and after the wrong measured data points and replace it in the main array.
For instance, element number 17 is below 50 so I want to take the mean of elements 16 and 18 and replace element 17 with the new mean.
Can someone help me, please? many thanks in advance.
No language is specified in the question, but for Python you could work with List Comprehension:
# array with 400 values, some of which are incorrect
arr = [...]
arr = [arr[i] if arr[i] >= 50 else (arr[i-1]+arr[i+1])/2 for i in range(len(arr))]
That is, if arr[i] is less than 50, it'll be replaced by the average value of the element before and after it. There are two issues with this approach.
If i is the first or last element, then one of the two values will be undefined, and no mean can be obtained. This can be fixed by just using the value of the available neighbour, as specified below
If two values in a row are very low, the leftmost one will use the rightmost one to calculate its value, which will result in a very low value. This is a problem that may not occur for you in practice, but it is an inherent result of the way you wish to recalculate values, and you might want to keep it in mind.
Improved version, keeping in mind the edge cases:
# don't alter the first and last item, even if they're low
arr = [arr[i] if arr[i] >= 50 or i == 0 or i+1 == len(arr) else (arr[i-1]+arr[i+1])/2 for i in range(len(arr))]
# replace the first and last element if needed
if arr[0] < 50:
arr[0] = arr[1]
if arr[len(arr)-1] < 50:
arr[len(arr)-1] = arr[len(arr)-2]
I hope this answer was useful for you, even if you intend to use another language or framework than python.
Related
I created a routing problem and added some dimension to it. A solution assignment is found and I want to know the cumulative value at each index. I noticed that the CumulVar of an assignment does not only have a Value method but also Min and Max methods. Apparently the cumulative variables are implemented in such a way that they can represent intervals. I can see how setting
slack_max>0
fix_start_cumul_to_zero=False
introduces an ambiguity for the cumulative variables as their is a choice in how to start and how much slack to add at each stop. But
Question: How are the Min and Max at each index computed?
You can get the Min and Max range of a given node index from solution.Min(dimension.Cumulvar(index))
Note you'll get Min and Max exactly the same when slack_max=0 unless you know something I don't ;)
Assuming you are using an output solution object solution and a time dimension time_dimension, this will store em as a dict with min-max tuples, you may wish to adapt the output format however you wish:
time_dict = {}
for vehicle_id in range(num_vehicles):
vehicle_time_dict={}
index = routing.Start(vehicle_id)
start_time = solution.Min(time_dimension.CumulVar(index))
vehicle_time_dict[index]=(index_min,index_max)
while not routing.isEnd(index):
previous_index = index
index = solution.Value(routing.NextVar(index))
index_min = solution.Min(time_dimension.CumulVar(index))
index_max = solution.Max(time_dimension.CumulVar(index))
vehicle_time_dict[index]=(index_min,index_max)
time_dict[vehicle_id]=vehicle_time_dict
routing.IsEnd(index) returns True if it's the last index of that vehicle's route (or anywhere after the last index, so if it's 10 nodes long:
routing.IsEnd(8) will return False,
routing.IsEnd(9) will return True,
routing.IsEnd(10) will also return True, etc)
Say ...
you have about 20 Thing
very often, you do a complex calculation running through a loop of say 1000 items. The end result is a varying number around 20 each time
you don't know how many there will be until you run through the whole loop
you then want to quickly (and of course elegantly!) access the result set in many places
for performance reasons you don't want to just make a new array each time. note that unfortunately there's a differing amount so you can't just reuse the same array trivially.
What about ...
var thingsBacking = [Thing](repeating: Thing(), count: 100) // hard limit!
var things: ArraySlice<Thing> = []
func fatCalculation() {
var pin: Int = 0
// happily, no need to clean-out thingsBacking
for c in .. some huge loop {
... only some of the items (roughly 20 say) become the result
x = .. one of the result items
thingsBacking[pin] = Thing(... x, y, z )
pin += 1
}
// and then, magic of slices ...
things = thingsBacking[0..<pin]
(Then, you can do this anywhere... for t in things { .. } )
What I am wondering, is there a way you can call to an ArraySlice<Thing> to do that in one step - to "append to" an ArraySlice and avoid having to bother setting the length at the end?
So, something like this ..
things = ... set it to zero length
things.quasiAppend(x)
things.quasiAppend(x2)
things.quasiAppend(x3)
With no further effort, things now has a length of three and indeed the three items are already in the backing array.
I'm particularly interested in performance here (unusually!)
Another approach,
var thingsBacking = [Thing?](repeating: Thing(), count: 100) // hard limit!
and just set the first one after your data to nil as an end-marker. Again, you don't have to waste time zeroing. But the end marker is a nuisance.
Is there a more better way to solve this particular type of array-performance problem?
Based on MartinR's comments, it would seem that for the problem
the data points are incoming and
you don't know how many there will be until the last one (always less than a limit) and
you're having to redo the whole thing at high Hz
It would seem to be best to just:
(1) set up the array
var ra = [Thing](repeating: Thing(), count: 100) // hard limit!
(2) at the start of each run,
.removeAll(keepingCapacity: true)
(3) just go ahead and .append each one.
(4) you don't have to especially mark the end or set a length once finished.
It seems it will indeed then use the same array backing. And it of course "increases the length" as it were each time you append - and you can iterate happily at any time.
Slices - get lost!
Consider the following example code:
spacing_Pin = transpose(-27:0.001:2);
Phi_intrp3 = interp1(P_in3,Phi_out_deg3,spacing_Pin,'spline');
size(P_in3) = size(Phi_out_deg3) = 40 1
P_in1st = P_in3 -0.76;
thetah_1st = Phi_intrp3(ismember(spacing_Pin,P_in1st));
I think thetah_1stshould now have 40 elements. It turns out, however, it only has 20. This must be, because P_in1stcan only be found 20 times in spacing_Pin. I know, you're not given P_in3, but it only has up to 2 digits after the dot. So should P_in1st. min(P_in3) = -25.9800and max(P_in3) = -0.0200. These values should be included in spacing_Pin. Why does thetah_1st have only 20 entries.
It's impossible to say without more information and your actual variable values, but I suspect that the issue is with using ismember the way that you have.
Since you are looking for floating point numbers within an array rather than integers, you should use ismembertol (ismember with a tolerance) to handle any potential floating point errors.
thetah_1st = Phi_intrp3(ismembertol(spacing_Pin, P_in1st));
As simple as in title. I have nx1 sized vector p. I'm interested in the maximum value of r = p/foo - floor(p/foo), with foo being a scalar, so I just call:
max_value = max(p/foo-floor(p/foo))
How can I get which value of p gave out max_value?
I thought about calling:
[max_value, max_index] = max(p/foo-floor(p/foo))
but soon I realised that max_index is pretty useless. I'm sorry asking this, real beginner here.
Having dropped the issue to pieces, I realized there's no unique corrispondence between values p and values in my related vector p/foo-floor(p/foo), so there's a logical issue rather than a language one.
However, given my input data, I know that the solution is unique. How can I fix this?
I ended up doing:
result = p(p/foo-floor(p/foo) == max(p/foo-floor(p/foo)))
Looks terrible, so if you know any other way...
Once you have the index, use it:
result = p(max_index)
You can create a new vector with your lets say "transformed" values:
p2 = (p/foo-floor(p/foo))
and then just use find to find the max values on p2:
max_index = find(p2 == max(p2))
that will return the index or indices of p2 with the max value of that operation, and finally just lookup the original value in p
p(max_index)
in 1 line, this is:
p(find((p/foo-floor(p/foo) == max((p/foo-floor(p/foo))))))
which is basically the same thing you did in the end :)
I am looking for to take one particular number or range of numbers from a set of number?
Example
A = [-10,-2,-3,-8, 0 ,1, 2, 3, 4 ,5,7, 8, 9, 10, -100];
How can I just take number 5 from the set of above number and
How can I take a range of number for example from -3 to 4 from A.
Please help.
Thanks
I don't know what you are trying to accomplish by this. But you could check each entry of the set and test it it's in the specified range of numbers. The test for a single number could be accomplished by testing each number explicitly or as a special case of range check where the lower and the upper bound are the same number.
looping and testing, no matter what the programming language is, although most programming languages have builtin methods for accomplishing this type of task (so you may want to specify what language are you supposed to use for your homework):
procfun get_element:
index=0
for element in set:
if element is 5 then return (element,index)
increment index
your "5" is in element and at set[index]
getting a range:
procfun getrange:
subset = []
index = 0
for element in set:
if element is -3:
push element in subset
while index < length(set)-1:
push set[index] in subset
if set[index] is 4:
return subset
increment index
#if we met "-3" but we didn't met "4" then there's no such range
return None
#keep searching for a "-3"
increment index
return None
if ran against A, subset would be [-3,-8, 0 ,1, 2, 3, 4]; this is a "first matched, first grabbed" poorman's algorithm. on sorted sets the algorithms can get smarter and faster.