I need to write a function that takes a List[Int], start index and end index and returns the max element within that range. I have a working recursive solution but I was wondering if its possible to do it using a built-in function in the Scala collection library.
Additional criteria are:
1) O(N) run time
2) No intermediate structure creation
3) No mutable data structures
def sum(input:List[Int],startIndex:Int,endIndex:Int): Int = ???
This is easily possible with Scala's views:
def subListMax[A](list: List[A], start: Int, end: Int)(implicit cmp: Ordering[A]): A =
list.view(start, end).max
view does not create an intermediate data structure, it only provides an interface into a slice of the original one. See the Views chapter in the documentation for the Scala collections library.
I think it is not possible with your criteria.
There is no higher order function known to me, which works on a sublist of a list. Many just create intermediate collections.
This would be a simple O(n) solution, but it creates an intermediate structure.
input.slice(startIndex, endIndex + 1).max
There are of course also functions, which traverse a collection and yield a value. Examples are reduce and foldLeft. The problem is that they traverse the whole collection and you can't set a start or end.
You could iterate from startIndex to endIndex and use foldLeft on that to get a value via indexing. An example would be:
(startIndex to endIndex).foldLeft(Integer.MIN_VALUE)((curmax, i) => input(i).max(curmax))
Here we only have an iterator, which basically behaves like a for-loop and produces no heavy intermediate collections. However iterating from startIndex to endIndex is O(n) and on each iteration we have an indexing operation (input(i)) that is also generally O(n) on List. So in the end, sum would not be O(n).
This is of course only my experience with scala and maybe I am misguided.
Ah and on your 3): I'm not sure what you mean by that. Using mutable state inside the function should not be a problem at all, since it only exists to compute your result and then disappear.
Related
I need to have an equivalent of the compareWith(a,b) method in Java, in Scala.
I have a list of strings and I need to sort them by comparing them with each other. sortBy just takes one string and returns a score, but that's not enough in my case, i need to compare two strings with each other and then return a number based on which one is better.
It seems like the only option is to write a custom case class, convert the strings to it, and then covert them back. For performance reasons, I want to avoid this as I have a large amount of data to process.
Is there a way to do this with just the strings?
I think you are looking for sortWith.
sortWith(lt: (A, A) ⇒ Boolean): Repr
Sorts this sequence according to a comparison function.
Note: will not terminate for infinite-sized collections.
The sort is stable. That is, elements that are equal (as determined by lt) appear in the same order in the sorted sequence as in the original.
lt the comparison function which tests whether its first argument precedes its second argument in the desired ordering.
Example:
List("Steve", "Tom", "John", "Bob").sortWith(_.compareTo(_) < 0) =
List("Bob", "John", "Steve", "Tom")
In Scala I have a list of tuples List[(String, String)]. So now from this list I want to find how many times each unique tuple appears in the list.
One way to do this would be to apply groupby{ x => x} and then find the length. But here my data set it quite large and it's taking a lot of time.
So is there any better way of doing this?
I would do the counting manually, using a Map. Iterate over your collection/list. During the iteration, build a count map. Keys in the count map are unique items from the original collection/list, values are number of occurrences of the key. If the item being processed during the iteration is in the count collection, increase its value by 1. If not, add value 1 to the count map. You can use getOrElse:
count(current_item) = count.getOrElse(current_item, 0) + 1;
This should work faster than groupby, followed by length check. Will also require less memory.
Other suggestions, check also this discussion.
For example, I have a Scala RDD with 10000 elements, I want to take each element one by one to deal with. How do I do that? I tried use take(i).drop(i-1), but it is extraordinarily time consuming.
According to what you said in the comments:
yourRDD.map(tuple => tuple._2.map(elem => doSomething(elem)))
The first map will iterate over the tuples inside of your RDD, that is why I called the variable tuple, then for every tuple we get the second element ._2 and apply a map which iterate over all the elements of your Iterable that is why I called the variable elem.
doSomething() is just a random function of your choice to apply on each element.
I am trying to find a datastructure that can do a constant lookup and then scan next sorted element from that point until end element is reached. Basically linear scan on sorted set but instead of doing it from first element it should start from specific element so i can scan a range effectively. TreeMap might be a right datastructure for it. Correct me if I'm wrong there. I am trying to use its def
slice(from: Int, until: Int): TreeMap[A, B] and supply from and to values as indexOf element to start and end scan. I can't find a method to get indexOf treeMap element based on Key. I'm sure its internally there but is it expose somewhere? Also, what's the performance of this method? Is it really better then doing linear scan from first element?
I think, you are looking for TreeMap.from() or TreeMap.iteratorFrom() or TreeMap.range()
Basic question. What is the best way to selectively remove and delete items from a mutable Array in Swift?
There are options that do NOT seem to be suited for this like calling removeObject inside a
for in loop
enumeration block
and others that appear to work in general like
for loop using an index + calling removeObjectAtIndex, even inside the loop
for in loop for populating an arrayWithItemsToRemove and then use originalArray.removeObjectsInArray(arrayWithItemsToRemove)
using .filter to create a new array seems to be really nice, but I am not quite sure how I feel about replacing the whole original array
Is there a recommended, simple and secure way to remove items from an array? One of those I mentioned or something I am missing?
It would be nice to get different takes (with pros and cons) or preferences on this. I still struggle choosing the right one.
If you want to loop and remove elements from a NSMutableArray based on a condition, you can loop the array in reverse order (from last index to zero), and remove the objects satisfying the condition.
For example, if you have an array of integers and want to remove the numbers divisible by three, you can run the loop like this:
var array: NSMutableArray = [1, 2, 3, 4, 5, 6, 7];
for index in stride(from: array.count - 1, through: 0, by: -1) {
if array[index] as Int % 3 == 0 {
array.removeObjectAtIndex(index)
}
}
Looping in reverse order ensures that the index of the array elements still to check doesn't change. In forward mode instead, if you remove for instance the first element, then the element previously at index 1 will change to index 0, and you have to account for that in the code.
Usage of removeObject (which doesn't work with the above code) is not recommended in a loop for performance reasons, because its implementation loops through all elements of the array and uses isEqualTo to determine whether to remove the object or not. The complexity order raises from O(n) to O(n^2) - in a worst case scenario, where all elements of the array are removed, the array is traversed once in the main loop, and traversed again for each element of the array. So all solution based on enumeration blocks, for-in, etc., should be avoided, unless you have a good reason.
filter instead is a good alternative, and it's what I'd use because:
it's concise and clear: 1 line of code as opposed to 5 lines (including closing brackets) of the index based solution
its performances are comparable to the index based solution - it is a bit slower, but I think not that much
It might not be ideal in all cases though, because, as you said, it generates a new array rather than operating in place.
When working with NSMutableArray you shouldn't remove objects while you are looping along the mutable array itself (unless looping backwards, as pointed out by Antonio's answer).
A common solution is to make an immutable copy of the array, iterate on the copy, and remove objects selectively on the original mutable array by calling "removeObject" or by calling "removeObjectAtIndex", but you will have to calculate the index, since indexes in the original array and the copy will not match because of the removals (you will have to decrement the "removal index" each time an object is removed).
Another solution (better) is to loop the array once, create an NSIndexSet with the indexes of the objects to remove, and then call "removeObjectsAtIndexes:" on the mutable array.
See documentation on NSMutableArray's "removeObjectsAtIndexes:" in Swift.
Some of the options:
For loop over indexes and calling removeObjectAtIndex: 1) You will have to deal with the fact that when you remove, the index of the following object will become the current index, so you have to make sure to not increment the index in that case; you can avoid this by iterating backwards. 2) Each call to removeObjectAtIndex is O(n) (since it must shift all following elements forwards), so the algorithm is O(n^2).
For loop to build a set of elements to remove and then calling removeObjectsInArray: The first part is O(n). removeObjectsInArray uses a hash table to test elements for removal efficiently; hash table access is O(1) on average but O(n) worst-case, so the algorithm is O(n) on average, but O(n^2) worst-case.
Using filter to create a new array: This is O(n). It creates a new array.
For loop to build an index set of indexes of elements to remove (or with indexesOfObjectsPassingTest), then remove them using removeObjectsAtIndexes: I believe this is O(n). It does not create a new array.
Use filterUsingPredicate using a predicate based on a block of your test: I believe this is also O(n). It does not create a new array.