Related
In Pandas we can use the map function to map a dict to a series to create another series with the mapped values. More generally speaking, I believe it invokes the index operator of the argument, i.e. [].
import pandas as pd
dic = { 1: 'a', 2: 'b', 3: 'c' }
pd.Series([1, 2, 3, 4]).map(dic) # returns ["a", "b", "c", NaN]
I haven't found a way to do so directly in Polars, but have found a few alternatives. Would any of these be the recommended way to do so, or is there a better way?
import polars as pl
dic = { 1: 'a', 2: 'b', 3: 'c' }
# Approach 1 - apply
pl.Series([1, 2, 3, 4]).apply(lambda v: dic.get(v, None)) # returns ["a", "b", "c", null]
# Approach 2 - left join
(
pl.Series([1, 2, 3, 4])
.alias('key')
.to_frame()
.join(
pl.DataFrame({
'key': list(dic.keys()),
'value': list(dic.values()),
}),
on='key', how='left',
)['value']
) # returns ["a", "b", "c", null]
# Approach 3 - to pandas and back
pl.from_pandas(pl.Series([1, 2, 3, 4]).to_pandas().map(dic)) # returns ["a", "b", "c", null]
I saw this answer on mapping a dict of expressions but since its chains when/then/otherwise it might not work well for huge dicts.
Mapping a python dictionary over a polars Series should always be considered an anti-pattern. This will be terribly slow and what you want is semantically equal to a join.
Use joins. They are heavily optimized, multithreaded and don't use python.
Example
import polars as pl
dic = { 1: 'a', 2: 'b', 3: 'c' }
mapper = pl.DataFrame({
"keys": list(dic.keys()),
"values": list(dic.values())
})
pl.Series([1, 2, 3, 4]).to_frame("keys").join(mapper, on="keys", how="left").to_series(1)
Series: 'values' [str]
[
"a"
"b"
"c"
null
]
Polars is an awesome tool but even awesome tools aren't meant for everything and this is one of those cases. Using a simple python list comprehension is going to be faster.
You could just do:
[dic[x] if x in dic.keys() else None for x in [1,2,3,4]]
On my computer, the timing of that, using %%timeit is 800ns
In contrast to
pl.Series([1, 2, 3, 4]).to_frame("keys").join(pl.DataFrame([{'keys':x, 'values':y} for x,y in dic.items()]), on="keys", how="left").to_series(1)
which takes 434µs.
Notice that the first is measured in nanoseconds whereas the second is in microseconds so it's really 800ns vs 434000ns.
I had an exercise where I had to put duplicate elements from List to individual lists. Everything works fine, but my question is how to order it alphabetic now, so the result starts from List(a,a,a,a,a,a), not from List(e,e,e,e)
Tried to use sortBy in the end, but none combination worked for me.
My code:
val list2 = List('a', 'a', 'a', 'a', 'b', 'c', 'c', 'a', 'a', 'd', 'e', 'e', 'e', 'e')
def sortedSublist(l: List[Char]): List[List[Char]] = {
l.groupBy(identity).map{case (key, values) => values}.toList
}
println(sortedSublist(list2))
Current result is:
List(List(e, e, e, e), List(a, a, a, a, a, a), List(b), List(c, c), List(d))
mck suggested a good answer.
You can also sort the map by key after groupBy
l.groupBy(identity).toList.sortBy(_._1).map(_._2)
I am trying to populate a ListView in Flutter with different sources. So, I have two lists,
list1 = ['a', 'b', 'c']; #The list isn't of numeric type
list2 = ['2', '4'];
Now, I can combine them using the spread operator and get the following output
[a, b, c, 2, 4]
but I want the output to be like -
[a, 2, b, 4, c]
How can this be achieved? What's the most idiomatic approach?
Builtin Iterable has no method zip, but you can write something like:
Iterable<T> zip<T>(Iterable<T> a, Iterable<T> b) sync* {
final ita = a.iterator;
final itb = b.iterator;
bool hasa, hasb;
while ((hasa = ita.moveNext()) | (hasb = itb.moveNext())) {
if (hasa) yield ita.current;
if (hasb) yield itb.current;
}
}
then use zip
final list1 = ['a', 'b', 'c'];
final list2 = ['2', '4'];
final res = zip(list1, list2);
print(res); // (a, 2, b, 4, c)
I guess this:
List list1 = [1, 3, 5];
List list2 = [2, 4];
print([...list1, ...list2]..sort());
second list sorted according to the first
[['E', 'E', 'C], ['B', 'C', 'A'], ['E', 'B', 'F'], ['D', 'F', 'E']]
ref_list = ['c','b','a']
# sort [1,2,3] according to ref_list... viz: ['c','a','b'] => [3,1,2]
ordering = sorted(range(len(ref_list)), key=lambda i: ref_list[i])
for j in range(len(list2)):
list2[j] = [list2[j][i] for i in ordering]
Here is some quick code I made, it works but it may need some refactoring:
def getListIndexes(some_list):
return [x for x in enumerate(some_list)]
list1 = [['C', 'B', 'A']]
list2 = [['C', 'E', 'E'], ['C', 'B', 'A'], ['F', 'B', 'E'], ['E', 'F', 'D']]
values_before_1 = getListIndexes(list1[0])
values_after_1 = getListIndexes(list(sorted(list1[0])))
mapping = dict.fromkeys(list(range(len(list1[0]))))
for item in values_before_1:
before, then = [(item[0],x[0]) for i, x in enumerate(values_after_1) if item[1]==x[1]][0]
mapping[then] = before
results = [[None, None, None] for l in range(len(list2))]
for i, item in enumerate(list2):
values = getListIndexes(item)
for j in range(len(item)):
results[i][mapping[j]] = list2[i][j]
print "[*] list1 with content: {} has been sorted and now is like: {}\n".format(list1[0], sorted(list1[0]))
print "[*] list2 with content:\n\n{}\n".format(list2)
print "...has been sorted based on list1 sorting and now looks like this...\n"
print results
Output:
[*] list1 with content: ['C', 'B', 'A'] has been sorted and now is like: ['A', 'B', 'C']
[*] list2 with content:
[['C', 'E', 'E'], ['C', 'B', 'A'], ['F', 'B', 'E'], ['E', 'F', 'D']]
...has been sorted based on list1 sorting and now looks like this...
[['E', 'E', 'C'], ['A', 'B', 'C'], ['E', 'B', 'F'], ['D', 'F', 'E']]
I have a byte array (or more precisely a ByteString) of UTF8 strings, which are prefixed by their length as 2-bytes (msb, lsb). For example:
val z = akka.util.ByteString(0, 3, 'A', 'B', 'C', 0, 5,
'D', 'E', 'F', 'G', 'H',0,1,'I')
I would like to convert this to a list of strings, so it should similar to List("ABC", "DEFGH", "I").
Is there an elegant way to do this?
(EDIT) These strings are NOT null terminated, the 0 you are seeing in the array is just the MSB. If the strings were long enough, the MSB would be greater than zero.
Edit: Updated based on clarification in comments that first 2 bytes define an int. So I converted it manually.
def convert(bs: List[Byte]) : List[String] = {
bs match {
case count_b1 :: count_b2 :: t =>
val count = ((count_b1 & 0xff) << 8) | (count_b2 & 0xff)
val (chars, leftover) = t.splitAt(count)
new String(chars.toArray, "UTF-8") :: convert(leftover)
case _ => List()
}
}
Call convert(z.toList)
Consider multiSpan method as defined here which is a repeated application of span over a given list,
z.multiSpan(_ == 0).map( _.drop(2).map(_.toChar).mkString )
Here the spanning condition is whether an item equals 0, then we drop the first two prefixing bytes, and convert the remaining to a String.
Note On using multiSpan, recall to import annotation.tailrec .
Here is my answer with foldLeft.
def convert(z : ByteString) = z.foldLeft((List() : List[String], ByteString(), 0, 0))((p, b : Byte) => {
p._3 match {
case 0 if p._2.nonEmpty => (p._2.utf8String :: p._1, ByteString(), -1, b.toInt)
case 0 => (p._1, p._2, -1, b.toInt)
case -1 => (p._1, p._2, (p._4 << 8) + b.toInt, 0)
case _ => (p._1, p._2 :+ b, p._3 - 1, 0)
}
})
It works like this:
scala> val bs = ByteString(0, 3, 'A', 'B', 'C', 0, 5, 'D', 'E', 'F', 'G', 'H',0,1,'I')
scala> val k = convert(bs); (k._2.utf8String :: k._1).reverse
k: (List[String], akka.util.ByteString, Int, Int) = (List(DEFGH, ABC),ByteString(73),0,0)
res20: List[String] = List(ABC, DEFGH, I)