Write a program in Scala that reads an String from the keyboard and counts the number of characters, ignoring if its UpperCase or LowerCase
ex: Avocado
R: A = 2; v = 1; o = 2; c = 1; d = 2;
So, i tried to do it with two fors iterating over the string, and then a conditional to transform the character in the position (x) to Upper and compare with the character in the position (y) which is the same position... basically i'm transforming the same character so i can increment in the counter ex: Ava -> A = 2; v = 1;
But with this logic when i print the result it comes with:
ex: Avocado
R: A = 2; v = 1; o = 2; c = 1; a = 2; d = 1; o = 2;
its repeting the same character Upper or Lower in the result...
so my teacher asked us to resolve this using the split method and yield of Scala but i dunno how to use the split without forEach() that he doesnt allow us to use.
sorry for the bad english
object ex8 {
def main(args: Array[String]): Unit = {
println("Write a string")
var string = readLine()
var cont = 0
for (x <- 0 to string.length - 1) {
for (y <- 0 to string.length - 1) {
if (string.charAt(x).toUpper == string.charAt(y).toUpper)
cont += 1
}
print(string.charAt(x) + " = " + cont + "; ")
cont = 0
}
}
}
But with this logic when i print the result it comes with:
ex: Avocado
R: A = 2; V = 1; o = 2; c = 1; a = 2; d = 1; o = 2;
Scala 2.13 has added a very handy method to cover this sort of thing.
inputStr.groupMapReduce(_.toUpper)(_ => 1)(_+_)
.foreach{case (k,v) => println(s"$k = $v")}
//A = 2
//V = 1
//C = 1
//O = 2
//D = 1
It might be easier to group the individual elements of the String (i.e. a collection of Chars, made case-insensitive with toLower) to aggregate their corresponding size using groupBy/mapValues:
"Avocado".groupBy(_.toLower).mapValues(_.size)
// res1: scala.collection.immutable.Map[Char,Int] =
// Map(a -> 2, v -> 1, c -> 1, o -> 2, d -> 1)
Scala 2.11
Tried with classic word count approach of map => group => reduce
val exampleStr = "Avocado R"
exampleStr.
toLowerCase.
trim.
replaceAll(" +","").
toCharArray.map(x => (x,1)).groupBy(_._1).
map(x => (x._1,x._2.length))
Answer :
exampleStr: String = Avocado R
res3: scala.collection.immutable.Map[Char,Int] =
Map(a -> 2, v -> 1, c -> 1, r -> 1, o -> 2, d -> 1)
There are a few things I "know" about Perl:
lists flatten
functions take and return lists
So if I have this:
sub my_test {
my #x = qw(a b c);
my #y = qw(x y z t);
return (#x, #y);
}
say my_test; # a b c x y z t
say scalar my_test;
I expect either of two result values:
7, because that's how many items there are in the list qw(a b c x y z t). Indeed, this is what I get from scalar sub { #{[ qw(a b c x y z t) ]} }->().
't', because if you interpret the commas as the comma operator (sigh) you get ('a', 'b', 'c', 'x', 'y', 'z', 't') which evaluates to 't'. Indeed, this is what I get from scalar sub { qw(a b c x y z t) }->().
What you do get instead is… 4, without warning. Why did I get a mix of list flattening and comma operator?
Similar story with hashes and this rather popular pattern:
sub default_override_hash {
my %defaults = (foo => 'bar', egg => 'quuz');
my %values = #_;
my %overrides = (__baz => '');
return (%defaults, %values, %overrides);
}
scalar default_override_hash; # '1/8'
How does scalar know that default_override_hash returned three hashes, and it should not only just get %overrides (and not everything, and not ''), but its scalar representation as a hash?
The most important point is: Lists don't flatten. (Lists are flat, but they don't flatten, because to do that they'd have to be nested first.)
, (the comma operator) in list context is list concatenation. A , B in list context evaluates A and B in list context, then concatenates the results.
A , B in scalar context works like C (or JavaScript): It evaluates A in void context, then evaluates (and returns) B in scalar context. (A , B in void context works the same, but evaluates B in void context too.)
In return X, the context of X is the context of the function call itself. Thus sub my_test { return X } scalar my_test is like scalar X. It's like return dynamically looks at the context the current sub call is in, and evaluates its operand expression accordingly.
perldoc perlsub says:
A return statement may be used to exit a subroutine, optionally specifying the returned value, which will be evaluated in the appropriate context (list, scalar, or void) depending on the context of the subroutine call.)
As described above, #x, #y in scalar context evaluates #x in void context (which for an array does nothing), then evaluates and returns #y in scalar context. An array in scalar context yields the number of elements it contains.
The same logic applies to %defaults, %values, %overrides. , is left-associative, so this parses as (%defaults , %values) , %overrides.
This evaluates %defaults , %values in void context (which in turn evaluates %defaults, then %values in void context, which has no effect), then evaluates and returns %overrides in scalar context. A hash in scalar context returns a (rather useless) string describing hash internals.
I know this question has been asked before here (compare multiple hashes for common keys merge values). As far as I can tell, it went unanswered. If you answer, please include an example that uses the List::Compare->new() constructor.
List::Compare has the ability to accept multiple arrays as input. However, there are no examples that explain how to do so if you do not know in advance how many there will be passed to the constructor.
Example from the man page:
$lcm = List::Compare->new(\#Al, \#Bob, \#Carmen, \#Don, \#Ed);
or...
You may use the 'single hashref' constructor format to build a
List::Compare object to process three or more lists at once:
$lcm = List::Compare->new( { lists => [\#Al, \#Bob, \#Carmen, \#Don, #Ed], } );
or
$lcm = List::Compare->new( {
lists => [\#Al, \#Bob, \#Carmen, \#Don, \#Ed],
unsorted => 1, } );
I need to use this 'single hashref' constructor above, because I don't know how many lists (arrays) will be passed to the constructor. The closest I come is this:
my %l;
my #a = ("fred", "barney", "pebbles", "bambam", "dino");
my #b = ("george", "jane", "elroy", "judy");
my #c = ("homer", "bart", "marge", "maggie");
my #d = ("fred", "barney", "pebbles", "bambam", "dino");
my #e = ("fred", "george", "jane", "elroy", "judy", "pebbles");
$l{'lists'}{'a'} = [ #a ];
$l{'lists'}{'b'} = [ #b ];
$l{'lists'}{'c'} = [ #c ];
$l{'lists'}{'d'} = [ #d ];
$l{'lists'}{'e'} = [ #e ];
my $lc = List::Compare->new(\%l);
my #intersection = $lc->get_intersection;
print #intersection . "\n";
I am getting:
Need to define 'lists' key properly: at /usr/local/share/perl5/List/Compare.pm line 21.
The Compare.pm code (line 21) is:
die "Need to define 'lists' key properly: $!"
unless ( ${$argref}{'lists'}
and (ref(${$argref}{'lists'}) eq 'ARRAY') );
Can someone tell me how to construct and name this hash from simple arrays? I need to be able to process a wide and varied number of them continuously. There could possibly be hundreds of arrays involved.
Update
#Borodin's answer was exactly what I needed. I apologize for the bad data, was trying to come up with something concise. Here is what derived from that code
my #sets_to_process = qw( DOW SP500 has_trend_lines_day );
my #sets;
my $num_sets = $#sets_to_process;
for my $i (0 .. $num_sets) {
my #set = get_ids_in_list( $dbh, $sets_to_process[$i] );
push #sets, \#set;
}
my $lc = List::Compare->new(#sets);
my #intersection = $lc->get_intersection;
print "Sets:\n";
printf " %s\n", join ', ', #$_ for #sets;
print "\n";
print "Intersection:\n";
printf " %s\n", #intersection? join(', ', #intersection) : 'None';
print "\n";
The problem with your parameter construction is that you are defining %l (a dreadful identifier, by the way) as a hash containing a hash of arrays, like this
(
lists => {
a => ["fred", "barney", "pebbles", "bambam", "dino"],
b => ["george", "jane", "elroy", "judy"],
c => ["homer", "bart", "marge", "maggie"],
d => ["fred", "barney", "pebbles", "bambam", "dino"],
e => ["fred", "george", "jane", "elroy", "judy", "pebbles"],
},
)
but the documentation is clear that it should be a simple hash containing an array of arrays
(
lists => [
["fred", "barney", "pebbles", "bambam", "dino"],
["george", "jane", "elroy", "judy"],
["homer", "bart", "marge", "maggie"],
["fred", "barney", "pebbles", "bambam", "dino"],
["fred", "george", "jane", "elroy", "judy", "pebbles"],
],
)
Furthermore, it doesn't help your problem that “[You] don't know how many lists (arrays) will be passed to the constructor” because all that you are doing is pushing the problem inside a data structure instead of keeping it at the parameter level
It is difficult to help you with the data you have given because the intersection is the empty set, so here's a sample program you can experiment with that generates between five and ten sets of sixteen random alphabet letters. It creates different data to work with each time it is run, and passes the list of array references directly as parameters to the new constructor instead of using a reference to a hash with a single lists element
use strict;
use warnings;
use List::Util 'shuffle';
use List::Compare;
my #sets;
my $num_sets = 5 + rand(6);
for (1 .. $num_sets) {
my #set = (shuffle 'A' .. 'Z')[0..16];
push #sets, [ sort #set ];
}
my $lc = List::Compare->new(#sets);
my #overlap = $lc->get_intersection;
print "Sets:\n";
printf " %s\n", join ', ', #$_ for #sets;
print "\n";
print "Intersection:\n";
printf " %s\n", #overlap ? join(', ', #overlap) : 'None';
print "\n";
sample outputs
Sets:
B, C, D, E, F, G, K, L, M, O, P, Q, S, U, V, W, X
B, C, D, F, G, I, J, L, M, P, R, T, U, V, W, X, Y
A, B, C, D, F, G, H, K, L, M, O, R, T, U, V, W, Y
A, B, D, G, H, I, K, L, M, O, R, T, U, V, W, Y, Z
A, B, C, D, E, F, H, J, K, L, M, P, Q, S, U, V, Z
Intersection:
B, D, L, M, U, V
Sets:
A, B, C, D, F, J, K, L, M, N, Q, R, U, V, W, X, Y
A, E, F, G, H, I, J, L, O, P, Q, R, S, T, V, X, Z
B, E, G, H, J, K, L, M, N, P, S, T, U, V, W, Y, Z
B, C, D, E, F, G, H, I, J, N, O, Q, R, T, V, W, Z
A, B, C, E, F, G, H, I, L, N, O, Q, T, U, W, X, Y
Intersection:
None
Update
With regard to the updated code in your question, your identifier $num_sets is wrongly-named as it is the index of the final element in #sets, or one less than the actual number of sets
If you want to use a variable then you should say
my $num_sets = #sets_to_process;
and then loop like this
for my $i ( 0 .. $num_sets-1 ) { ... }
But in this case you don't need the indices at all, and it's probably best to forget about $num_sets and write just this
for my $set ( #sets_to_process ) {
my #set = get_ids_in_list($dbh, $set);
push #sets, \#set;
}
or even just use map like this
my #sets = map [ get_ids_in_list($dbh, $_) ], #sets_to_process;
I have a huge file (does not fit into memory) which is tab separated with two columns (key and value), and pre-sorted on the key column. I need to call a function on all values for a key and write out the result. For simplicity, one can assume that the values are numbers and the function is addition.
So, given an input:
A 1
A 2
B 1
B 3
The output would be:
A 3
B 4
For this question, I'm not so much interested in reading/writing the file, but more in the list comprehension side. It is important though that the whole content (input as well as output) doesn't fit into memory. I'm new to Scala, and coming from Java I'm interested what would be the functional/Scala way to do that.
Update:
Based on AmigoNico's comment, I came up with the below constant memory solution.
Any comments / improvements are appreciated!
val writeAggr = (kv : (String, Int)) => {println(kv._1 + " " + kv._2)}
writeAggr(
( ("", 0) /: scala.io.Source.fromFile("/tmp/xx").getLines ) { (keyAggr, line) =>
val Array(k,v) = line split ' '
if (keyAggr._1.equals(k)) {
(k, keyAggr._2 + v.toInt)
} else {
if (!keyAggr._1.equals("")) {
writeAggr(keyAggr)
}
(k, v.toInt)
}
}
)
This can be done quite elegantly with Scalaz streams (and unlike iterator-based solutions, it's "truly" functional):
import scalaz.stream._
val process =
io.linesR("input.txt")
.map { _.split("\\s") }
.map { case Array(k, v) => k -> v.toInt }
.pipe(process1.chunkBy2(_._1 == _._1))
.map { kvs => s"${ kvs.head._1 } ${ kvs.map(_._2).sum }\n" }
.pipe(text.utf8Encode)
.to(io.fileChunkW("output.txt"))
Not only will this read from the input, aggregate the lines, and write to the output in constant memory, but you also get nice guarantees about resource management that e.g. source.getLines can't offer.
You probably want to use a fold, like so:
scala> ( ( Map[String,Int]() withDefaultValue 0 ) /: scala.io.Source.fromFile("/tmp/xx").getLines ) { (map,line) =>
val Array(k,v) = line split ' '
map + ( k -> ( map(k) + v.toInt ) )
}
res12: scala.collection.immutable.Map[String,Int] = Map(A -> 3, B -> 4)
Folds are great for accumulating results (unlike for-comprehensions). And since getLines returns an Iterator, only one line is held in memory at a time.
UPDATE: OK, there is a new requirement that we not hold the results in memory either. In that case I think I'd just write a recursive function and use it like so:
scala> val kvPairs = scala.io.Source.fromFile("/tmp/xx").getLines map { line =>
val Array(k,v) = line split ' '
( k, v.toInt )
}
kvPairs: Iterator[(String, Int)] = non-empty iterator
scala> final def loop( key:String, soFar:Int ) {
if ( kvPairs.hasNext ) {
val (k,v) = kvPairs.next
if ( k == key )
loop( k, soFar+v )
else {
println( s"$key $soFar" )
loop(k,v)
}
} else println( s"$key $soFar" )
}
loop: (key: String, soFar: Int)Unit
scala> val (k,v) = kvPairs.next
k: String = A
v: Int = 1
scala> loop(k,v)
A 3
B 4
But the only thing functional about that is that it uses a recursive function rather than a loop. If you are OK with holding all of the values for a particular key in memory you could write a function that iterates over the lines of the file producing an Iterator of Iterators of like-keyed pairs, which you could then just sum and print, but the code would still not be particularly functional and it would be slower.
Travis's Scalaz pipeline solution looks like an interesting one along those lines, but with the iteration hidden behind some handy constructs. If you specifically want a functional solution, I'd say his is the best answer.