How to create a hash of arrays or hash of hashes - perl

I have a file named Jobs.conf which contains:
JobName: A
JobSize: 100
JobArrival:1
JobExe:100
JobName: B
JobSize: 100
JobArrival:2
JobExe:100
JobName: C
JobSize: 100
JobArrival:3
JobExe:100
JobName: D
JobSize: 100
JobArrival:4
JobExe:100
Is it possible to read this file so each Job is stored in an array and then the 4 arrays are stored in a hash? Would it make more sense to store this as a hash of hashes and is that possible?

The key is to set $/ (the input record separator) properly:
This influences Perl's idea of what a "line" is. Works like awk's RS
variable, including treating empty lines as a terminator if set to the
null string (an empty line cannot contain any spaces or tabs). You may
set it to a multi-character string to match a multi-character
terminator, or to undef to read through the end of file. Setting it
to "\n\n" means something slightly different than setting to "",
if the file contains consecutive empty lines. Setting to "" will
treat two or more consecutive empty lines as a single empty line.
Setting to "\n\n" will blindly assume that the next input character
belongs to the next paragraph, even if it's a newline.
Then we just take advantage of how the records are laid out by split-ing directly into a hash reference:
use strict;
use warnings;
use Data::Dump;
local $/ = "";
my #jobs;
while (<DATA>) {
push(#jobs, {split(/:\s*|\n/)});
}
dd(\#jobs);
__DATA__
JobName: A
JobSize: 100
JobArrival:1
JobExe:100
JobName: B
JobSize: 100
JobArrival:2
JobExe:100
JobName: C
JobSize: 100
JobArrival:3
JobExe:100
JobName: D
JobSize: 100
JobArrival:4
JobExe:100
Or just:
my #jobs = map { {split(/:\s*|\n/)} } <DATA>;
Output:
[
{ JobArrival => 1, JobExe => 100, JobName => "A", JobSize => 100 },
{ JobArrival => 2, JobExe => 100, JobName => "B", JobSize => 100 },
{ JobArrival => 3, JobExe => 100, JobName => "C", JobSize => 100 },
{ JobArrival => 4, JobExe => 100, JobName => "D", JobSize => 100 },
]
An array of hashes (the previous example) would be my preference, but if you wanted a hash of hashes, you'd need to modify the code slightly:
my %jobs;
while (<DATA>) {
my %temp = split(/:\s*|\n/);
$jobs{delete($temp{JobName})} = \%temp;
}
dd(\%jobs);
Output:
{
A => { JobArrival => 1, JobExe => 100, JobSize => 100 },
B => { JobArrival => 2, JobExe => 100, JobSize => 100 },
C => { JobArrival => 3, JobExe => 100, JobSize => 100 },
D => { JobArrival => 4, JobExe => 100, JobSize => 100 },
}

Related

How to find the combination count in scala?

My dataset contain 5 columns with last column as classindex. I want the combination of each column with that classindex values.
"sunny", "hot", "high", "false","no"
"sunny", "hot", "high", "true","no"
"overcast", "hot", "high", "false","yes"
"rainy", "mild", "high", "false","yes"
I want the combination sunny & yes = 0, sunny & no = 2, overcast & yes = 1, rainy & yes = 2.
Gather each row into a case class Weather with 5 properties,
case class Weather(p1: String, p2: String, p3: String, p4: String, p5: String)
and so for
val xs = Array(
Weather("sunny", "hot", "high", "false","no"),
Weather("sunny", "hot", "high", "true","no"),
Weather("overcast", "hot", "high", "false","yes"),
Weather("rainy", "mild", "high", "false","yes"))
group the entries by the first and last properties, and then count the amount of grouped instances, for instance like like this,
xs.groupBy( w => (w.p1,w.p5) ).mapValues(_.size)
which delivers
Map((overcast,yes) -> 1, (sunny,no) -> 2, (rainy,yes) -> 1)
However this approach does not account for missing or not declared groups such as "sunny" and "yes".
The description of your dataset seems a little bit vague to me but, which data structure are you using to represent it?
Imagine it is a list, you could try something like:
l => (l.head, l.last)
Applying this to the entire set:
val dataset = List(
"sunny"::"hot"::"high"::"no"::Nil,
"sunny"::"hot"::"high"::"no"::Nil,
"overcast"::"hot"::"high"::"yes"::Nil,
"rainy"::"mild"::"high"::"yes"::Nil
)
val qualified = dataset.map(l => (l.head, l.last))
Once you have your elements qualified with "yes"/"no" class you can group your occurrences and count the number of element of each group:
val countMap = qualified.groupBy(x => x).map(kv => (kv._1, kv._2.size))
Or the shorter form:
val countMap = qualified.groupBy(x => x).mapValues(_.size)
In order to list all possibilities, even though their count is 0, you can generate all possible combinations and use the map to look-up each count value:
(
for(
st <- dataset.map(_.head).toSet[String];
q <- dataset.map(_.last).toSet[String]
) yield (st,q)
).map(k => (k, countMap.getOrElse(k,0)))
> Set(((rainy,no),0), ((sunny,yes),0), ((sunny,no),2), ((rainy,yes),1), ((overcast,yes),1), ((overcast,no),0))

CoffeeScript reduce skips values that are the same

I am using CoffeeScript to aggregate elements from a list into a combined object. However, when I have two values that are the same, one of the values gets left out. Instead of skipping one of these values, how can I get their sum?
metals = [
{ metal: 'silver', amount: 10 }
{ metal: 'gold', amount: 16 }
{ metal: 'iron', amount: 17 }
{ metal: 'iron', amount: 3 }
]
reduction = metals.reduce (x, y) ->
x[y.metal]= y.amount
x
, {}
console.log reduction
# => { silver: 10, gold: 16, iron: 3 }, but I would like to get iron: 20
here is a jsfiddle to help solve the problem https://jsfiddle.net/822trwez/
If you want reduce to sum things then you have to say so:
reduction = metals.reduce (x, y) ->
x[y.metal] = (x[y.metal] ? 0) + y.amount
x
, { }
The x[y.metal] ? 0 is just saying "if x[y.metal] is defined then use it, otherwise use 0". You could also say:
reduction = metals.reduce (x, y) ->
x[y.metal] = (x[y.metal] || 0) + y.amount
x
, { }
since you don't care about falsey values for x[y.metal] such as 0, '', false, null, or undefined; in your case you can convert all those to zero.
You could also be more explicit about what you're doing:
reduction = metals.reduce (x, y) ->
x[y.metal] = 0 if(y.metal !of x)
x[y.metal] += y.amount
x
, {}
The x[y.metal] = 0 if(y.metal !of x) just initializes x[y.metal] to zero if x doesn't have a y.metal property already. You could also use unless if you don't like !of:
reduction = metals.reduce (x, y) ->
x[y.metal] = 0 unless(y.metal of x)
x[y.metal] += y.amount
x
, {}
Keep in mind that all reduce does is runs the function you give it and feeds the function's output back to itself so:
[1,2,3].reduce f, i
is just:
f(f(f(i, 1), 2), 3)
What the function f does with its inputs and what it returns is up to you.

Perl dynamic hash traversal

A hash of (0 or more levels of hash refs of) array refs of hash refs. Note that the level above the leaf nodes will always be array refs, even if they only have one element.
I need to fetch the aggregate sum of VALUE (in an array of array ref) by preserving the order of the hash refs (In the order of insertion).
Examples :
1)
(
A => {
A1 => [
{ VALUE => 10 },
{ VALUE => 20 }
],
B1 => [
{ VALUE => 30 }
],
},
B => {
A1 => [
{ VALUE => 10 }
],
B1 => [
{ VALUE => 5 }
],
},
C => {
A1 => [
{ VALUE => 100 }
],
},
)
The required output of the above structure will be -
(
[A, A1, 30],
[A, B1, 30],
[B, A1, 10],
[B, B1, 5],
.
.
.
.
)
2)
(
A => [
{ VALUE => 10 },
{ VALUE => 20 }
],
B => [
{ VALUE => 30 }
],
)
The required output of the above structure will be -
(
[A, 30],
[B, 30]
)
You need to write a function that will walk your hash structure and compute the necessary sums. For each key in the hash, it needs to make this decision:
If the value of this key is a list ref, then sum up the VALUE elements in the hashes in this list and return [key, sum]
If the value of this hash is a hash ref, then recurse into that hash. If we got a list back from it, append it to our current output and continue.
At the top level (depth 0), print out each list that's returned.
There are a number of details that still need to be resolved, but that ought to get you started on the right track.

what does this ruby code mean?

this code fails:
#user_pages, #users = paginate :users, :per_page => 40, :order => :name
rewriting it like this works:
#users = User.all.paginate(:page => params[:page], :per_page => 40)
but what does #user_pages, #users mean?
I take it that #users is being assigned to #user_pages?
joey
No, #user_pages and #users are two different values returned from an array of values. They are, in effect, value[0] and value[1].
An irb example should help:
MacBook-Pro:~ me$ irb
1.9.3-p429 :001 > a,b = [1,2]
=> [1, 2]
1.9.3-p429 :002 > a
=> 1
1.9.3-p429 :003 > b
=> 2

why do they want to save a hash in an array?

I saw a very strange piece of code in a perl script used in my project, it's something like:
my $arrayRef = [
A => {AA => 11, AAA => 111},
B => {BB => 11, BBB => 111},
];
IMO, it tries to construct an anonymous array from a hash table. I try to print the array element and here is what I get:
foreach (#$arrayRef )
{
print;
print "\n";
}
A
HASH(0x1e60220)
B
HASH(0x1e71bd0)
which means it treats every element (key&value) in the hash table as a separate element in the anonymous array. However I am really confused about why do they want to save a hash into an array. The ONLY benefit for me is to save some memory if the hash table is really huge. Is this a wedely used perl tricks?
Thanks!
it tries to construct an anonymous array from a hash table.
No, it constructs an anonymous array from a four-element list.
A => {AA => 11, AAA => 111}, B => {BB => 11, BBB => 111}
is exactly the same thing as
'A', {AA => 11, AAA => 111}, 'B', {BB => 11, BBB => 111}
The use of => does imply some sort of relationship, so I suppose they could have used
{ A => {AA => 11, AAA => 111}, B => {BB => 11, BBB => 111} }
or
[ [ A => {AA => 11, AAA => 111} ], [ B => {BB => 11, BBB => 111} ] ]
or any of a million other data structures, but there's way to know why one was chosen over another from what you gave.
It's an anonymous array, in which alternates a string key with an anonymous hash. The answer to your question: it depends on context, a concrete data-structure should help to resolve a concrete problem. So if we have only the data-structure and forget the problem we are trying to resolve, it's harder to imagine why they used this construction.
Perhaps, they needed a "ordered hash of hashes", the array structure makes sure the order, hash not