How do you compare hashes for equality that contain different key formats (some strings, some symbols) in Ruby? - ruby-1.9.3

I'm using ruby 1.9.3 and I need to compare two hashes that have different key formats. For example, I want the equality of the following two hashes to be the true:
hash_1 = {:date => 2011-11-01, :value => 12}
hash_2 = {"date" => 2011-11-01, "value" => 12}
Any ideas on how these two hashes can be compared in one line of code?

Stringify the keys on the hash that has symbols:
> hash_1.stringify_keys
=> {"date"=>"2011-11-01", "value"=>12}
Then compare. So, your answer, in one line, is:
> hash_1.stringify_keys == hash_2
=> true
You could also do it the other way around, symbolizing the string keys in hash_2 instead of stringifying them in hash_1:
> hash_1 == hash_2.symbolize_keys
=> true
If you want the stringification/symbolization to be a permanent change, use the version with the bang !: stringify_keys! or symbolize_keys! respectively
> hash_1.stringify_keys! # <- Permanently changes the keys in hash_1 into Strings
=> {"date"=>"2011-11-01", "value"=>12} # as opposed to temporarily changing them for comparison
Ref: http://as.rubyonrails.org/classes/HashWithIndifferentAccess.html
Also, I'm guessing you meant to put quotes around the dates...
:date => "2011-11-01"
...or, explicitly instantiate them as Date objects?
:date => Date.new("2011-11-01")
The way you have the date written now sets :date to 2011-11-01 These are currently being interpreted as series of integers with subtraction in between them.
That is:
> date = 2011-11-01
=> 1999 # <- integer value of 2011, minus 11, minus 1

Related

Is there a Raku method that allows a hash value to be sorted and split from its pair?

I am currently trying to use hashes in an array to find the keys and values of each specific item in the array. I am able to do this and both the keys and the values are separate when I haven't sorted the array, but when I create a sorted array such as:
my #sorted_pairs = %counts{$word}.sort(*.value);
It binds the values together. Is there a method for sorted hash values that allow the pairs to be split into separate entities within the array? I want to be able to access the "word" string and the count or number of times that word was seen as an integer, separately.
I am using this source as a reference. I have tried a handful of these methods and while it does seem to sort the array by numeric value given the output:
sorted array : [do => 1 rest => 1 look => 1 wanted => 1 give => 1
imagine => 2 read => 2 granted => 2 ever => 2 love => 2 gonna => 2
feel => 2 meant => 2 like => 2 you => 2 live => 3 wrote => 3 come => 3
know => 3 are => 3 mom => 4]
it doesn't separate the key and value from one another.
Word-counting in Raku
You might want to save your results as a (Bag-ged) array-of-hashes (pairs?), then print out either .keys or .values as necessary.
raku -e 'my #aoh = words.Bag.pairs.sort(*.values).reverse; \
.say for #aoh;' Ishmael.txt
Sample Input (Ishmael.txt):
Call me Ishmael. Some years ago--never mind how long precisely --having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen, and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people's hats off--then, I account it high time to get to sea as soon as I can.
Using the code above you get the following Output (full output truncated by specifying $_.value >= 4):
raku -e 'my #aoh = words.Bag.pairs.sort(*.values).reverse; \
.say if ($_.value >= 4) for #aoh ;' Ishmael.txt
I => 8
the => 7
and => 6
of => 4
to => 4
a => 4
And it's simple enough to return just .keys by changing the second statement to .keys.put for #aoh:
$ raku -e 'my #aoh = words.Bag.pairs.sort(*.values).reverse; \
.keys.put if ($_.value >= 4) for #aoh ;' Ishmael.txt
I
the
and
a
to
of
Or return just .values by changing the second statement to .values.put for #aoh:
$ raku -e 'my #aoh = words.Bag.pairs.sort(*.values).reverse; \
.values.put if ($_.value >= 4) for #aoh ;' Ishmael.txt
8
7
6
4
4
4
[Note: the above is a pretty quick-and-dirty example of word-counting code in Raku. It doesn't handle punctuation, capitalization, etc., but it's a start.]
https://docs.raku.org/language/hashmap#Looping_over_hash_keys_and_values

How to understand such this kind of variable to combine _ and other charcters in Perl?

How to understand such this kind of value in Perl?
my %opt = ( _argv => join(" ",#ARGV),_cwd = cwd()).
Are _argv and _cwd both strings?
From the reference:
The => operator (sometimes pronounced "fat comma") is a synonym for the comma except that it causes a word on its left to be interpreted as a string if it begins with a letter or underscore and is composed only of letters, digits and underscores. This includes operands that might otherwise be interpreted as operators, constants, single number v-strings or function calls. If in doubt about this behavior, the left operand can be quoted explicitly.
my %hash = ('a' => 'b', 'c' => 'd');
can be written as
my %hash = (a => 'b', c => 'd');
thanks for everyone, Now I think _argv and _cwd both are just a variable name, equals to "_argv" and "_cwd".

Removing a key from a Perl hash

I have a hash in which its keys are hashes. I want to rename some of the keys inside the primary hash by adding a key with the desired name and deleting the unwanted key. I succeeded in adding a key, but I'm unable to delete the original key.
This statement isn't working
delete $primary_hash{$sec_hash_key};
If I print the value of $primary_hash{$sec_hash_key} it's returning $HASH(0X*). I don't know what is missing in syntax?
In Perl, hash keys are always strings. If you specify a non-string object as a hash key, perl will stringify it to be able to use it as a key. Therefore, when you say:
I have hash in which it's [sic] keys are hashes
you are wrong. They are not hashes, they are strings.
Now, if you did something like:
my %h = (a => 1);
my %g = (%h => 2);
That would have created %g as:
(a => 1, 2 => undef);
If, instead, you did %g = (\%h => 2), that would have created something along the lines of:
%g = (
'HASH(0x7ff92882cbd8)' => 2
);
Note that the key is a string. You cannot go back to the data structure from that string.
What do you mean by 'delete'? Free the memory, or just want the
key to be undefined, when checking for it in an if statement?
The latter you can achieve my setting the key undef.
$primary_hash{$sec_hash_key} = undef;.
But please provide a full working example of your problem, so
it can be reproduced.

Perl Xpath: search item before a date year

I have an xml database that contains films, for example:
<film id="5">
<title>The Avengers</title>
<date>2012-09-24</date>
<family>Comics</family>
</film>
From a Perl script I want to find film by date.
If I search films of an exacly year, for example:
my $query = "//collection/film[date = 2012]";
it works exactly and return all films of 2012 year, but if I search all film before a year, it didn't work, for example:
my $query = "//collection/film[date < 2012]";
it returns all film..
Well, as usual, there's more than one way to do it. ) Either you let XPath tool know that it should compare dates (it doesn't know from the start) with something like this:
my $query = '//collection/film[xs:date(./date) < xs:date("2012-01-01")]';
... or you just bite the bullet and just compare the 'yyyy' substrings:
my $query = '//collection/film[substring(date, 1, 4) < "2012"]';
The former is better semantically, I suppose, but requires an advanced XML parser tool which supports XPath 2.0. And the latter was successfully verified with XML::XPath.
UPDATE: I'd like to give my explanation of why your first query works. ) See, you don't compare dates there - you compare numbers, but only because of '=' operator. Quote from the doc:
When neither object to be compared is a node-set and the operator is =
or !=, then the objects are compared by converting them to a common
type as follows and then comparing them. If at least one object to be
compared is a boolean, then each object to be compared is converted to
a boolean as if by applying the boolean function. Otherwise, if at
least one object to be compared is a number, then each object to be
compared is converted to a number as if by applying the number
function.
See? Your '2012-09-24' was converted to number - and became 2012. Which, of course, is equal to 2012. )
This doesn't work with any other comparative operators, though: that's why you need to either use substring, or convert the date-string to number. I supposed the first approach would be more readable - and faster as well, perhaps. )
Use this XPath, to check the year
//collection/film[substring-before(date, '-') < '2012']
Your Perl script will be,
my $query = "//collection/film[substring-before(date, '-') < '2012']";
OR
my $query = "//collection/film[substring-before(date, '-') = '2012']";
Simply use:
//collection/film[translate(date, '-', '') < 20120101]
This removes the dashes from the date then compares it for being less than 2012-01-01 (with the dashes removed).
In the same way you can get all films with dates prior a given date (not only year):
//collection/film[translate(date, '-', '') < translate($theDate, '-', '']

Perl Parsing Log/Storing Results/Reading Results

A while back I created a log parser. The logs can be several thousands of lines up to millions of lines. I store the parsed entries in an array of hash refs.
I am looking for suggestions on how to store my output, so that I can quickly read it back in if the script is run again (this prevents the need to re-parse the log).
The end goal is to have a web interface that will allow users to create queries (basically treating the parsed output like it existed within a database).
I have already considered writing the output of Data::Dumper to a file.
Here is an example array entry printed with Data::Dumper:
$VAR =
{
'weekday' => 'Sun',
'index' => 26417,
'timestamp' => '1316326961',
'text' => 'sys1 NSP
Test.cpp 1000
This is a example error message.
',
'errname' => 'EM_TEST',
'time' => {
'array' => [
2011,
9,
18,
'06',
22,
41
],
'stamp' => '20110918062241',
'whole' => '06:22:41',
'hour' => '06',
'sec' => 41,
'min' => 22
},
'month' => 'Sep',
'errno' => '2261703',
'dayofmonth' => 18,
'unknown2' => '1',
'unknown3' => '1',
'year' => 2011,
'unknown1' => '0',
'line' => 219154
},`
Is there a more efficient way of accomplishing my goal?
If your output is an object (or if you want to make it into an object), then you can use KiokuDB (along with a database back end of your choice). If not, then you can use Storable. Of course, if your data structure essentially mimics a CSV file, then you can just write the output to file. Or you can output the data into a JSON object that you can store in a file. Or you can forgo the middleman and simply use a database.
You mentioned that your data structure is a "array of hashes" (presumably you mean an array of hash references). If the keys of each hash reference are the same, then you can store this in CSV.
You're unlikely to get a specific answer without being more specific about your data.
Edit: Now that you've posted some sample data, you can simply write this to a CSV file or a database with the values for index,timestamp,text,errname,errno,unknown1,unknown2,unknown3, and line.
use Storable;
# fill my hash
store \%hash, 'file';
%hash = ();
%hash = %{retrieve('file')};
# print my hash
You can always use KiokuDB, Storable or what have we, but if you are planning to do aggregation, using a relational data base (or some data store that supports queries) may be the best solution in the longer run. A lightweight data store with an SQL engine like SQLite that doesn't require running a database server could be a good starting point.