Cassandra: get_range_slices of TimeUUID super column? - perl

I have a schema of Row Keys 1-n. In each row there are a variable number of supercolumns with a TimeUUID 'name'. Im hoping to be able to query this data by a time range.
Two issues have come up:
in KeyRange -> the values that I put in for 'start_key' and 'end_key' are getting misunderstood (for lack of a better term) by Thrift. Experimenting with different groups of values Im not seeing what I expect and often get back something completely unexpected.
Example: my row keys are running from 1-1000 with lots of random gaps. I put start_key = 50 and end_key = 20 .. and I get back rows with keys ranging from 99 to 414.
Example: I have a known row with key = 13. Putting this value into start_key and end_key gives me no results.
Second issue: even when I do get results the 'columns' portion of the 'keyslice' is always empty. I have checked via cassandra-cli and I know there is data.
Im using Perl as follows:
my $slice_range = new Cassandra::SliceRange();
$slice_range->{ start } = create_UUID( UUID::Tiny::UUID_TIME, "2010-12-24 00:00:00" );
$slice_range->{ finish } = create_UUID( UUID::Tiny::UUID_TIME, "2011-12-25 00:00:00" );
my $slice_predicate = new Cassandra::SlicePredicate();
$slice_predicate->{ slice_range } = $slice_range;
my $key_range = new Cassandra::KeyRange();
$key_range->{ start_key } = 13;
$key_range->{ end_key } = 13;
my $result = $client->get_range_slices( $column_parent, $slice_predicate, $key_range, $consistency_level );
print Dumper( $result );
Clearly Im misunderstanding some basic precept.
EDIT: It turns out that the Perl library Im using is not properly documented. The UUID creation was not working as advertised. I opened it up, fixed it, and now its all going a bit more as I was expecting. I can slice my supercolumns by date/time range. Still working on getting the key range portion to work.

http://wiki.apache.org/cassandra/FAQ#range_rp covers why you're not seeing what you expect with key ranges.
You need to specify a SlicePredicate that contains the actual range of what you're trying to select. The default of no column_names and no slice_range will result in the empty columns list that you see.

Related

Mongoid query with AND and OR condition

I am using a MongoDB for a new app I am creating and I only need certain data from it. I require the following fields: current (boolean), position, sector, and name. The position field is populated with one of four values - "cash, SPVR, OFF, and training", while the sector field has one of two values - "Break or Working". I want a query that will give me all records where the following conditions exist:
1. current = TRUE
2. Position = "SPVR" or "CASH" OR Sector = "BREAK" and
3. Name is not NULL
The problem I have is that there are hundreds of records that have a value of BREAK, but most of them do not have a value in the Name field. I don't want any records that do not have a name associated with them. So in that respect, in line 2 above, I'm guessing the OR could be AND?
This is mu current start at the Mongoid statement
get '/currentstate*' do
StateTransaction.where( current: true, :position.in =>["SPVR", "CASH"] ).to_json
end
Any help would be greatly appreciated.
Thanks!!
It should work
StateTransaction.where(current: true, :name.ne => ["",nil]).and(
StateTransaction.or(
{:position.in =>["SPVR", "CASH"]},
{sector: "BREAK"}
).selector
)
Following is arguably simpler:
sts = StateTransaction.where(current: true, :name.ne => ["",nil])
sts.any_of(:position.in => ["SPVR", "CASH"], sector: 'BREAK')

Strange object value instead of float by using mapReduce in mongodb with Doctrine

I use mongo query for calculating sum price for every item.
My query looks like so
$queryBuilder = new Query\Builder($this, $documentName);
$queryBuilder->field('created')->gte($startDate);
$queryBuilder->field('is_test_value')->notEqual(true);
..........
$queryBuilder->map('function() {emit(this.item, this.price)}');
$queryBuilder->reduce('function(item, valuesPrices) {
return {sum: Array.sum(valuesPrices)}
}');
And this works, no problem. But I found that in some cases (approximately 20 cases from 200 results) I have strange result in field sum - instead of sum value I see construction like
[objectObject]444444444444444
4 - is price for item.
I tried to replace reduce block to block like this:
var sum = 0;
for (var i = 0; i < valuesPrices.length; i++) {
sum += parseFloat(valuesPrices[i]);
}
return {sum: sum}
In that case I see NAN value.
I suspected that some data in field price was inserted incorrectly (not as float, but as string, object etc). I tried execute my query from mongo cli and I see that all price values are integer.
It's not "strange" at all. You "broke the rules" and now you are paying for it.
"MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key."
The primary rule of mapReduce (as cited ) is that you must return exactly the same structure from the "reducer" as you do from the "mapper". This is because the "reducer" can actually run several times for the same "key". This is how mapReduce processes large lists.
You fix this by just returning a singular value, just like you did in the emit:
return Array.sum(values);
And then there will not be a problem. Adding an object key to that makes the data inconsistent, and thus you get an error when the "reduced" result gets fed back into the "reducer" again.

Specman coverage: Is there a way to define ranges using variable?

I have comp_value that gets values between 1 .. 100.
In addition I have an input variable period (of the same range). I need to cover 2 ranges of comp_values: [1..period] and [period+1 .. 100]. Something like this:
cover some_event_e is {
item period using no_collect;
item comp_val using no_collect,
ranges = {
range([1..period], "Smaller_than_period");
range([period+1..100], "Bigger_than_period");
};
};
(The code causes compilation error since no variable can be written inside range).
Is there a way to collect the coverage?
Thank you for your help.
Ranges must be constant.
But if I understood your intent correctly, you can define new items like
cover some_event_e is {
item smaller_or_equal_than_period: bool = (comp_val in [1..period]) using
ignore = (not smaller_or_equal_than_period);
item greater_than_period: bool = (comp_val in [(min(100,period+1)..100]) using
ignore = (not greater_than_period);
};
Assuming period is always in [1..100].

Why won't my factor column value change to a date value?

I know this is elementary but I can't seem to figure it out, even after reading other posts.
In a dataset, I want to convert an entire column into a date. The current class is factor.
The value in the field looks like this 12/25/2012
This is what I've tried.
C$DateofDeath=as.Date(C$DateofDeath,'%m/%d/%Y')
Error in as.Date.default(C$DateofDeath, "%m/%d/%Y") :
do not know how to convert 'C$DateofDeath' to class “Date”
C$DateofDeath=as.Date(C$DateofDeath,"%m/%d/%Y")
Error in as.Date.default(C$DateofDeath, "%m/%d/%Y") :
do not know how to convert 'C$DateofDeath' to class “Date”
Claims$DateofDeath=strptime(as.character(Claims$DateofDeath),format= '%m/%d/%Y')
Error in `$<-.data.frame`(`*tmp*`, "DateofDeath", value = list(sec = numeric(0), :
replacement has 0 rows, data has 71616
Claims$DateofDeath=strptime(as.character(Claims$DateofDeath),format= "%m/%d/%Y")
Error in `$<-.data.frame`(`*tmp*`, "DateofDeath", value = list(sec = numeric(0), :
replacement has 0 rows, data has 71616
Use as.POSIXct
C$DateOfDeath<-as.POSIXct(as.character(C$DateOfDeath), format = "%d/%m/%Y")
There are lots of R experts here but you have to specify R as one of your tags to get them to notice your question.
Looks like you have tried a bunch of combinations but not the right one.
> C <- data.frame(DateofDeath="12/25/2012",other=TRUE)
> as.Date(as.character(C$DateofDeath),format="%m/%d/%Y")
[1] "2012-12-25"
Notice that as.Date() takes a character input, not a factor. So you need to convert to character, then to Date.
Your strptime() versions seem fine to me except that you call are referring to the dataframe Claims instead of C. Actually strptime() should convert the factor to character for you, so you don't need the as.character() part with those.

Cypher: Problems with comparison

I'm trying to excecute a cypher query for a no4j database on gwt.
I stored in some nodes int values as property detail. If I'm using neoclipe right, I noticed now, that this values are stored in the database as String values.
In my query I have the following part which does not work:
START ...
MATCH node-[:SomeTag]->intnode
WHERE intnode.detail < 10
RETURN ...
and I get:
Don't know how to compare that. Left: 15; Right: 10: Don't know how to compare that: Left: 15; Right: 10
So intnode.detail < 10 does not work. I also tried this: intnode.detail < STR(10), because I thought it will compare the hash values or ascii values, but I got the same error.
EDIT:
I read, that it is possible to set the #GraphProperty while storing data, but how can I do that in gwt?
I mean if I have a node and I could e.g. write
Object obj = (Object) 10;
node.setProperty("detail", obj);
How can I now tell neo4j, that obj is an int?
This answer is mostly focused on your initial question - not on the question you´ve added in the EDIT-part.
I just had a similar problem with a comparison inside the WHERE-part of a cypher query. I tried to do something like
MATCH ...
WHERE value > 1
which caused an error message very similar to yours. After some testing I´ve found out that the query works, if I add single quotes. This is my solution:
MATCH ...
WHERE value > '1'
(note the quotes)
Ive also noticed, that this doesnt work with double quotes
I hope this helps you and/or anyone else who encounters this problem :)
I think the intnode.detail value in stored as string , so you wont able to compare with integer value.
You have to do like this
START ...
MATCH node-[:SomeTag]->intnode
WHERE intnode.detail < "10"
RETURN ...