Cassandra: Update multiple rows? - nosql

In mysql, I can do the following query, UPDATE mytable SET price = '100' WHERE manufacturer='22'
Can this be done with cassandra?
Thanks!

For that you will need to maintain your own index in a separate column family (you need a way to find all products (product ids) that are manufactured by a certatin manufacturer. When you have all the product ids doing a batch mutation is simple).
E.g
ManufacturerToProducts = { // this is a ColumnFamily
'22': { // this is the key to this Row inside the CF
// now we have an infinite # of columns in this row
'prod_1': "prod_1", //for simplicity I'm only showing the value (but in reality the values in the map are the entire Column.)
'prod_1911' : null
}, // end row
'23': { // this is the key to another row in the CF
// now we have another infinite # of columns in this row
'prod_1492': null,
},
}
(disclaimer: thanks Arin for the annotation used above)

Related

Wanna disable the column (row no:) from sorting usage by Tabulator js plugin

How can I make all columns on a Tabulator table auto sort except the first column
I have tried sortable:false & headerSort:false.
img(1) Initial table
img(2) Sorting table at Name: column.
(Target : Remaining the No: value from 1 to 6 by ascending order even Name: order change.)
Could you please help me to find a solution.
Thanks.
As per;
https://github.com/olifolkerd/tabulator/issues/861
"You need to set the headerSort property in the column definition object for the column you want to not be sortable, not on the table as a whole. the sortable property you are currently using in your column definition was removed in version 3.0"
$("#mytable").tabulator({
height:205, // Set height of table, this enables the Virtual DOM and improves render speed
//layout:"fitColumns", // Fit columns to width of table (optional)
resizableColumns:false, // Disable column resize
responsiveLayout:true, // Enable responsive layouts
placeholder:"No Data Available", // Display message to user on empty table
initialSort:[ // Define the sort order:
{column:"altitude", dir:"asc"}, // 1'st // THIS IS WHAT YOU'RE LOOKING FOR I ASSUMEN
],
columns:[
{title:"Flight", field:"flight", headerSort:false, responsive:0, align:"left"}, // , width:250},
{title:"CallSig", field:"callsign", headerSort:false, responsive:3},
...
Further reading: http://tabulator.info/docs/3.3#sorting
EDIT: You can set sorting programmatically;
$("#example-table").tabulator("setSort", "age", "asc");
Hope this helps.

Create column with aggregated value with calculation in PBI

Imagine you have two tables:
Table User:
ID, Name
Table Orders:
ID, UserID
I'm trying to create a new column in table User which should contain aggregated values of distinct count of Order.IDs.
Calculated column:
OrderCount = CALCULATE(DISTINCTCOUNT(Orders[Id]))
Alternatively if you don't/can't have a relationship between the two tables:
OrderCount2 = CALCULATE(DISTINCTCOUNT(Orders[Id]),FILTER(Orders, Orders[UserId] = User[Id]))
If all you need is to display it in some visualisation, you can use Orders[Id] directly by setting the aggregate option to Count (Distinct) in Values under Visualizations side pane.

filtering on a range of values in a db column with sqlalchemy orm

I have a postgresql database and in one particular table, with many rows. One column in this table, called data, is a float array, REAL[], and gets filled with an array of ~4500 elements. I want to access this table through some query via SQLAlchemy and the ORM.
How do I select all rows in the table where a subset of this column satisfies some condition, e.g.contains a range of values? Like I want to select all rows where the data contains values >= 10, or values between >=10 and <=20.
Can I do this with a straight session query like
rows = session.query(Table).filter(Table.data.(some conditional)).all()
where my conditional is something like "VALUES >= 10 and VALUES <= 20"?
Or do I need to define some special methods, or setup, when I'm defining my SQLAlchemy table class. For example, I have my table set up as
class Table(Base):
__tablename__ = 'table'
__table_args__ = {'autoload' : True, 'schema' : 'testdb', 'extend_existing':True}
data = deferred(Column(ARRAY(Float)))
def __repr__(self):
return '<Table (pk={0})>'.format(self.pk)
Ideally I'd like to set it up so I can just do simple filtering in my session.query calls. Is this possible? I'm not super familiar with the ORM, so maybe it is?
I've had a look at the ARRAY Comparator sqlalchemy docs but those only seem to work on exact values. My data is precise to 6 sigfigs, and I don't know the exact values ahead of time.
What's the best way to do this? Thanks.
EDIT:
Based on the below comment, here is the code I'm using in attempting to select all rows (out of 1000) that have data (from 1 column) >= 1.0. There should be 537 rows.
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).all()
This gives the correct subset number. len(rows) = 537. However, I don't understand the logic of with this operator, where to select data >=1.0 , I use the le operator? Also, along those same lines, there should be 234 rows that have data between the values >=1.0 and <1.0, but this statement fails to give the correct subset..
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).filter(datadb.Table.data.any(1.2,operator=operators.ge)).all()
* EDIT 2 *
Here's an example of my database Table with a few rows. pk is an integer, and data is a real[].
db datadb
schema Table
pk data
0 [0.0,0.0,0.5,0.3,1.3,1.9,0.3,0.0,0.0]
1 [0.1,0.0,1.0,0.7,1.1,1.5,1.2,0.3,1.4]
2 [0.0,0.6,0.4,0.3,1.6,1.7,0.4,1.3,0.0]
3 [0.0,0.1,0.2,0.4,1.0,1.1,1.2,0.9,0.0]
4 [0.0,0.0,0.5,0.3,0.2,0.1,0.7,0.3,0.1]
I have 5 rows, 4 of them have data with values >= 1.0, while just 2 have values in the range >= 1.0 and <= 1.2. The query I would do to grab the rows is in the first case
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).all()
This should return the 4 rows, at pk=0,1,2,3. This query does what I expect. The second case
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).filter(datadb.Table.data.any(1.2,operator=operators.ge)).all()
and should return the 2 rows at pk=1,3. However this query just returns the 4 rows from the first query. For the second query, I also tried
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le),datadb.Table.data.any(1.2,operator=operators.ge)).all()
which also didn't work.
Please read documentation on ARRAY.Comparator, according to which you should be able to do the following:
rows = (session.query(Table)
.filter(Table.data.any(10, operator=operators.le))
.filter(Table.data.any(20, operator=operators.ge)
).all()
EDIT:
# combined filter does not work,
# but applying one or the other is still useful as it reduces the result set
q = (session.query(MyTable)
.filter(MyTable.data.any(1.0, operator=operators.le))
# .filter(MyTable.data.any(1.2, operator=operators.ge))
)
# filter in memory
items = [_row for _row in q.all()
if any(1.0 <= item <= 1.2 for item in _row.data)]
for item in items:
print(item)

One to Many equivalent in Cassandra and data model optimization

I am modeling my database in Cassandra, coming from RDBMS. I want to know how can I create a one-to-many relationship which is embedded in the same Column Name and model my table to fit the following query needs.
For example:
Boxes:{
23442:{
belongs_to_user: user1,
box_title: 'the box title',
items:{
1: {
name: 'itemname1',
size: 44
},
2: {
name: 'itemname2',
size: 24
}
}
},
{ ... }
}
I read that its preferable to use composite columns instead of super columns, so I need an example of the best way to implement this. My queries are like:
Get items for box by Id
get top 20 boxes with their items (for displaying a range of boxes with their items on the page)
update items size by item id (increment size by a number)
get all boxes by userid (all boxes that belongs to a specific user)
I am expecting lots of writes to change the size of each item in the box. I want to know the best way to implement it without the need to use super columns. Furthermore, I don't mind getting a solution that takes Cassandra 1.2 new features into account, because I will use that in production.
Thanks
This particular model is somewhat challenging, for a number of reasons.
For example, with the box ID as a row key, querying for a range of boxes will require a range query in Cassandra (as opposed to a column slice), which means the use of an ordered partitioner. An ordered partitioner is almost always a Bad Idea.
Another challenge comes from the need to increment the item size, as this calls for the use of a counter column family. Counter column families store counter values only.
Setting aside the need for a range of box IDs for a moment, you could model this using multiple tables in CQL3 as follows:
CREATE TABLE boxes (
id int PRIMARY KEY,
belongs_to_user text,
box_title text,
);
CREATE INDEX useridx on boxes (belongs_to_user);
CREATE TABLE box_items (
id int,
item int,
size counter,
PRIMARY KEY(id, item)
);
CREATE TABLE box_item_names (
id int PRIMARY KEY,
item int,
name text
);
BEGIN BATCH
INSERT INTO boxes (id, belongs_to_user, box_title) VALUES (23442, 'user1', 'the box title');
INSERT INTO box_items (id, item, name) VALUES (23442, 1, 'itemname1');
INSERT INTO box_items (id, item, name) VALUES (23442, 1, 'itemname2');
UPDATE box_items SET size = size + 44 WHERE id = 23442 AND item = 1;
UPDATE box_items SET size = size + 24 WHERE id = 23442 AND item = 2;
APPLY BATCH
-- Get items for box by ID
SELECT size FROM box_items WHERE id = 23442 AND item = 1;
-- Boxes by user ID
SELECT * FROM boxes WHERE belongs_to_user = 'user1';
It's important to note that the BATCH mutation above is both atomic, and isolated.
Technically speaking, you could also denormalize all of this into a single table. For example:
CREATE TABLE boxes (
id int,
belongs_to_user text,
box_title text,
item int,
name text,
size counter,
PRIMARY KEY(id, item, belongs_to_user, box_title, name)
);
UPDATE boxes set size = item_size + 44 WHERE id = 23442 AND belongs_to_user = 'user1'
AND box_title = 'the box title' AND name = 'itemname1' AND item = 1;
SELECT item, name, size FROM boxes WHERE id = 23442;
However, this provides no guarantees of correctness. For example, this model makes it possible for items of the same box to have different users, or titles. And, since this makes boxes a counter column family, it limits how you can evolve the schema in the future.
I think in PlayOrm's objects first, then show the column model below....
Box {
#NoSqlId
String id;
#NoSqlEmbedded
List<Item> items;
}
User {
#NoSqlId
TimeUUID uuid;
#OneToMany
List<Box> boxes;
}
The User then is a row like so
rowkey = uuid=<someuuid> boxes.fkToBox35 = null, boxes.fktoBox37=null, boxes.fkToBox38=null
Note, the form of the above is columname=value where some of the columnnames are composite and some are not.
The box is more interesting and say Item has fields name and idnumber, then box row would be
rowkey = id=myid, items.item23.name=playdo, items.item23.idnumber=5634, itesm.item56.name=pencil, items.item56.idnumber=7894
I am not sure what you meant though on get the top 20 boxes? top boxes meaning by the number of items in them?
Dean
You can use Query-Driven Methodology, for data modeling.You have the three broad access paths:
1) partition per query
2) partition+ per query (one or more partitions)
3) table or table+ per query
The most efficient option is the “partition per query”. This article can help you in this case, step-by-step. it's sample is exactly a one-to-many relation.
And according to this, you will have several tables with some similar columns. You can manage this, by Materialized View or batch-log(as alternative approach).

How to set an object to null

How to set an object to null.
Ex:
my object samp contains three fileds
samp.field1,samp.field2.sampfield3
If i set samp:= null;
im getting errors is there a way to set the object value to null.
An sql database does not know about objects, it deals with rows in table.
To remove a row use DELETE :
e.g. :
DELETE FROM samp WHERE id = 12345;
DELETE FROM samp WHERE field1 = 'Delete Me';
The first example is typical to remove individual rows uing their primary key (id in this case)
The second example will remove a group of rows which have a speciic value for a field.