cassandra 2.0.7 cql SELECT Secific Value from map - select

ALTER TABLE users ADD todo map;
UPDATE users SET todo = { '1':'1111', '2':'2222', '3':'3' ,.... } WHERE user_id = 'frodo';
now ,i want to run the follow cql ,but failed ,is here any other method ?
SELECT user_id, todo['1'] FROM users WHERE user_id = 'frodo';
ps:
the length my map can change. for example : { '1':'1111', '2':'2222', '3':'3' } or { '1':'1111', '2':'2222', '3':'3', '4':'4444'} or { '1':'1111', '2':'2222', '3':'3', '4':'4444' ...}

If you want to use a map collection, you'll have the limitation that you can only select the collection as a whole (docs).
I think you could use the suggestion from the referenced question, even if the length of your map changes. If you store those key/value pairs for each user_id in separate fields, and make your primary key based on user_id and todo_k, you'll have access to them in the select query.
For example:
CREATE TABLE users (
user_id text,
todo_k text,
todo_v text,
PRIMARY KEY (user_id, todo_k)
);
-----------------------------
| user_id | todo_k | todo_v |
-----------------------------
| frodo | 1 | 1111 |
| frodo | 2 | 2222 |
| sam | 1 | 11 |
| sam | 2 | 22 |
| sam | 3 | 33 |
-----------------------------
Then you can do queries like:
select user_id,todo_k,todo_v from users where user_id = 'frodo';
select user_id,todo_k,todo_v from users where user_id = 'sam' and todo_k = 2;

Related

How to query across multiple rows in postgres

I'm saving dynamic objects (objects of which I do not know the type upfront) using the following 2 tables in Postgres:
CREATE TABLE IF NOT EXISTS objects(
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE IF NOT EXISTS object_values(
id UUID NOT NULL DEFAULT gen_random_uuid(),
event_id UUID NOT NULL,
param TEXT NOT NULL,
value TEXT NOT NULL,
);
So for instance, if I have the following objects:
dog = [
{ breed: "poodle", age: 15, ...},
{ breed: "husky", age: 9, ...},
}
monitors = [
{ manufacturer: "dell", ...},
}
It will live in the DB as follows:
-- objects
| id | user_id | name |
|----|---------|---------|
| 1 | 1 | dog |
| 2 | 2 | dog |
| 3 | 1 | monitor |
-- object_values
| id | event_id | param | value |
|----|----------|--------------|--------|
| 1 | 1 | breed | poodle |
| 2 | 1 | age | 15 |
| 3 | 2 | breed | husky |
| 4 | 2 | age | 9 |
| 5 | 3 | manufacturer | dell |
Note, these tables are big (hundreds of millions). Generally optimised for writing.
What would be a good way of querying/filtering objects based on multiple object params? For instance: Select the number of all husky dogs above the age of 10 per unique user.
I also wonder whether it would have been better to denormalise the tables and collapse the params onto a JSON column (and use gin indexes).
Are there any standards I can use here?
"Select the number of all husky dogs above the age of 10 per unique user" - The following query would do it.
SELECT user_id, COUNT(DISTINCT event_id) AS num_husky_dogs_older_than_10
FROM objects o
INNER JOIN object_values ov
ON o.id_ = ov.event_id
AND o.name_ = 'dog'
GROUP BY o.user_id
HAVING MAX(CASE WHEN ov.param = 'age'
AND ov.value_::integer >= 10 THEN 1 END) = 1
AND MAX(CASE WHEN ov.param = 'breed'
AND ov.value_ = 'husky' THEN 1 END) = 1;
Since your queries are most likely affected by having always the same JOIN operation between these two tables on the same fields, would be good to have a indices on:
the fields you join on ("objects.id", "object_values.event_id")
the fields you filter on ("objects.name", "object_values.param", "object_values.value_")
Check the demo here.

Postgresql - Filter object array and extract required values in a json object

I have a PostgreSQL table like below:
| data |
| -------------- |
| {"name":"a","tag":[{"type":"country","value":"US"}]} |
| {"name":"b","tag":[{"type":"country","value":"US"}]}, {"type":"country","value":"UK"}]} |
| {"name":"c","tag":[{"type":"gender","value":"male"}]} |
The goal is to extract all the value in "tag" array with "type" = "country" and aggregate them into a text array. The expected result is as follows:
| result |
| -------------- |
| ["US"] |
| ["US", "UK"] |
| [] |
I've tried to expand the "tag" array and aggregate the desired result back; however, it requires a unique id to group up the results. Hence, I add a column with row number to serve as unique id. Here is what I've done:
SELECT ROW_NUMBER() OVER () AS id, * INTO data_table_with_id FROM data_table;
SELECT ARRAY_AGG(tag_value) AS result
FROM (
SELECT
id,
json_array_elements("data"::json->'tag')->>'type' as tag_type,
json_array_elements("data"::json->'tag')->>'value' as tag_value
FROM data_table_with_id
) tags
WHERE tag_type = 'country'
GROUP BY id;
Is it possible to use a single select to filter the object array and get the required results?
You can do this easily with a JSON path function:
select jsonb_path_query_array(data, '$.tag[*] ?(#.type == "country").value')
from data_table;

Mysql- SELECT Column 'A' even with NULLS

Table A contains student names, table B and C contain classes and the presence of students.
I would like to display all students and attend their presence. The problem is that I can not display all students who did not have a checked presence.
Where I checked the presence of students it is ok, but if there is no checked presence in a given class, on a given day and in a given subject- nothing is displayed.
My query:
SELECT student.id_student, CONCAT(student.name,' ' ,student.surname) as 'name_surname',pres_student_present, pres_student_absent, pres_student_justified, pres_student_late, pres_student_rel, pres_student_course, pres_student_delegation, pres_student_note FROM student
LEFT JOIN class ON student.no_classes = class.no_classes
LEFT JOIN pres_student ON student.id_student = pres_student.id_student
WHERE (class.no_classes = '$class' OR NULL AND pres_student_data = '$data' AND pres_student_id_subject = $id_subject OR NULL)
GROUP BY student.surname
ORDER BY student.surname ASC
I want to display name_surname always and any other column should have NULL or 1
like:
Name | present | absent | just | late | rel | delegation | note |
Donald Trump | 1 | | | | | | |
Bush | | | | | | | |
Someone | 1 | | | | | | |
etc...
You should move restrictions on class and pres_studenttables from the WHERE clause to the ON (LEFT join).
In your case when you perform a restriction in the WHERE clause on a table with an outer join, the sql engine consider you are performing an INNER join
SELECT student.id_student
, CONCAT(student.name, ' ', student.surname) AS 'name_surname'
, pres_student_present
, pres_student_absent
, pres_student_justified
, pres_student_late
, pres_student_rel
, pres_student_course
, pres_student_delegation
, pres_student_note
FROM student
LEFT JOIN class
ON student.no_classes = class.no_classes
AND class.no_classes = '$class'
LEFT JOIN pres_student
ON student.id_student = pres_student.id_student
AND pres_student_data = '$data'
AND pres_student_id_subject = $id_subject
GROUP BY student.surname
ORDER BY student.surname ASC

How to get back aggregate values across 2 dimensions using Python Cubes?

Situation
Using Python 3, Django 1.9, Cubes 1.1, and Postgres 9.5.
These are my datatables in pictorial form:
The same in text format:
Store table
------------------------------
| id | code | address |
|-----|------|---------------|
| 1 | S1 | Kings Row |
| 2 | S2 | Queens Street |
| 3 | S3 | Jacks Place |
| 4 | S4 | Diamonds Alley|
| 5 | S5 | Hearts Road |
------------------------------
Product table
------------------------------
| id | code | name |
|-----|------|---------------|
| 1 | P1 | Saucer 12 |
| 2 | P2 | Plate 15 |
| 3 | P3 | Saucer 13 |
| 4 | P4 | Saucer 14 |
| 5 | P5 | Plate 16 |
| and many more .... |
|1000 |P1000 | Bowl 25 |
|----------------------------|
Sales table
----------------------------------------
| id | product_id | store_id | amount |
|-----|------------|----------|--------|
| 1 | 1 | 1 |7.05 |
| 2 | 1 | 2 |9.00 |
| 3 | 2 | 3 |1.00 |
| 4 | 2 | 3 |1.00 |
| 5 | 2 | 5 |1.00 |
| and many more .... |
| 1000| 20 | 4 |1.00 |
|--------------------------------------|
The relationships are:
Sales belongs to Store
Sales belongs to Product
Store has many Sales
Product has many Sales
What I want to achieve
I want to use cubes to be able to do a display by pagination in the following manner:
Given the stores S1-S3:
-------------------------
| product | S1 | S2 | S3 |
|---------|----|----|----|
|Saucer 12|7.05|9 | 0 |
|Plate 15 |0 |0 | 2 |
| and many more .... |
|------------------------|
Note the following:
Even though there were no records in sales for Saucer 12 under Store S3, I displayed 0 instead of null or none.
I want to be able to do sort by store, say descending order for, S3.
The cells indicate the SUM total of that particular product spent in that particular store.
I also want to have pagination.
What I tried
This is the configuration I used:
"cubes": [
{
"name": "sales",
"dimensions": ["product", "store"],
"joins": [
{"master":"product_id", "detail":"product.id"},
{"master":"store_id", "detail":"store.id"}
]
}
],
"dimensions": [
{ "name": "product", "attributes": ["code", "name"] },
{ "name": "store", "attributes": ["code", "address"] }
]
This is the code I used:
result = browser.aggregate(drilldown=['Store','Product'],
order=[("Product.name","asc"), ("Store.name","desc"), ("total_products_sale", "desc")])
I didn't get what I want.
I got it like this:
----------------------------------------------
| product_id | store_id | total_products_sale |
|------------|----------|---------------------|
| 1 | 1 | 7.05 |
| 1 | 2 | 9 |
| 2 | 3 | 2.00 |
| and many more .... |
|---------------------------------------------|
which is the whole table with no pagination and if the products not sold in that store it won't show up as zero.
My question
How do I get what I want?
Do I need to create another data table that aggregates everything by store and product before I use cubes to run the query?
Update
I have read more. I realised that what I want is called dicing as I needed to go across 2 dimensions. See: https://en.wikipedia.org/wiki/OLAP_cube#Operations
Cross-posted at Cubes GitHub issues to get more attention.
This is a pure SQL solution using crosstab() from the additional tablefunc module to pivot the aggregated data. It typically performs better than any client-side alternative. If you are not familiar with crosstab(), read this first:
PostgreSQL Crosstab Query
And this about the "extra" column in the crosstab() output:
Pivot on Multiple Columns using Tablefunc
SELECT product_id, product
, COALESCE(s1, 0) AS s1 -- 1. ... displayed 0 instead of null
, COALESCE(s2, 0) AS s2
, COALESCE(s3, 0) AS s3
, COALESCE(s4, 0) AS s4
, COALESCE(s5, 0) AS s5
FROM crosstab(
'SELECT s.product_id, p.name, s.store_id, s.sum_amount
FROM product p
JOIN (
SELECT product_id, store_id
, sum(amount) AS sum_amount -- 3. SUM total of product spent in store
FROM sales
GROUP BY product_id, store_id
) s ON p.id = s.product_id
ORDER BY s.product_id, s.store_id;'
, 'VALUES (1),(2),(3),(4),(5)' -- desired store_id's
) AS ct (product_id int, product text -- "extra" column
, s1 numeric, s2 numeric, s3 numeric, s4 numeric, s5 numeric)
ORDER BY s3 DESC; -- 2. ... descending order for S3
Produces your desired result exactly (plus product_id).
To include products that have never been sold replace [INNER] JOIN with LEFT [OUTER] JOIN.
SQL Fiddle with base query.
The tablefunc module is not installed on sqlfiddle.
Major points
Read the basic explanation in the reference answer for crosstab().
I am including with product_id because product.name is hardly unique. This might otherwise lead to sneaky errors conflating two different products.
You don't need the store table in the query if referential integrity is guaranteed.
ORDER BY s3 DESC works, because s3 references the output column where NULL values have been replaced with COALESCE. Else we would need DESC NULLS LAST to sort NULL values last:
PostgreSQL sort by datetime asc, null first?
For building crosstab() queries dynamically consider:
Dynamic alternative to pivot with CASE and GROUP BY
I also want to have pagination.
That last item is fuzzy. Simple pagination can be had with LIMIT and OFFSET:
Displaying data in grid view page by page
I would consider a MATERIALIZED VIEW to materialize results before pagination. If you have a stable page size I would add page numbers to the MV for easy and fast results.
To optimize performance for big result sets, consider:
SQL syntax term for 'WHERE (col1, col2) < (val1, val2)'
Optimize query with OFFSET on large table

Sql Query to select missing records based on multiple hard coded ranges

Creating a SQL query that performs math with variables from multiple tables
This question I asked previously will help a bit as far as layout, for the sake of saving time I'll include the important bits and add in more detail for scenarios pertaining to this:
A mockup of what the tables look like:
Inventory
ID | lowrange | highrange | ItemType
----------------------------------------
1 | 15 | 20 | 1
2 | 21 | 30 | 1
3 | null | null | 1
4 | 100 | 105 | 2
MissingOrVoid
ID | Item | Missing | Void
---------------------------------
1 | 17 | 1 | 0
1 | 19 | 1 | 0
4 | 102 | 0 | 1
4 | 103 | 1 | 0
4 | 104 | 1 | 0
TableWithDataEnteredForItemType1
InventoryID| ItemID | Detail1 | Detail2 | Detail3
-------------------------------------------------
1 | 16 | Some | Info | Here
1 | 18 | More | Info | Here
1 | 20 | Data | Is | Here
2 | 21 | .... | .... | ....
2 | 24 | .... | .... | ....
2 | 28 | .... | .... | ....
2 | 29 | .... | .... | ....
2 | 30 | .... | .... | ....
TableWithDataEnteredForItemType2
InventoryID| ItemID | Col1 | Col2 | Col3
----------------------------------------
4 | 101 | .... | .... | ....
I attempted this. I know it is not functional but it illustrates what I'm trying to do and I personally haven't seen anything written up like this before:
SELECT CASE WHEN (I.ItemType = 1) THEN SELECT TONE.ItemID FROM
TableWithDataEnteredForItemType1 TONE WHEN (I.ItemType = 2)
THEN SELECT TTWO.ItemID FROM TableWithDataEnteredForItemType2
TTWO END AS ItemMissing Inventory I JOIN CASE WHEN (I.ItemType = 1) THEN
TableWithDataEnteredForItem1 T WHEN (I.ItemType = 2) THEN
TableWithDataEnteredForItem2 T END ON
I.ID = T.InventoryID WHERE ItemMissing NOT BETWEEN IN (SELECT
I.lowrange FROM Inventory WHERE I.lowrange IS NOT NULL) AND IN
(SELECT I.highrange FROM Inventory WHERE I.highrange IS NOT NULL)
AND ItemMissing NOT IN (SELECT Item from MissingOrVoid)
The result should be:
ItemMissing
----
15
22
23
25
26
27
105
I know I'm probably not even going in the right direction with my query, but I was hoping I could get some direction as to fixing it to get the results that are needed.
Thanks
Edit:
Specific requirements (thought I included this but appears I didn't) - return list of all items not accounted for in the system. There are two ways of something being accounted for: 1) an entry in the corresponding ItemType table 2) located in MissingOrVoid (known items to be missing or removed).
DDL (I had to look this up as I wasn't sure what was meant here. Being that I have very little experience creating tables by scripting, this will probably be psuedo-DDL):
Inventory
ID - int, identifier/primary key, not nullable
lowrange - int, nullable
highrange - int, nullable
itemtype - int, not nullable
MissingOrVoid
ID - int, foreign key for Inventory.ID, not nullable
Item - int, identifier/primary key, not nullable
missing - bit, not nullable
void - bit, not nullable
Tables for Item types:
IntenvoryID - int, foreign key for Inventory.ID, not nullable
ItemID - int, primary key, not nullable
Everything else - not needed for querying, just data about the item (included
to show that the tables aren't the same content)
Edit 2: Here's an incredibly inefficient C# and Linq way of doing it but maybe of some help:
List<int> Items = new List<int>();
List<int> MoV = (from c in db.MissingOrVoid Select c.Item).ToList();
foreach (Table...ItemType1 row in db.Table...ItemType1)
Items.Add(row.ItemID);
foreach (Table...ItemType2 row in db.Table...ItemType2)
Items.Add(row.ItemID);
List<Range> InventoryRanges = new List<Range>();
foreach (Inventory row in db.Inventories)
{
if (row.lowrange != null && row.highrange != null)
InventoryRanges.Add(new Range(row.lowrange, row.highrange));
}
foreach (int item in Items)
{
foreach (Range range in InventoryRanges)
{
if (range.lowrange <= item && range.highrange >= item)
Items.Remove(item);
}
if (MoV.Contains(item))
Items.Remove(item);
}
return Items;
There's a ready made number table called master..spt_values, which can be quite helpful in this case. Note though, that you can use this table if the distance between lowrange and highrange cannot exceed 2047, otherwise create, populate and use your own number table instead.
Here's the method:
SELECT
ItemMissing = i.Item
FROM (
SELECT
i.ID,
Item = i.lowrange + v.number,
i.ItemType
FROM Inventory i
INNER JOIN master..spt_values v
ON v.type = 'P' AND v.number BETWEEN 0 AND i.highrange - i.lowrange
) inv
LEFT JOIN MissingOrViod m
ON inv.ID = m.ID AND inv.Item = m.Item
LEFT JOIN TableWithDataEnteredForItem1 t1 ON inv.ItemType = 1
AND inv.ID = t1.InventoryID AND inv.Item = t1.ItemID
LEFT JOIN TableWithDataEnteredForItem2 t2 ON inv.ItemType = 2
AND inv.ID = t2.InventoryID AND inv.Item = t2.ItemID
WHERE m.ID IS NULL AND t1.InventoryID IS NULL AND t2.InventoryID IS NULL
The subselect expands the Inventory table into a complete item list with item IDs as defined by lowrange and highrange (this is where the number table comes in handy). The obtained list is then compared against the other three tables to find and exclude those items that are present in them. The remaining items, then, constitute the list of 'items missing'.