I have a table in Cassandra containing name, item.
Using the following data types: name is text, item is set<text>.
f.e. I have these entries:
name | item
a | {item1, item3}
b | {item2, item3}
c | {item1, item2}
Now my question: Is there any way to get all names having item1?
I tried this, but didn't work:
SELECT name
FROM table
WHERE item = 'item1';
I get an error that 'item1' is a string, but item is a set<text>.
I guess there is a way to do this, but I can't think of how.
Thanks in advance.
Unlikely this is not yet supported in Cassandra. May be in some upcoming version we will be able to index even collection items.
Related
I started learning ksqldb for Kafka and ran into a problem, I have a Product stream and the structure is the following:
ID | VARCHAR(STRING)
NAME | VARCHAR(STRING)
...
INGREDIENTS | ARRAY<STRUCT<NAME VARCHAR(STRING), ISALLERGEN BOOLEAN, RELATIVEAMOUNT DOUBLE>>
And as a result, I want to create a table and get the last state of the product using 'LATEST_BY_OFFSET' function, but the problem is that I can't apply this function to Array or Struct.
Should I EXPLODE each prop and create a separate entry in the table for each ingredient?(I think it's very strange). How can I deal with this situation? Can you tell me if you have any ideas?
This is a known issue. It is documented here: https://github.com/confluentinc/ksql/issues/5437 and I happen to be working on it presently. Hopefully a solution will be available in an upcoming version!
I'm not sure how your workaround would work given that after you explode the array that there could be several instance of each component of the struct per key...
The problem was solved in release 0.25.1
https://github.com/confluentinc/ksql/pull/8878
https://github.com/confluentinc/ksql/commit/adac45855dce1c413073e5cbeb474cb022013a2b
I am using Postgres 9.5. If I update certain values of a row and commit, is there any way to fetch the old value afterwards? I am thinking is there something like a flashback? But this would be a selective flashback. I don't want to rollback the entire database. I just need to revert one row.
Short answer - it is not possible.
But for future readers, you can create an array field with historical data that will look something like this:
Column | Type |
----------------+--------------------------+------
value | integer |
value_history | integer[] |
For more info read the docs about arrays
Currently I have a table schema that looks like this:
| id | visitor_ids | name |
|----|-------------|----------------|
| 1 | {abc,def} | Chris Houghton |
| 2 | {ghi} | Matt Quinn |
The visitor_ids are all GUIDs, I've just shortened them for simplicity.
A user can have multiple visitor ids, hence the array type.
I have a GIN index created on the visitor_ids field.
I want to be able to lookup users by a visitor id. Currently we're doing this:
SELECT *
FROM users
WHERE visitor_ids && array['abc'];
The above works, but it's really really slow at scale - it takes around 45ms which is ~700x slower than a lookup by the primary key. (Even with the GIN index)
Surely there's got to be a more efficient way of doing this? I've looked around and wasn't able to find anything.
Possible solutions I can think of could be:
The current query is just bad and needs improving
Using a separate user_visitor_ids table
Something smart with special indexes
Help appreciated :)
I tried the second solution - 700x faster. Bingo.
I feel like this is an unsolved problem however, what's the point in adding arrays to Postgres when the performance is so bad, even with indexes?
Let's say I have the following table(only bigger):
key | type
----------------
uuid1 | blue
uuid2 | red
uuid3 | blue
What I want to be able to do is change everything that is blue to green. How would I do this without specifying all the UUIDs with the CLI or CQL?
You have a couple choices:
You can put a secondary index on the "type" column, then query all items equal to "blue". Once you have those you'll have all their keys, and you can do a batch mutation to set all the values to "green".
You can use the Hadoop integration to read in all the columns, then output the updated data in your reducer. Pig would be a good choice for this type of work.
I'm trying to query multiple tables at once. Say I have a table named PRESCHOOLERS and I have another one called FAVORITE_GOOEY_TREATS, with a foreign key column in the PRESCHOOLERS table referencing the id field of FAVORITE GOOEY TREAT. What would I do if I wanted to get a list of preschoolers with their first names alongside their favorite treats. I mean something like:
first_name | treat
john | fudge
sally | ice-cream
Here's what I'm trying, but I've got a syntax error on the where part.
SELECT PRESCHOOLERS.first_name, FAVORITE_GOOEY_TREATS.name as treat
FROM PRESCHOOLERS, FAVORITE_GOOEY_TREATS
WHERE PRESCHOOLERS.favorite_treat = FAVORITE_GOOEY_TREATS.id and PRESCHOOLERS.age>15;
As far as I know this kind of thing is alright by sql standards, but sqlite3 doesn't much like it. Can someone point me at some examples of similar queries that work?
Try
SELECT PRESCHOOLERS.first_name, FAVORITE_GOOEY_TREATS.name as treat
FROM PRESCHOOLERS
JOIN FAVORITE_GOOEY_TREATS ON PRESCHOOLERS.favorite_treat = FAVORITE_GOOEY_TREATS.id
WHERE PRESCHOOLERS.age > 15;