I'm trying to setup and run Cassandra 3.10 in my local docker (https://hub.docker.com/_/cassandra/). Everything goes well until I try to select from one table.
This is the error I get everytime I run select whatever from whatever:
'Row' object has no attribute 'values'
The steps that I followed:
I created a new keyspace using the default superuser: cassandra. create keyspace test with replication = {'class':'SimpleStrategy','replication_factor' : 2}; and USE test;
I created a new table: create table usertable (userid int primary key, usergivenname varchar, userfamilyname varchar, userprofession varchar);
Insert some data: insert into usertable (userid, usergivenname, userfamilyname, userprofession) values (1, 'Oliver', 'Veits', 'Freelancer');
Try to select: select * from usertable where userid = 1;
I got this steps from: https://oliverveits.wordpress.com/2016/12/08/cassandra-hello-world-example/ just to copy & paste some working code (I was getting mad with the syntax and typos)
This are the logs of my docker image:
INFO [Native-Transport-Requests-1] 2017-04-23 19:09:12,543 MigrationManager.java:303 - Create new Keyspace: KeyspaceMetadata{name=test2, params=KeyspaceParams{durable_writes=true, replication=ReplicationParams{class=org.apache.cassandra.locator.SimpleStrategy, replication_factor=2}}, tables=[], views=[], functions=[], types=[]}
INFO [Native-Transport-Requests-1] 2017-04-23 19:09:41,415 MigrationManager.java:343 - Create new table: org.apache.cassandra.config.CFMetaData#1b484e82[cfId=6757f460-2858-11e7-9787-6d2c86545d91,ksName=test2,cfName=usertable,flags=[COMPOUND],params=TableParams{comment=, read_repair_chance=0.0, dclocal_read_repair_chance=0.1, bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=864000, default_time_to_live=0, memtable_flush_period_in_ms=0, min_index_interval=128, max_index_interval=2048, speculative_retry=99PERCENTILE, caching={'keys' : 'ALL', 'rows_per_partition' : 'NONE'}, compaction=CompactionParams{class=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy, options={min_threshold=4, max_threshold=32}}, compression=org.apache.cassandra.schema.CompressionParams#4bce743f, extensions={}, cdc=false},comparator=comparator(),partitionColumns=[[] | [userfamilyname usergivenname userprofession]],partitionKeyColumns=[userid],clusteringColumns=[],keyValidator=org.apache.cassandra.db.marshal.Int32Type,columnMetadata=[usergivenname, userprofession, userid, userfamilyname],droppedColumns={},triggers=[],indexes=[]]
INFO [MigrationStage:1] 2017-04-23 19:09:41,484 ColumnFamilyStore.java:406 - Initializing test2.usertable
INFO [IndexSummaryManager:1] 2017-04-23 19:13:25,214 IndexSummaryRedistribution.java:75 - Redistributing index summaries
Thanks a lot!
UPDATE
I created another table with a uuid column like this: "uid uuid primary key". It works when the table is empty but after one insert, I get the same error
I had the same problem, I was using cqlsh. You just need to reload. Probably, it is a cqlsh bug the first time or after create a schema.
~$ sudo docker run --name ex1 -d cassandra:latest
9c1092938d29ec7f94bee773cc2adc0a23ff09344e32500bfeb1898466610d06
~$ sudo docker exec -ti ex1 /bin/bash
root#9c1092938d29:/# cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> CREATE SCHEMA simple1
... WITH replication={
... 'class': 'SimpleStrategy',
... 'replication_factor':1};
cqlsh> use simple1;
cqlsh:simple1> create table users(id varchar primary key, first_name varchar, last_name varchar);
cqlsh:simple1> select * from users;
id | first_name | last_name
----+------------+-----------
(0 rows)
cqlsh:simple1> insert into users(id, first_name, last_name) values ('U100001', 'James', 'Jones');
cqlsh:simple1> select * from users;
'Row' object has no attribute 'values'
cqlsh:simple1> quit
root#9c1092938d29:/# cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> use simple1;
cqlsh:simple1> select * from users;
id | first_name | last_name
---------+------------+-----------
U100001 | James | Jones
(1 rows)
Related
I am trying to copy a JSON data from Kafka to vertica. I am using the following query
COPY public.from_kafka
SOURCE KafkaSource(stream='example_data|0|-2, example_data|1|-2',
brokers='kafka01.example.com:9092',
duration=interval '10000 milliseconds') PARSER KafkaJSONParser()
REJECTED DATA AS TABLE public.rejections;
each message in the topic looks like that:
{"location_id":30277, "start_date":1667911800000}
when I am running the query, no new rows are created. when I am checking the rejections table I see the following rejected_reason:
Missing or null value for column with NOT NULL constraint [start_date]
however the rejected_data is {"location_id":30277, "start_date":1667911800000}
why does Vertica not recognize the start_date field and how can I solve it?
vertica table:
CREATE TABLE public.from_kafka
(
location_id int NOT NULL,
start_date timestamp NOT NULL
)
CREATE PROJECTION public.from_kafka /*+createtype(L)*/
(
location_id ENCODING RLE,
start_date ENCODING GCDDELTA
)
AS
SELECT from_kafka.location_id,
from_kafka.start_date,
FROM public.from_kafka
ORDER BY from_kafka.start_date,
from_kafka.location_id
SEGMENTED BY hash(from_kafka.location_id, from_kafka.start_date) ALL NODES KSAFE 1;
EDIT - SOLUTION
PARSER KafkaJSONParser() does not know how to convert long into timestamp, due to that I had to convert the JSON message with java, insert the updated JSON to a new topic and then use KafkaJSONParser() function
A timestamp, in any SQL database, is a timestamp, not an integer.
To load your JSON format and have a timestamp, redefine your table to receive an integer and convert it to a timestamp on the fly.
I do it from file, here, but it will work with a Kafka stream, too.
-- create your table like so:
DROP TABLE IF EXISTS public.from_kafka;
CREATE TABLE public.from_kafka
(
location_id int NOT NULL,
start_date int NOT NULL,
start_date_ts timestamp DEFAULT TO_TIMESTAMP(start_date//1000)
);
This is the JSON file I use:
$ cat kafka.json
{"location_id":30277, "start_date":1667911800000},
{"location_id":30278, "start_date":1667911900000},
{"location_id":30279, "start_date":1667912000000},
{"location_id":30280, "start_date":1667912100000},
{"location_id":30281, "start_date":1667912200000},
{"location_id":30282, "start_date":1667912300000}
And this is the copy command I use:
COPY public.from_kafka (
location_id
, start_date
)
FROM LOCAL 'kafka.json' PARSER FJsonParser(record_terminator=E'\n')
EXCEPTIONS 'kafka.log';
And this, finally, is what from_kafka will contain:
SELECT * FROM public.from_kafka;
-- out location_id | start_date | start_date_ts
-- out -------------+------------+---------------------
-- out 30277 | 1667911800 | 2022-11-08 13:50:00
-- out 30278 | 1667911900 | 2022-11-08 13:51:40
-- out 30279 | 1667912000 | 2022-11-08 13:53:20
-- out 30280 | 1667912100 | 2022-11-08 13:55:00
-- out 30281 | 1667912200 | 2022-11-08 13:56:40
-- out 30282 | 1667912300 | 2022-11-08 13:58:20
I'm following the example Class::DBI.
I create the cd table like that in my MariaDB database:
CREATE TABLE cd (
cdid INTEGER PRIMARY KEY,
artist INTEGER, # references 'artist'
title VARCHAR(255),
year CHAR(4)
);
The primary key cdid is not set to auto-incremental. I want to use a sequence in MariaDB. So, I configured the sequence:
mysql> CREATE SEQUENCE cd_seq START WITH 100 INCREMENT BY 10;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT NEXTVAL(cd_seq);
+-----------------+
| NEXTVAL(cd_seq) |
+-----------------+
| 100 |
+-----------------+
1 row in set (0.00 sec)
And set-up the Music::CD class to use it:
Music::CD->columns(Primary => qw/cdid/);
Music::CD->sequence('cd_seq');
Music::CD->columns(Others => qw/artist title year/);
After that, I try this inserts:
# NORMAL INSERT
my $cd = Music::CD->insert({
cdid => 4,
artist => 2,
title => 'October',
year => 1980,
});
# SEQUENCE INSERT
my $cd = Music::CD->insert({
artist => 2,
title => 'October',
year => 1980,
});
The "normal insert" succeed, but the "sequence insert" give me this error:
DBD::mysql::st execute failed: You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB server version for the right syntax to use near ''cd_seq')' at line
1 [for Statement "SELECT NEXTVAL ('cd_seq')
"] at /usr/local/share/perl5/site_perl/DBIx/ContextualFetch.pm line 52.
I think the quotation marks ('') are provoking the error, because when I put the command "SELECT NEXTVAL (cd_seq)" (without quotations) in mysql client it works (see above). I proved all combinations (', ", `, no quotation), but still...
Any idea?
My versions: perl 5.30.3, 10.5.4-MariaDB
The documentation for sequence() says this:
If you are using a database with AUTO_INCREMENT (e.g. MySQL) then you do not need this, and any call to insert() without a primary key specified will fill this in automagically.
MariaDB is based on MySQL. Therefore you do not need the call to sequence(). Use the AUTO_INCREMENT keyword in your table definition instead.
I recently started porting a SQLite database over to PostGreSQL for a Flask site built with SQLAlchemy. I have my schemas in PGSQL and even inserted the data into the database. However, I am unable to run my usual INSERT commands to add information to the database. Normally, I insert new records using SQL Alchemy by leaving the ID column to be NULL and then just setting the other columns. However, that results in the following error:
sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) null value in column "id" violates not-null constraint
DETAIL: Failing row contains (null, 2017-07-24 20:40:37.787393+00, 2017-07-24 20:40:37.787393+00, episode_length_list = [52, 51, 49, 50, 83]
sum_length = 0
for ..., 0, f, 101, 1, 0, 0, , null).
[SQL: 'INSERT INTO submission (date_created, date_modified, code, status, correct, assignment_id, course_id, user_id, assignment_version, version, url) VALUES (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, %(code)s, %(status)s, %(correct)s, %(assignment_id)s, %(course_id)s, %(user_id)s, %(assignment_version)s, %(version)s, %(url)s) RETURNING submission.id'] [parameters: {'code': 'episode_length_list = [52, 51, 49, 50, 83]\n\nsum_length = 0\n\nfor episode_length in episode_length_list:\n pass\n\nsum_length = sum_length + episode_length\n\nprint(sum_length)\n', 'status': 0, 'correct': False, 'assignment_id': 101, 'course_id': None, 'user_id': 1, 'assignment_version': 0, 'version': 0, 'url': ''}]
Here is my SQL Alchemy table declarations:
class Base(Model):
__abstract__ = True
#declared_attr
def __tablename__(cls):
return cls.__name__.lower()
def __repr__(self):
return str(self)
id = Column(Integer(), primary_key=True)
date_created = Column(DateTime, default=func.current_timestamp())
date_modified = Column(DateTime, default=func.current_timestamp(),
onupdate=func.current_timestamp())
class Submission(Base):
code = Column(Text(), default="")
status = Column(Integer(), default=0)
correct = Column(Boolean(), default=False)
assignment_id = Column(Integer(), ForeignKey('assignment.id'))
course_id = Column(Integer(), ForeignKey('course.id'))
user_id = Column(Integer(), ForeignKey('user.id'))
assignment_version = Column(Integer(), default=0)
version = Column(Integer(), default=0)
url = Column(Text(), default="")
I created the schema by calling db.create_all() in a script.
Checking the PostGreSQL side, we can see the constructed table:
Table "public.submission"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------+--------------------------+-----------+----------+--------------+-------------
id | bigint | not null | plain | |
date_created | timestamp with time zone | | plain | |
date_modified | timestamp with time zone | | plain | |
code | text | | extended | |
status | bigint | | plain | |
correct | boolean | | plain | |
assignment_id | bigint | | plain | |
user_id | bigint | | plain | |
assignment_version | bigint | | plain | |
version | bigint | | plain | |
url | text | | extended | |
course_id | bigint | | plain | |
Indexes:
"idx_16881_submission_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"submission_course_id_fkey" FOREIGN KEY (course_id) REFERENCES course(id)
"submission_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "user"(id)
Has OIDs: no
I'm still new to this, but shouldn't there be a sequence?
Any insight or suggestions on what to look for next would be super appreciated.
It is standard SQL that a PRIMARY KEY is UNIQUE and NOT NULL. PostgreSQL enforces the standard, and does not allow you to have any (not a single one) NULL on the table. Other databases allow you to have one NULL, therefore, the different behaviour.
PostgreSQL current documentation on Primary Keys clearly states it:
5.3.4. Primary Keys
A primary key constraint indicates that a column, or group of columns, can be used as a unique identifier for rows in the table. This requires that the values be both unique and not null.
If you want your PRIMARY KEY to be a synthetic (i.e.: not natural) sequence number, you should define it with type BIGSERIAL instead of BIGINT. I don't know the details on how this is achieved using SQLAlchemy, but look at the references.
When you then INSERT into your table, the id should NOT be in the INSERT column list (it should not be set to null, just not be there). I.e.:
This will generate a new id:
INSERT INTO public.submission (code) VALUES ('Some code') ;
will work.
This won't:
INSERT INTO public.submission (id, code) VALUES (NULL, 'Some code') ;
I guess SQLAlchemy should be smart enough to generate the proper SQL INSERT statements, once properly configured.
Reference:
Why isn't SQLAlchemy creating serial columns?
Ultimately, I discovered what went wrong, and it was definitely my fault. The process I used to load the old data into the database (pgloader) was doing more than just loading data - it was somehow overwriting parts of the table definitions! I was able to pg_dump the data out, reset the tables, and then load it back in - everything works as expected. Thanks for sanity checks!
I need to migrate data from MySQL to DB2. Both DBs are up and running.
I tried to mysqldump with --no-create-info --extended-insert=FALSE --complete-insert and with a few changes on the output (e.g. change ` to "), I get to a satisfactory result but sometimes I have weird exceptions, like
does not have an
ending string delimiter. SQLSTATE=42603
Ideally I would want to have a routine that is as general as possible, but as an example here, let's say I have a DB2 table that looks like:
db2 => describe table "mytable"
Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
id SYSIBM BIGINT 8 0 No
name SYSIBM VARCHAR 512 0 No
2 record(s) selected.
Its MySQL counterpart being
mysql> describe mytable;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| name | varchar(512) | NO | | NULL | |
+-------+--------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)
Let's assume the DB2 and MySQL databases are called mydb.
Now, if I do
mysqldump -uroot mydb mytable --no-create-info --extended-insert=FALSE --complete-insert | # mysldump, with options (see below): # do not output table create statement # one insert statement per record# ouput table column names
sed -n -e '/^INSERT/p' | # only keep lines beginning with "INSERT"
sed 's/`/"/g' | # replace ` with "
sed 's/;$//g' | # remove `;` at end of insert query
sed "s/\\\'/''/g" # replace `\'` with `''` , see http://stackoverflow.com/questions/2442205/how-does-one-escape-an-apostrophe-in-db2-sql and http://stackoverflow.com/questions/2369314/why-does-sed-require-3-backslashes-for-a-regular-backslash
, I get:
INSERT INTO "mytable" ("id", "name") VALUES (1,'record 1')
INSERT INTO "mytable" ("id", "name") VALUES (2,'record 2')
INSERT INTO "mytable" ("id", "name") VALUES (3,'record 3')
INSERT INTO "mytable" ("id", "name") VALUES (4,'record 4')
INSERT INTO "mytable" ("id", "name") VALUES (5,'" "" '' '''' \"\" ')
This ouput can be used as a DB2 query and it works well.
Any idea on how to solve this more efficiently/generally? Any other suggestions?
After having played around a bit I came with the following routine which I believe to be fairly general, robust and scalable.
1 Run the following command:
mysqldump -uroot mydb mytable --no-create-info --extended-insert=FALSE --complete-insert | # mysldump, with options (see below): # do not output table create statement # one insert statement per record# ouput table column names
sed -n -e '/^INSERT/p' | # only keep lines beginning with "INSERT"
sed 's/`/"/g' | # replace ` with "
sed -e 's/\\"/"/g' | # replace `\"` with `#` (mysql escapes double quotes)
sed "s/\\\'/''/g" > out.sql # replace `\'` with `''` , see http://stackoverflow.com/questions/2442205/how-does-one-escape-an-apostrophe-in-db2-sql and http://stackoverflow.com/questions/2369314/why-does-sed-require-3-backslashes-for-a-regular-backslash
Note: here unlike in the question ; are not being removed.
2 upload the file to DB2 server
scp out.sql user#myserver:out.sql
3 run queries from the file
db2 -tvsf /path/to/query/file/out.sql
I am trying to query a postgresql (v 9.3.6) table with a tstzrange to determine if a given timestamp exists within the table defined as
CREATE TABLE sensor(
id serial,
hostname varchar(64) NOT NULL,
ip varchar(15) NOT NULL,
period tstzrange NOT NULL,
PRIMARY KEY(id),
EXCLUDE USING gist (hostname WITH =, period with &&)
);
I am using psycopg2 and when I try the query:
sql = "SELECT id FROM sensor WHERE %s <# period;"
cursor.execute(sql,(isotimestamp,))
I get the error
psycopg2.DataError: malformed range literal:
...
DETAIL: Missing left parenthesis or bracket.
I've tried various type castings to no avail.
I've managed a workaround using the following query:
sql = "SELECT * FROM sensor WHERE %s BETWEEN lower(period) AND upper(period);"
but would like to know why I am having problem with the range operators. Is it my code or psycopg2 or what?
Any help is appreciated.
EDIT 1:
In response to the comments, I have attempted the same query on a simple 1-row table in postgresql like below
=> select * from sensor;
session_id | hostname | ip | period
------------+----------+-----------+-------------------------------------------------------------------
1 | bob | 127.0.0.1 | ["2015-02-08 19:26:42.032637+00","2015-02-08 19:27:28.562341+00")
(1 row)
Now by using the "#>" operator I get the following error:
=> select * from sensor where period #> '2015-02-08 19:26:43.04+00';
ERROR: malformed range literal: "2015-02-08 19:26:43.04+00"
LINE 1: select * from sensor where period #> '2015-02-08 19:26:42.03...
Which appears to be the same as the psycopg2 error, a malformed range literal, so I thought I would try typecasting to timestamp as below
=> select * from sensor where sensor.period #> '2015-02-08 19:26:42.032637+00'::timestamptz;
session_id | hostname | ip | period
------------+----------+-----------+-------------------------------------------------------------------
1 | feral | 127.0.0.1 | ["2015-02-08 19:26:42.032637+00","2015-02-08 19:27:28.562341+00")
So it appears that it is my mistake, the literal has to be typecast or it is assumed to be a range. Using psycopg2, the query can be executed with:
sql="select * from sensor where period #> %s::timestamptz"