Query not working with ENUM field in Oracle NoSQL Database - nosql

I encountered issues with the execution of a select query with WHERE clause on ENUM field
Here's a sample query which is not working:
kv-> execute "select * from Table1_TBL where col1 < 100 and col1 >10 and Table1Summaries.values($value.col2 = 'VAL1')"

In general, in comparisons enum columns behave like strings. So if "col2" is a column of table "Table1_TBL" that is declared as an enum, the query should be the following:
select * from Table1_TBL where col1 < 100 and col1 >10 and col2 = 'VAL1'.

Related

DB2 - Concat all values in a column into a Single string

Let's say I have a table like this:
test
Col1
Col2
A
1
B
1
C
1
D
2
I am doing query select col1 from test where col2 = 1;
This will return a column with values A B and C in 3 separate rows.
I want the SQL to return a single row with value A|B|C. Is this possible to do? If it is how should I do it?
you can use LISTAGG function like this:
SELECT LISTAGG(col1, ',')
If LISTAGG is not available, it can be reproduced with XMLAGG:
SELECT SUBSTR(XMLSERIALIZE(XMLAGG(XMLTEXT('|'||"Col1"))),2)
FROM "test"
WHERE "Col2" = 1

How to update new empty column with data that depends on mathematical operation of different data types?

[beginner]
I have a table that looks like this:
colA colB
1 <null>
2 <null>
3 <null>
colB is the new empty column I added to the table. colA is varchar and colB is double precision data type (float).
I want to update colB with a colA multiplied by 2.
New table should look like this:
colA colB
1 2
2 4
3 6
When I go to update colB like so:
update tablename set colB = colA * 2
I get error:
Invalid operation: Invalid input syntax for type numeric
Ive tried to work around this with solutions like this:
update tablename set colB = COALESCE(colA::numeric::text,'') * 2
but get the same error.
In a select statement on the same table, this works on colA which is varchar:
select colA * 2 from tablename
How can I update a column with mathematical operations with different datatype reference columns? I cant update datatype for colA.
I suppose that Laurenz Albe is correct and there are non-numeric values in col_a
So UPDATE must be guarded:
UPDATE T
SET col_b =
CASE
WHEN col_a ~'^([0-9]+\.?[0-9]*|\.[0-9]+)$' THEN col_a::numeric *2
END ;
-- or this way
UPDATE T
SET col_b = col_a::numeric *2
WHERE
col_a ~'^([0-9]+\.?[0-9]*|\.[0-9]+)$' ;
Look at fiddle: https://www.db-fiddle.com/f/4wFynf9WiEuiE499XMcsCT/1
Recipes for "isnumeric" you can get here: isnumeric() with PostgreSQL
There is a value in the string column that is not a valid number. You will have to fix the data or exclude certain rows with a WHERE condition.
If you say that running the query from your client works, that leads me to suspect that your client doesn't actually execute the whole query, but slaps a LIMIT on it (some client tools do that).
The following query will have to process all rows and should fail:
SELECT colA * 2 AS double
FROM tablename
ORDER BY double;
update tablename set colB = colA::numeric * 2

Multi-column index

I have 4 columns and I would like to match a record if any of the 4 columns match any of an array of values, something like this (syntax is not correct, but this is the idea):
SELECT * FROM y WHERE (col1,col2,col3,col4) IN (val1,val2,val3,val4)
Right now I'm using this syntax:
SELECT
*
FROM
y
WHERE
col1 IN (val1,val2,val3,val4)
OR
col2 IN (val1,val2,val3,val4)
OR
col3 IN (val1,val2,val3,val4)
OR
col4 IN (val1,val2,val3,val4)
I have 4 individual indexes on each column, but I'm wondering if there's a better type of multi-column index I could use.
So two questions:
Is there a better type of index rather than individual ones on each of col1,col2,col3 and col4?
What's the syntax in there where clause?
One index I can recommend in such cases is a bloom filter.
For that, you'll have to use PostgreSQL v10 and install the bloom extension:
CREATE EXTENSION bloom;
Then you can create a multicolumn index:
CREATE INDEX ON y USING bloom (col1, col2, col3, col4);
This index will support your query.
If the OR creates a performance problem, try using UNION:
SELECT * FROM y WHERE col1 IN (val1,val2,val3,val4)
UNION
SELECT * FROM y WHERE col2 IN (val1,val2,val3,val4)
UNION
SELECT * FROM y WHERE col3 IN (val1,val2,val3,val4)
UNION
SELECT * FROM y WHERE col4 IN (val1,val2,val3,val4);

Optimal approach to bulk insert of pandas dataframe into PostgreSQL table

I need to upload multiple excel files to a postgresql table but they can olverlap each other in several registers, therefore I need to be aware of IntegrityErrors. I'm following two approaches:
cursor.copy_from: The fastest approach but I don't know how to catch and control all Integrityerrors due to duplicate registers
streamCSV = StringIO()
streamCSV.write(invoicing_info.to_csv(index=None, header=None, sep=';'))
streamCSV.seek(0)
with conn.cursor() as c:
c.copy_from(streamCSV, "staging.table_name", columns=dataframe.columns, sep=';')
conn.commit()
cursor.execute: I can count and handle each exception but it is very
slow.
data = invoicing_info.to_dict(orient='records')
with cursor as c:
for entry in data:
try:
c.execute(DLL_INSERT, entry)
successful_inserts += 1
connection.commit()
print('Successful insert. Operation number {}'.format(successful_inserts))
except psycopg2.IntegrityError as duplicate:
duplicate_registers += 1
connection.rollback()
print('Duplicate entry. Operation number {}'.format(duplicate_registers))
At the end of the routine, I need to determine the following info:
print("Initial shape: {}".format(invoicing_info.shape))
print("Successful inserts: {}".format(successful_inserts))
print("Duplicate entries: {}".format(duplicate_registers))
How can I modify the first approach to control all exceptions? How can I optimize the second approach?
while you have duplicate IDs in different excel sheets you have to answer for yourself how you make a decision to data from which excel sheet to trust?
while you are using multiple tables, and will use approach to have at least one row from conflicting pair you can always do following:
create temporary tables for each excel sheet
upload data to each table for excel sheet (like you do now in a bulk)
make an insert from select picking distinct on(id), in a manner:
INSERT INTO staging.table_name(id, col1, col2 ...)
SELECT DISTINCT ON(id)
id, col1, col2
FROM
(
SELECT id, col1, col2 ...
FROM staging.temp_table_for_excel_sheet1
UNION
SELECT id, col1, col2 ...
FROM staging.temp_table_for_excel_sheet2
UNION
SELECT id, col1, col2 ...
FROM staging.temp_table_for_excel_sheet3
) as data
with such insert postgreSQL will take the random row out of non-unique id sets.
In case you would like to trust the first record you can add some order:
INSERT INTO staging.table_name(id, col1, col2 ...)
SELECT DISTINCT ON(id)
id, ordering_column col1, col2
FROM
(
SELECT id, 1 as ordering_column, col1, col2 ...
FROM staging.temp_table_for_excel_sheet1
UNION
SELECT id, 2 as ordering_column, col1, col2 ...
FROM staging.temp_table_for_excel_sheet2
UNION
SELECT id, 3 as ordering_column, col1, col2 ...
FROM staging.temp_table_for_excel_sheet3
) as data
ORDER BY ordering_column
for initial count of objects:
SELECT sum(count)
FROM
(
SELECT count(*) as count FROM temp_table_for_excel_sheet1
UNION
SELECT count(*) as count FROM temp_table_for_excel_sheet2
UNION
SELECT count(*) as count FROM temp_table_for_excel_sheet3
) as data
after finishing this bulk inserts you can run select count(*) FROM staging.table_name to get a result for total number of inserted records
for duplicate count you can run:
SELECT sum(count)
FROM
(
SELECT count(*) as count
FROM temp_table_for_excel_sheet2 WHERE id in (select id FROM temp_table_for_excel_sheet1 )
UNION
SELECT count(*) as count
FROM temp_table_for_excel_sheet3 WHERE id in (select id FROM temp_table_for_excel_sheet1 )
)
UNION
SELECT count(*) as count
FROM temp_table_for_excel_sheet3 WHERE id in (select id FROM temp_table_for_excel_sheet2 )
) as data
If the excel sheets contain duplicate records, Pandas seems a likely choice for identifying and eliminated dupes: https://33sticks.com/python-for-business-identifying-duplicate-data/. Or is the issue that different records in different sheets have the same id/index? If so, a similar approach could work where you use Pandas to isolate the ids used multiple times and then correct them with unique identifiers before attempting to upload to the SQL db.
For a bulk upload, I'd use an ORM. SQLAlchemy has some great info on bulk uploads: http://docs.sqlalchemy.org/en/rel_1_0/orm/persistence_techniques.html#bulk-operations, and there's a related discussion here: Bulk insert with SQLAlchemy ORM

multiple where clause in report query ireport jasper

I want to generate a report chart for this data from two tables are required. So i have a query like this ( Am using OrientDB )
select col1,col2 from (select col11,col22 from t1 where col11 = $P{col11}) where col1 = $P{col1} and col2 = $P{col2}
When i run this report i will get following exception
Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 
    at java.util.ArrayList.rangeCheck(ArrayList.java:635)      at java.util.ArrayList.get(ArrayList.java:411)      at com.orientechnologies.orient.core.sql.filter.OSQLPredicate.bindParameters(OSQLPredicate.java:366) 
    at com.orientechnologies.orient.core.sql.OCommandExecutorSQLResultsetAbstract.assignTarget(OCommandExecutorSQLResultsetAbstract.java:182) 
    at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.assignTarget(OCommandExecutorSQLSelect.java:435) 
    at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.executeSearch(OCommandExecutorSQLSelect.java:417) 
    at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.execute(OCommandExecutorSQLSelect.java:388) 
    at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:64) 
    at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.executeCommand(OAbstractPaginatedStorage.java:1163) 
    ... 8 more
As my observation if i have single where condition i.e., either in subquery or outerquery it works, if it is having where clause in both they this is the exception thrown.
Don't know if it solves your problem, but use of aliases could maybe help you? For proper SQL alias on columns and table :
select col1,col2
from ( select col11 as col1, col22 as col2
from t1 where col11 = $P{col11}) as table
where col1 = $P{col1} and col2 = $P{col2}