SPLIT_PART() not returning results in Redshift - amazon-redshift

There are strings in a Redshift database that I'm trying to parse. The strings look like this in the object field:
yada \n foobar\n thisthing: xyz\nvegetable: amazing
The value I'm trying to get at is xyz.
I'm trying:
SELECT split_part(v.object::varchar,'\n',3) as first_parse
FROM table_name as v
Believing that will return thisthing: xyz which I can then split again on ': '.
The Redshift documentation makes me think that's valid Redshift SQL:
http://docs.aws.amazon.com/redshift/latest/dg/SPLIT_PART.html
This answer on StackOverflow also makes me believe this is valid Redshift SQL:
https://stackoverflow.com/a/20811724/1807668
However the results of that query are results that are blank in the first_parse field (not NULL, actually blank).
How should I go about getting to the xyz part of my sample string above using Redshift SQL? Any help would be appreciated.

Related

Postgres : Unable to extract data from a bytea column which stores json array data

I'm trying to extract data from a bytea column which stores JSON data in Postgres 11.9 version.
However, the my code is throwing an error:
ERROR: invalid input syntax for type json
DETAIL: Token "" is invalid.
CONTEXT: JSON data, line 1: ...
Here is the sample data:
create table EMPLOYEE (PAYMENT bytea,NAME character varying);
insert into EMPLOYEE
values ('[{"totalCode":{"code":"EMPLOYER_TAXES"},"totalValue":{"amount":122.5,"currencyCode":"USD"}},{"totalCode":{"code":"OTHER_PAYMENTS"},"totalValue":{"amount":0.0,"currencyCode":"USD"}},{"totalCode":{"code":"GROSS_PAY"},"totalValue":{"amount":1000.0,"currencyCode":"USD"}},{"totalCode":{"code":"TOTAL_HOURS"},"totalValue":{"amount":40.0}}]'::bytea,'Tom')
;
Here is my query:
SELECT *
FROM EMPLOYEE left outer join lateral
jsonb_array_elements(PAYMENT::text::jsonb) element1 on true ;
Please help me in accessing data from this array. Data is always JSON in format.
There was a restriction to use bytea for this column.
You are making your life unnecessary hard by storing JSON values in a bytea column. Just because this is the recommended way in Oracle, doesn't mean this is a good choice for Postgres.
The correct solution is to change that column to jsonb. You will have to have a DBMS specific layer in your application anyway as the actual functions and operators you are using are very different.
Having said that, you can get away with this awful choice by using the convert_from() method:
select e.name, element1.*
from employee e
left join lateral jsonb_array_elements(convert_from(PAYMENT, 'UTF-8')::jsonb) element1 on true;
I also think you should change your INSERT statement to do an explicit conversion from text to bytea so that you can be sure the correct encoding is used:
insert into employee (payment, name)
values (convert_to('[{...}]', 'UTF-8'),'Tom');
But again: the only correct solution is to change that column to jsonb (or least json)

PostgreSQL, allow to filter by not existing fields

I'm using a PostgreSQL with a Go driver. Sometimes I need to query not existing fields, just to check - maybe something exists in a DB. Before querying I can't tell whether that field exists. Example:
where size=10 or length=10
By default I get an error column "length" does not exist, however, the size column could exist and I could get some results.
Is it possible to handle such cases to return what is possible?
EDIT:
Yes, I could get all the existing columns first. But the initial queries can be rather complex and not created by me directly, I can only modify them.
That means the query can be simple like the previous example and can be much more complex like this:
WHERE size=10 OR (length=10 AND n='example') OR (c BETWEEN 1 and 5 AND p='Mars')
If missing columns are length and c - does that mean I have to parse the SQL, split it by OR (or other operators), check every part of the query, then remove any part with missing columns - and in the end to generate a new SQL query?
Any easier way?
I would try to check within information schema first
"select column_name from INFORMATION_SCHEMA.COLUMNS where table_name ='table_name';"
And then based on result do query
Why don't you get a list of columns that are in the table first? Like this
select column_name
from information_schema.columns
where table_name = 'table_name' and (column_name = 'size' or column_name = 'length');
The result will be the columns that exist.
There is no way to do what you want, except for constructing an SQL string from the list of available columns, which can be got by querying information_schema.columns.
SQL statements are parsed before they are executed, and there is no conditional compilation or no short-circuiting, so you get an error if a non-existing column is referenced.

Snowflake : Unsupported subquery type cannot be evaluated

I am using snowflake as a data warehouse. I have a CSV file at AWS S3. I am writing a merge sql to merge data received in CSV to the table in snowflake. I have a column in time dimension table with data type as Number(38,0) data type in SF. This table holds all dates time, one e.g. is of column
time_id= 232 and time=12:00
In CSV I am getting a column with the label as time and value as 12:00.
In merge sql I am fetching this value and trying to get time_id for it.
update table_name set start_time_dim_id = (select time_id from time_dim t where t.time_name = csv_data.start_time_dim_id)
On this statement I am getting this error "SQL compilation error: Unsupported subquery type cannot be evaluated"
I am struggling to solve it, during this I google for it and got one reference for it
https://github.com/snowflakedb/snowflake-connector-python/issues/251
So want to make sure if anyone have encountered this issue? If yes, will appreciate pointers over it.
It seems like a conversion issue. I suggest you to check the data in CSV file. Maybe there is a wrong or missing value. Please check your data, and make sure it returns numeric values
create table simpleone ( id number );
insert into simpleone values ( True );
The last statement fails with:
SQL compilation error: Expression type does not match column data type, expecting NUMBER(38,0) but got BOOLEAN for column ID
If you provide sample data, and SQL to produce this error, maybe we can provide a solution.
unfortunately correlated and nested subqueries in Snowflake are a bit limited at this stage.
I would try running something like this:
update table_name
set start_time_dim_id= time_id
from time_dim
where t.time_name=csv_data.start_time_dim_id

When using 'Replace' built in function in T-SQL within a Select Query, does the data on the table get modified?

I have the following query
SELECT
[DocID],
[Docunum],
[Comments] = REPLACE(REPLACE([Comments], CHAR(13), ''), CHAR(10), '')
FROM
[Billy].[dbo].[order]
WHERE
DocDate = '2017-12-20 00:00:00.000'
I was wondering if the replace function, actually changes the value in the database? My concern is that this is ERP and I do not want referential integrity problems. I only want to eliminate the carriage separators from the NVARCHAR column to avoid spacing issues while pasting in Excel. I do not want any values changed in the database.
Any feedback would be appreciated. I have searched and did not find anything that answered this specifically. If I missed something please post link for reference if possible.
Actually here you are using replace in Select query so it will not affect your database it will only affect your result which is returned by this query, so here you are safe.

SQL Server 2000 query that omits commas in resulting rows?

Wondering if there is a way to query a SQL Server database and somehow format columns to omit commas in the data if there is any.
Reason for asking is I have 10000+ records and through out the data the varchar have data like 3,25% and other 1%.
I'd prefer not to alter the data in the original table thus asking if a select with other functions would do the trick.
I have thought about selecting all the data into a temp table and stripping the commas but that is a lot of work for every time I do the query.
Any info or if its is possible please reply.
Take a look at the REPLACE function:
SELECT REPLACE(YourColumn, ',', '')
FROM YourTable
Use SQL REPLACE :
REPLACE(YourField,',','')