How to refer to a Camel SQL component named parameter where the parameter name is in mixed case and contains spaces? - postgresql

I am loading a CSV file into a Postgresql database using the Camel SQL component.
The original CSV file header names (Columns) are mixed case with spaces e.g. "Cost Price"
The SQL component refers to an SQL insert statement in a properties file,
e.g.
insert into upload_data(year,month,cost)values(:#year,:#month,:#Cost Price)
I get this error:
Caused by: [org.springframework.jdbc.BadSqlGrammarException - PreparedStatementCallback; bad SQL grammar []; nested exception is org.postgresql.util.PSQLException: ERROR: syntax error at or near ":" at position...
-the position refers to the : before #Cost Price
If I change the parameter name to cost_price and modify the CSV file the file is uploaded correctly without error.
I have tried surrounding the parameter with " ' \" and {} in the insert statement
Is it possible to use mixed case with spaces in named parameters using escapes or something or do I need to intervene and modify the CSV header?

The SQL component does not support this, in fact its a real bad design to use spaces in header names. So after you read the CSV file, you can change the header name before calling the SQL component.

Related

Getting Python to accept a csv into postgreSQL table with ":" in the headers

I receive a .csv export every 10 minutes that I'd like to import into a postgreSQL server. Working with a test csv, I got everything to work, but didn't take notice that my actual csv file has a forced ":" at the end of each column header (but not on the first header for some reason)(built into the back-end of the exporter, so I cant get it removed, already asked the company). So I added the ":"s to my test csv as shown in the link,
My insert into functions no longer work and give me syntax errors. First I'm trying to add them using the following code,
print("Reading file contents and copying into table...")
with open('C:\\Users\\admin\\Desktop\\test2.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
columns = next(readCSV) #skips the header row
query = 'insert into test({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
for data in readCSV:
cursor.execute(query, data)
con.commit()
Resulting in '42601' error near ":" in the second column header.
The results are the same while actually listing column headers and ? ? ?s out in the INSERT INTO section.
What is the syntax to get the script to accept ":" on column headers? If there's no way, is there a way to scan through headers and remove the ":" at the end of each?
Because : is a special character, if your column is named year: in the DB, you must double quote its name --> select "year:" from test;
You are getting a PG error because you are referencing the unquoted column name (insert into test({0})), so add double quotes there.
query = 'insert into test("year:","day:", "etc:") values (...)'
That being said, it might be simpler to remove every occurrence of : in your csv's 1st line
Much appreciated JGH and Adrian. I went with your suggestion to remove every occurrence of : by adding the following line after the first columns = ... statement
columns = [column.strip(':') for column in columns]
It worked well.

Postgresql Nested functions

I'm trying to nest functions in postgresql, trying to extract a directory name (actually the third one) from a path.
My starting SQL code was :
SELECT "Path", regexp_matches("Path", '^([^/]*/){3}.*') FROM ...
Since the returned value of my regex is delimited by curly brace and double-quote (when containing spaces), I'm trying to remove theses delimiters by nesting functions :
SELECT "Path", right(regexp_matches("Path", '^([^/]*/){3}.*'),2) FROM
But I've the following error :
ERROR: function right(text[], integer) does not exist
LINE 2: SELECT "Path", right(regexp_matches("Path", '^([^/]*/){3}.*'...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
État SQL :42883
Caractère :17
I don't know how to format the expression in order to be accepted...
Of course, if someone can give me a way to get exactly the nth field of a string separated by slashes, I would be happy !

IBM DB2 Load failure to IBM DB2 Z/Os

Connect to server user myuser using mypass;
LOAD CLIENT from "Text_File.TXT" OF DEL
MODIFIED BY CHARDEL0x22 coldel0x09 KEEPBLANKS USEDEFAULTS
TIMESTAMPFORMAT="YYYY-MM-DD HH:MM:SS.UUUUUUUUU" MESSAGES "Log_Text_File.TXT"
INSERT INTO SCHEMA.Table NONRECOVERABLE;
This is my current command above, the single text file generated is below:
"int" "AND 8 / 2010.
" "int" "int" "string" "2014-03-12 14:52:29" "name" "int"
The error I'm getting is:
SQL3116W The field value in row "F8-8245" and column "6" is missing, but the
target column is not nullable.
SQL3185W The previous error occurred while processing data from row "F8-8245"
of the input file.
I'm using a text qualifier of "
It's a tab delimited file.
I'm not sure why the file is failing as the 6th column is filled.
Any help would be greatly appreciated.
If your input data file can contain a newline character inside a character-string value , then add DELPRIORITYCHAR to the modified-by list like this:
MODIFIED BY CHARDEL0x22 coldel0x09 delprioritychar
then retry and check the output. Remember to erase your message file before each load(or archive) so you only see fresh messages.

Use of column names in Redshift COPY command which is a reserved keyword

I have a table in redshift where the column names are 'begin' and 'end'. They are Redshift keywords. I want to explicitly use them in the Redshift COPY command. Is there a workaround rather than renaming the column names in the table. That will be my last option.
I tried to enclose them within single/double quotes, but looks like the COPY command only accepts comma separated column names.
Copy command works fails if you don't escape keywords as column name. e.g. begin or end.
copy test1(col1,begin,end,col2) from 's3://example/file/data1.csv' credentials 'aws_access_key_id=XXXXXXXXXXXXXXX;aws_secret_access_key=XXXXXXXXXXX' delimiter ',';
ERROR: syntax error at or near "end"
But, it works fine if as begin and end are enclosed by double quote(") as below.
copy test1(col1,"begin","end",col2) from 's3://example/file/data1.csv' credentials 'aws_access_key_id=XXXXXXXXXXXXXXX;aws_secret_access_key=XXXXXXXXXXX' delimiter ',';
I hope it helps.
If there is some different error please update your question.

Converting csv to parquet in spark gives error if csv column headers contain spaces

I have csv file which I am converting to parquet files using databricks library in scala. I am using below code:
val spark = SparkSession.builder().master("local[*]").config("spark.sql.warehouse.dir", "local").getOrCreate()
var csvdf = spark.read.format("org.apache.spark.csv").option("header", true).csv(csvfile)
csvdf.write.parquet(csvfile + "parquet")
Now the above code works fine if I don't have space in my column headers. But if any csv file have spaces in the column headers, it doesn't work and errors out stating invalid column headers. My csv files are delimited by ,.
Also, I cannot change the spaces of column names of the csv. The column names has to be as they are even if they contain spaces as those are given by end user.
Any idea on how to fix this?
per #CodeHunter's request
sadly, the parquet file format does not allow for spaces in column names;
the error that it'll spit out when you try is: contains invalid character(s) among " ,;{}()\n\t=".
ORC also does not allow for spaces in column names :(
Most sql-engines don't support column names with spaces, so you'll probably be best off converting your columns to your preference of foo_bar or fooBar or something along those lines
I would rename the offending columns in the dataframe, to change space to underscore, before saving. Could be with select "foo bar" as "foo_bar" or .withColumnRenamed("foo bar", "foo_bar")