OData changeset transaction with Teiid - jboss

I am trying to insert data into a SQL server datbase by using an Odata batch request changeset with Teiid. Surprisingly the inserts doesn't seem to be atomic. I am using Teeid 13.0.2 and mssql-jdbc-7.4.1.jre8.jar.
To demonstrate the problem. I created a TestTable with one column 'Id' which is the primary key. The table already contains the number 1. By using the following batch request I try inserting 41 and 1 in an atomic operation. The batch fails with status code 400, which I am expecting due to the primary key constraint. But the 41 is still inserted into the SQL server database, which as I understand shouldn't happen.
--batch_a
Content-Type: multipart/mixed; boundary=changeset_a
--changeset_a
Content-Type: application/http
Content-ID: 1
Content-Transfer-Encoding: binary
POST TestTable HTTP/1.1
Content-Type: application/json
{"Id":41}
--changeset_a
Content-Type: application/http
Content-ID: 2
Content-Transfer-Encoding: binary
POST TestTable HTTP/1.1
Content-Type: application/json
{"Id":1}
--changeset_a--
--batch_a--
Any ideas of what I am doing wrong? I created a teeid trace to see what is going on and it seems to be that there are 2 transactions and not 1. The trace is available at https://gist.github.com/mbankdmt/ec1465e22f71c00dab6db13483da66c1.

I'm using teiid 17.0.0 and onlingo 4.8.0. And I encountered the same issue as you.
After debugging, I figured out there are some bugs inside class TeiidServiceHandler. Nested transactions cause this issue.

Related

Database calls, 484ms apart, are producing incorrect results in Postgres

We have "things" sending data to AWS IoT. A rule forwards the payloads to a Lambda which is responsible for inserting or updating the data into Postgres (AWS RDS). The Lambda is written in python and uses PG8000 for interacting with the db. The lambda event looks like this:
{
"event_uuid": "8cd0b9b1-be93-49f8-1234-af4381052672",
"date": "2021-07-08T16:09:25.138809Z",
"serial_number": "a1b2c3",
"temp": "34"
}
Before inserting the data into Postgres, a query is run on the table to look for any existing event_uuids which are required to be unique. For a specific reason, there is no UNIQUE constraint on the event_uuid column. If the event_uuid does not exist, the data is inserted. If the event_uuid does exist, the data is updated. This all works great, except for the following case.
THE ISSUE: one of our things is sending two of the same payloads in very quick succession. It's an issue with one of our things but it's not something we can resolve at the moment and we need to account for it. Here are the timestamps from CloudWatch of when each payload was received:
2021-07-08T12:10:09.288-04:00
2021-07-08T12:10:09.772-04:00
As a result of the payloads being received 484ms apart, the Lambda is inserting both payloads instead of inserting the first and performing an update with the second one.
Any ideas on how to get around this?
Here is part of the Lambda code...
conn = make_conn()
event_query = f"""
SELECT json_build_object('uuid', uuid)
FROM samples
WHERE event_uuid='{event_uuid}'
AND serial_number='{serial_number}'
"""
event_resp = fetch_one(conn, event_query)
if event_resp:
update_sample_query = f"""
UPDATE samples SET temp={temp} WHERE uuid='{event_resp["uuid"]}'
"""
else:
insert_sample_query = f"""
INSERT INTO samples (uuid, event_uuid, temp)
VALUES ('{uuid4()}', '{event_uuid}', {temp})
"""

PostgREST / PostgreSQL Cannot enlarge string buffer message

I run into a Cannot enlarge string buffer message on my running postgREST API. I guess some tables are too large to work successful with the API.
I am using the docker postgrest/postgrest container from https://hub.docker.com/r/postgrest/postgrest with the version PostgREST 5.1.0.
Everything is working as expected but if the tables size getting too large, I get following error message.
hint null
details "Cannot enlarge string buffer containing 1073741822 bytes by 1 more bytes."
code "54000"
message "out of memory"
I can't determine the threshold when it's working or not.
Is there a possibility to enlarge the string buffer in some config file or is this hardcoded?
Are there any limits from the table size working with the API. So far I couldn’t find any information in the docu.
=========== Update
The postgres logs give me following SQL query:
WITH pg_source AS (
SELECT "public"."n_osm_bawue_line".*
FROM "public"."n_osm_bawue_line"
)
SELECT null AS total_result_set,
pg_catalog.count(_postgrest_t) AS page_total,
array[]::text[] AS header,
coalesce(json_agg(_postgrest_t), '[]')::character varying AS body
FROM (
SELECT *
FROM pg_source
) _postgrest_t
I use following postgres version:
"PostgreSQL 11.1 (Debian 11.1-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit"
Unless you recompile PostgreSQL is not possible to upgrade the limit(defined here).
My suggestion would be to try to reduce the size of the payload(are you sure you need all the data?) or get all of the payload in multiple requests.
With PostgREST you can do vertical filtering(just select the columns that you need) or paginate to reduce the number of rows you get in one request.
The error message comes from PostgreSQL. PostgREST just wraps the message in JSON and sends the HTTP response.
As a first step for finding the problem, look what is the exact HTTP request you do to trigger the error.
Then, enable PostgreSQL logging and repeat the request, check the logs and then you'll see what is the SQL query that causes this error. Run the query through pgAdmin or psql to make sure you got the problematic query.
Update your question with your findings. The SQL query would be what is needed to continue.
After that you could add a postgresql tag to your question.
There is always the possibility that the file being imported is either corrupted or malformed because of any number of reasons.
I just happened to have discovered in my case that my file had something like incorrect line endings (long story, unnecessary here) which caused the whole file to appear as one line, thus causing the obvious result. You may have something similar in your case that requires a find+replace kind of solution.
For whatever benefit to anyone else, I used this to resolve it:
tr -d '\0' < bad_file.csv > bad_file.csv.fixed

JSON Data loading into Redshift Table

I am trying to load JSON Data into Redshift Table. Below is the sample code, Table structure and JSON Data.
I have gone through many post in this site and AWS. However, my issue is not yet resolved.
JSON data is below, that I copied the below data in test.json and uploaded in S3...
{backslash: "a",newline: "ab",tab: "dd"}
Table structure is as below
create table escapes (backslash varchar(25), newline varchar(35), tab
varchar(35));
Copy command is as below
copy escapes from 's3://dev/test.json'
credentials 'aws_access_key_id=******;aws_secret_access_key=$$$$$'
format as JSON 'auto';
However it throws the below error
Amazon Invalid operation: Load into table 'escapes' failed. Check 'stl_load_errors' system table for details.;
1 statement failed.
In the 'stl_load_errors' table , the error reason is as below "Invalid value."
Seems like issue is with your JSON data. Ideally it should be-
{
"backslash": "a",
"newline": "ab",
"tab": "dd"
}
I hope this should resolve your issue, but if not, update your question and I could reattempt the answer.

BigQuery - create table via UI from cloud storage results in integer error

I am trying to test out BigQuery but am getting stuck on creating a table from data stored in google cloud storage. I am able to reduce the data down to just one value, but it is not making sense.
I have a text file I uploaded to google cloud storage with just one integer value in it, 177790884
I am trying to create a table via the BigQuery web UI, and go through the wizard. When I get to the schema definition section, I enter...
ID:INTEGER
The load always fails with...
Errors:
File: 0 / Line:1 / Field:1: Invalid argument: 177790884 (error code: invalid)
Too many errors encountered. Limit is: 0. (error code: invalid)
Job ID trusty-hangar-120519:job_LREZ5lA8QNdGoG2usU4Q1jeMvvU
Start Time Jan 30, 2016, 12:43:31 AM
End Time Jan 30, 2016, 12:43:34 AM
Destination Table trusty-hangar-120519:.onevalue
Source Format CSV
Allow Jagged Rows true
Ignore Unknown Values true
Source URI gs:///onevalue.txt
Schema
ID: INTEGER
If I load with a schema of ID:STRING it works fine. The number 177790884 is not larger than a 64 bit signed int, I am really unsure what is going on.
Thanks,
Craig
Your input file likely contains a UTF-8 byte order mark (3 "invisible" bytes at the beginning of the file that indicate the encoding) that can cause BigQuery's CSV parser to fail.
https://en.wikipedia.org/wiki/Byte_order_mark
I'd suggest Googling for a platform-specific method for view and remove the byte order mark. (A hex editor would do.)
The issue is definitely with file's encoding. I was able to reproduce error.
And then "fixed" it by saving "problematic" file as ANSI (just for test) and now it was loaded successfully.

TSQL XML Parsing

I'm calling a webservice (url below) and am getting the XML results back within SQL Server as a varchar(8000) and then converting that to XML. This works perfectly. I want to parse this XML information out into it's individual values but continue to get null values. This is my first attempt in using XML on my SQL 2008 server so I know I'm missing a very trivial item.
http://dev.virtualearth.net/Services/v1/GeocodeService/GeocodeService.asmx/Geocode?culture=en-us&count=10&query=1%20microsoft%20way,%20redmond,%20wa&landmark=&addressLine=&locality=&postalTown=&adminDistrict=&district=&postalCode=&countryRegion=&mapBounds=&currentLocation=&curLocAccuracy=&entityTypes=&rankBy=
I'm taking the response received and storing it into #XML.
SET #XML = CAST(#Response AS XML)
I'm next trying to pull out the Postal Code to get my results and receive a NULL or the wrong node.
Returns NULL
SELECT #XML.value('(/GeocodingResult/Results/Address/PostalCode) [1]', 'varchar(50)')
Returns "Copyright © 2010 Microsoft and its suppliers. All " (without the quotes)
SELECT #XML.value('(/) [1]', 'varchar(50)')
Your XPath is wrong - your root node is GeocodingResponse (not GeocodingResult), and you're missing a GeocodingResult along the way.
Try this XPath:
/GeocodingResponse/Results/GeocodingResult/Address/PostalCode
or this SQL XQuery:
SELECT
#XML.value('(/GeocodingResponse/Results/GeocodingResult/Address/PostalCode) [1]',
'varchar(50)')