Error while importing beers database using CSV - orientdb

I have the latest community edition 2.2.17. While importing the beers database using csv I am getting error while importing the beers. (categories, styles etc. all got imported fine).
The errors are like:
OrientDB etl v.2.2.17 (build 2.2.x#rd9bace82ea8437117fd48114fc255e791056014b; 2017-02-16 17:20:27+0000) www.orientdb.com
[csv] INFO column types: {last_mod=ANY, abv=ANY, filepath=ANY, name=ANY, cat_id=ANY, upc=ANY, id=ANY, brewery_id=ANY, style_id=ANY, descript=ANY, ibu=ANY, srm=ANY}
BEGIN ETL PROCESSOR
[file] INFO Reading from file C:/Database_studies/nosql/orientdb/Import/OrientDB_self_study_files/beerdb/openbeerdb_csv/beers.csv with encoding UTF-8
Started execution with 1 worker threads
[csv] ERROR Error on converting row 1 field 'last_mod' , value '2010-07-22 20:00:20' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'abv' , value '4.5' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'filepath' , value '' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'name' , value 'Hocus Pocus' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'cat_id' , value '11' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'upc' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'id' , value '1' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'brewery_id' , value '812' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'style_id' , value '116' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'descript' , value 'Our take on a classic summer ale. A toast to weeds, rays, and summer haze. A light, crisp ale for mowing lawns, hitting lazy fly balls, and communing with nature, Hocus Pocus is offered up as a summer sacrifice to clodless days.
Its malty sweetness finishes tart and crisp and is best apprediated with a wedge of orange.' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'ibu' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'srm' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'last_mod' , value '2010-07-22 20:00:20' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'abv' , value '6.7' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'filepath' , value '' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'name' , value 'Grimbergen Blonde' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'cat_id' , value '-1' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'upc' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'id' , value '2' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'brewery_id' , value '264' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'style_id' , value '-1' (class:java.lang.String) to type: ANY
The command I used to import is same as given in the documentation:
./oetl.sh /temp/openbeer/beers.json
(with the directory name changed to the actual one in my system).
Can someone please suggest.
Here is my beers.json file:
{
"config" : { "haltOnError": false },
"source": { "file": { "path": "C:/Database_studies/nosql/orientdb/Import/OrientDB_self_study_files/beerdb/openbeerdb_csv/beers.csv" } },
"extractor": { "csv": { "columns": ["id","brewery_id","name","cat_id","style_id","abv","ibu","srm","upc","filepath","descript","last_mod"],
"columnsOnFirstLine": true } },
"transformers": [
{ "vertex": { "class": "Beer" } },
{ "edge": { "class": "HasCategory", "joinFieldName": "cat_id", "lookup": "Category.id" } },
{ "edge": { "class": "HasBrewery", "joinFieldName": "brewery_id", "lookup": "Brewery.id" } },
{ "edge": { "class": "HasStyle", "joinFieldName": "style_id", "lookup": "Style.id" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:C:/orientdb_install_031217/orientdb-community-2.2.17/databases/openbeerdb",
"dbType": "graph",
"classes": [
{"name": "Beer", "extends": "V"},
{"name": "HasCategory", "extends": "E"},
{"name": "HasStyle", "extends": "E"},
{"name": "HasBrewery", "extends": "E"}
], "indexes": [
{"class":"Beer", "fields":["id:integer"], "type":"UNIQUE" }
]
}
}
}
Thanks,
DBuserN

My suggestion is to explicate the types for each column
"extractor": { "csv": { "columns": ["id:integer","brewery_id:integer","name:string","cat_id:integer","style_id:integer","abv:integer","ibu:integer","srm:integer","upc:integer","filepath:string","descript:string","last_mod:dateTime"]
Check the CSV extractor documentation:
http://orientdb.com/docs/last/Extractor.html
And be sure the default dateTimeFormat is right for your input file.

Related

Querying array of nested objects with nested array of objects in Redshift

Let's say I have the following JSON
{
"id": 1,
"sets": [
{
"values": [
{
"value": 1
},
{
"value": 2
}
]
},
{
"values": [
{
"value": 5
},
{
"value": 6
}
]
}
]
}
If the table name is X I expect the query
SELECT x.id, v.value
FROM X as x,
x.sets as sets,
sets.values as v
to give me
id, value
1, 1
1, 2
2, 5
2, 6
and it does work if both sets and values has one object each. When there's more the query fails with column 'id' had 0 remaining values but expected 2. Seems to me I'm not iterating over "sets" properly?
So my question is: what's the proper way to query data structured like my example above in Redshift (using PartiQL)?

malformed array literal when convetring jsonb array of jsonb items to postgres array of jsonb by jsonb_array_elements

I have jsonb-array:
element_values := '[
{
"element_id": "a7993f3d-9256-4354-a147-5b9d18d7812b",
"value": true
},
{
"element_id": "ceeb364e-bb88-4f41-9c56-9e5f4d0bc1fb",
"value": None
},
...
]'::JSONB
And I want to convert it into array of jsonb objects: JSONB[]
I tried this method:
<<elements_n_values_relationship_create>>
DECLARE
elements_n_values_relationship JSONB[];
BEGIN
SELECT * FROM jsonb_array_elements(element_values) INTO elements_n_values_relationship;
...
END;
But I got the following error:
ERROR: malformed array literal: "{"value": true, "element_id": "a7993f3d-9256-4354-a147-5b9d18d7812b"}"
DETAIL: Unexpected array element.
Why it does not work?
You have to use null in place of None to make your statement work
EDIT:
Try this in pgadmin or any SQL client, is is working as expected
select jsonb_array_elements('[{
"element_id": "a7993f3d-9256-4354-a147-5b9d18d7812b",
"value": true
},
{
"element_id": "ceeb364e-bb88-4f41-9c56-9e5f4d0bc1fb",
"value": null
}]'::JSONB);
jsonb_array_elements
"{""value"":"{""value"": true, ""element_id"": ""a7993f3d-9256-4354-a147-5b9d18d7812b""}"
{ "value": null, "element_id": "ceeb364e-bb88-4f41-9c56-9e5f4d0bc1fb" }

Can MongoDB resolve empty string as property name?

Knows somebody, how to query from MongoDB value from this (JSON valid, pretty printed) object:
var a = JSON.parse(`
{
"vnut_okraj_podmienky": {
"": {
"standart_podmienky": {
"type": "radio",
"value": "on"
},
"nestand_teplota": {
"type": "number",
"value": "24"
},
"nestand_vlhkost": {
"type": "number",
"value": "70"
}
}
}
}
`
In browser console I can obtain value (=24) of:
a.vnut_okraj_podmienky[""].nestand_teplota.value
but mongosh returns [] on this (db name irrelevant):
db.isover_projects.distinct("vnut_okraj_podmienky.''.nestand_teplota.value")
and error MongoServerError: FieldPath field names may not be empty strings.
on:
db.isover_projects.distinct("vnut_okraj_podmienky..nestand_teplota.value")
The MongoDB server stores data in BSON.
According the specification at https://bsonspec.org/spec.html a field name must be
Zero or more modified UTF-8 encoded characters followed by '\x00'. The (byte*) MUST NOT contain '\x00', hence it is not full UTF-8.
So it technically can store the empty string as a field name.
This works in simple queries as well:
>db.collection.find({"":{a:1}})
[ { _id: ObjectId("616c4783e3be8ecf36d5e932"), '': { a: 1 } } ]
This also works dotted notation:
>db.collection.find({".a":1})
[ { _id: ObjectId("616c4783e3be8ecf36d5e932"), '': { a: 1 } } ]
However, that does not work if you try to use that empty field name with update, projection, or aggregation operators:
>db.collection.aggregate([{$match:{".a":1}},{$set:{".b":2}}])
MongoError: Invalid $set :: caused by :: FieldPath field names may not be empty strings.
So while it is technically permitted to store a document with a field whose name is the empty string, not all operations are support on such fields.

Access value inside a mongo object, when you know the key but not the value

I am trying to find the value of a key inside of a mongo object. This is what my document in mongo looks like:
_id: 123456
▼ Item_Data: Object
▼ Payload: Object
▼ Items: Array
▸ 0: Object
▼ 1: Object
key1: "string1"
key2: "string2"
key3: "string3"
▸ 2: Object
▼ 3: Object
key1: "string4"
key2: "string5"
key3: "string6"
What I am trying to do:
I am given the value of a key, for example, in this case, I am given a string value "string1" for key1
I search the "Items" array for the object that has a key1 value of "string1". (each object has a unique key1 value)
I want to find the value of key2 that is in the same object I found in the previous step
So I input the "string1", find the object that has "string1" value in key1, and it outputs the value of key2 in that same object. In this case, it would output "string2"
I have tried a couple of methods (all of which didn't work) until I found this, which seemed the most promising.
test = myCollection.find_one({ "_id": 123456, "Item_Data.payload.items.key1" : "string1" },
{ "Item_Data.payload.items.$": 1 })
If I do print(test), I get:
{'_id': 123456, 'Item_Data': {'payload': {'items': [{'key1': 'string1', 'key2': 'string2', 'key3': 'string3'}]}}}
I tried to hone in on the relevant data by using
test = myCollection.find_one({ "_id": 123456, "Item_Data.payload.items.key1" : "string1" },
{ "Item_Data.payload.items.$": 1 })['Item_Data']['payload']['items']
but this just prints
[{'key1': 'string1', 'key2': 'string2', 'key3': 'string3'}]
Now this thing above ^^ is a list (I checked using type()), but when I do len(), it returns 1.
I also tried the following code:
var1 = json.dumps(test)
var2 = json.loads(var1)
print(type(test))
print(type(var1))
print(type(var2))
print(len(test))
print(len(var2))
which returns
<class 'list'>
<class 'str'>
<class 'list'>
1
1
Essentially I am getting the whole data as a single element list, and thus cannot do something like
print(var2['key2'])
to get the value of key2.
I feel like I am doing this completely wrong
Try the following
db.collection.aggregate([
{
$match: {
"Item_Data.payload.items.key1": "string1"
},
},
{
$unwind: "$Item_Data.payload.items"
},
{
$match: {
"Item_Data.payload.items.key1": "string1"
},
},
{
$project: {
_id: 1,
key2: "$Item_Data.payload.items.key2"
}
}
])
Output
[
{
"_id": 123456,
"key2": "string2"
}
]
mongoplayground

What does it mean in JSON

{
"messageshow": [
{
"message_id": "497",
"message": "http://flur.p-sites.info/api/messages/voice/1360076234.caff",
"message_pic": "<UIImage: 0xa29e160>",
"uid": "44",
"created": "4 hours ago",
"username": "pari",
"first_name": "pp",
"last_name": "pp",
"profile_pic": "http://flur.p-sites.info/api/uploads/13599968121.jpg",
"tag_user": {
"tags": [
{
"message": "false"
}
]
},
"boos_list": {
"booslist": [
{
"message": "false"
}
]
},
"aplouds_list": {
"aploudslist": [
{
"message": "false"
}
]
},
"total_comments": 0,
"total_boos": 0,
"total_applouds": 0
},
{
"message_id": "496",
"message": "http://flur.p-sites.info/api/messages/voice/1360076182.caff",
"message_pic": "<UIImage: 0xa3b0610>",
"uid": "44",
"created": "4 hours ago",
"username": "pari",
"first_name": "pp",
"last_name": "pp",
"profile_pic": "http://flur.p-sites.info/api/uploads/13599968121.jpg",
"tag_user": {
"tags": [
{
"message": "false"
}
]
},
"boos_list": {
"booslist": [
{
"message": "false"
}
]
},
"aplouds_list": {
"aploudslist": [
{
"message": "false"
}
]
},
"total_comments": 0,
"total_boos": 0,
"total_applouds": 0
}
]
}
In this JSON all value are coming in "" quotes, but few tags are coming without any quotes what does it indicate ?
JSON Display value without quote it consider as Numeric value..
For JSON beginner :
JSON Syntax Rules
JSON syntax is a subset of the JavaScript object notation syntax:
Data is in name/value pairs
Data is separated by commas
Curly braces hold objects
Square brackets hold arrays
JSON data is written as name/value pairs.
A name/value pair consists of a field name (in double quotes), followed by a colon, followed by a value:
"firstName" : "John"
This is simple to understand, and equals to the JavaScript statement:
firstName = "John"
JSON values can be:
A number (integer or floating point)
A string (in double quotes)
A Boolean (true or false)
An array (in square brackets)
An object (in curly brackets)
null
JSON Objects :
JSON objects are written inside curly brackets,
Objects can contain multiple name/values pairs:
{ "firstName":"John" , "lastName":"Doe" }
This is also simple to understand, and equals to the JavaScript statements:
firstName = "John"
lastName = "Doe"
JSON Arrays :
JSON arrays are written inside square brackets.
An array can contain multiple objects:
{
"employees": [
{ "firstName":"John" , "lastName":"Doe" },
{ "firstName":"Anna" , "lastName":"Smith" },
{ "firstName":"Peter" , "lastName":"Jones" }
]
}
In the example above, the object "employees" is an array containing three objects. Each object is a record of a person (with a first name and a last name).
This is Basic of JSON
For more understanding refere this site.
Thanks
The tags which are without double quotes are integer values or Boolean Values or NULL.
The tags which are starting with [] square brackets are Arrays.
The tags which are starting with {} is JSON inside a attribute/value.
That depends on the type of the value. If the value is an numerical type its WITHOUT the quotes.
If it is no numerical type it's WITH the quotes (for example Strings, like most in your example).
In addition to strings JSON supports numerical values. So in this case the values without quotes are simply considered numbers.
They are numeric values. As per the JSON docs:
A value can be a string in double quotes, or a number, or true or
false or null, or an object or an array. These structures can be
nested.