Is there a function in n1ql similar to mysql's concatenation that will allow me to concatenate two fields, such as:
select
firstName + " " + lastName
from table
From the N1QL documentation (https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/stringops.html):
N1QL provides the concatenation string operator. The result of the
concatenation operator is also a string.
expression || expression The following example shows concatenation of two strings.
Query:
SELECT fname || " " || lname AS full_name
FROM tutorial
Result:
{
"results": [
{
"full_name": "Dave Smith"
},
{
"full_name": "Earl Johnson"
},
{
"full_name": "Fred Jackson"
},
{
"full_name": "Harry Jackson"
},
{
"full_name": "Ian Taylor"
},
{
"full_name": "Jane Edwards"
}
]
}
If you want concatenate Integers should convert it first with TO_STRING:
SELECT 'My name is ' || fname || ' and my age is ' || TO_STRING(30) || ' years old' AS age
FROM app
{
"results": [
{
"age": "My name is Dave and my age is 30 years old"
},
{
"age": "My name is Earl and my age is 30 years old"
},
{
"age": "My name is Fred and my age is 30 years old"
},
{
"age": "My name is Harry and my age is 30 years old"
},
{
"age": "My name is Ian and my age is 30 years old"
},
{
"age": "My name is Jane and my age is 30 years old"
}
]
}
Related
In my data flow I have a column with an array and I need to map it to columns.
Here is an example of the data:
["title:mr","name:jon","surname:smith"]
[surname:jane"]
["title:mrs","surname:peters"]
["title:mr"]
and here is an example of the desired result:
what's the best approach to achieve this?
You can do this using the combination of derived column, rank and pivot transformations.
Let's say I have the given sample data (array of strings) as a column mycol.
Now, I have used rank transformation. I have given column name id for rank column and used mycol column for sort condition (ascending order). The result would be as shown below:
Now I have used derived column to create a new column with dynamic expression as unfold(mycol).
For some reason this new column's type was not being rendered properly. So, I have used cast to make it complex type with complex type defination as string[].
I have created 2 new columns key and value. The dynamic contents are as follows:
key: split(new[1],':')[1]
value: split(new[1],':')[2]
Now I have used pivot transformation. Here I have used group by on id, selected pivot column as key and selected pivoted columns as max(value) (since aggregate has to be used).
The required result is obtained. The following is the entire dataflow JSON (The actual transformations start from rank as you already have the array column.)
{
"name": "dataflow1",
"properties": {
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": {
"referenceName": "csv1",
"type": "DatasetReference"
},
"name": "source1"
}
],
"sinks": [
{
"dataset": {
"referenceName": "dest",
"type": "DatasetReference"
},
"name": "sink1"
}
],
"transformations": [
{
"name": "derivedColumn1"
},
{
"name": "rank1"
},
{
"name": "derivedColumn2"
},
{
"name": "cast1"
},
{
"name": "derivedColumn3"
},
{
"name": "pivot1"
}
],
"scriptLines": [
"source(output(",
" mycol as string",
" ),",
" allowSchemaDrift: true,",
" validateSchema: false,",
" ignoreNoFilesFound: false) ~> source1",
"source1 derive(mycol = split(replace(replace(replace(mycol,'[',''),']',''),'\"',''),',')) ~> derivedColumn1",
"derivedColumn1 rank(asc(mycol, true),",
" output(id as long)) ~> rank1",
"rank1 derive(new = unfold(mycol)) ~> derivedColumn2",
"derivedColumn2 cast(output(",
" new as string[]",
" ),",
" errors: true) ~> cast1",
"cast1 derive(key = split(new[1],':')[1],",
" value = split(new[1],':')[2]) ~> derivedColumn3",
"derivedColumn3 pivot(groupBy(id),",
" pivotBy(key),",
" {} = max(value),",
" columnNaming: '$N$V',",
" lateral: true) ~> pivot1",
"pivot1 sink(allowSchemaDrift: true,",
" validateSchema: false,",
" partitionFileNames:['op.csv'],",
" umask: 0022,",
" preCommands: [],",
" postCommands: [],",
" skipDuplicateMapInputs: true,",
" skipDuplicateMapOutputs: true,",
" saveOrder: 1,",
" partitionBy('hash', 1)) ~> sink1"
]
}
}
}
I have a postgreSQL column that looks like this:
{
"table": false,
"time": {
"user": {
"type": "admin"
},
"end": {
"Always": null
},
"sent": {
"Never": 1356
},
"increments": 5,
"increment_type": "weeks",
"type": "days"
}
}
I would like to extract from the json file "Increments = 5 and Increment_type= weeks). result would be -- Column_a = 5 weeks
Use the dereferencing operators, -> and ->> to get what you need:
select concat(
colname->'time'->>'increments',
' ',
colname->'time'->>'increment_type'
) as column_a
from tablename;
In the documentation of MongoDB Atlas search, it says the following for the autocomplete operator:
query: String or strings to search for. If there are multiple terms in
a string, Atlas Search also looks for a match for each term in the
string separately.
For the text operator, the same thing applies:
query: The string or strings to search for. If there are multiple
terms in a string, Atlas Search also looks for a match for each term
in the string separately.
Matching each term separately seems odd behaviour to me. We need multiple searches in our app, and for each we expect less results the more words you type, not more.
Example: When searching for "John Doe", I expect only results with both "John" and "Doe". Currently, I get results that match either "John" or "Doe".
Is this not possible using MongoDB Atlas Search, or am I doing something wrong?
Update
Currently, I have solved it by splitting the search-term on space (' ') and adding each individual keyword to a separate must-sub-clause (with the compound operator). However, then the search query no longer returns any results if there is one keyword with only one character. To account for that, I split keywords with one character from those with multiple characters.
The snippet below works, but for this I need to save two generated fields on each document:
searchString: a string with all the searchable fields concatenated. F.e. "John Doe Man Streetstreet Citycity"
searchArray: the above string uppercased & split on space (' ') into an array
const must = [];
const searchTerms = 'John D'.split(' ');
for (let i = 0; i < searchTerms.length; i += 1) {
if (searchTerms[i].length === 1) {
must.push({
regex: {
path: 'searchArray',
query: `${searchTerms[i].toUpperCase()}.*`,
},
});
} else if (searchTerms[i].length > 1) {
must.push({
autocomplete: {
query: searchTerms[i],
path: 'searchString',
fuzzy: {
maxEdits: 1,
prefixLength: 4,
maxExpansions: 20,
},
},
});
}
}
db.getCollection('someCollection').aggregate([
{
$search: {
compound: { must },
},
},
]).toArray();
Update 2 - Full example of unexpected behaviour
Create collection with following documents:
db.getCollection('testing').insertMany([{
"searchString": "John Doe ExtraTextHere"
}, {
"searchString": "Jane Doe OtherName"
}, {
"searchString": "Doem Sarah Thisistestdata"
}])
Create search index 'default' on this collection:
{
"mappings": {
"dynamic": false,
"fields": {
"searchString": {
"type": "autocomplete"
}
}
}
}
Do the following query:
db.getCollection('testing').aggregate([
{
$search: {
autocomplete: {
query: "John Doe",
path: 'searchString',
fuzzy: {
maxEdits: 1,
prefixLength: 4,
maxExpansions: 20,
},
},
},
},
]).toArray();
When a user searches for "John Doe", this query returns all the documents that have either "John" OR "Doe" in the path "searchString". In this example, that means all 3 documents. The more words the user types, the more results are returned. This is not expected behaviour. I would expect more words to match less results because the search term gets more precise.
An edgeGram tokenization strategy might be better for your use case because it works left-to-right.
Try this index definition take from the docs:
{
"mappings": {
"dynamic": false,
"fields": {
"searchString": [
{
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 3,
"maxGrams": 10,
"foldDiacritics": true
}
]
}
}
}
Also, add change your query clause from must to filter. That will exclude the documents that do not contain all the tokens.
Given a jsonb and set of keys how can I get a new jsonb with required keys.
I've tried extracting key-values and assigned to text[] and then using jsonb_object(text[]). It works well, but the problem comes when a key has a array of jsons.
create table my_jsonb_table
(
data_col jsonb
);
insert into my_jsonb_table (data_col) Values ('{
"schemaVersion": "1",
"Id": "20180601550002",
"Domains": [
{
"UID": "29aa2923",
"quantity": 1,
"item": "book",
"DepartmentDomain": {
"type": "paper",
"departId": "10"
},
"PriceDomain": {
"Price": 79.00,
"taxA": 6.500,
"discount": 0
}
},
{
"UID": "bbaa2923",
"quantity": 2,
"item": "pencil",
"DepartmentDomain": {
"type": "wood",
"departId": "11"
},
"PriceDomain": {
"Price": 7.00,
"taxA": 1.5175,
"discount": 1
}
}
],
"finalPrice": {
"totalTax": 13.50,
"total": 85.0
},
"MetaData": {
"shopId": "1405596346",
"locId": "95014",
"countryId": "USA",
"regId": "255",
"Date": "20180601"
}
}
')
This is what I am trying to achieve :
SELECT some_magic_fun(data_col,'Id,Domains.UID,Domains.DepartmentDomain.departId,finalPrice.total')::jsonb FROM my_jsonb_table;
I am trying to create that magic function which extracts the given keys in a jsonb format, as of now I am able to extract scalar items and put them in text[] and use jsonb_object. but don't know how can I extract all elements of array
expected output :
{
"Id": "20180601550002",
"Domains": [
{
"UID": "29aa2923",
"DepartmentDomain": {
"departId": "10"
}
},
{
"UID": "bbaa2923",
"DepartmentDomain": {
"departId": "11"
}
}
],
"finalPrice": {
"total": 85.0
}
}
I don't know of any magic. You have to rebuild it yourself.
select jsonb_build_object(
-- Straight forward
'Id', data_col->'Id',
'Domains', (
-- Aggregate all the "rows" back together into an array.
select jsonb_agg(
-- Turn each array element into a new object
jsonb_build_object(
'UID', domain->'UID',
'DepartmentDomain', jsonb_build_object(
'departId', domain#>'{DepartmentDomain,departId}'
)
)
)
-- Turn each element of the Domains array into a row
from jsonb_array_elements( data_col->'Domains' ) d(domain)
),
-- Also pretty straightforward
'finalPrice', jsonb_build_object(
'total', data_col#>'{finalPrice,total}'
)
) from my_jsonb_table;
This probably is not a good use of a JSON column. Your data is relational and would better fit traditional relational tables.
Suppose I have the 3 collections in MongoDB below. Assume the "commitment_id" value is the Object_ID of "My Commitment" and the "id" values in the funds array refer to "My Fund" and "His Fund". How do I get the list of funds referred to in the funds array of the commitment_attributes collection? From everything I've read it seems like I would need to loop through the cursor and get each one by one, but is there a way to do this in a query automatically?
db.commitments.insert({
commitment_nm : "My Commitment",
due_dt : "1/1/2016",
ccy: "USD"
});
db.funds.insert({
name: "My Fund",
ccy_iso_cd: "USD"
}, {
name: "His Fund",
ccy_iso_cd: "EUR",
})
db.commitment_attributes.insert({
"commitment_id": ObjectId("56fdd7e371d53780e4a63d58"),
"attribute_nm": "Original Commitment",
"effective_dt": "1/10/2016",
"funds": [
{
"id": ObjectId("56fdd78c71d53780e4a63d57"),
"amount": 1000
},
{
"id": ObjectId("56fdd78c71d53780e4a63d57"),
"amount": 500
}
]
})
var commitment = db.commitments.findOne({commitment_nm: "My Commitment"})
var attribute = db.commitment_attribtues.findOne({"commitment_id" : commitment._id});
var funds = ...... want to query for only the funds which have their ID mentioned in the attribute var