I want to write a groovy function in the "transformConfigs" of table config, in Apache Pinot. I want to assign value 'A' if my timestamp is greater than 7:00:00, but otherwise assign value 'B' in the column.
These are what I have tried, but none of these work:
Groovy({LocalDateTime.parse(t, \"yyyy-MM-dd hh:mm:ss.SSS\").toDate().format('HH:mm:ss') > LocalTime.of(7, 0, 0) ? 'A' : 'B'}, t)
Groovy({java.time.format.DateTimeFormatter.LocalDateTime.parse(t, "yyyy-MM-dd hh:mm:ss.SSS").toDate().format('HH:mm:ss') > java.time.format.DateTimeFormatter.LocalTime.of(7, 0, 0) ? 'A' : 'B'}, t)
Groovy({SimpleDateFormat time_now = new SimpleDateFormat(\"yyyy-MM-dd hh:mm:ss.SSS\"); SimpleDateFormat output = new SimpleDateFormat(\"hh:mm:ss.SSS\"); Date d = time_now.parse(t); String f_time_now = output.format(d); String f_shift_time = output.format(\"07:00:00.000\"); def result = formattedTime > f_shift_time ? 'A' : 'B'; return result}, t)
Table config:
{
"REALTIME": {
"tableName": "test_REALTIME",
"tableType": "REALTIME",
"segmentsConfig": {
"schemaName": "test",
"replication": "1",
"timeColumnName": "t",
"allowNullTimeValue": false,
"replicasPerPartition": "1"
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant",
"tagOverrideConfig": {}
},
"tableIndexConfig": {
"invertedIndexColumns": [],
"noDictionaryColumns": [],
"streamConfigs": {
"streamType": "kafka",
"stream.kafka.topic.name": "test_topic",
"stream.kafka.broker.list": "localhost:9092",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
...
},
"rangeIndexColumns": [],
"rangeIndexVersion": 2,
"autoGeneratedInvertedIndex": false,
"createInvertedIndexDuringSegmentGeneration": false,
"sortedColumn": [],
"bloomFilterColumns": [],
"loadMode": "MMAP",
"onHeapDictionaryColumns": [],
"varLengthDictionaryColumns": [],
"enableDefaultStarTree": false,
"enableDynamicStarTreeCreation": false,
"aggregateMetrics": false,
"nullHandlingEnabled": false
},
"metadata": {},
"quota": {},
"routing": {},
"query": {},
"ingestionConfig": {
"transformConfigs": [
{
"columnName": "shift",
"transformFunction": "Groovy({LocalDateTime.parse(t, \"yyyy-MM-dd hh:mm:ss.SSS\").toDate().format('HH:mm:ss') > LocalTime.of(7, 0, 0) ? 'A' : 'B'}, t)"
}
]
},
"isDimTable": false
}
}
Schema:
{
"schemaName": "test",
"dimensionFieldSpecs": [
{
"name": "shift",
"dataType": "STRING"
}
],
"dateTimeFieldSpecs": [
{
"name": "t",
"dataType": "TIMESTAMP",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS"
}
]
}
Output: no ingestion takes place.
Thank you in advance. Let me know if I can provide more details.
Related
In our project we use serverSideDataSource, so the getRows method gets called with request params having below request model,
{
"startRow": 0,
"endRow": 50,
"rowGroupCols": [],
"valueCols": [],
"pivotCols": [],
"pivotMode": false,
"groupKeys": [],
"filterModel": {
"columnA": {
"filterType": "number",
"type": "equals",
"filter": 123
},
"columnB": {
"filterType": "number",
"type": "equals",
"filter": 676
}
},
"sortModel": []
}
But recent filter was applied on columnB, but there is just no way to tell that was the recent filter applied.Please help !!
I read from: https://aws.amazon.com/blogs/database/using-the-data-api-to-interact-with-an-amazon-aurora-serverless-mysql-database/
The RDSDataService client also supports parameterized queries by allowing you to use placeholder parameters in SQL statements. Escaped input values permit the resolution of these parameters at runtime. Parameterized queries are useful to prevent SQL injection attacks.
But when I use it with Postgres, pass string: myname's and it breaks my SQL syntax. I don't sure how RDSDataService deal with SQL injection attacks as they written in document.
Could anyone can help me explain this? and how to deal safe SQL String in this case?
UPDATED: Sorry for my bad. RDSDataService already escaped string literal when using Parameterized queries.
Here is some basic code to take return values from Redshift or Aurora and transform it to insert into the database in a batch parameterSet:
Take your response including the metadata and pass that into this function. It will parse as strings or ints. If you need more datatypes supported you will have to create more if statements in the function below:
const data =
{
"ColumnMetadata": [
{
"isCaseSensitive": true,
"isCurrency": false,
"isSigned": false,
"label": "dealer_name",
"length": 0,
"name": "dealer_name",
"nullable": 1,
"precision": 255,
"scale": 0,
"schemaName": "raw_data",
"tableName": "xxxxxxxxxxxxxxxxx",
"typeName": "varchar"
},
{
"isCaseSensitive": true,
"isCurrency": false,
"isSigned": false,
"label": "city",
"length": 0,
"name": "city",
"nullable": 1,
"precision": 255,
"scale": 0,
"schemaName": "raw_data",
"tableName": "xxxxxxxxxxxxxxxxx",
"typeName": "varchar"
},
{
"isCaseSensitive": false,
"isCurrency": false,
"isSigned": true,
"label": "vehicle_count",
"length": 0,
"name": "vehicle_count",
"nullable": 1,
"precision": 19,
"scale": 0,
"schemaName": "",
"tableName": "",
"typeName": "int8"
}
],
"Records": [
[
{
"stringValue": "Grand Prairie Ford Inc."
},
{
"stringValue": "Grand Prairie"
},
{
"longValue": 18
}
],
[
{
"stringValue": "Currie Motors Ford of Valpo"
},
{
"stringValue": "Valparaiso"
},
{
"longValue": 16
}
]
],
"TotalNumRows": 2
}
const buildParameterSets = (res) => {
let columns = res.ColumnMetadata.map((c) => [c.name, c.typeName] );//get type and name of column
let data = res.Records.map((r) => {
let arr = r.map((v, i) => {
if (columns[i][1].includes("int")) {
return {
name: columns[i][0],
value: {
longValue: Object.values(v)[0]
}
}
} else {
return {
name: columns[i][0],
value: {
stringValue: Object.values(v)[0]
}
}
}
});
return arr;
});
return data;
};
console.log(buildParameterSets(data));
Then you can insert using the BatchExecuteStatementCommand from the AWS SDK:
https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-rds-data/classes/batchexecutestatementcommand.html
const rds_client = new RDSDataClient({ region: "us-east-2" });
let insert_sql = `INSERT INTO dealer_inventory (
dealer_name,
city,
vehicle_count
) VALUES (
:dealer_name,
:city,
:vehicle_count
)`;
try {
// insert data
const insert_params = {
database: "dev",
parameterSets: parameterSets,
sql: insert_sql,
secretArn: process.env.SECRET_ARN,
resourceArn: process.env.RESOURCE_ARN,
};
const insert_command = new RDSBatchExecuteStatementCommand(insert_params);
var insert_response = await rds_client.send(insert_command);
} catch (error) {
console.log("RDS INSERT ERROR");
console.log(error.message);
} finally {
console.log("Inserted: ");
console.log(insert_response);
}
This is my schema:
db.createCollection("user_clicks", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "session_id", "country", "browser", "url", "date"],
properties: {
session_id: {
bsonType: "string",
description: "must be a string and is required" },
country: {
bsonType: "string",
description: "country name and is required"},
browser: {
bsonType: "string",
description: "browser name and is required"},
url : {
bsonType: "string",
description: "user click url and is required"},
date: {
bsonType: "date",
description: "localdatetime and is required"}}}})
This is the code I am using to generate data:
mgeneratejs `{
"session_id": "$oid",
"country": "$country",
"browser": {
"$choose": {
"from": [
"Firefox",
"Chrome",
"Safari",
"Explorer"
],
"weights": [
1,
2,
2,
1
]
}
},
"url": {
"$choose": {
"from": [
"google.com/images",
"facebook.com/profile1538713",
"soundcloud.com/playlist03",
"some-url.com/home",
"sinoptik.ua/kyiv"
],
"weights": [
1,
2,
2,
1,
3
]
}
},
"date": {
"$date": {
"min": "2016-08-01T23:59:59.999Z",
"max": "2016-10-01T23:59:59.999Z"
}
}
}` -n 5 | mongoimport --uri="mongodb://localhost:27017/events" --collection user_clicks --mode=insert
I'm trying to generate random date by using mgeneratejs and mongoimport. The problem is that i can't insert any date like : "3/13/2019" or "2019-03-26T23:44:26Z" (that's what i actually need). The error is:
**WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})**
I try to insert like new Date("2019-03-26T23:44:26Z") and it works! Please help how to automate every time inserting create new Date(date) or how to fix this !
I am confused at what your issue is... everything is working as it should.. You store dates in Mongo as Date objects (that have the type of Date)..
If you want to store dates in Mongo in a format like 3/13/2019 you can do something like this:
// javascript
let dateToInsert = new Date("3/13/2019");
Use the below before inserting the random date and it should work:
new Date(randomDate);
Using V6 reporting, I created a bubble chart with simple percentage values. I'm able to format the axis values using it's "format" property. But in the Bubble tooltip, the value is still raw value... Is there a way to show the Formatted value instead of the value ?
Here is the code of the report based on "Sales" schema to reproduce easily:
{
"classID": "ic3.ReportGuts",
"guts_": {
"schemaName": "Sales",
"cubeName": "Sales",
"themeId": "ic3-elegant",
"ic3Rev": 4015,
"layout": {
"classID": "ic3.FixedLayout",
"guts_": {
"grid": 10,
"boxes": [
{
"classID": "ic3.FixedLayoutBox",
"guts_": {
"behaviour": "Fixed Box",
"position": {
"top": 10,
"left": 10,
"width": 930,
"height": 460
},
"advanced": {
"zIndex": 15
},
"header": "#{bubbleTitle}",
"boxStyle": "zoneRequired",
"ic3_uid": "ic3-5",
"widgetAdapterUid": "w1"
}
}
]
}
},
"widgetMgr": {
"classID": "ic3.WidgetAdapterContainerMgr",
"guts_": {
"items": [
{
"classID": "ic3.GoogleChartsAdapter",
"guts_": {
"configState": 3,
"navigationGuts": {
"classID": "ic3.NavigationStrategy",
"guts_": {
"menuVisibility": {
"back": false,
"reset": false
},
"maxAxisMemberCount": 25
}
},
"dataRenderOptions": {
"chartType": {
"label": "Bubble",
"id": "bubble-google-chart",
"proto": {
"options": {
"width": "100%",
"height": "100%",
"bubble": {
"textStyle": {
"fontSize": 10
}
}
},
"chartType": "BubbleChart"
}
},
"axesConfiguration": null,
"graphsConfiguration": null,
"advanced": {
"hAxis": {
"format": "\"##.##%\""
},
"vAxis": {
"format": null
},
"legend": {
"position": "none"
},
"colorAxis": {
"colors": [
"#1F77B4",
"#FF7F0E",
"#2CA02C",
"#D62728",
"#9467BD",
"#8C564B",
"#E377C2",
"#7F7F7F",
"#BCBD22",
"#17BECF"
],
"legend": {
"position": "none"
}
},
"explorer": {},
"sizeAxis": {
"minSize": 7
},
"tooltip": {
"format": null
}
}
},
"ic3_name": "widget-12",
"ic3_eventMapper": {
"classID": "ic3.EventWidgetMapper",
"guts_": {
"__ic3_widgetEventsDescription": {}
}
},
"navigationOptions": {
"menuVisibility": {
"back": false,
"reset": false
}
},
"hooks": {
"beforeData": "/**\n * Return data object\n */\nfunction(context, data, $box) {\n //bubble on data received\n debugger\n\treturn data;\n}",
"beforeRender": "/**\n * Return patched \n * options object.\n */\nfunction(context, options) {\n //bubble before render\n\treturn options;\n}"
},
"ic3_uid": "w1",
"ic3_mdxBuilderUid": "m1"
}
}
]
}
},
"constantMgr": {
"classID": "ic3.ConstantsMgr",
"guts_": {}
},
"cssMgr": {
"classID": "ic3.CssMgr",
"guts_": {}
},
"javascriptMgr": {
"classID": "ic3.ReportJavascriptMgr",
"guts_": {
"js": "/** \n * A function called each time an event is generated. \n * \n * #param context the same object is passed between consumeEvent calls. \n * Can be used to store information. \n * { \n * $report : jQuery context of the report container \n * fireEvent : a function( name, value ) triggering an event \n * } \n * \n * #param event the event information \n * \n { \n * name : as specified in the 'Events' tab \n * value : (optional) actual event value \n * type : (optional) e.g., ic3selection \n * } \n * \n * Check the 'Report Event Names' menu for the list of available events. \n */ \n/* \nfunction consumeEvent( context, event ) { \n if (event.name == 'ic3-report-init') { \n // add your code here \n } \n} \n*/ \n"
}
},
"calcMeasureMgr": {
"classID": "ic3.CalcMeasureMgr",
"guts_": {
"measures": []
}
},
"mdxQueriesMgr": {
"classID": "ic3.MdxQueriesContainerMgr",
"guts_": {
"mdxQueries": {
"classID": "ic3.BaseContainerMgr",
"guts_": {
"items": [
{
"classID": "ic3.QueryBuilderWidget",
"guts_": {
"mode": "MDX",
"options": {
"WIZARD": {
"cubeName": null,
"measures": [],
"rows": [],
"rowsNonEmpty": false,
"columns": [],
"columnsNonEmpty": false,
"filter": []
},
"MDX": {
"statement": "with\nmember [PDM Sejours] as 0.5072 ,format_string=\"percent\"\nmember [Evo PDM] as 0.00291 ,format_string=\"percent\"\nmember [Activité Etablissements] as 8113 ,format_string=\"#,###\"\nselect \n NON EMPTY {[Measures].[PDM Sejours], [Measures].[Evo PDM] , [Measures].[Activité Etablissements]} ON COLUMNS, \nNON EMPTY {[Product].[Product].[Company].[icCube]} on ROWS \nfrom [Sales]\n"
}
},
"ic3_name": "mdx Query-0",
"ic3_uid": "m1",
"schemaSettings": {}
}
}
]
}
},
"mdxFilter": {
"classID": "ic3.BaseContainerMgr",
"guts_": {
"items": []
}
},
"actionBuilders": {
"classID": "ic3.BaseContainerMgr",
"guts_": {
"items": []
}
}
}
},
"customLocalizations": []
}
}
formatting the data for the chart, should flow through to the tooltip
not familiar with iccube, however, when loading the data, you can use google's object notation
to provide both the value (v:) and the formatted value (f:)
for example, instead of loading the following data row...
['Sub-Saharan Africa', 80, 1.023],
use object notation...
[{v: 'Sub-Saharan Africa'}, {v: 80, f: 'test 80'}, {v: 1.023, f: 'test 1.023000000'}],
the tooltip should display the value for f:
see following working snippet...
google.charts.load('current', {
callback: function () {
var data = new google.visualization.DataTable({
cols: [
{label: 'ID', type: 'string'},
{label: 'X', type: 'number'},
{label: 'Y', type: 'number'}
],
rows: [
{c:[{v: 'Sub-Saharan Africa'}, {v: 80, f: 'test 80'}, {v: 1.023, f: 'test 1.023000000'}]},
{c:[{v: 'Arab States'}, {v: 80, f: 'test 80'}, {v: 1.022, f: 'test 1.0220000000'}]},
{c:[{v: 'East Asia and the Pacific'}, {v: 80, f: 'test 80'}, {v: 1.21, f: 'test 1.2100000000'}]}
]
});
var container = document.getElementById('chart_div');
var chart = new google.visualization.BubbleChart(container);
chart.draw(data);
},
packages: ['corechart']
});
<script src="https://www.gstatic.com/charts/loader.js"></script>
<div id="chart_div"></div>
I'm trying to implement a service in my play2 app that uses elastic4s to get a document by Id.
My document in elasticsearch:
curl -XGET 'http://localhost:9200/test/venues/3659653'
{
"_index": "test",
"_type": "venues",
"_id": "3659653",
"_version": 1,
"found": true,
"_source": {
"id": 3659653,
"name": "Salong Anna och Jag",
"description": "",
"telephoneNumber": "0811111",
"postalCode": "16440",
"streetAddress": "Kistagången 12",
"city": "Kista",
"lastReview": null,
"location": {
"lat": 59.4045675,
"lon": 17.9502138
},
"pictures": [],
"employees": [],
"reviews": [],
"strongTags": [
"skönhet ",
"skönhet ",
"skönhetssalong"
],
"weakTags": [
"Frisörsalong",
"Frisörer"
],
"reviewCount": 0,
"averageGrade": 0,
"roundedGrade": 0,
"recoScore": 0
}
}
My Service:
#Singleton
class VenueSearchService extends ElasticSearchService[IndexableVenue] {
/**
* Elastic search conf
*/
override def path = "test/venues"
def getVenue(companyId: String) = {
val resp = client.execute(
get id companyId from path
).map { response =>
// transform response to IndexableVenue
response
}
resp
}
If I use getFields() on the response object I get an empty object. But if I call response.getSourceAsString I get the document as json:
{
"id": 3659653,
"name": "Salong Anna och Jag ",
"description": "",
"telephoneNumber": "0811111",
"postalCode": "16440",
"streetAddress": "Kistagången 12",
"city": "Kista",
"lastReview": null,
"location": {
"lat": 59.4045675,
"lon": 17.9502138
},
"pictures": [],
"employees": [],
"reviews": [],
"strongTags": [
"skönhet ",
"skönhet ",
"skönhetssalong"
],
"weakTags": [
"Frisörsalong",
"Frisörer"
],
"reviewCount": 0,
"averageGrade": 0,
"roundedGrade": 0,
"recoScore": 0
}
As you can se the get request omits info:
"_index": "test",
"_type": "venues",
"_id": "3659653",
"_version": 1,
"found": true,
"_source": {}
If I try to do a regular search:
def getVenue(companyId: String) = {
val resp = client.execute(
search in "test"->"venues" query s"id:${companyId}"
//get id companyId from path
).map { response =>
Logger.info("response: "+response.toString)
}
resp
}
I get:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "venues",
"_id": "3659653",
"_score": 1,
"_source": {
"id": 3659653,
"name": "Salong Anna och Jag ",
"description": "",
"telephoneNumber": "0811111",
"postalCode": "16440",
"streetAddress": "Kistagången 12",
"city": "Kista",
"lastReview": null,
"location": {
"lat": 59.4045675,
"lon": 17.9502138
},
"pictures": [],
"employees": [],
"reviews": [],
"strongTags": [
"skönhet ",
"skönhet ",
"skönhetssalong"
],
"weakTags": [
"Frisörsalong",
"Frisörer"
],
"reviewCount": 0,
"averageGrade": 0,
"roundedGrade": 0,
"recoScore": 0
}
}
]
}
}
My Index Service:
trait ElasticIndexService [T <: ElasticDocument] {
val clientProvider: ElasticClientProvider
def path: String
def indexInto[T](document: T, id: String)(implicit writes: Writes[T]) : Future[IndexResponse] = {
Logger.debug(s"indexing into $path document: $document")
clientProvider.getClient.execute {
index into path doc JsonSource(document) id id
}
}
}
case class JsonSource[T](document: T)(implicit writes: Writes[T]) extends DocumentSource {
def json: String = {
val js = Json.toJson(document)
Json.stringify(js)
}
}
and indexing:
#Singleton
class VenueIndexService #Inject()(
stuff...) extends ElasticIndexService[IndexableVenue] {
def indexVenue(indexableVenue: IndexableVenue) = {
indexInto(indexableVenue, s"${indexableVenue.id.get}")
}
Why is getFields empty when doing get?
Why is query info left out when doing getSourceAsString in a get request?
Thank you!
What you're hitting in question 1 is that you're not specifying which fields to return. By default ES will return the source and not fields (other than type and _id). See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
I've added a test to elastic4s to show how to retrieve fields, see:
https://github.com/sksamuel/elastic4s/blob/master/src%2Ftest%2Fscala%2Fcom%2Fsksamuel%2Felastic4s%2FSearchTest.scala
I am not sure on question 2.
The fields are empty because elasticsearch don't return it.
If you need fields, you must indicate in query what field you need:
this is you search query without field:
search in "test"->"venues" query s"id:${companyId}"
and in this query we indicate which field we want to, in this case 'name' and 'description':
search in "test"->"venues" fields ("name","description") query s"id:${companyId}"
now you can retrieve the fields:
for(x <- response.getHits.hits())
{
println(x.getFields.get("name").getValue)
You found a getSourceAsString in a get request because the parameter _source is to default 'on' and fields is to default 'off'.
I hope this will help you