CQL SELECT fails with prepareStatement - select

In my sample app following snippet works fine.
SELECT_CQL = "SELECT * FROM " + STREAM_NAME_IN_CASSANDRA + " WHERE '" + CONTEXT_ID_COLUMN + "'=?";
Connection connection = getConnection();
statement = connection.prepareStatement(SELECT_CQL);
statement.setString(1, "123");
resultSet = statement.executeQuery();
But when I try to add another parameter to the where clause query returns nothing!!
SELECT_CQL = "SELECT * FROM " + STREAM_NAME_IN_CASSANDRA + " WHERE '" + CONTEXT_ID_COLUMN + "'=? AND '"+TIMESTAMP_COLUMN+"'=?";
Connection connection = getConnection();
statement = connection.prepareStatement(SELECT_CQL);
statement.setString(1, "123");
statement.setString(2, "1390996577514");
resultSet = statement.executeQuery();
When I try the exact query within cqlsh terminal, it works fine.

statement.setString(2, "1390996577514");
Double-check the datatype of your TIMESTAMP_COLUMN and make sure that it's a string. Otherwise you'll need to use the appropriate "set" method. Ex:
statement.setLong(2, 1390996577514L);

Related

Flink SQL Client connect to secured kafka cluster

I want to execute a query on Flink SQL Table backed by kafka topic of secured kafka cluster. I'm able to execute the query programmatically but unable to do the same through Flink SQL client. I'm not sure on how to pass JAAS config (java.security.auth.login.config) and other system properties through Flink SQL client.
Flink SQL query programmatically
private static void simpleExec_auth() {
// Create the execution environment.
final EnvironmentSettings settings = EnvironmentSettings.newInstance()
.inStreamingMode()
.withBuiltInCatalogName(
"default_catalog")
.withBuiltInDatabaseName(
"default_database")
.build();
System.setProperty("java.security.auth.login.config","client_jaas.conf");
System.setProperty("sun.security.jgss.native", "true");
System.setProperty("sun.security.jgss.lib", "/usr/libexec/libgsswrap.so");
System.setProperty("javax.security.auth.useSubjectCredsOnly","false");
TableEnvironment tableEnvironment = TableEnvironment.create(settings);
String createQuery = "CREATE TABLE test_flink11 ( " + "`keyid` STRING, " + "`id` STRING, "
+ "`name` STRING, " + "`age` INT, " + "`color` STRING, " + "`rowtime` TIMESTAMP(3) METADATA FROM 'timestamp', " + "`proctime` AS PROCTIME(), " + "`address` STRING) " + "WITH ( "
+ "'connector' = 'kafka', "
+ "'topic' = 'test_flink10', "
+ "'scan.startup.mode' = 'latest-offset', "
+ "'properties.bootstrap.servers' = 'kafka01.nyc.com:9092', "
+ "'value.format' = 'avro-confluent', "
+ "'key.format' = 'avro-confluent', "
+ "'key.fields' = 'keyid', "
+ "'value.fields-include' = 'EXCEPT_KEY', "
+ "'properties.security.protocol' = 'SASL_PLAINTEXT', 'properties.sasl.kerberos.service.name' = 'kafka', 'properties.sasl.kerberos.kinit.cmd' = '/usr/local/bin/skinit --quiet', 'properties.sasl.mechanism' = 'GSSAPI', "
+ "'key.avro-confluent.schema-registry.url' = 'http://kafka-schema-registry:5037', "
+ "'key.avro-confluent.schema-registry.subject' = 'test_flink6', "
+ "'value.avro-confluent.schema-registry.url' = 'http://kafka-schema-registry:5037', "
+ "'value.avro-confluent.schema-registry.subject' = 'test_flink4')";
System.out.println(createQuery);
tableEnvironment.executeSql(createQuery);
TableResult result = tableEnvironment
.executeSql("SELECT name,rowtime FROM test_flink11");
result.print();
}
This is working fine.
Flink SQL query through SQL client
Running this giving the following error.
Flink SQL> CREATE TABLE test_flink11 (`keyid` STRING,`id` STRING,`name` STRING,`address` STRING,`age` INT,`color` STRING) WITH('connector' = 'kafka', 'topic' = 'test_flink10','scan.startup.mode' = 'earliest-offset','properties.bootstrap.servers' = 'kafka01.nyc.com:9092','value.format' = 'avro-confluent','key.format' = 'avro-confluent','key.fields' = 'keyid', 'value.avro-confluent.schema-registry.url' = 'http://kafka-schema-registry:5037', 'value.avro-confluent.schema-registry.subject' = 'test_flink4', 'value.fields-include' = 'EXCEPT_KEY', 'key.avro-confluent.schema-registry.url' = 'http://kafka-schema-registry:5037', 'key.avro-confluent.schema-registry.subject' = 'test_flink6', 'properties.security.protocol' = 'SASL_PLAINTEXT', 'properties.sasl.kerberos.service.name' = 'kafka', 'properties.sasl.kerberos.kinit.cmd' = '/usr/local/bin/skinit --quiet', 'properties.sasl.mechanism' = 'GSSAPI');
Flink SQL> select * from test_flink11;
[ERROR] Could not execute SQL statement. Reason:
java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is /tmp/jaas-6309821891889949793.conf
There is nothing in /tmp/jaas-6309821891889949793.conf except the following comment
# We are using this file as an workaround for the Kafka and ZK SASL implementation
# since they explicitly look for java.security.auth.login.config property
# Please do not edit/delete this file - See FLINK-3929
SQL client run command
bin/sql-client.sh embedded --jar flink-sql-connector-kafka_2.11-1.12.0.jar --jar flink-sql-avro-confluent-registry-1.12.0.jar
Flink cluster command
bin/start-cluster.sh
How to pass this java.security.auth.login.config and other system properties (that I'm setting in the above java code snippet), for SQL client?
flink-conf.yaml
security.kerberos.login.use-ticket-cache: true
security.kerberos.login.principal: XXXXX#HADOOP.COM
security.kerberos.login.use-ticket-cache: false
security.kerberos.login.keytab: /path/to/kafka.keytab
security.kerberos.login.principal: XXXX#HADOOP.COM
security.kerberos.login.contexts: Client,KafkaClient
I haven't really tested whether this solution is feasible, you can try it out, hope it will help you.

Flink table sink doesn't work with debezium-avro-confluent source

I'm using Flink SQL to read debezium avro data from Kafka and store as parquet files in S3. Here is my code,
import os
from pyflink.datastream import StreamExecutionEnvironment, FsStateBackend
from pyflink.table import TableConfig, DataTypes, BatchTableEnvironment, StreamTableEnvironment, \
ScalarFunction
exec_env = StreamExecutionEnvironment.get_execution_environment()
exec_env.set_parallelism(1)
# start a checkpoint every 12 s
exec_env.enable_checkpointing(12000)
t_config = TableConfig()
t_env = StreamTableEnvironment.create(exec_env, t_config)
INPUT_TABLE = 'source'
KAFKA_TOPIC = os.environ['KAFKA_TOPIC']
KAFKA_BOOTSTRAP_SERVER = os.environ['KAFKA_BOOTSTRAP_SERVER']
OUTPUT_TABLE = 'sink'
S3_BUCKET = os.environ['S3_BUCKET']
OUTPUT_S3_LOCATION = os.environ['OUTPUT_S3_LOCATION']
ddl_source = f"""
CREATE TABLE {INPUT_TABLE} (
`event_time` TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL,
`id` BIGINT,
`price` DOUBLE,
`type` INT,
`is_reinvite` INT
) WITH (
'connector' = 'kafka',
'topic' = '{KAFKA_TOPIC}',
'properties.bootstrap.servers' = '{KAFKA_BOOTSTRAP_SERVER}',
'scan.startup.mode' = 'earliest-offset',
'format' = 'debezium-avro-confluent',
'debezium-avro-confluent.schema-registry.url' = 'http://kafka-production-schema-registry:8081'
)
"""
ddl_sink = f"""
CREATE TABLE {OUTPUT_TABLE} (
`event_time` TIMESTAMP,
`id` BIGINT,
`price` DOUBLE,
`type` INT,
`is_reinvite` INT
) WITH (
'connector' = 'filesystem',
'path' = 's3://{S3_BUCKET}/{OUTPUT_S3_LOCATION}',
'format' = 'parquet'
)
"""
t_env.sql_update(ddl_source)
t_env.sql_update(ddl_sink)
t_env.execute_sql(f"""
INSERT INTO {OUTPUT_TABLE}
SELECT *
FROM {INPUT_TABLE}
""")
When I submit the job, I get the following error message,
pyflink.util.exceptions.TableException: Table sink 'default_catalog.default_database.sink' doesn't support consuming update and delete changes which is produced by node TableSourceScan(table=[[default_catalog, default_database, source]], fields=[id, price, type, is_reinvite, timestamp])
I'm using Flink 1.12.1. The source is working properly and I have tested it using a 'print' connector in the sink. Here is a sample data set which was extracted from the task manager logs when using 'print' connector in the table sink,
-D(2021-02-20T17:07:27.298,14091764,26.0,9,0)
-D(2021-02-20T17:07:27.298,14099765,26.0,9,0)
-D(2021-02-20T17:07:27.299,14189806,16.0,9,0)
-D(2021-02-20T17:07:27.299,14189838,37.0,9,0)
-D(2021-02-20T17:07:27.299,14089840,26.0,9,0)
-D(2021-02-20T17:07:27.299,14089847,26.0,9,0)
-D(2021-02-20T17:07:27.300,14189859,26.0,9,0)
-D(2021-02-20T17:07:27.301,14091808,37.0,9,0)
-D(2021-02-20T17:07:27.301,14089911,37.0,9,0)
-D(2021-02-20T17:07:27.301,14099937,26.0,9,0)
-D(2021-02-20T17:07:27.302,14091851,37.0,9,0)
How can I make my table sink work with the filesystem connector ?
What happens is that:
when receiving the Debezium records, Flink updates a logical table by adding, removing and suppressing Flink rows based on their primary key.
the only sinks that can handle that kind of information are those that have a concept of update by key. Jdbc would be a typical example, in which case it's straightforward for Flink to translate the concept of "a Flink row with key foo has been updated to bar" into "JDBC row with key foo should be updated with value bar", or something. filesystem sink do not support that kind of operation since files are append-only.
See also Flink documentation on append and update queries
In practice, in order to do the conversion, we first have to decide what is it exactly we want to have in this append-only file.
If what we want is to have in the file the latest version of each item any time an id is updated, then to my knowledge the way to go would be to convert it to a stream first, and then output that with a FileSink. Note that in that case, the result contains a boolean saying if the row is updated or deleted, and we have to decide how we want this information to be visible in the resulting file.
Note: I used this other CDC example from the Flink SQL cookbook to reproduce a similar setup:
// assuming a Flink retract table of claims build from a CDC stream:
tableEnv.executeSql("" +
" CREATE TABLE accident_claims (\n" +
" claim_id INT,\n" +
" claim_total FLOAT,\n" +
" claim_total_receipt VARCHAR(50),\n" +
" claim_currency VARCHAR(3),\n" +
" member_id INT,\n" +
" accident_date VARCHAR(20),\n" +
" accident_type VARCHAR(20),\n" +
" accident_detail VARCHAR(20),\n" +
" claim_date VARCHAR(20),\n" +
" claim_status VARCHAR(10),\n" +
" ts_created VARCHAR(20),\n" +
" ts_updated VARCHAR(20)" +
") WITH (\n" +
" 'connector' = 'postgres-cdc',\n" +
" 'hostname' = 'localhost',\n" +
" 'port' = '5432',\n" +
" 'username' = 'postgres',\n" +
" 'password' = 'postgres',\n" +
" 'database-name' = 'postgres',\n" +
" 'schema-name' = 'claims',\n" +
" 'table-name' = 'accident_claims'\n" +
" )"
);
// convert it to a stream
Table accidentClaims = tableEnv.from("accident_claims");
DataStream<Tuple2<Boolean, Row>> accidentClaimsStream = tableEnv
.toRetractStream(accidentClaims, Row.class);
// and write to file
final FileSink<Tuple2<Boolean, Row>> sink = FileSink
// TODO: adapt the output format here:
.forRowFormat(new Path("/tmp/flink-demo"),
(Encoder<Tuple2<Boolean, Row>>) (element, stream) -> stream.write((element.toString() + "\n").getBytes(StandardCharsets.UTF_8)))
.build();
ordersStreams.sinkTo(sink);
streamEnv.execute();
Note that during the conversion, you obtain a boolean telling you whether that row is a new value for that accident claim, or a deletion of such claim. My basic FileSink config there is just including that boolean in the output, although how to handle deletions is to be decided case by case.
The result in the file then looks like this:
head /tmp/flink-demo/2021-03-09--09/.part-c7cdb74e-893c-4b0e-8f69-1e8f02505199-0.inprogress.f0f7263e-ec24-4474-b953-4d8ef4641998
(true,1,4153.92,null,AUD,412,2020-06-18 18:49:19,Permanent Injury,Saltwater Crocodile,2020-06-06 03:42:25,IN REVIEW,2021-03-09 06:39:28,2021-03-09 06:39:28)
(true,2,8940.53,IpsumPrimis.tiff,AUD,323,2019-03-18 15:48:16,Collision,Blue Ringed Octopus,2020-05-26 14:59:19,IN REVIEW,2021-03-09 06:39:28,2021-03-09 06:39:28)
(true,3,9406.86,null,USD,39,2019-04-28 21:15:09,Death,Great White Shark,2020-03-06 11:20:54,INITIAL,2021-03-09 06:39:28,2021-03-09 06:39:28)
(true,4,3997.9,null,AUD,315,2019-10-26 21:24:04,Permanent Injury,Saltwater Crocodile,2020-06-25 20:43:32,IN REVIEW,2021-03-09 06:39:28,2021-03-09 06:39:28)
(true,5,2647.35,null,AUD,74,2019-12-07 04:21:37,Light Injury,Cassowary,2020-07-30 10:28:53,REIMBURSED,2021-03-09 06:39:28,2021-03-09 06:39:28)

parameterize queries in OrientDB using placeholders

I am using pyorient and and want to parameterize queries -- I am assuming that command() allows placeholders, but am not able to find any documentation. IN particular I'd like to use dict() args as per postgres' %(name)s construct, but could make tuples/lists work as well.
I tried your case with my python_test database:
Dataset:
I used two parameters:
name ---> string;
surname ---> string;
and I passed them into the command() function.
PyOrient Code:
import pyorient
db_name = 'python_test'
user = 'root'
pwd = 'root'
print("Connecting...")
client = pyorient.OrientDB("localhost",2424)
session_id = client.connect(user, pwd)
print("OK - sessionID: ",session_id,"\n")
if client.db_exists( db_name, pyorient.STORAGE_TYPE_PLOCAL ):
print("DB "+db_name+" already exists\n")
client.db_open(db_name, user, pwd, pyorient.DB_TYPE_GRAPH, pyorient.STORAGE_TYPE_PLOCAL)
name = 'test name'
surname = 'test surname'
vertexes = client.command("SELECT FROM TestClass WHERE name = '" + name + "' AND surname = '" + surname + "'")
for vertex in vertexes:
print(vertex)
else:
print("Creating DB "+ db_name + "...")
client.db_create( db_name, pyorient.DB_TYPE_GRAPH, pyorient.STORAGE_TYPE_PLOCAL )
print("DB " + db_name + " created\n")
client.db_close()
Output:
Connecting...
OK - sessionID: 40
DB python_test already exists
{'#TestClass':{'surname': 'test surname', 'name': 'test name'},'version':1,'rid':'#12:0'}
Hope it helps

Logging mongodb query in java

I am using mongo-java-driver-2.9.1 for interacting with mongodb, I want to log the query that are fired on to the mongodb server. e.g. In java for inserting the document this is the code that I write
DBCollection coll = db.getCollection("mycollection");
BasicDBObject doc = new BasicDBObject("name", "MongoDB")
.append("type", "database")
.append("count", 1);
coll.insert(doc);
for this, equivalent code in "mongo" client for inserting document in mongodb is
db.mycollection.insert({
"name" : "MongoDB",
"type" : "database",
"count" : 1
})
I want to log this second code, is there any way to do it?
I think the MongoDB Java driver has not logging support so you have to write your logging Message by your own. Here an Example:
DBCollection coll = db.getCollection("mycollection");
BasicDBObject doc = new BasicDBObject("name", "MongoDB")
.append("type", "database")
.append("count", 1);
WriteResult insert = coll.insert(doc);
String msg = "";
if(insert.getError() == null){
msg = "insert into: " + collection.toString() +" ; Object " + q.toString());
//log the message
} else {
msg = "ERROR by insert into: " + collection.toString() +" ; Object " + q.toString());
msg = msg + " Error message: " + insert.getError();
}
//log the message

Enable Caching for all reports in SSRS Report Server

I have more than 100 reports in SSRS report server. I need to enable caching for all of those. Right now I am enabling caching through the report manager for each and every report.
Can we add caching in any of the report servers config files? So that we can enable caching for all reports at a single place.
Any help will be appreciated
Thanks
AJ
Below is the script that I used to enable caching in minutes on a list of reports.
Save it as setreportscaching.rss and then run it from the command line:
rs.exe -i setreportscaching.rss -e Mgmt2010 -t -s http://mySsrsBox:8080/ReportServer -v ReportNamesList="OneReport,AnotherReport,YetAnotherOne" -v CacheTimeMinutes="333" -v TargetFolder="ReportsFolderOnServer"
It is easy to modify it to loop through files in some folder rather then take csv list of reports. It has some silly piece of diagnostics that can be commented out for speed.
Public Sub Main()
Dim reportNames As String() = Nothing
Dim reportName As String
Dim texp As TimeExpiration
Dim reportPath As String
Console.WriteLine("Looping through reports: {0}", ReportNamesList)
reportNames = ReportNamesList.Split(","c)
For Each reportName In reportNames
texp = New TimeExpiration()
texp.Minutes = CacheTimeMinutes
reportPath = "/" + TargetFolder + "/" + reportName
'feel free to comment out this diagnostics to speed things up
Console.WriteLine("Current caching for " + reportName + DisplayReportCachingSettings(reportPath))
'this call sets desired caching option
rs.SetCacheOptions(reportPath, true, texp)
'feel free to comment out this diagnostics to speed things up
Console.WriteLine("New caching for " + reportName + DisplayReportCachingSettings(reportPath))
Next
End Sub
Private Function DisplayReportCachingSettings(reportPath as string)
Dim isCacheSet As Boolean
Dim expItem As ExpirationDefinition = New ExpirationDefinition()
Dim theResult As String
isCacheSet = rs.GetCacheOptions(reportPath, expItem)
If isCacheSet = false Or expItem is Nothing Then
theResult = " is not defined."
Else
If expItem.GetType.Name = "TimeExpiration" Then
theResult = " is " + (CType(expItem, TimeExpiration)).Minutes.ToString() + " minutes."
ElseIf expItem.GetType.Name = "ScheduleExpiration" Then
theResult = " is a schedule"
Else
theResult = " is " + expItem.GetType.Name
End If
End If
DisplayReportCachingSettings = theResult
End Function