MongoDB Scala - query document for a specific field value - mongodb

So I know that in Mongo Shell, you use dot notation to get the field you want in any document.
How is dot notation achieved in MongoDB Scala. I'm confused as to how it works. Here is the code that fetches a document from a collection:
val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)
EDIT:
I'm trying to work on a mechanism to basically re-consume Kafka records at a point where the consumer was shutdown. To do this, I store my kafka records in an external database, and then try to fetch the most recent offset from there and start consuming from that point. Here is my Scala method that should do that:
def getLatestCommitOffsetFromDB(collectionName: String): Long = {
import com.mongodb.Block
import org.bson.Document
val printBlock = new Block[Document]() {
override def apply(document: Document): Unit = {
println(document.toJson)
}
}
import com.mongodb.async.SingleResultCallback
val callbackWhenFinished = new SingleResultCallback[Void]() {
override def onResult(result: Void, t: Throwable): Unit = {
System.out.println("Latest offset fetched from database.")
}
}
var obj: String = " "
try {
val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)
//TODO FIND A WAY TO GET THE VALUE AND STORE IT IN A VARIABLE
} catch {
case e: RuntimeException =>
logger.error(s"MongoDB Server Error : Unable to fetch data from collection : $collection")
logger.error(e.printStackTrace().toString())
}
obj.toLong
}
The problem isn't that I can fetch documents from Mongo, more-so that I'm trying to access a particular field in Mongo. The Document has four fields in it: topic, partition, message, and offset. I want to get the "offset" field and store that in a variable, so I can use it as a restarting point to re-consume Kafka records.
where do I go from there?
POM.xml
<?xml version="1.0" encoding="UTF-8"?>
http://maven.apache.org/xsd/maven-4.0.0.xsd">
4.0.0
<groupId>OffsetManagementPoC</groupId>
<artifactId>OffsetManagementPoC</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.12</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.8</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.25</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>0.10.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.6.5</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.6.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.6.5</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>casbah_2.12</artifactId>
<version>3.1.1</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.mongodb.scala</groupId>
<artifactId>mongo-scala-driver_2.12</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.8</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>3.4.2</version>
</dependency>
<dependency>
<groupId>org.mongodb.scala</groupId>
<artifactId>mongo-scala-driver_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>bson</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-async</artifactId>
<version>3.4.3</version>
</dependency>
<dependency>
<groupId>org.mongodb.scala</groupId>
<artifactId>mongo-scala-bson_2.11</artifactId>
<version>2.1.0</version>
</dependency>
</dependencies>

You can modify your query this way:
import com.mongodb.MongoClient
import com.mongodb.client.MongoCollection
import com.mongodb.client.model.Projections
def getLatestCommitOffsetFromDB(
databaseName: String,
collectionName: String
): Long = {
val mongoClient = new MongoClient("localhost", 27017);
val collection =
mongoClient.getDatabase(databaseName).getCollection(collectionName)
val record = collection
.find()
.projection(
Projections
.fields(Projections.include("offset"), Projections.excludeId()))
.first
record.get("offset").asInstanceOf[Double].toLong
}
I think you were missing the com.mongodb.client.model.Projections imports in order to use fields, include and excludeId
I used first instead of limit(1) to make it easier to extract the result.
first returns a Document object on which you can call get to retrieve the value of the requested field.
But in fact, since you just want one record and one field, you can remove the projection!:
val record = collection.find().first

According to the documentation, collection.find() accepts a com.mongodb.DBObject
One of the implementations of that interface that you can use is BasicDBObject which is basically like a mutable.Map[String, Object]. You can use the constructor which accepts a map like:
val query = new com.mongodb.BasicDBObject(Map(
"foo.bar" -> "value1"
"bar.foo" -> "value2"
))
val record = collection.find(query)....

Related

Custom database query doesn't work, what could be the problem?

I found out that this code doesn't injects the parameters to the SQL query.
#Query(value = "SELECT * FROM schema.table WHERE schema.table.name = ':name'", nativeQuery = true)
Optional findByName(#Param("name") String name);
In this case the database receives this:
SELECT * FROM schema.table WHERE schema.table.name = ':name'
I tried with #Param too, then this is the case:
#Query(value = "SELECT * FROM schema.table WHERE schema.table.name = '?1'", nativeQuery = true)
Optional findByName(String name);
Then the database receives this:
SELECT * FROM schema.table WHERE schema.table.name = '?1'
I tried without the single apostrophes (') as well.
Without #Query:
Optional findByName(String name);
The DB receives this:
... WHERE myclass.name = $1
What could be the problem? I saw this solution many times and it worked for them. Basis queries are working, so that's not the problem I can't access the DB or something like that.
My application.properties:
spring.datasource.url=jdbc:postgresql://localhost:5432/postgres
spring.datasource.username=
spring.datasource.password=
spring.jpa-hibernate.ddl-auto=create-drop
spring-jpa.show-sql=false
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgresPlusDialect
spring.jpa.properties.hibernate.format_sql=true
And my pom.xml:
<properties>
<maven.compiler.source>18</maven.compiler.source>
<maven.compiler.target>18</maven.compiler.target>
<spring.version>2.6.6</spring.version>
<tomcat7-maven-plugin.version>2.2</tomcat7-maven-plugin.version>
<maven-war-plugin.version>3.3.2</maven-war-plugin.version>
<javax.servlet-api.version>4.0.1</javax.servlet-api.version>
<javax.persistence-api.version>2.2</javax.persistence-api.version>
<spring-data-jpa.version>2.6.4</spring-data-jpa.version>
<postgresql.version>42.3.4</postgresql.version>
<commons-lang3.version>3.12.0</commons-lang3.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>${spring.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
<version>${spring.version}</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>${postgresql.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>javax.persistence</groupId>
<artifactId>javax.persistence-api</artifactId>
<version>${javax.persistence-api.version}</version>
</dependency>
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<version>${javax.servlet-api.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>${commons-lang3.version}</version>
</dependency>
</dependencies>
</dependencyManagement>

gRPC spring-boot-starter can not bind #GRpcGlobalInterceptor?

I m using
<dependency>
<groupId>org.lognet</groupId>
<artifactId>grpc-spring-boot-starter</artifactId>
<version>2.1.4</version>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.zipkin.brave</groupId>
<artifactId>brave-instrumentation-grpc</artifactId>
<version>4.13.1</version>
</dependency>
<dependency>
<groupId>io.zipkin.brave</groupId>
<artifactId>brave</artifactId>
<version>4.13.1</version>
</dependency>
and I want to use brave-instrumentation-grpc monitor my grpc server application. So I followed advice below:
https://github.com/LogNet/grpc-spring-boot-starter#interceptors-support
https://github.com/openzipkin/brave/tree/master/instrumentation/grpc
#Configuration
public class GrpcFilterConfig{
#GRpcGlobalInterceptor
#Bean
public ServerInterceptor globalInterceptor(){
Tracing tracing = Tracing.newBuilder().build();
GrpcTracing grpcTracing = GrpcTracing.create(tracing);
return grpcTracing.newServerInterceptor();
}
}
The question is : The GlobalInterceptor i defined does not bind to grpcserver.
debug deep insde the GRpcServerRunner.class, it seems that the code does not return beansWithAnnotation
Map<String, Object> beansWithAnnotation = this.applicationContext.getBeansWithAnnotation(annotationType);
beansWithAnnotation is null.
Is there something wrong with my usage?

Kafka-spark Streaming processing jobs synchronically

Im trying a simple test where i use Kafka-connect and spark
I wrote a custom kafka-connect that creates this source record
SourceRecord sr = new SourceRecord(null,
null,
destTopic,
Schema.STRING_SCHEMA,
cleanPath);
in the spark i receive this message like this
val kafkaConsumerParams = Map[String, String](
"metadata.broker.list" -> prop.getProperty("kafka_host"),
"zookeeper.connect" -> prop.getProperty("zookeeper_host"),
"group.id" -> prop.getProperty("kafka_group_id"),
"schema.registry.url" -> prop.getProperty("schema_registry_url"),
"auto.offset.reset" -> prop.getProperty("auto_offset_reset")
)
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaConsumerParams, topicsSet)
val ds = messages.foreachRDD(rdd => {
val toPrint = rdd.map(t => {
val file_path = t._2
val startTime = DateTime.now()
Thread.sleep(1000 * 60)
1
}).sum()
LogUtils.getLogger(classOf[DeviceManager]).info(" toPrint = " + toPrint +" (number of flows calculated)")
})
}
when i use the connector to send multiple message to the desired topic ( in my test it had 6 partitions)
The sleep thread gets all the messages, but preforms them synchronically instead of asynchronically.
When i create a simple test producer, the sleeps are done asynchronically.
I Also created 2 simple consumers, and tried both the connector and a producer, and both task were consumed asynchronically
which means my problems lays with the way the spark is receiving the messages sent from the connector.
I cant figure why the tasks are not acting the same way as they do when i send it from a producer.
i even printed the record the spark recieves and they are exactly the same
producer sent record
1: {partition=2, offset=11, value=something, key=null}
2: {partition=5, offset=9, value=something2, key=null}
connect sent record
1: {partition=3, offset=9, value=something, key=null}
the versions used in my projects are
<scala.version>2.11.7</scala.version>
<confluent.version>4.0.0</confluent.version>
<kafka.version>1.0.0</kafka.version>
<java.version>1.8</java.version>
<spark.version>2.0.0</spark.version>
dependencies
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>${confluent.version}</version>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-schema-registry-client</artifactId>
<version>${confluent.version}</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.11</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_2.11</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.0-RC1</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.8.0</version>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>${confluent.version}</version>
<scope>${global.scope}</scope>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-connect-avro-converter</artifactId>
<version>${confluent.version}</version>
<scope>${global.scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>${kafka.version}</version>
</dependency>
We cannot run Spark-Kafka streaming jobs asynchronously. But we can run them in parallel, as Kafka consumer(s) do. For that, we need to set following configuration in SparkConf():
sparkConf.set("spark.streaming.concurrentJobs","4")
By default, its value is "1". But we can override it to a higher value.
I hope this helps!

java.lang.ClassNotFoundException: io.jsonwebtoken.Jwts when using JJWT JSON Web Token

When I am trying to use JJWT from Stormpath, it is throwing a run time Exception java.lang.ClassNotFoundException: io.jsonwebtoken.Jwts. I am using Jersey2 embedded on GlassFish 4.1; here is the code that is throwing the exception:
private String issueToken(String login) {
Key key = keyGenerator.generateKey();
//Key key = MacProvider.generateKey();
String jwtToken = Jwts.builder()
.setIssuer(uriInfo.getAbsolutePath().toString())
//.setIssuer("http://trustyapp.com/")
.setSubject(login)
.setIssuedAt(new Date())
.setExpiration(toDate(LocalDateTime.now().plusMinutes(15L)))
.signWith(SignatureAlgorithm.HS512, key)
.compact();
logger.info("#### generating token for a key : " + jwtToken + " - " + key);
return jwtToken;
}
I have imported io.jsonwebtoken.Jwts and my pom.xml has :
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.8.2</version>
<scope>compile</scope>
</dependency>
i also tried it without the above dependency in case the below dependency which is on my pom.xml is enough:
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt</artifactId>
<version>0.7.0</version>
<scope>compile</scope>
</dependency>
I tried the recommendations from this and this but it did not work, please help
The problem is solved after adding the following dependencies into my pom.xml:
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-common</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-jdk-http</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-client</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-server</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-servlet</artifactId>
<version>${version.jersey}</version>
</dependency>
I assumed that such dependencies are not required since i am using Jersey 2 which is embedded on the GlassFish4.1.1 Server.

Scala: java.lang.ClassNotFoundException: com.sun.jersey.core.provider.jaxb.AbstractRootElementProvider

I have the following piece of code to make a call to receive a token and i am getting the jersey error.Caused by: java.lang.ClassNotFoundException: com.sun.jersey.core.provider.jaxb.AbstractRootElementProvider i am not able to figure out why. I have given the pom as well.. Can anyone suggest whats the issue.?
import javax.ws.rs.client.{ClientBuilder, Entity}
import javax.ws.rs.core.{MediaType, Response}
 
import com.google.gson.JsonParser
import com.rms.execution._
import com.rms.transform.constants.ApiConstants
import com.rms.transform.constants.ImportPayloadSettingParameters._
import com.rms.transform.task.api.TaskConstants
import com.rms.transform.task.api.TaskConstants._
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.json.{JSONException, JSONObject}
import org.junit.runner.RunWith
import org.scalatest.junit.JUnitRunner
import org.scalatest.{BeforeAndAfter, FunSuite}
import org.slf4j.{Logger, LoggerFactory}
 
import scala.collection.immutable.HashMap
import scala.collection.mutable.ListBuffer
import scala.util.control.Breaks._
test("Do a datastore commit ) {
val jobId: Integer = 1
 
val token: String = getToken
}
#throws[JSONException]
private def getToken : String = {
val body = HashMap("username" -> “a”,”tenant”->”b”,”password"->“c”)
var bearerToken : String = ""
val client = ClientBuilder.newClient
val response = client.target(“http://”).request(MediaType.APPLICATION_JSON).post(Entity.json(body))
if (response != null){
val tokenResponse = response.readEntity(classOf[String])
val json :JSONObject = new JSONObject(tokenResponse);
val token :String = json.get("token").toString();
bearerToken = "Bearer " + token
}
 
bearerToken
}
POM is below..
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>${commons-lang3.version}</version>
</dependency>
<dependency>
<groupId>javax.ws.rs</groupId>
<artifactId>javax.ws.rs-api</artifactId>
<version>${javax.ws.rs-api.version}</version>
</dependency>
<dependency>
<groupId>com.rms</groupId>
<artifactId>import-hdfs</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<artifactId>jersey-core</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-server</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-client</artifactId>
<groupId>com.sun.jersey</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>uk.co.datumedge</groupId>
<artifactId>hamcrest-json</artifactId>
<version>${hamcrest-json.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.rms</groupId>
<artifactId>execution-common</artifactId>
<version>${jobs.version}</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>${scalatest.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
</dependencies>