Spring Data MongoDB logging mongoTemplate query - mongodb

I want to see the native mongodb query created by Spring Data.
I tried the following.
Created application.properties file under META-INF/resources and added the following
logging.level.org.springframework.data.mongodb.core.MongoTemplate=DEBUG
Created log4j.properties file under META-INF/resources and added the following
log4j.category.org.springframework.data.document.mongodb=DEBUG
log4j.category.org.springframework.data.document.mongodb=DEBUG
log4j.appender.stdout.layout.ConversionPattern=%d{ABSOLUTE} %5p %40.40c:%4L - %m%n
In my code level
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
...
private static final Log log = LogFactory.getLog(abc.class);
...
log.info(mongoTemplate.aggregate(aggregation, xyz.class, xyz.class));
I am not able to find any query in my jetty console. Not sure what I am missing.

Related

Spring Batch / Postgres : ERROR: relation "batch_job_instance" does not exist

I am trying to configure Spring Batch to use PostGres DB. I have included the following dependencies in my build.gradle.kts file:
implementation("org.springframework.boot:spring-boot-starter-data-jpa")
implementation("org.postgresql:postgresql")
My application.yml for my SpringBatch module has the following included:
spring:
datasource:
url: jdbc:postgresql://postgres:5432/springbatchdb
username: postgres
password: root
driverClassName: org.postgresql.Driver
docker-compose.yml
postgres:
restart: always
image: postgres:12-alpine
container_name: postgres
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=root
- POSTGRES_DB=springbatchdb
ports:
- "5432:5432"
volumes:
- postgresql:/var/lib/postgresql
- postgresql_data:/var/lib/postgresql/data
However, when I try to add a data file I see the following error in the logs of both my SpringBatch Docker container, and the PostGres container:
Spring Batch:
<<< Exception in method: org.meanwhileinhell.spring.batch.server.SpringBatchController.handle Error Message: PreparedStatementCallback; bad SQL grammar [SELECT JOB_INSTANCE_ID, JOB_NAME from BATCH_JOB_INSTANCE where JOB_NAME = ? and JOB_KEY = ?]; nested exception is org.postgresql.util.PSQLException: ERROR: relation "batch_job_instance" does not exist
PostGres:
LOG: database system is ready to accept connections
2021-01-08 09:54:56.778 UTC [56] ERROR: relation "batch_job_instance" does not exist at character 39
2021-01-08 09:54:56.778 UTC [56] STATEMENT: SELECT JOB_INSTANCE_ID, JOB_NAME from BATCH_JOB_INSTANCE where JOB_NAME = $1 and JOB_KEY = $2
2021-01-08 09:55:27.033 UTC [56] ERROR: relation "batch_job_instance" does not exist at character 39
2021-01-08 09:55:27.033 UTC [56] STATEMENT: SELECT JOB_INSTANCE_ID, JOB_NAME from BATCH_JOB_INSTANCE where JOB_NAME = $1 and JOB_KEY = $2
I can see that the SB server is picking up POSTGRES from my metadata ok.
JobRepositoryFactoryBean : No database type set, using meta data indicating: POSTGRES
What am I missing to get the initial db configured during the server start?
Edit: I've tried adding spring.datasource.initialize=true explicitly, but no change.
Please check below added in application.yml
spring.batch.initialize-schema: always
Please check below dependencies are added
<artifactId>spring-boot-starter-batch</artifactId>
yaml file is
spring.datasource.url=jdbc:postgresql://localhost:5432/postgres
spring.datasource.username=postgres
spring.datasource.password=1234
spring.datasource.driver-class-name=org.postgresql.Driver
spring.batch.jdbc.initialize-schema=always
gradle dependencies
dependencies {
implementation 'org.springframework.boot:spring-boot-starter-jdbc'
implementation 'org.springframework.boot:spring-boot-starter-batch'
implementation 'org.projectlombok:lombok-maven-plugin:1.18.6.0'
implementation group: 'org.postgresql', name: 'postgresql', version: '42.3.1'
testImplementation 'org.springframework.boot:spring-boot-starter-test'
testImplementation 'org.springframework.batch:spring-batch-test'
}
You need to set spring.batch.initialize-schema=always property to tell Spring Boot to create Spring Batch tables automatically. Please refer to the Initialize a Spring Batch Database section of Spring Boot's reference documentation for more details.
For anyone who has spring.batch.initialize-schema=always set already and it's still not working, also verify that you are connecting to the database with a user that has sufficient privileges, including to create the necessary tables.
in application.properties,
Prior Spring Boot 2.5 we can use
spring.batch.initialize-schema=ALWAYS
Later version of Spring Boot 2.5 use below
spring.batch.jdbc.initialize-schema=ALWAYS
Solution that worked for me in Spring 5.0!
I spent a lot of time resolving issues like ERROR: relation "X" does not exist when using the latest Spring Boot Starter 3.0 and Spring Batch 5.0.
spring.batch.jdbc.initialize-schema=always
However, it didn't create the necessary tables for me. Though as per the documentation, it should have created tables.
After a lot of research, I found, that in the latest Spring Batch 5.0, there are a lot of improvements. And I was doing a lot of things wrong when migrating to the new Spring 5.
Remove #EnableBatchProcessing from your configurations. As you don't need that anymore with the latest Spring Batch 5.
Example:
#Configuration
#AllArgsConstructor
#EnableBatchProcessing //please remove it.
public class SpringBatchConfiguration {}
change it to:
#Configuration
#AllArgsConstructor
public class SpringBatchConfiguration {}
PlatformTransaction Manager: The second thing I was doing wrong was using an incorrect Transaction Manager, if you are using JPA for persisting entities you need a corresponding Transaction Manager.
I was using the ResourcelessTransactionManager() which was wrong in my case and was creating a lot of headaches while running.
For JPA you need a JpaTransactionManager()
Something like:
#Bean
public PlatformTransactionManager transactionManager() {
return new JpaTransactionManager();
}
The third thing I learned after mistakes; we don't need to create a bean of Datasource unless we are doing something complex like having 2 Datasource one for writing Spring Batch associated tables and another for persisting our business data.
Wherever we are required to use JobRepository just inject it.
Something like:
#Bean
#Autowired
Job job(JobRepository jobRepository) {
JobBuilder jobBuilderFactory = new JobBuilder("somename", jobRepository );
return jobBuilderFactory.flow(step1(jobRepository)).end()
.build();
}
For more details on migrations: Spring Migration 3 Guide.

export data from mongo to hive

my input: a collection("demo1") in mongo db (version 3.4.4 )
my output : my data imported in a database in hive("demo2") (version 1.2.1.2.3.4.7-4)
purpose : create a connector between mongo and hive
Error:
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. com/mongodb/util/JSON
I tried 2 solutions following those steps (but the error remains):
1) I create a local collection in mongo (via robomongo) connected to docker
2) I upload those version of jars and add it in hive
ADD JAR /home/.../mongo-hadoop-hive-2.0.2.jar;
ADD JAR /home/.../mongo-hadoop-core-2.0.2.jar;
ADD JAR /home/.../mongo-java-driver-3.4.2.jar;
Unfortunately the error doesn't change; so I upload those version, I hesitate in choosing right version for my export, so I try this:
ADD JAR /home/.../mongo-hadoop-hive-1.3.0.jar;
ADD JAR /home/.../mongo-hadoop-core-1.3.0.jar;
ADD JAR /home/.../mongo-java-driver-2.13.2.jar;
3) I create an external table
CREATE EXTERNAL TABLE demo2
(
id INT,
name STRING,
password STRING,
email STRING
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH
SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","name":"name","password":"password","email":"email"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/local.demo1');
Error returned in hive :
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. com/mongodb/util/JSON
How can I resolve this problem?
Copying the correct jar files (mongo-hadoop-core-2.0.2.jar, mongo-hadoop-hive-2.0.2.jar, mongo-java-driver-3.2.2.jar) on ALL the nodes of the cluster did the trick for me.
Other points to take care about:
Follow all steps mentioned here religiously - https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage#installation
Adhere to the requirements given here - https://github.com/mongodb/mongo-hadoop#requirements
Other useful links
https://github.com/mongodb/mongo-hadoop/wiki/FAQ#i-get-a-classnotfoundexceptionnoclassdeffounderror-when-using-the-connector-what-do-i-do
https://groups.google.com/forum/#!topic/mongodb-user/xMVoTSePgg0

Read application.conf configuration in Play for Scala

I have the following data source configured in application.conf that I use to run Slick statements.
I need to access the same database through an OLAP engine that will use the same user and password.
Since it's already configured, I'd like to get these two from there, is it possible to read application.conf from Scala? I know I can read the physical file, but is there a library to get the parsed data?
## JDBC Datasource
# ~~~~~
#
dbConfig = {
url = "jdbc:mysql://localhost:3306/db"
driver = com.mysql.jdbc.Driver
connectionPool = disabled
keepAliveConnection = true
user=root
password=xxxx
}
Working with play, you simply inject the configuration like this:
import javax.inject.Inject
import play.api.Configuration
class Something #Inject()(configuration: Configuration) {
val url: Option[String] = configuration.getString("dbConfig.url")
val keepAliveConnection: Option[Boolean] = configuration.getBoolean("dbConfig.keepAliveConnection")
...
}
Also see Configuration API on how to get your properties in various types and formats.

Moving HDFS data into MongoDB

I am trying to move HDFS data into MongoDB. I know how to export data into mysql by using sqoop. I dont think I can use sqoop for MongoDb. I need help understanding how to do that.
This recipe will use the MongoOutputFormat class to load data from an HDFS instance
into a MongoDB collection.
Getting ready
The easiest way to get started with the Mongo Hadoop Adaptor is to clone the Mongo-Hadoop
project from GitHub and build the project configured for a specific version of Hadoop. A Git
client must be installed to clone this project.
This recipe assumes that you are using the CDH3 distribution of Hadoop.
The official Git Client can be found at http://git-scm.com/downloads .
The Mongo Hadoop Adaptor can be found on GitHub at https://github.com/mongodb/
mongo-hadoop . This project needs to be built for a specific version of Hadoop. The resulting
JAR file must be installed on each node in the $HADOOP_HOME/lib folder.
The Mongo Java Driver is required to be installed on each node in the $HADOOP_HOME/
lib folder. It can be found at https://github.com/mongodb/mongo-java-driver/
downloads .
How to do it...
Complete the following steps to copy data form HDFS into MongoDB:
1. Clone the mongo-hadoop repository with the following command line:
git clone https://github.com/mongodb/mongo-hadoop.git
2. Switch to the stable release 1.0 branch:
git checkout release-1.0
3. Set the Hadoop version which mongo-hadoop should target. In the folder
that mongo-hadoop was cloned to, open the build.sbt file with a text editor.
Change the following line:
hadoopRelease in ThisBuild := "default"
to
hadoopRelease in ThisBuild := "cdh3"
4. Build mongo-hadoop :
./sbt package
This will create a file named mongo-hadoop-core_cdh3u3-1.0.0.jar in the
core/target folder.
5. Download the MongoDB Java Driver Version 2.8.0 from https://github.com/
mongodb/mongo-java-driver/downloads .
6. Copy mongo-hadoop and the MongoDB Java Driver to $HADOOP_HOME/lib on
each node:
cp mongo-hadoop-core_cdh3u3-1.0.0.jar mongo-2.8.0.jar $HADOOP_
HOME/lib
7. Create a Java MapReduce program that will read the weblog_entries.txt file
from HDFS and write them to MongoDB using the MongoOutputFormat class:
import java.io.*;
import org.apache.commons.logging.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.*;
import org.bson.*;
import org.bson.types.ObjectId;
import com.mongodb.hadoop.*;
import com.mongodb.hadoop.util.*;
public class ExportToMongoDBFromHDFS {
private static final Log log = LogFactory.getLog(ExportToMongoDBFromHDFS.class);
public static class ReadWeblogs extends Mapper<LongWritable, Text, ObjectId, BSONObject>{
public void map(Text key, Text value, Context context)
throws IOException, InterruptedException{
System.out.println("Key: " + key);
System.out.println("Value: " + value);
String[] fields = value.toString().split("\t");
String md5 = fields[0];
String url = fields[1];
String date = fields[2];
String time = fields[3];
String ip = fields[4];
BSONObject b = new BasicBSONObject();
b.put("md5", md5);
b.put("url", url);
b.put("date", date);
b.put("time", time);
b.put("ip", ip);
context.write( new ObjectId(), b);
}
}
public static void main(String[] args) throws Exception{
final Configuration conf = new Configuration();
MongoConfigUtil.setOutputURI(conf,"mongodb://<HOST>:<PORT>/test. weblogs");
System.out.println("Configuration: " + conf);
final Job job = new Job(conf, "Export to Mongo");
Path in = new Path("/data/weblogs/weblog_entries.txt");
FileInputFormat.setInputPaths(job, in);
job.setJarByClass(ExportToMongoDBFromHDFS.class);
job.setMapperClass(ReadWeblogs.class);
job.setOutputKeyClass(ObjectId.class);
job.setOutputValueClass(BSONObject.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(MongoOutputFormat.class);
job.setNumReduceTasks(0);
System.exit(job.waitForCompletion(true) ? 0 : 1 );
}
}
8. Export as a runnable JAR file and run the job:
hadoop jar ExportToMongoDBFromHDFS.jar
9. Verify that the weblogs MongoDB collection was populated from the Mongo shell:
db.weblogs.find();
The basic problem is that mongo stores its data in BSON format (binary JSON), while you hdfs data may have different formats (txt, sequence, avro). The easiest thing to do would be to use pig to load your results using this driver:
https://github.com/mongodb/mongo-hadoop/tree/master/pig
into mongo db. You'll have to map your values to your collection - there's a good example on the git hub page.

Soap UI, REST API, update database

I am going to use soapUI to test the REST API framework.
Is there a way through which i can insert/update records inside the MongoDB with data in a file type(csv, txt etc) using soapUI tool?
What i am trying to do is validate the API calls and update the database from a data file.
If you are willing to use Groovy script, then you can do this pretty easily.
Put your jdbc driver in SoapUI's bin\ext directory.
https://github.com/mongodb/mongo-java-driver/downloads
(probably where you can it for mongodb)
Then you need roughly these things in your script:
import groovy.sql.Sql
def groovyUtils = new com.eviware.soapui.support.GroovyUtils(context)
groovyUtils.registerJdbcDriver("org.postgresql.Driver") // NOT SURE WHAT STRING FOR MONGODB
def connectString = "....."
sql = Sql.newInstance(connectString) // TEST YOUR CONNECT STRING IN A SQL BROWSER
def misc = sql.firstRow("SELECT * from table")
groovy.sql.Sql is very nice!
http://groovy.codehaus.org/api/groovy/sql/Sql.html
You can easily use below groovy code in "Groovy test step" of your test case and connect to mongodb. Before that please ensure that mongodb java client jar file and gmongo are in {Installation Directory}\bin\ext folder of your soapUI installation
Gmongo: http://mvnrepository.com/artifact/com.gmongo/gmongo/1.5
Mongodb Java Client : http://mvnrepository.com/artifact/org.mongodb/mongo-java-driver/3.2.2
import com.gmongo.GMongoClient
import com.gmongo.GMongo
import com.mongodb.MongoCredential
import com.mongodb.ServerAddress
//def credentials = MongoCredential.createMongoCRCredential('admin', 'students', 'admin' as char[])
//def client = new GMongoClient(new ServerAddress("127.0.0.1:27017"))
context.gmongo=new GMongo()
def db=context.gmongo.getDB("test")
log.info db.fruit.find().count()
db.fruit.find().each{
doc->log.info doc
}