Spark custom log4j integration for Scala application - scala

I am new to spark and Scala as well,
Question is I am not able to debug my application.
I have developed a spark application using Maven in Scala.
But I am not able to log the details, meaning not getting where that log file is getting generated, cause as per log4j property, log file is not available at given path.
Any specific changes I need to do, to get that log file.
I am testing my application in Hortonworks.
Command for submitting the app:
bin/spark-submit --master yarn-cluster --class com.examples.MainExample lib/Test.jar
log4j.properties file is kept at src/resources folder
PFB log4j.properties
log4j.appender.myConsoleAppender=org.apache.log4j.ConsoleAppender
log4j.appender.myConsoleAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.myConsoleAppender.layout.ConversionPattern=%d [%t] %-5p %c - %m%n
log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/var/log/spark.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n
# By default, everything goes to console and file
log4j.rootLogger=INFO, myConsoleAppender, RollingAppender
# The noisier spark logs go to file only
log4j.logger.spark.storage=INFO, RollingAppender
log4j.additivity.spark.storage=false
log4j.logger.spark.scheduler=INFO, RollingAppender
log4j.additivity.spark.scheduler=false
log4j.logger.spark.CacheTracker=INFO, RollingAppender
log4j.additivity.spark.CacheTracker=false
log4j.logger.spark.CacheTrackerActor=INFO, RollingAppender
log4j.additivity.spark.CacheTrackerActor=false
log4j.logger.spark.MapOutputTrackerActor=INFO, RollingAppender
log4j.additivity.spark.MapOutputTrackerActor=false
log4j.logger.spark.MapOutputTracker=INFO, RollingAppender
log4j.additivty.spark.MapOutputTracker=false

Not able to solve this issue via application but, in if you change log4j.properties in conf folder as below, it will write logs to give file.
Make sure the path has write access.
log4j.rootLogger=INFO, FILE
# Set everything to be logged to the console
log4j.rootCategory=INFO, FILE
log4j.appender.FILE=org.apache.log4j.FileAppender
log4j.appender.FILE.File=/tmp/sparkLog/SparkOut.log
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

Try placing the log4j.properties inside 'src/main/scala/resources'.

Related

Custom logger using log4j for spark scala application when executed using oozie

I have developed a spark scala application and used log4j for logger and is working fine when i execute it using the spark-submit as below :
spark-submit --name "Test" --class com.comp.test --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/home/myid/log4j.properties' --queue=root.user /home/myid/dev/data.jar
The works fine and I get my log file created at the specified directory in the log4j.properties.
Now, When I run the same using Oozie spark action , the log file at the specific directory as mentioned in the log4j.properties is not getting created.
log4j.properties:
log4j.appender.myConsoleAppender=org.apache.log4j.ConsoleAppender
log4j.appender.myConsoleAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.myConsoleAppender.layout.ConversionPattern=%d [%t] %-5p %c - %m%n
log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/home/myid/dev/log/dev.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n
# By default, everything goes to console and file
log4j.rootLogger=INFO, myConsoleAppender, RollingAppender
# The noisier spark logs go to file only
log4j.logger.spark.storage=INFO, RollingAppender
log4j.additivity.spark.storage=false
log4j.logger.spark.scheduler=INFO, RollingAppender
log4j.additivity.spark.scheduler=false
log4j.logger.spark.CacheTracker=INFO, RollingAppender
log4j.additivity.spark.CacheTracker=false
log4j.logger.spark.CacheTrackerActor=INFO, RollingAppender
log4j.additivity.spark.CacheTrackerActor=false
log4j.logger.spark.MapOutputTrackerActor=INFO, RollingAppender
log4j.additivity.spark.MapOutputTrackerActor=false
log4j.logger.spark.MapOutputTracker=INFO, RollingAppender
log4j.additivty.spark.MapOutputTracker=false
Oozie workflow :
<workflow-app name="OozieApp" xmlns="uri:oozie:workflow:0.5">
<start to="LoadTable"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="LoadTable">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>root.user</value>
</property>
</configuration>
<master>yarn</master>
<mode>client</mode>
<name>OozieApp</name>
<class>com.comp.test</class>
<jar>data.jar</jar>
<spark-opts>--queue=root.user --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/home/myid/log4j.properties' </spark-opts>
</spark>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
Could you help me get the custom log created in log directory when executed using oozie spark action ?
I can use shell action and use spark-submit , but I prefer spark action itself.

Turn off logging of Solr in "sbt test"

I am using Solr in Scala. I have a test case that adds some documents into Solr core.
When running sbt test, the following information is shown repeatedly:
15/12/03 01:17:50 INFO LogUpdateProcessor: [test] webapp=null path=/update params={} {add=[(null)]} 0 2
In an attempt to suppress it, I put a log4j.properties with content:
.level=WARNING
org.apache.solr.core.level=WARNING
org.apache.solr.update.processor.level=WARNING
under both ${project_dir}/src/main/resources and ${project_dir}/src/test/resources
However, the log message is still there.
I am using :
Scala 2.11.5
solr-solrj 5.3.1
solr-core 5.3.1
sbt 0.1.0
The log4j.properties file is malformatted.
The following content works:
log4j.rootLogger=WARN, stdout
# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n>

Filter log4j appender to only include messages from class

I am trying to create a log4j file appender that includes only messages for a specific class. Here is an example message:
2015-08-06 16:41:43,773 [pool-3-thread-8] INFO ACME.log- new test message
Where I want all log messages for ACME.log to go to a specific appender. Here is code I currently have in properties file that is not working.
log4j.appender.myLog = org.apache.log4j.DailyRollingFileAppender
log4j.appender.myLog.File = /opt/netiq/idm/apps/tomcat/logs/acme.log
log4j.appender.myLog.layout = org.apache.log4j.PatternLayout
log4j.appender.myLog.layout.ConversionPattern = %d{yyyy-MM-dd HH:mm:ss} %-5p %m%n
log4j.appender.myLog.filter.filter.ID=ACME.log
log4j.appender.myLog.filter.ID.levelMin=INFO
log4j.appender.myLog.filter.ID.levelMax=ERROR
log4j.appender.myLog.filter.2 = org.apache.log4j.varia.DenyAllFilter
If you're using log4j 1.x, we strongly recommend that you use org.apache.log4j.rolling.RollingFileAppender 1 instead of org.apache.log4j.DailyRollingFileAppender (may lose messages, Bug 43374).
So the configuration of you appender can be:
log4j.appender.myLog=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.myLog.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.myLog.rollingPolicy.fileNamePattern=/path/acme%d{yyyy-MM-dd}.log
log4j.appender.myLog.layout=org.apache.log4j.PatternLayout
log4j.appender.myLog.layout.ConversionPattern=%d %-5p (%c.java:%L).%M - %m%n
log4j.appender.myLog.filter.A=org.apache.log4j.varia.LevelRangeFilter
log4j.appender.myLog.filter.A.LevelMin=INFO
log4j.appender.myLog.filter.A.LevelMax=ERROR
log4j.appender.myLog.filter.A.AcceptOnMatch=true
If you only want the log of a class (e.g. com.company.MyClass), you specify it as follows:
log4j.logger.com.company.MyClass=TRACE, myLog
Notes
In that case, you need to add the respective jar (apache-log4j-extras-1.2.17.jar).

How do I suppress the bloat of useless information when using the DUMP command while using grunt via 'pig -x local'?

I'm working with PigLatin, using grunt, and every time I 'dump' stuffs, my console gets clobbered with blah blah, blah non-info, is there a way to surpress all that?
grunt> A = LOAD 'testingData' USING PigStorage(':'); dump A;
2013-05-06 19:42:04,146 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2013-05-06 19:42:04,147 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
...
...
--- another like 50 lines of useless context clobbering junk here... till ---
...
...
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
now my like 4 lines of info looking for:
(daemon,*,1,1,System Services,/var/root,/usr/bin/false)
(uucp,*,,,/var/spool/uucp,/usr/sbin/uucico)
(taskgated,*,13,13,Task Gate Daemon,/var/empty,/usr/bin/false)
(networkd,*,24,24,Network Services,/var/empty,/usr/bin/false)
(installassistant,*,25,25,/usr/bin/false)
grunt>
---> obviously if it errors, fine lotsa info helpful, but not when it basically works great.
You need to set the log4j properties.
For example:
$PIG_HOME/conf/pig.properties :
enable:
# log4jconf=./conf/log4j.properties
rename: log4j.properties.template -> log4j.properties
log4j.properties :
set info to error:
log4j.logger.org.apache.pig=info, A
You may also set the Hadoop related logging level as well:
log4j.logger.org.apache.hadoop = error, A
An easy way to do this seems to be to redirect standard error as below.
But it will suppress all errors.
pig -x local 2> /dev/null
Also found that if you remove or rename your hadoop install directory to basically make it inaccessible to pig then all those INFO messages go away. Changing logging levels in hadoop didn't help, just so that you know.
When you start pig, pass it a log4j.properties file with pig -4 <filename>.
In my case there was a log4j.properties in the conf directory and setting the level of the logger named org.apache.pig to ERROR is sufficient to make the logger less verbose.
log4j.logger.org.apache.pig=ERROR, A
pig has debug log level one need to set that in pig.properties file,
# Logging level. debug=OFF|ERROR|WARN|INFO|DEBUG (default: INFO)
#
# debug=INFO
The reason one get large logs on console, e.g. change it to ERROR

Not able to log the axis1internal flow

I created dynamic web project in eclipse. I am not able to pass commons-logginf file to factory . please help me .
Thanks in advance
I got resolution . I am able to log the Axis Internal Flow. src/commons-logging.properties : org.apache.commons.logging.Log=org.apache.commons.logging.impl.Log4JLogger inputs to Log4JLogger class is below properties file. classes folder in web-inf :web-inf/classes/log4j.properties log4j.properties : log4j.rootCategory=DEBUG, CONSOLE log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.appender.CONSOLE.Threshold=DEBUG log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout log4j.appender.CONSOLE.layout.ConversionPattern=- %m%n log4j.appender.LOGFILE=org.apache.log4j.FileAppender log4j.appender.LOGFILE.File=axis.log log4j.appender.LOGFILE.Append=true log4j.appender.LOGFILE.Threshold=DEBUG log4j.appender.LOGFILE.layout=org.apache.log4j.PatternLayout log4j.appender.LOGFILE.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n