IntellijIdea - Disable Info Message when running Spark Application - scala

I'm getting so many message when running application that using Apache Spark and Hbase/Hadoop Library. For Example :
0 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation #org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
How to disable it, so i just get straight to the point Log like println(varABC) only ?

What you are seeing is logs produced by Spark through log4j, as by default it enables quite a log of printouts printed to stderr. You can configure it as you are usually configuring log4j behavior, e.g. through a log4j.properties configuration file. Refer to http://spark.apache.org/docs/latest/configuration.html#configuring-logging

In /spark-2.0.0-bin-hadoop2.6/conf folder you have a file log4j.properties.template
Rename from log4j.properties.template to log4j.properties
and make the following change in log4j.properties
from: log4j.rootCategory=INFO, console
to: log4j.rootCategory=ERROR, console
Hope this Help!!!...

Under $SPARK_HOME/conf dir modify the log4j.properties file - change values INFO to ERROR as below:
log4j.rootLogger=${root.logger}
root.logger=ERROR,console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
log4j.logger.org.apache.spark.repl.Main=WARN
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=ERROR
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=ERROR
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
this will disable all the INFO log messages and only will print ERROR or FATAL log messages. you can change these values according to your requirement(s).

Related

Embedded Undertow log output

I want to configure my embedded Undertow to save the server logs into a file
public class Server {
UndertowJaxrsServer server = new UndertowJaxrsServer();
ResteasyDeployment deployment = new ResteasyDeploymentImpl();
deployment.setApplicationClass(ExampleApplication.class.getName());
deployment.setInjectorFactoryClass("org.jboss.resteasy.cdi.CdiInjectorFactory");
DeploymentInfo deploymentInfo = server.undertowDeployment(deployment, "/");
deploymentInfo.setClassLoader(Server.class.getClassLoader());
deploymentInfo.setDeploymentName("service");
deploymentInfo.setContextPath("/service");
deploymentInfo.addListener(Servlets.listener(Listener.class));
server.deploy(deploymentInfo);
Builder builder = Undertow.builder()
.addHttpListener("8080", "localhost")
}
The server logs are shown in the console but I want to save all the server logs to a file (similar to JBoss server log where they are saved to log files on daily basis). How can I configure that?
Given you're using log4j as the log manager you'd need to modify your configuration file. For a log4j.properties it would look something like:
log4j.rootLogger=DEBUG, file
# My Application Log
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=log4j.log
log4j.appender.file.logfile.Threshold=ALL
log4j.appender.file.MaxBackupIndex=100
log4j.appender.file.MaxFileSize=1Gb
log4j.appender.file.encoding=UTF8
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%p %t %c - %m%n

How to do effective logging in Spark application

I have a spark application code written in Scala that runs a series of Spark-SQL statements. These results are calculated by calling an action 'Count' in the end against the final dataframe. I would like to know what is the best way to do logging from within a Spark-scala application job? Since all the dataframes (around 20) in number are computed using a single action in the end, what are my options when it comes to logging the outputs/sequence/success of some statements.
Question is little generic in nature. Since spark works on lazy evaluation, the execeution plan is decided by spark and I want to know till what point application statements ran successfully and what were the intermediate results at that stage.
The intention here being to monitor the long running task and see till which point it was fine and where the the problems creeped in.
If we try to put logging before/after transformations then it gets printed when code is read. So, the logging has to be done with custom messages during the actual execution (calling the action in the end of the scala code). If I try to put count/take/first etc in between the code then the execution of job slows down a lot.
I understand the problem that you are facing. Let me put out a simple solution for this.
You need to make use of org.apache.log4j.Logger. Use following lines of code to generate logger messages.
org.apache.log4j.Logger logger = org.apache.log4j.Logger.getRootLogger();
logger.error(errorMessage);
logger.info(infoMessage);
logger.debug(debugMessage);
Now, in order to redirect these messages to a log file, you need to create a log4j property file with below contents.
# Root logger option
# Set everything to be logged to the console
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=OFF
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=OFF
log4j.logger.org.spark-project.jetty.servlet.ServletHandler=OFF
log4j.logger.org.spark-project.jetty.server=OFF
log4j.logger.org.spark-project.jetty=OFF
log4j.category.org.spark_project.jetty=OFF
log4j.logger.Remoting=OFF
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
# Setting properties to have logger logs in local file system
log4j.appender.rolling=org.apache.log4j.RollingFileAppender
log4j.appender.rolling.encoding=UTF-8
log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
log4j.appender.rolling.maxBackupIndex=5
log4j.appender.rolling.maxFileSize=50MB
log4j.logger.org.apache.spark=OFF
log4j.logger.org.spark-project=OFF
log4j.logger.org.apache.hadoop=OFF
log4j.logger.io.netty=OFF
log4j.logger.org.apache.zookeeper=OFF
log4j.rootLogger=INFO, rolling
log4j.appender.rolling.file=/tmp/logs/application.log
You can name the log file in the last statement. Ensure the folders at every node with appropriate permissions.
Now, we need to pass the configurations while submitting the spark job as follows.
--conf spark.executor.extraJavaOptions=-Dlog4j.configuration=spark-log4j.properties --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=spark-log4j.properties
And,
--files "location of spark-log4j.properties file"
Hope this helps!
you can use log4j lib from maven
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>${log4j.version}</version>
</dependency>
For logging, first you need to create a logger object and then you can do logging at different log levels like info, error, warning. Below is the example of logging info in spark scala using log4j:
import org.apache.logging.log4j.LogManager
val logger = LogManager.getLogger(this.getClass.getName)
logger.info("logging message")
So, to add info at some points you can use logger.info("logging message") at that point.

Kafka Server - Could not find a 'KafkaServer' in JAAS

I have a standalone kafka broker that I'm trying to configure SASL for. Configurations are below. I'm trying to set up SASL_PLAIN authentication on the broker.
My understanding is that with the listener.name... configuration in the server.properties, I shouldn't need the jaas file. But I've experimented with one to see if that might be a better approach.
I have experimented with each of these commands, but both result in the same exception.
sudo bin/kafka-server-start etc/kafka/server.properties
sudo -Djava.security.auth.login.config=etc/kafka/kafka_server_jaas.conf bin/kafka-server-start etc/kafka/server.properties
the exception displayed is:
Fatal error during KafkaServer startup. Prepare to shutdown... Could
not find a 'KafkaServer' or 'sasl_plaintext.KafkaServer' entry in the
JAAS configuration. System property 'java.security.auth.login.config'
is not set
server.properties:
listeners=SASL_PLAINTEXT://0.0.0.0:9092
listener.security.protocol.map: SASL_PLAINTEXT:SASL_PLAINTEXT
listener.name.SASL_PLAINTEXT.plain.sasl.jaas.config:
org.apache.kafka.common.security.plain.PlainLoginModule required /
username="username" /
password="Password" /
user_username="Password";
advertised.listeners=SASL_PLAINTEXT://[ipaddress]:9092
sasl.enabled.mechanisms=PLAIN
sasl.mechanism.inter.broker.protocol=PLAIN
secutiy.inter.broker.protocol=SASL_PLAINTEXT
kafka_server_jaas.conf:
KafkaServer {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="username"
password="Password"
user_username="Password";
};
I've spent a day looking at this already - has anyone else had experience with this problem?
You need to export a variable, not in-line the config to kafka-server-start (or sudo).
export KAFKA_OPTS="-Djava.security.auth.login.config=/path/to/kafka_server_jaas.conf"
bin/kafka-server-start /path/to/server.properties
Ref. Confluent's sections on Kafka security
Putting my mistakes here for austerity:
Don't do your startup commands from the cli, put them in a .sh file and run from there:
For example, something like this:
zkstart
export KAFKA_OPTS="-Djava.security.auth.login.config=etc/kafka/zookeeper_jaas.conf"
bin/zookeeper-server-start etc/kafka/zookeeper.properties &
kafkastart
export KAFKA_OPTS=-Djava.security.auth.login.config=etc/kafka/kafka_server_jaas.conf
bin/kafka-server-start etc/kafka/server.properties
If you still encounter an error related to the configs, check your _jaas files to ensure all the configuration sections in the error messages are present. If they are, it's likely the format isn't quite correct - check for the two semi-colons in each section and if that fails, try recreating the file entirely from scratch (or from a c&p from the documentation).
edit
So, the final solution for me was to add the export.... lines to the beginning of the corresponding kafka-server-start and zookeeper-server-start files. It took me a while before the 'everything is a file' finally clicked and I realized the script files were the actual basis for the services.

invalid console appender config detected, console stream is looping in Jboss 5.1

I am getting the below error in the JBoss console logs.
Error:invalid console appender config detected, console stream is looping.
Tried changing the JAVA_OPTS as below but no effect.
set JAVA_OPTS=%JAVA_OPTS% -Dprogram.name=%PROGNAME% -Dorg.jboss.logging.Log4jService.catchSystemOut=false

How can I set the logger level with Quartz Scheduler and Cocoon?

I have a project with an old version of Cocoon. There two cron jobs.
The project has the following log4j config:
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.conversionPattern=%d %-5p - %-26.26c{1} - %m\n
log4j.rootLogger=WARN,CONSOLE
In logs folder there exists file cron.log. But there are some INFO entries. How can I setup log level for this?
You can try adding the following line to set the debug level of the org.quartz package.
log4j.logger.org.quartz=WARN,CONSOLE
BTW, you probably have something that configures this file appender (cron.log) because by default quartz (2.x) does not provides such configuration.
HIH