apache flink doesnt connect to port 8081 - server

Hi i am new to apache flink and i am trying to run a batch wordcount example to start learning about it.I have run
./bin/start-cluster.sh
and then i executed ./bin/flink run ./examples/batch/WordCount.jar --input test.txt --output out.txt
and i get the following
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8081
console messages
so i think its about server connection error and i tried some things like xampp but nothing better
So what's your opinion on that?

It seems like your cluster is not starting. Please try ./bin/start-cluster.sh again and go to http://localhost:8081/ to confirm your cluster is up. After that, the word count example should run fine after specifying the appropriate input and output files.

Related

How to establish real time log output for running container when using docker-compose

I'm working on a Django REST framework API that is built in a docker image and launched/managed with docker-compose. When I launch my app I get the real time log of the Django application in the terminal. I accidentally closed the terminal and I want to re-establish a real time log output in the terminal without restarting my containers.
I tried docker-compose logs which will print a tail of the log but does not re-establish a real time output. I would have to rerun this every time I wanted to see new log information.
I think if you add the --follow flag to your command, you'll get the desired result. So:
docker-compose logs --follow ...

how to run first example of Apache Flink

I am trying to run the first example from the oreilly book "Stream Processing with Apache Flink" and from the flink project. Each gives different errors
Example from the book gies NoClassDefFound error
Example from flink project gives java.net.ConnectException: Connection refused (Connection refused) but does create a flink job, see screenshot.
Detail below
Book example
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:scala/runtime/java8/JFunction1$mcVI$sp
at io.github.streamingwithflink.chapter1.AverageSensorReadings$$anon$3.createSerializer(AverageSensorReadings.scala:50)
The instructions from the book are:
download flink-1.7.1-bin-scala_2.12.tgz
extract
start cluster ./bin/start-cluster.sh
open flink's web UI http://localhost:8081
this all works fine
Download the jar file that includes examples in this book
run example
./bin/flink run \
-c io.github.streamingwithflink.chapter1.AverageSensorReadings \
examples-scala.jar
It seems that the class is not found from error message at the top of this post.
I put the jar in the same directory I am running the command
java -version
openjdk version "1.8.0_242"
OpenJDK Runtime Environment (Zulu 8.44.0.9-CA-macosx) (build 1.8.0_242-b20)
OpenJDK 64-Bit Server VM (Zulu 8.44.0.9-CA-macosx) (build 25.242-b20, mixed mode)
I also tried compiling the jar myself with the same error.
https://github.com/streaming-with-flink/examples-scala.git
and
mvn clean build
error is the same.
Flink project tutorial
running the SocketWindowWordCount
./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
I get a job but it fails
gives java.net.ConnectException: Connection refused (Connection refused)
It is not clear to me what connection is refused. I tried different ports with no change.
How can I run flink code successfully?
I tried to reproduce the failing AverageSensorReadings example, but it was working on my setup. I'll try look deeper into it tomorrow.
Regarding the SocketWindowWordCount example, the error message indicates that the Flink job failed to open a connection to the socket on port 9000. You need to open the socket before you start the job. You can do this for example with netcat:
nc -l 9000
After the job is running, you can send messages by typing and and these message will be ingested into the Flink job. You can see the stats in the WebUI evolving according to the number of words that your messages consisted of.
Note that netcat closes the socket when you stop the Flink job.
I am able to run the "Stream Processing with Apache Flink" code from IntelliJ.
See this post
I am able to run the "Stream Processing with Apache Flink" AverageSensorReadings code on my flink cluster by using sbt. I have never used sbt before but thought I would try it. My project is here
Note that I moved AverageSensorReading.scala to chapter5 a since that is where the code is explained and changed the package to com.mitzit.
use sbt assembly to create jar
run on flink cluster
./bin/flink run \
-c com.mitzit.chapter5.AverageSensorReadings \
/path/to/project/sbt-flink172/target/scala-2.11/sbt-flink172-assembly-0.1.jar
works fine. I have no idea why this works and the mvn compiled jar does not.

Kafka connect not working?

While going through the apache official page
https://kafka.apache.org/quickstart
A text file is created as
echo -e "foo\nbar" > test.txt
And to use kakfa connect following command is used
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
But while above command gets executed it shows a message kafka-connect stopped
Something else is using the same port that Kafka Connect wants to use.
You can use netstat -plnt to identify the other program (you'll need to run it as root if the process is owned by a different user).
If you want to get Kafka Connect to use a different port, edit config/connect-standalone.properties to add:
rest.port=18083
Where 18083 is an available port.

zookeper-server-start.sh config/zookeeper.properties is throwing "Already in use".how to solve this error in ubuntu?

Vfore starting kafka i tried to start zookeeper server,it is throwing following java.net.BindException Exception.
I checked existing processes using :
netstat -nap|grep 4040 and 8080
I found no processes running there. can anyone know about this?
screenshot
It's trying to start zookeeper on the default port of 2181. You probably have another zookeeper running. Try running the Java ps command "jps" and kill any other zookeeper process running on the same machine

Debugging MapReduce Hadoop in local mode in eclipse. Failed to connect remote VM

I am new to hadoop and I am trying to debug MapReduce Hadoop in local mode in Eclipse in Virtualbox Ubuntu following these articles: Debug Custom Java hadoop code in local environment and Hadoop MapReduce Debugging in Local Setup
In hadoop-env.sh I put the text
export HADOOP_OPTS="$HADOOP_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8008"
I tried to run Eclipse from command line
eclipse -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8008
I also changed from hdfs to file:/// in core-site.xml in hadoop configurations
<name>fs.default.name</name>
<value>file:///localhost:8020</value>
I checked the port 8080. Seems like it works okay:
netstat -atn | grep 8080`
says tcp6 8080 LISTEN
http://localhost:8080 opens in browser and says Required param job, map and reduce
still everything is useless as when I try to set debug configuration with the port 8080 in Eclipse it breaks “failed to connect remote vm”.
Can anyone suggest a possible solution?
That isn't the way to run eclipse as a debugger.
Run eclipse without any command line options and setup a debug configuration for a remote java application that connects to 8008.
[EDIT]
I also think your hadoop debug options are wrong. I use:
-agentlib:jdwp=transport=dt_socket,address=8008,server=y,suspend=n