Can't submit new job via gui on standalone kubernetes flink deployment (session mode) - kubernetes

After deploy a flink in standalone kubernetes mode (session cluster) i can't upload any new job using flink GUI. After click +Add New button and choosing jar file, the progress strap ends and nothing happens.
There is no information/error on Job Manager logs about this.
When I try to upload any file (eg. text file) I get an error, and there is an info at the log:
"Exception occured in REST handler: Only Jar files are allowed."
I've also try to upload fake jar (an empty file called .jar) and it works - I can upload this kind of file.
I have a brand new, clean Apache Flink cluster running on Kubernetes cluster.
I have used docker hub image and I've try two different versions:
*1.13.2-scala_2.12-java8, and
1.13-scala_2.11-java8*
But the result was the same on both versions.
My deployment are based on this howto:
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
and I've used yaml files provided in Appendix #
Common cluster resource definitions # to this article:
flink-configuration-configmap.yaml
jobmanager-service.yaml
taskmanager-session-deployment.yaml
jobmanager-session-deployment-non-ha.yaml
I'e also used ingress controller to publish GUI running on 8081 on jobmanager.
I have tree pods (1 jobmanager, 2 task managers) and can't see any errors from flink logs.
Any suggestions what I'm missing, or when to find any errors ?

Problem solved. Problem was caused by nginx upload limit (default is 1024kb). Flink GUI are published outside Kubernetes using ingress controller and nginx.
When we try to upload job files bigger than 1MB (1024kb), nginx limit prevented to do it. Jobs with size below this limit (for example fake jar with 0 kb size) was uploaded successfully

Related

Is possible for a container to send kafka event when finishes?

We just migrated to a kubernetes cluster, I was wondering if it is possible to send a kafka event when a container/pod finishes automatically with the stdout as message. Right now we are using fluentd with elastic search but the output of a pod is used as input for the next one, we need to poll constantly elastic search for when the output is ready and that causes performance issues on overall execution
I'm not sure of your current setup but my first thought would jump to:
Use something such as fluentd or Logstash on it's own pod per node
Configure volume access to Kubernetes log folder /var/log/containers/*
Use the Kafka output for either fluentd or Logstash with file input (tail) on the logging folder
This approach would require the configuration above on each node however but requires minimal configuration of logging locations etc..
It's not something I've personally configured but have considered it for the future.
More info here

OpenShift deployment - pod console logs are truncated

we are using OpenShift container platform (v3.11) for hosting our java application. We are writing application logs to standard pod console. However when I try to view pod logs or try to save logs to file, I am not getting complete log file instead getting only partial log (looks logs are truncated). I have tried to provide different options while viewing logs (like --since=48h etc..), but none of them worked.
Is there any way I can increase pod console buffer size or write complete log file contents to file.
The better way is configuring log aggrigation via fluentd/elastic (see elk_logging), however there's an option to change docker log driver settings on the node with the running container (see managing_docker_container_logs or docker_logging_configure)

How to sync user directory on bitbucket server to jira with both running on aks?

When trying to sync the user directories of Jira to other atlassian products (confluence and bitbucket server running on aks) a 403 error is returned.
Upon looking into this error the following steps have been attempted:
https://confluence.atlassian.com/stashkb/unable-to-connect-to-jira-for-authentication-forbidden-403-323391874.html
The IP adresses have been added to the whitelist of Jira. The next step in solutions online is to restart the Jira
service.
This however causes issues as upon running the stop/start-jira.sh files inside the pod the service returns
with none of the previous settings and all configurations including backups are gone. Taking us back to square one.
cluster size:
current set-up
3 x Standard D8 v3 (8 vcpus, 32 GiB memory) cluster on aks
Used the following images installed through UI:
atlassian/jira-software
cptactionhank/docker-atlassian-jira
Exec into pod and go to /opt/atlassian/jira/bin
run ./(start/stop)-jira.sh
What should happen is that when going back to the url the Jira instance is reset and all configuration files in the pod for the service are lost.
The logs of the pod give error no 137 as a common error when restarting.
update:
https://github.com/int128/devops-kompose/tree/master/atlassian-jira-software
The following helm chart has also been used and achieved the same result.

How to redirect Apache Spark logs from the driver and the slaves to the console of the machine that launchs the Spark job using log4j?

I'm trying to build an Apache Spark application that normalizes csv files from HDFS (changes delimiter, fix broken lines). I use log4j for logging but all the logs just print in the executors so the only way i can check them is using yarn logs -applicationId command. Is there any way i can redirect all logs( from driver and from executors) to my gateway node(the one which launchs the spark job) so i can check them during execution?
You should have the executors log4j props configured to write files local to themselves. Streaming back to the driver will cause unnecessary latency in processing.
If you plan on being able to 'tail" the logs in near real-time, you would need to instrument a solution like Splunk or Elasticsearch, and use tools like Splunk Forwarders, Fluentd, or Filebeat that are agents on each box that specifically watch for all configured log paths, and push that data to a destination indexer, that'll parse and extract log field data.
Now, there are other alternatives like Streamsets or Nifi or Knime (all open source), which offer more instrumentation for collecting event processing failures, and effectively allow for "dead letter queues" to handle errors in a specific way. The part I like about those tools - no programming required.
i think it is not possible. When you execute spark in local mode you can able to see it in console. Otherwise you have to alter log4j properties for the log file path.
As per https://spark.apache.org/docs/preview/running-on-yarn.html#configuration,
YARN has two modes for handling container logs after an application has completed. If log aggregation is turned on (with the yarn.log-aggregation-enable config in yarn-site.xml file), container logs are copied to HDFS and deleted on the local machine.
You can also view the container log files directly in HDFS using the HDFS shell or API. The directory where they are located can be found by looking at your YARN configs (yarn.nodemanager.remote-app-log-dir and yarn.nodemanager.remote-app-log-dir-suffix in yarn-site.xml).
I am not sure whether the log aggregation from worker nodes happen in real time !!
There is an indirect way to achieve. Enable the following property in yarn-site.xml.
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
This will store all your logs of the submitted applications in hdfs location. Then using the following command you can download the logs into a single aggregated file.
yarn logs -applicationId application_id_example > app_logs.txt
I came across this github repo which downloads the driver and container logs separately. Clone this repository : https://github.com/hammerlab/yarn-logs-helpers
git clone --recursive https://github.com/hammerlab/yarn-logs-helpers.git
In your .bashrc (or equivalent), source .yarn-logs-helpers.sourceme:
$ source /path/to/repo/.yarn-logs-helpers.sourceme
Then download the aggregated logs into nicely segregated driver and container logs by this command.
yarn-container-logs application_example_id

Running a spring batch with partitions in cloud foundry

I have created an app with spring batch(with partition) application taking example of this https://github.com/mminella/S3JDBC. My app is reading some files from object store and doing some processing and writing back to object the store. My app with local partition works fine in my machine.
I changed the maven, to run in cloud foundry , did change for deployer partition handler and step execution listener and deploying on pcf.
But while trying to push and run the app on pcf , I am getting an issue :
Failing URI /v2/info. I tried to log the error found that there is one call to my app e.g https://mypcf.com:443/v2/info and after that it gives the error. I cant provide full logs because of some restrictions. So I want to know :
To deploy a spring batch in pcf(is there any extra configuration
needed except the maven dependency and code changes for
deployerpartitionhandler and stepexecutionlistener and #cloudtask):
org.springframework.cloud spring-cloud-deployer-cloudfoundry
1.1.0.M1
Is it mandatory to have a separate data base service like my-sql for the partition job. Cant I use H2(the default one, if I
don't configure anything)?
Do I need to do any configuration in pcf to support running multiple partitions ?
As I am running remote partitioning , can I run that app on local STS or Intellij(not on PCF-DEV)so that it will run my app in
pcf(remote) and launch the workers.(Sorry for the stupid question ,
I am new to PCF).
Thanks for checking out my example. To answer your questions:
You should be able to use the latest deployer release (instead of that rather old version).
Yes. Partitioned steps need to all be able to share the same job repository data store so an in memory database like H2 will not work for that use case.
Besides defining your datasource, that's all that is required to live in PCF. That being said, there are other things that need to be configured, but you can use other mechanisms to do so (Spring Cloud Config Server, application.properties/yml, etc).
Yes, you should be able to run the master locally and have it deploy the workers onto PCF if you're using the CF deployer.