Building spark-jobserver Using SBT and Scala - rest

Can anyone suggest me a better documentation about spark-jobserver. I have gone through the url spark-jobserver but unable to follow the same. It will be great if some one explain step by step instruction on how to use spark-jobserver.
Tools used in building the project.
sbt launcher version 0.13.5
Scala code runner version 2.11.6
With the above mentioned tools I am getting errors while building the spark-jobserver.

The documentation provided in the jobserver repo is indeed confusing.
Here's the steps I followed to manually build and run Spark Job Server on a local machine.
1. git clone https://github.com/spark-jobserver/spark-jobserver
2. sudo mkdir -p /var/log/job-server
3. sudo chown user1:user1 /var/log/job-server
4. cd spark-jobserver
5. sbt job-server/assembly
6. cd config
7. cp local.sh.template abc.sh # Note that the same name 'abc' is used in steps 8 and 10 as well
8. cp ec2.conf.template abc.conf
9. cd .. # The jobserver root dir
10. ./bin/server_package.sh abc # This script copies the files and packages necessary to run job server into a single dir [ default - /tmp/job-server]
11. cd /tmp/job-server [This is where the files and packages necessary to run job server are published by default]
12. ./server_start.sh
13. Run ./server_stop.sh to stop the server
Hope this helps

Here are the steps that I used to install:
Clone the jobserver repo.
Get sbt using wget https://dl.bintray.com/sbt/native-packages/sbt/0.13.8/sbt-0.13.8.tgz
Move "sbt-launch.jar" in sbt/bin to /bin
Create a script /bin/sbt, contents found here, making sure to change the pointer to java if necessary
Make the above script executable
Now cd into the spark jobserver directory, and run sbt publish-local
Assuming the above was successful, run sbt in the same directory
Finally, use the command re-start, and if it succeeds the server is now running!

Related

How to fix diesel_cli link libpq.lib error with Postgres tools installed in Docker?

I'm trying (for hours now) to install the cargo crate diesel_cli for postgres. However, every time I run the recommended cargo command:
cargo install diesel_cli --no-default-features --features postgres
I wait a few minutes just to see the same build fail with this message:
note: LINK : fatal error LNK1181: cannot open input file 'libpq.lib'
error: aborting due to previous error
error: failed to compile `diesel_cli v1.4.1`, intermediate artifacts can be found at `C:\Users\<user name here>\AppData\Local\Temp\cargo-installUU2DtT`
Caused by:
could not compile `diesel_cli`.
I'm running postgres in a docker container and have the binaries on my C:\pgsql with the lib and bin directories both on the PATH so I can't figure out why it's not linking. What else could be required they didn't mention in the docs?
In my case the installation was successful but when I tried to run it this error occured.
maybe this would work for others who have the same problem:
open PowerShell
type in setx PQ_LIB_DIR "C:\Program Files\PostgreSQL\13\lib" (or any other path to your PostgreSQL lib)
restart your PC
run again
I had the same issue with WSL, if you're on Linux probably you could find PostgreSQL lib location and add it to your environment variables.
Update:
The answer below is a work around for older versions. Please check the possibility to execute cargo clean first
Original Version
Adding the folder to the PATH variable didn't help, at least in my case, as by some reason it is not used in the /LIBPATH parameter passed to link.exe.
In my case it was C:\Users\<username>\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\lib
You can see it in the beginning of the error message.
Copy libpq.lib in there and it will be used from there.
After installation diesel would require some other assemblies. Copy libcrypto-1_1-x64.dll, libiconv-2.dll and libssl-1_1-x64.dll into the folder showed after where diesel command execution
I had the same error on Ubuntu and for me the following install fixed the issue:
sudo apt install libpq-dev
No need to move files around, just add C:\Program Files\PostgreSQL\14\lib and C:\Program Files\PostgreSQL\14\bin to your PATH. Installing and running diesel should have no problems.
Note: your paths may be different, and remember to close/reopen your terminal so the PATH variable is updated.
(Tested on Windows 10)
To give clear steps for windows:
Add C:\Users<username>.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\lib in the path in environment variables
Copy libpq.lib that is in C:\Program Files\PostgreSQL\14\lib (obviously this is with version 14) and paste it in C:\Users<username>.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\lib
If you've attempted to cargo build (or anything that runs the build scripts for libpq rust crate) when your environment was invalid, then you need to do a cargo clean after fixing your environment otherwise you'll still get the libpq.lib not found error even when it's in your PATH. The other answers where you copy the file into another directory are just hacks
You can instead add the path to .../lib to the compilers' library search paths, using RUSTFLAGS environment variable. It works both for installing diesel_cli and for building your projects.
RUSTFLAGS='-L /usr/local/pgsql/lib' cargo build
On Windows with EDB installer, the path contains a space, so use CARGO_ENCODED_RUSTFLAGS instead. For PowerShell:
$env:CARGO_ENCODED_RUSTFLAGS = "-L`u{1f}C:\Program Files\PostgreSQL\14\lib"

ERROR:The serve command requires to be run in an Angular project, but a project definition could not be found

I am trying to clone the git repository for Tour of Heros with NgRX (blove/ngrx-tour-of-heros)
However, I can not seem to run the application.
I have updated my Angular cli to 7.3 I have installed yarn to try and help as well as tried to create a new application and remove files to drag and drop files that where not there like the node modules, and I am still getting this error.
How do I get this error removed so that I can see the file?
Check your build per-requisites, as illustrated by blove/ngrx-tour-of-heros issue 2
Breaking changes - node => 10
ERROR - "json-server requires at least version 4 of Node, please upgrade"
if your node -v is greater than 10, npm upgrade json-server
Appears that node-sass was optional on yarn install.
Had to install separately - yarn add none-sass
Also, cd client and cd server are reversed in lines 13 and 15.
The angular project is in de client directory, so after cloning the repo you have to get into the client directory before running the ng serve command:
git clone https://github.com/blove/ngrx-tour-of-heros.git
cd ./client
yarn install
npm run start
Seems like it's an older repo with Angular v5 and CLI v 1.6 - try downgrading if it doesn't work. https://github.com/blove/ngrx-tour-of-heros/blob/master/client/package.json#L32

Docker: kafka confluent go client error

I am trying to use apache kafka with go, things look good when i execute the project with go run but when i use docker build i get error....
# pkg-config --cflags rdkafka
Package rdkafka was not found in the pkg-config search path.
Perhaps you should add the directory containing `rdkafka.pc'
to the PKG_CONFIG_PATH environment variable
No package 'rdkafka' found
pkg-config: exit status 1
I installed librdkafka from https://github.com/confluentinc/confluent-kafka-go
git clone https://github.com/edenhill/librdkafka.git
cd librdkafka
./configure --prefix /usr
make
sudo make install
I tried
PKG_CONFIG_PATH=/usr/lib/pkgconfig
source ~/.bashrc
but not luck. Any help is appreciated.
Probably you should include librdkafka.dll, msvcr120.dll and zlib.dll in your project root. At least this is what i should do to get this work on Windows. Not sure about Linux.
This below line inside the Dockerfile worked for me as this sets the environmental variable and this will persist when a container is run from the resulting image.
ENV PKG_CONFIG_PATH ${PKG_CONFIG_PATH}:/usr/lib/pkgconfig/

How to build Spark from the sources from the Download Spark page?

I tried to install and build Spark 2.0.0 on Ubuntu VM with Ubuntu 16.04 as follows:
Install Java
sudo apt-add-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
Install Scala
Go to their Downloads tab on their site: scala-lang.org/download/all.html
I used Scala 2.11.8.
sudo mkdir /usr/local/src/scala
sudo tar -xvf scala-2.11.8.tgz -C /usr/local/src/scala/
Modify the .bashrc file and include the path for scala:
export SCALA_HOME=/usr/local/src/scala/scala-2.11.8
export PATH=$SCALA_HOME/bin:$PATH
then type:
. .bashrc
Install git
sudo apt-get install git
Download and build spark
Go to: http://spark.apache.org/downloads.html
Download Spark 2.0.0 (Build from Source - for standalone mode).
tar -xvf spark-2.0.0.tgz
cd into the Spark folder (that has been extracted).
now type:
./build/sbt assembly
After its done Installing, I get the message:
[success] Total time: 1940 s, completed...
followed by date and time...
Run Spark shell
bin/spark-shell
That's when all hell breaks loose and I start getting the error. I go into the assembly folder to look for a folder called target. But there's no such folder there. The only things visible in assembly are: pom.xml, README, and src.
I looked it up online for quite a while and I haven't been able to find a single concrete solution that would help solve the error. Can someone please provide explicit step-by-step instructions as to how to go about solving this ?!? It's driving me nuts now... (T.T)
Screenshot of the error:
For some reason, Scala 2.11.8 is not working well while building but if I switch over to Scala 2.10.6 then it builds properly. I guess the reason I would need Scala in the first place is to get access to sbt to be able to build spark. Once its built, I need to direct myself to the spark folder and type:
build/sbt package
This will build the missing JAR files for me using Scala 2.11... kinda weird but that's how its working (I am assuming by looking at the logs).
Once spark builds again, type: bin/spark-shell (while being in the spark folder) and you'll have access to the spark shell.
type sbt package in spark directory not in build directory.
If your goal is really to build your custom Spark package from the sources you've downloaded from http://spark.apache.org/downloads.html, you should do the following instead:
./build/mvn -Phadoop-2.7,yarn,mesos,hive,hive-thriftserver -DskipTests clean install
You may want to read the official document Building Spark.
NB You don't have to install Scala and git packages to build Spark so you could have skipped "2. Install Scala" and "3. Install git" steps.

stuck at command "sbt compile" in docker ubuntu

I try to include sbt into docker images. However, it never works and always stuck at Getting org.scala-sbt sbt 0.13.7 ... Also, it is also not working for changing the sbt version.
Here is the snippet of docker file
FROM ubuntu:14.04
RUN echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
RUN sudo apt-get update
RUN sudo apt-get install sbt //also I used force--yes
also, I try to install it on the container manually by using
wget http://dl.bintray.com/sbt/debian/sbt-0.13.5.deb
sudo apt-get update
sudo dpkg -i sbt-0.13.5.deb
When I run sbt compile, it also stuck trying to get Getting org.scala-sbt ...
but it is working for sbt --version
Basically, I dont know why sbt stucks trying to get Getting org.scala-sbt ...
You will need a Java Virtual Machine for SBT and so I think it's a good think you start from the java official docker image. Here is a basic docker file that uses the official ubuntu installation method:
FROM java
RUN echo "deb http://dl.bintray.com/sbt/debian /" | tee -a /etc/apt/sources.list.d/sbt.list
RUN apt-key update
RUN apt-get update
RUN apt-get -y --force-yes install sbt
NOTE: --force-yes has to be there because it is not an authenticated package. I tried adding RUN apt-key update, but that didn't make a difference so you can omit this line.
Then, build a test image: docker build -t test/sbt ., create an interactive container docker run -i -t test/sbt sbt and play with it.
This works for me, but I noticed the download times were slow for launching SBT, so be patient at this step.
This is because the SBT executable in itself is really light and will fetch a bunch of libraries on the first run to accomplish its task. It's also a way for SBT to support multiple projects using multiple SBT versions. If you are stuck at libraries resolution, check your networking configuration. For SBT errors, they are mostly printed on the command line, but you can configure logging if you want.
What's left for you to figure out is to add your project files and issue a compile command to test it.
sbt will try to download a higher version of itself if the project require a higher version than the currently installed sbt version. Usually there is a {projectFolder}/project/build.properties which specifies the desired sbt version for a sbt project. for example: sbt.version=0.13.7 requires version 0.13.7
You seem to get stuck at Getting org.scala-sbt sbt 0.13.7 .... But I believe sbt is actually trying to download sbt 0.13.7 to your local. As the package is not small, depending on your network speed, it may take a while.
It is also likely that there is a network connectivity issue that prevents sbt from downloading its package. So you can try to verify first your network connectivity is not a problem.
If your network is fine, another approach you can try is to go to sbt site to download 0.13.7 package manually to your docker and install it there by following instructions you can find from sbt site.
Hope this helps.
Sometimes sbt stuck when downloading files. You can periodically check size ~/.ivy2 folder and if size isn't grow kick sbt process and rerun sbt.
For me only after 5 kicks sbt download all files!!!