Requester cannot establish the connection. Jetty, Lift /Scala, iSeries DB2/400 - scala

I’m working my way through the Lift Application Development Cookbook by Gilberto T. Garcia Jr and have run up against a problem I can’t seem to resolve. I’ve copied the source code Chap06-map-table and I’m trying to modify it to work with my IBM i (iSeries, AS/400, i5) database. I was able to make it work with the first type of connection using Squeryl Record. However, I can’t seem to figure how to get this to work using a JNDI Datasource. I’ve spent a couple of days searching the internet for examples of setting this up and have not found a good example involving a DB/400 database connection. Below is the error I get when I attempt to start the container and the code I’ve modified in an effort to make it work. Any help would be appreciated.
There seems to be some choices for the data source class from jt4oo.jar (jtOpen) and I’m not sure which would be the best to use or perhaps there’s another. I’ve been trying this with each of the three and am assuming the first is the correct one.
com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource
com.ibm.as400.access.AS400JDBCConnectionPoolDataSource
com.ibm.as400.access.AS400JDBCDataSource
Thanks. Bob
This is the start of the error:
> container:start
[info] jetty-8.0.4.v20111024
[info] No Transaction manager found - if your webapp requires one, please config
ure one.
[info] NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet
[info] started o.e.j.w.WebAppContext{/,[file:/C:/Users/Bob/Lift26Projects/scala_
210/chap06-map-table/src/main/webapp/]}
[info] started o.e.j.w.WebAppContext{/,[file:/C:/Users/Bob/Lift26Projects/scala_
210/chap06-map-table/src/main/webapp/]}
18:21:47.062 [pool-7-thread-1] ERROR n.liftweb.http.provider.HTTPProvider - Fail
ed to Boot! Your application may not run properly
java.sql.SQLException: The application requester cannot establish the connection
. ("jdbc:as400://www.busapp.com;libraries=PLAY2TEST";naming=system;errors=full;)
at com.ibm.as400.access.JDError.throwSQLException(JDError.java:524) ~[jt
400-6.7.jar:JTOpen 6.7]
at com.ibm.as400.access.AS400JDBCConnection.setProperties(AS400JDBCConne
ction.java:3142) ~[jt400-6.7.jar:JTOpen 6.7]
at com.ibm.as400.access.AS400JDBCManagedDataSource.createPhysicalConnect...
My Build.sbt File:
name := "Lift 2.5 starter template"
version := "0.0.1"
organization := "net.liftweb"
scalaVersion := "2.10.0"
resolvers ++= Seq("snapshots" at "http://oss.sonatype.org/content/repositories/snapshots",
"staging" at "http://oss.sonatype.org/content/repositories/staging",
"releases" at "http://oss.sonatype.org/content/repositories/releases"
)
seq(com.github.siasia.WebPlugin.webSettings :_*)
unmanagedResourceDirectories in Test <+= (baseDirectory) { _ / "src/main/webapp" }
scalacOptions ++= Seq("-deprecation", "-unchecked")
env in Compile := Some(file("./src/main/webapp/WEB-INF/jetty-env.xml") asFile)
libraryDependencies ++= {
val liftVersion = "2.5"
Seq(
"net.liftweb" %% "lift-webkit" % liftVersion % "compile",
"net.liftmodules" %% "lift-jquery-module_2.5" % "2.3",
"org.eclipse.jetty" % "jetty-webapp" % "8.0.4.v20111024" % "container",
"org.eclipse.jetty" % "jetty-plus" % "8.0.4.v20111024" % "container",
"ch.qos.logback" % "logback-classic" % "1.0.6",
"org.specs2" %% "specs2" % "1.14" % "test",
"net.liftweb" %% "lift-squeryl-record" % liftVersion % "compile",
"net.sf.jt400" % "jt400" % "6.7",
"org.liquibase" % "liquibase-maven-plugin" % "3.0.2"
)
}
This is my boot.scala file:
package bootstrap.liftweb
import _root_.liquibase.database.DatabaseFactory
import _root_.liquibase.database.jvm.JdbcConnection
import _root_.liquibase.exception.DatabaseException
import _root_.liquibase.Liquibase
import _root_.liquibase.resource.FileSystemResourceAccessor
import net.liftweb._
import util._
import Helpers._
import common._
import http._
import sitemap._
import Loc._
import net.liftmodules.JQueryModule
import net.liftweb.http.js.jquery._
import net.liftweb.squerylrecord.SquerylRecord
import org.squeryl.Session
import java.sql.{SQLException, DriverManager}
import org.squeryl.adapters.DB2Adapter
import javax.naming.InitialContext
import javax.sql.DataSource
import code.model.LiftBookSchema
/**
* A class that's instantiated early and run. It allows the application
* to modify lift's environment
*/
class Boot {
def runChangeLog(ds: DataSource) {
val connection = ds.getConnection
try {
val database = DatabaseFactory.getInstance().
findCorrectDatabaseImplementation(new JdbcConnection(connection))
val liquibase = new Liquibase(
"database/changelog/db.changelog-master.xml",
new FileSystemResourceAccessor(),
database
)
liquibase.update(null)
} catch {
case e: SQLException => {
connection.rollback()
throw new DatabaseException(e)
}
}
}
def boot {
// where to search snippet
LiftRules.addToPackages("code")
prepareDb()
// Build SiteMap
val entries = List(
Menu.i("Home") / "index", // the simple way to declare a menu
// more complex because this menu allows anything in the
// /static path to be visible
Menu(Loc("Static", Link(List("static"), true, "/static/index"),
"Static Content")))
// set the sitemap. Note if you don't want access control for
// each page, just comment this line out.
LiftRules.setSiteMap(SiteMap(entries: _*))
//Show the spinny image when an Ajax call starts
LiftRules.ajaxStart =
Full(() => LiftRules.jsArtifacts.show("ajax-loader").cmd)
// Make the spinny image go away when it ends
LiftRules.ajaxEnd =
Full(() => LiftRules.jsArtifacts.hide("ajax-loader").cmd)
// Force the request to be UTF-8
LiftRules.early.append(_.setCharacterEncoding("UTF-8"))
// Use HTML5 for rendering
LiftRules.htmlProperties.default.set((r: Req) =>
new Html5Properties(r.userAgent))
//Init the jQuery module, see http://liftweb.net/jquery for more information.
LiftRules.jsArtifacts = JQueryArtifacts
JQueryModule.InitParam.JQuery = JQueryModule.JQuery172
JQueryModule.init()
}
def prepareDb() {
Class.forName("com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource")
val ds = new InitialContext().lookup("java:/comp/env/jdbc/dsliftbook").asInstanceOf[DataSource]
runChangeLog(ds)
SquerylRecord.initWithSquerylSession(
Session.create(
ds.getConnection,
new DB2Adapter)
)
}
}
This is my jetty-env-xml File
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
<New id="dsliftbook" class="org.eclipse.jetty.plus.jndi.Resource">
<Arg></Arg>
<Arg>jdbc/dsliftbook</Arg>
<Arg>
<New class="com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource">
<Set name="serverName">"jdbc:as400://www.[server].com;libraries=PLAY2TEST";naming=system;errors=full;</Set>
<Set name="user">[user]</Set>
<Set name="password">[password]</Set>
</New>
</Arg>
</New>
</Configure>

Okay, I've managed to get connected. One problem was the quotation marks in the jetty-env-xml file. And the user name/password I was using apparently did not the authority required to make this work I'm not sure why since this is the same id/password I use for all my iSeries development. So for now, I'm another user profile with security officer authority until I can figure out what's happening or what authorities are required.
Once I got signed on, I was not able to set a library list for the user and this was causing the SQL to fail. It was looking for a library name that was the same as the user ID. For the time being, I've gotten around this issue by creating a new library named the same as the user id.
One other problem here is that even though I'm supplying both the ID and Password, I'm getting prompted to enter the ID/Password before it will connect. The ID and url are filled in but the password always has to be re-keyed.
I've included the current source for the jetty-env-xml file and the boot.scala file. Hopefully this may help others.
Thanks to Dave and James for their help!
Bob
boot.scala:
package bootstrap.liftweb
// import _root_.liquibase.database.DatabaseFactory
// import _root_.liquibase.database.jvm.JdbcConnection
// import _root_.liquibase.exception.DatabaseException
// import _root_.liquibase.Liquibase
// import _root_.liquibase.resource.FileSystemResourceAccessor
import net.liftweb._
import util._
import Helpers._
import common._
import http._
import sitemap._
import Loc._
import net.liftmodules.JQueryModule
import net.liftweb.http.js.jquery._
import net.liftweb.squerylrecord.SquerylRecord
import org.squeryl.Session
import java.sql.{SQLException, DriverManager}
import org.squeryl.adapters.DB2Adapter
import javax.naming.InitialContext
import javax.sql.DataSource
import code.model.LiftBookSchema
import com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource
/**
* A class that's instantiated early and run. It allows the application
* to modify lift's environment
*/
class Boot {
// def runChangeLog(ds: DataSource) {
// val connection = ds.getConnection
// try {
// val database = DatabaseFactory.getInstance().
// findCorrectDatabaseImplementation(new JdbcConnection(connection))
// val liquibase = new Liquibase(
// "database/changelog/db.changelog-master.xml",
// new FileSystemResourceAccessor(),
// database
// )
// liquibase.update(null)
// } catch {
// case e: SQLException => {
// connection.rollback()
// throw new DatabaseException(e)
// }
// }
// }
def boot {
// where to search snippet
LiftRules.addToPackages("code")
prepareDb()
// Build SiteMap
val entries = List(
Menu.i("Home") / "index", // the simple way to declare a menu
// more complex because this menu allows anything in the
// /static path to be visible
Menu(Loc("Static", Link(List("static"), true, "/static/index"),
"Static Content")))
// set the sitemap. Note if you don't want access control for
// each page, just comment this line out.
LiftRules.setSiteMap(SiteMap(entries: _*))
//Show the spinny image when an Ajax call starts
LiftRules.ajaxStart =
Full(() => LiftRules.jsArtifacts.show("ajax-loader").cmd)
// Make the spinny image go away when it ends
LiftRules.ajaxEnd =
Full(() => LiftRules.jsArtifacts.hide("ajax-loader").cmd)
// Force the request to be UTF-8
LiftRules.early.append(_.setCharacterEncoding("UTF-8"))
// Use HTML5 for rendering
LiftRules.htmlProperties.default.set((r: Req) =>
new Html5Properties(r.userAgent))
//Init the jQuery module, see http://liftweb.net/jquery for more information.
LiftRules.jsArtifacts = JQueryArtifacts
JQueryModule.InitParam.JQuery = JQueryModule.JQuery172
JQueryModule.init()
}
def prepareDb() {
Class.forName("com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource")
val ds = new InitialContext().lookup("java:/comp/env/jdbc/dsliftbook").asInstanceOf[DataSource]
// runChangeLog(ds)
SquerylRecord.initWithSquerylSession(Session.create(ds.getConnection, new DB2Adapter)
)
}
}
jetty-env-xml
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
<New id="dsliftbook" class="org.eclipse.jetty.plus.jndi.Resource">
<Arg></Arg>
<Arg>jdbc/dsliftbook</Arg>
<Arg>
<New class="com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource">
<Set name="serverName">www.[server].com</Set>
<Set name="user">DBUSER</Set>
<Set name="password">DBUSER</Set>
</New>
</Arg>
</New>
</Configure>

Related

Connecting HBase using Scala in playframework

Hi I am trying to connect to HBase from Scala application in Play framework. I am following this link to establish the connection. My application is not running properly. I am accessing Hbase remotely through putty. I am having this play application in my local windows machine. Where & how to mention the HBase server connection details in the application?
conf/application.conf:
e# This is the main configuration file for the application.
# ~~~~~
# Secret key
# ~~~~~
# The secret key is used to secure cryptographics functions.
# If you deploy your application to several instances be sure to use the same key!
application.secret="?#3Y^s/S>oCNuO7If3Mq8]U285PqOG[bh/;^WVjZ#p5=`KljrbDrg4tBG6clCPuN"
# The application languages
# ~~~~~
application.langs="en"
# Global object class
# ~~~~~
# Define the Global object class for this application.
# Default to Global in the root package.
# application.global=Global
# Router
# ~~~~~
# Define the Router object to use for this application.
# This router will be looked up first when the application is starting up,
# so make sure this is the entry point.
# Furthermore, it's assumed your route file is named properly.
# So for an application router like `conf/my.application.Router`,
# you may need to define a router file `my.application.routes`.
# Default to Routes in the root package (and `conf/routes`)
# application.router=my.application.Routes
# Database configuration
# ~~~~~
# You can declare as many datasources as you want.
# By convention, the default datasource is named `default`
#
# db.default.driver=org.h2.Driver
# db.default.url="jdbc:h2:mem:play"
# db.default.user=sa
# db.default.password=""
#
# You can expose this datasource via JNDI if needed (Useful for JPA)
# db.default.jndiName=DefaultDS
# Evolutions
# ~~~~~
# You can disable evolutions if needed
# evolutionplugin=disabled
# Ebean configuration
# ~~~~~
# You can declare as many Ebean servers as you want.
# By convention, the default server is named `default`
#
# ebean.default="models.*"
# Logger
# ~~~~~
# You can also configure logback (http://logback.qos.ch/), by providing a logger.xml file in the conf directory .
# Root logger:
logger.root=ERROR
# Logger used by the framework:
logger.play=INFO
# Logger provided to your application:
logger.application=DEBUG
ERROR:
http://localhost:9000 gives me the web page with one form and add button. When I click that add button it redirects me to http://localhost:9000/bars url and gives below error on the web page itself
Bad request
For request 'POST /bars' [Expecting xml body]
There is no error log on the console.
My \play-hbase\app\controllers\Application.scala looks like this:
package controllers
import play.api.mvc.{Action, Controller}
import org.apache.hadoop.hbase.{HColumnDescriptor, HTableDescriptor, HBaseConfiguration}
import org.apache.hadoop.hbase.client._
import org.apache.hadoop.hbase.util.Bytes
import play.api.Logger
import play.api.libs.json.Json
import java.util.UUID
import scala.collection.JavaConversions._
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.hbase.client.{ConnectionFactory,HTable,Put,HBaseAdmin}
object Application extends Controller {
val barsTableName = "bars"
val family = Bytes.toBytes("all")
val qualifier = Bytes.toBytes("json")
lazy val hbaseConfig = {
println("Hi .... hbaseConfig ... START")
val conf:Configuration = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "xxx.xxx.xxx.xxx") // xxx.xxx.xxx.xxx IP address of my Cloudera virtual machine.
conf.set("hbase.zookeeper.property.clientPort", "2181")
val hbaseAdmin = new HBaseAdmin(conf)
// create a table in HBase if it doesn't exist
if (!hbaseAdmin.tableExists(barsTableName)) {
val desc = new HTableDescriptor(barsTableName)
desc.addFamily(new HColumnDescriptor(family))
hbaseAdmin.createTable(desc)
Logger.info("bars table created")
}
// return the HBase config
println("Hi .... hbaseConfig ... END")
conf
}
def index = Action {
// return the server-side generated webpage from app/views/index.scala.html
println("Hi .... index ... START")
Ok(views.html.index("Play Framework + HBase"))
}
def addBar() = Action(parse.json) { request =>
// create a new row in the table that contains the JSON sent from the client
println("Hi .... addBar ... START")
val table = new HTable(hbaseConfig, barsTableName)
val put1 = new Put(Bytes.toBytes(UUID.randomUUID().toString))
put1.add(family, qualifier, Bytes.toBytes(request.body.toString()))
table.put(put1)
table.close()
println("Hi .... addBar ... END")
Ok
}
def getBars = Action {
// query the table and return a JSON list of the bars in the table
val table = new HTable(hbaseConfig, barsTableName)
val scan = new Scan()
scan.addColumn(family, qualifier)
val scanner = table.getScanner(scan)
val results = try {
scanner.toList.map {result =>
Json.parse(result.getValue(family, qualifier))
}
} finally {
scanner.close()
table.close()
}
Ok(Json.toJson(results))
}
}
My \play-hbase\conf\routes file looks like this:
# Routes
# This file defines all application routes (Higher priority routes first)
# ~~~~
GET / controllers.Application.index
GET /bars controllers.Application.getBars
POST /bars controllers.Application.addBar
# Map static resources from the /public folder to the /assets URL path
GET /assets/*file controllers.Assets.at(path="/public", file)
GET /webjars/*file controllers.WebJarAssets.at(file)
I added println() statements in my Application.scala file to check the flow. It is just printing :
Hi .... index ... START
Hi .... index ... START
My \play-hbase\app\views\index.scala file looks like this:
#(title: String)
<!DOCTYPE html>
<html>
<head>
<title>#title</title>
<link rel='shortcut icon' type='image/png' href='#routes.Assets.at("images/favicon.png")'>
<link rel='stylesheet' href='#routes.WebJarAssets.at(WebJarAssets.locate("bootstrap.min.css"))'>
<link rel='stylesheet' href='#routes.Assets.at("stylesheets/index.css")'>
<script type='text/javascript' src='#routes.WebJarAssets.at(WebJarAssets.locate("jquery.min.js"))'></script>
<script type='text/javascript' src='#routes.Assets.at("javascripts/index.min.js")'></script>
</head>
<body>
<div class="navbar navbar-fixed-top">
<div class="navbar-inner">
<div class="container-fluid">
<a id="titleLink" class="brand" href="/">#title</a>
</div>
</div>
</div>
<div class="container">
<div class="well">
<h3>Bars</h3>
<ul id="bars"></ul>
<hr>
<h3>Add a Bar</h3>
<form id="addBar" action="#routes.Application.addBar()" method="post">
<input id="barName" placeholder="Name">
<button>Add Bar</button>
</form>
</div>
</div>
</body>
</html>
My \play-hbase\build.sbt looks like this:
name := "play-hbase"
version := "1.0-SNAPSHOT"
libraryDependencies ++= Seq(
// Select Play modules
//jdbc, // The JDBC connection pool and the play.api.db API
//anorm, // Scala RDBMS Library
//javaJdbc, // Java database API
//javaEbean, // Java Ebean plugin
//javaJpa, // Java JPA plugin
//filters, // A set of built-in filters
//javaCore, // The core Java API
// WebJars pull in client-side web libraries
"org.webjars" %% "webjars-play" % "2.2.0-RC1-1",
"org.webjars" % "bootstrap" % "2.3.1",
// HBase
//"org.apache.hadoop" % "hadoop-core" % "1.2.1",
//"org.apache.hbase" % "hbase" % "0.94.11",
"org.apache.hadoop" % "hadoop-common" % "2.6.0",
"org.apache.hadoop" % "hadoop-client" % "2.6.0",
"org.apache.hbase" % "hbase" % "1.2.0",
"org.apache.hbase" % "hbase-client" % "1.2.0",
"org.apache.hbase" % "hbase-common" % "1.2.0",
"org.apache.hbase" % "hbase-server" % "1.2.0",
"org.slf4j" % "slf4j-log4j12" % "1.7.5"
// Add your own project dependencies in the form:
// "group" % "artifact" % "version"
)
play.Project.playScalaSettings
My \play-hbase\project\plugins.sbt looks like this:
// Comment to get more information during initialization
logLevel := Level.Warn
// The Typesafe repository
resolvers += "Typesafe repository" at "http://repo.typesafe.com/typesafe/releases/"
// Use the Play sbt plugin for Play projects
addSbtPlugin("com.typesafe.play" % "sbt-plugin" % "2.2.0-RC1")
My \play-hbase\project\build.properties looks like this:
#Activator-generated Properties
#Sat Oct 22 14:55:10 UTC 2016
template.uuid=148fc4a0-928a-42a0-81c8-98d83d1a656d
sbt.version=0.13.0
Thanks.

Access public available Amazon S3 file from Apache Spark

I have a public available Amazon s3 resource (text file) and want to access it from spark. That means - I don't have any Amazon credentials - it works fine if I want to just download it:
val bucket = "<my-bucket>"
val key = "<my-key>"
val client = new AmazonS3Client
val o = client.getObject(bucket, key)
val content = o.getObjectContent // <= can be read and used as input stream
However, when I try to access the same resource from spark context
val conf = new SparkConf().setAppName("app").setMaster("local")
val sc = new SparkContext(conf)
val f = sc.textFile(s"s3a://$bucket/$key")
println(f.count())
I receive the following error with stacktrace:
Exception in thread "main" com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3521)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:221)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1781)
at org.apache.spark.rdd.RDD.count(RDD.scala:1099)
at com.example.Main$.main(Main.scala:14)
at com.example.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
I don't want to provide any AWS credentials - I just want to access resource anonymously (for now) - how to achieve this? I probably need to make it use something like AnonymousAWSCredentialsProvider - but how to put it inside spark or hadoop?
P.S. My build.sbt just in case
scalaVersion := "2.11.7"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.4.1",
"org.apache.hadoop" % "hadoop-aws" % "2.7.1"
)
UPDATED: After I did some investigations - I see the reason why itsn't working.
First of all, S3AFileSystem creates AWS client with the following order of credentials:
AWSCredentialsProviderChain credentials = new AWSCredentialsProviderChain(
new BasicAWSCredentialsProvider(accessKey, secretKey),
new InstanceProfileCredentialsProvider(),
new AnonymousAWSCredentialsProvider()
);
"accessKey" and "secretKey" values are taken from the spark conf instance (keys must be "fs.s3a.access.key" and "fs.s3a.secret.key" or org.apache.hadoop.fs.s3a.Constants.ACCESS_KEY and org.apache.hadoop.fs.s3a.Constants.SECRET_KEY constants, which is more convenient).
Second - you probably see that AnonymousAWSCredentialsProvider is the third option (last priority) - what could possible be wrong with that? See the implementation of AnonymousAWSCredentials:
public class AnonymousAWSCredentials implements AWSCredentials {
public String getAWSAccessKeyId() {
return null;
}
public String getAWSSecretKey() {
return null;
}
}
It simply returns null for both access key and secret key. Sounds reasonable. But look inside AWSCredentialsProviderChain:
AWSCredentials credentials = provider.getCredentials();
if (credentials.getAWSAccessKeyId() != null &&
credentials.getAWSSecretKey() != null) {
log.debug("Loading credentials from " + provider.toString());
lastUsedProvider = provider;
return credentials;
}
It doesn't choose provider in case both keys are null - that means anonymous credentials can't work. Looks like a bug inside aws-java-sdk-1.7.4. I tried to use latest version - but it's incompatible with hadoop-aws-2.7.1.
Any other ideas?
I personally never accessed public data from Spark. You can try to use dummy credentials, or to create ones just for this usage. Set them directly on the SparkConf object.
val sparkConf: SparkConf = ???
val accessKeyId: String = ???
val secretAccessKey: String = ???
sparkConf.set("spark.hadoop.fs.s3.awsAccessKeyId", accessKeyId)
sparkConf.set("spark.hadoop.fs.s3n.awsAccessKeyId", accessKeyId)
sparkConf.set("spark.hadoop.fs.s3.awsSecretAccessKey", secretAccessKey)
sparkConf.set("spark.hadoop.fs.s3n.awsSecretAccessKey", secretAccessKey)
As an alternative, read the documentation of DefaultAWSCredentialsProviderChain to see where the credentials are looked for. The list (order is important) is:
Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_KEY
Java System Properties - aws.accessKeyId and aws.secretKey
Credential profiles file at the default location (~/.aws/credentials) shared by all AWS SDKs and the AWS CLI
Instance profile credentials delivered through the Amazon EC2 metadata service
This is what helped me:
val session = SparkSession.builder()
.appName("App")
.master("local[*]")
.config("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider")
.getOrCreate()
val df = session.read.csv(filesFromS3:_*)
Versions:
"org.apache.spark" %% "spark-sql" % "2.4.0",
"org.apache.hadoop" % "hadoop-aws" % "2.8.5",
Documentation:
https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Authentication_properties
It seems you can now use the aws.credentials.provider config key to use anonymous access given by org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, which correctly special case the anonymous provider. However, you need a newer hadoop-aws than 2.7, which means you also need a spark installation without a bundled hadoop.
Here is how I did it colab:
!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q http://apache.osuosl.org/spark/spark-2.3.1/spark-2.3.1-bin-without-hadoop.tgz
!tar xf spark-2.3.1-bin-without-hadoop.tgz
!pip install -q findspark
!pip install -q pyarrow
Now we install hadoop on the side, and set the output of hadoop classpath to SPARK_DIST_CLASSPATH, so spark can see it.
import os
!wget -q http://mirror.nbtelecom.com.br/apache/hadoop/common/hadoop-2.8.4/hadoop-2.8.4.tar.gz
!tar xf hadoop-2.8.4.tar.gz
os.environ['HADOOP_HOME']= '/content/hadoop-2.8.4'
os.environ["SPARK_DIST_CLASSPATH"] = "/content/hadoop-2.8.4/etc/hadoop:/content/hadoop-2.8.4/share/hadoop/common/lib/*:/content/hadoop-2.8.4/share/hadoop/common/*:/content/hadoop-2.8.4/share/hadoop/hdfs:/content/hadoop-2.8.4/share/hadoop/hdfs/lib/*:/content/hadoop-2.8.4/share/hadoop/hdfs/*:/content/hadoop-2.8.4/share/hadoop/yarn/lib/*:/content/hadoop-2.8.4/share/hadoop/yarn/*:/content/hadoop-2.8.4/share/hadoop/mapreduce/lib/*:/content/hadoop-2.8.4/share/hadoop/mapreduce/*:/content/hadoop-2.8.4/contrib/capacity-scheduler/*.jar"
Then we do like in https://mikestaszel.com/2018/03/07/apache-spark-on-google-colaboratory/, but add s3a and anonymous reading support, which is what the question is about.
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["SPARK_HOME"] = "/content/spark-2.3.1-bin-without-hadoop"
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.amazonaws:aws-java-sdk:1.10.6,org.apache.hadoop:hadoop-aws:2.8.4 --conf spark.sql.execution.arrow.enabled=true --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider pyspark-shell'
And finally we can create the session.
import findspark
findspark.init()
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").getOrCreate()

ElasticSearch import issue

I'm trying to create a client for ES in SCALA [ school project ] .
but when I want to import Elastic search I got some problems
I've written a sbt file :
libraryDependencies += "org.elasticsearch" %% "elasticsearch" % "1.4.2"
libraryDependencies += "org.apache.lucene" % "lucene-core" % "4.10.2"
with other lucene
and when I try to use it :
import org.elasticsearch.node.Nodebuilder.*
object Setup {
Node node = nodeBuilder().node();
Client client = node.client();
}
it does recognize org.elasticsearch.node. but not .Nodebuilder.
Anyone has an idea ?
solved
import org.elasticsearch.node.NodeBuilder.nodeBuilder
val node = nodeBuilder().node()
val client = node.client()
I would suggest you to use the following library: https://github.com/sksamuel/elastic4s

No Class Definitions found scala/reflect/ClassManifest using play 2 bars tutorial

I'm trying to build the bars tutorial project from "Getting Started with Play 2, Scala, and Squeryl" by James Ward, Ryan Knight July 11, 2012.
I can get it to work okay until I enter the URL for the bars list (//localhost:9000/bars). At that point I receive the following error in my browser: "Execution Exception [RuntimeException: java.lang.NoClassDefFoundError: scala/reflect/ClassManifest]". If I refresh my browser I then get the following error: "[RuntimeException: java.lang.NoClassDefFoundError: Could not initialize class com.codahale.jerkson.Json$]".
I've tried creating this project using Activator, Command Line Play and Eclipse.
I know there are differences in the versions of dependencies, Scala and Play 2 between the tutorial and what I'm using and that may be causing my problem. But I don't know what that might be. I'm assuming that I'm missing something to do with scala/reflect/ClassManifest. But, I have a dependency included for "org.scala-lang" % "scala-reflect" % "2.10.3".
I'm running Play 2.2.1 built with Scala 2.20.3 running Java 1.7.0.
I've tried to include all the relevant code below.
Any help would be appreciated. Thanks
build.sbt
> import AddSettings._ name := "mysquerylapp"
>
> version := "1.0-SNAPSHOT"
>
> libraryDependencies ++= Seq(
> "org.webjars" %% "webjars-play" % "2.2.0",
> "org.webjars" % "bootstrap" % "2.3.1",
> "org.scala-lang" % "scala-reflect" % "2.10.3",
> "org.scalatest" % "scalatest_2.10" % "2.0.M8",
> "com.codahale" % "jerkson_2.9.1" % "0.5.0",
> "org.squeryl" % "squeryl_2.10" % "0.9.5-6",
> "postgresql" % "postgresql" % "9.1-901-1.jdbc4", jdbc, anorm, cache )
>
> lazy val main = Project(id="main",
> base=file(".")).settings(play.Project.playScalaSettings:_*).autoSettings(userSettings,
> allPlugins, defaultSbtFiles)
>
> //play.Project.playScalaSettings
application.conf
> # This is the main configuration file for the application.
> # ~~~~~
>
> # Secret key
> # ~~~~~
> # The secret key is used to secure cryptographics functions.
> # If you deploy your application to several instances be sure to use the same key!
> application.secret="GZXuoBh0fnkNJh8B__1IsG5_YIi/Fw:W?=8[WB^k?=F?gR?X^oQ1J]fTqJB<ZZi1"
>
> # The application languages
> # ~~~~~ application.langs="en"
>
> # Global object class
> # ~~~~~
> # Define the Global object class for this application.
> # Default to Global in the root package.
> # application.global=Global
>
> # Router
> # ~~~~~
> # Define the Router object to use for this application.
> # This router will be looked up first when the application is starting up,
> # so make sure this is the entry point.
> # Furthermore, it's assumed your route file is named properly.
> # So for an application router like `my.application.Router`,
> # you may need to define a router file `conf/my.application.routes`.
> # Default to Routes in the root package (and conf/routes)
> # application.router=my.application.Routes
>
> # Database configuration
> # ~~~~~
> # You can declare as many datasources as you want.
> # By convention, the default datasource is named `default`
> # db.default.driver=org.h2.Driver db.default.url="jdbc:h2:mem:play"
> # db.default.user=sa
> # db.default.password=""
>
> # Evolutions
> # ~~~~~
> # You can disable evolutions if needed
> # evolutionplugin=disabled
>
> # Logger
> # ~~~~~
> # You can also configure logback (http://logback.qos.ch/),
> # by providing an application-logger.xml file in the conf directory.
>
> # Root logger: logger.root=ERROR
>
> # Logger used by the framework: logger.play=INFO
>
> # Logger provided to your application: logger.application=DEBUG
routes
# Routes
# This file defines all application routes (Higher priority routes first)
# ~~~~
# Home page
GET / controllers.Application.index
POST /bars controllers.Application.addBar
GET /bars controllers.Application.getBars
# Map static resources from the /public folder to the /assets URL path
GET /assets/*file controllers.Assets.at(path="/public", file)
1.sql
# --- First database schema
# --- !Ups
create sequence s_bar_id;
create table bar (
id bigint DEFAULT nextval('s_bar_id'),
name varchar(128)
);
# --- !Downs
drop table bar;
drop sequence s_bar_id;
Global.scala
import org.squeryl.adapters.{H2Adapter, PostgreSqlAdapter}
import org.squeryl.internals.DatabaseAdapter
import org.squeryl.{Session, SessionFactory}
import play.api.db.DB
import play.api.GlobalSettings
import play.api.Application
object Global extends GlobalSettings {
override def onStart(app: Application) {
SessionFactory.concreteFactory = app.configuration.getString("db.default.driver") match {
case Some("org.h2.Driver") => Some(() => getSession(new H2Adapter, app))
case Some("org.postgresql.Driver") => Some(() => getSession(new PostgreSqlAdapter, app))
case _ => sys.error("Database driver must be either org.h2.Driver or org.postgresql.Driver")
}
}
def getSession(adapter:DatabaseAdapter, app: Application) = Session.create(DB.getConnection()(app), adapter)
}
index.scala.html
#(form: play.api.data.Form[Bar])
#main("Welcome to Play 2.0") {
#helper.form(action = routes.Application.addBar) {
#helper.inputText(form("name"))
<input type="submit"/>
}
}
bar.scala
package models
import org.squeryl.{Schema, KeyedEntity}
case class Bar(name: Option[String]) extends KeyedEntity[Long] {
val id: Long = 0
}
object AppDB extends Schema {
val barTable = table[Bar]("bar")
}
application.scala
package controllers
import play.api.mvc._
import com.codahale.jerkson.Json
import play.api.data.Form
import play.api.data.Forms.{mapping, text, optional}
import org.squeryl.PrimitiveTypeMode._
import models.{AppDB, Bar}
object Application extends Controller {
val barForm = Form(
mapping(
"name" -> optional(text)
)(Bar.apply)(Bar.unapply)
)
def index = Action {
Ok(views.html.index(barForm))
}
def addBar = Action { implicit request =>
barForm.bindFromRequest.value map { bar =>
inTransaction(AppDB.barTable insert bar)
Redirect(routes.Application.index())
} getOrElse BadRequest
}
def getBars = Action {
val json = inTransaction {
val bars = from(AppDB.barTable)(barTable =>
select(barTable)
)
Json.generate(bars)
}
Ok(json).as(JSON)
}
Changed the "com.codahale" % "jerkson_2.9.1" % "0.5.0" to "com.cloudphysics" % "jerkson_2.10" % "0.6.3" in the build.sbt file and that allowed me to move on.

Performing a simple HTTP GET with Dispatch

The following is a valid query in a browser (e.g. Firefox):
http://www.freesound.org/api/sounds/search/?q=barking&api_key=074c0b328aea46adb3ee76f6918f8fae
yielding a JSON document:
{
"num_results": 610,
"sounds": [
{
"analysis_stats": "http://www.freesound.org/api/sounds/115536/analysis/",
"analysis_frames": "http://www.freesound.org/data/analysis/115/115536_1956076_frames.json",
"preview-hq-mp3": "http://www.freesound.org/data/previews/115/115536_1956076-hq.mp3",
"original_filename": "Two Barks.wav",
"tags": [
"animal",
"bark",
"barking",
"dog",
"effects",
...
I am trying to perform this query with Dispatch 0.9.4. Here's a build.sbt:
scalaVersion := "2.10.0"
libraryDependencies += "net.databinder.dispatch" %% "dispatch-core" % "0.9.4"
From sbt console, I do the following:
import dispatch._
val q = url("http://www.freesound.org/api/sounds/search")
.addQueryParameter("q", "barking")
.addQueryParameter("api_key", "074c0b328aea46adb3ee76f6918f8fae")
val res = Http(q OK as.String)
But the promise always completes with the following error:
res0: dispatch.Promise[String] = Promise(!Unexpected response status: 301!)
So what am I doing wrong? Here is the API documentation in case it helps.
You can enable redirect following with the configure method on the Http executor:
Http.configure(_ setFollowRedirects true)(q OK as.String)
You could also pull the Location out of the 301 response manually, but that's going to be a lot less convenient.