I tried to get json schema for different types using this library.
https://index.scala-lang.org/andyglow/scala-jsonschema/scala-jsonschema/0.7.6?target=_2.13
case class Pizza(a:Int, b:String)
val stringSchema: json.Schema[Pizza] = Json.schema[Pizza]
But I want to skip/hide some unsupported or not required fields.
case class Pizza(a:Int, b:String c:ZonedDateTime)//want to skip field c
Is there any way to do this? Or any other library we can achieve the same.
Related
I would like to update the schema of an spark dataframe by first converting it to a dataset which contains less columns. Background: i would like to remove some deeply nested fields from a schema.
I tried the following but the schema does not change:
import org.apache.spark.sql.functions._
val initial_df = spark.range(10).withColumn("foo", lit("foo!")).withColumn("bar", lit("bar!"))
case class myCaseClass(bar: String)
val reduced_ds = initial_df.as[myCaseClass]
The schema still includes the other fields:
reduced_ds.schema // StructType(StructField(id,LongType,false),StructField(foo,StringType,false),StructField(bar,StringType,false))
Is there a way to update the schema that way?`
It also confuses me that when i collect the dataset it only returns the fields defined in the case class:
reduced_ds.limit(1).collect() // Array(myCaseClass(bar!))
Add a fake map operation to force the projection using the predefined identity function:
import org.apache.spark.sql.functions._
val initial_df = spark.range(10).withColumn("foo", lit("foo!")).withColumn("bar", lit("bar!"))
case class myCaseClass(bar: String)
val reduced_ds = initial_df.as[myCaseClass].map(identity)
This yields
reduced_ds.schema // StructType(StructField(bar,StringType,true))
in the doc: https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Dataset.html#as%5BU%5D(implicitevidence$2:org.apache.spark.sql.Encoder%5BU%5D):org.apache.spark.sql.Dataset%5BU%5D
it says:
Note that as[] only changes the view of the data that is passed into
typed operations, such as map(), and does not eagerly project away any
columns that are not present in the specified class.
To achieve what you want to do you need to
initial_df.select(the columns in myCaseClass).as[myCaseClass]
It is normal since when u collect reduced_ds it returns record of Type myCaseClass, myCaseClass has only one attribute named bar. That's not conflicting with the fact that the dataset schema is something else
I'm using Slick 3.1.0 and Slick-pg 0.10.0. I have an enum as:
object UserProviders extends Enumeration {
type Provider = Value
val Google, Facebook = Value
}
Following the test case, it works fine with the column mapper simply adding the following implicit mapper into my customized driver.
implicit val userProviderMapper = createEnumJdbcType("UserProvider", UserProviders, quoteName = true)
However, when using plain SQL, I encountered the following compilation error:
could not find implicit value for parameter e: slick.jdbc.SetParameter[Option[models.UserProviders.Provider]]
I could not find any document about this. How can I write plain SQL with enum in slick? Thanks.
You need to have an implicit of type SetParameter[T] in scope which tells slick how to set parameters from some custom type T that it doesn't already know about. For example:
implicit val setInstant: SetParameter[Instant] = SetParameter { (instant, pp) =>
pp.setTimestamp(new Timestamp(instant.toEpochMilli))
}
The type of pp is PositionedParameters.
You might also come across the need to tell slick how to extract a query result into some custom type T that it doesn't already know about. For this, you need an implicit GetResult[T] in scope. For example:
implicit def getInstant(implicit get: GetResult[Long]): GetResult[Instant] =
get andThen (Instant.ofEpochMilli(_))
I'm struggling using custom case classes to write to Cassandra (2.1.6) using Spark (1.4.0). So far, I've tried this by using the DataStax spark-cassandra-connector 1.4.0-M1 and the following case classes:
case class Event(event_id: String, event_name: String, event_url: String, time: Option[Long])
[...]
case class RsvpResponse(event: Event, group: Group, guests: Long, member: Member, mtime: Long, response: String, rsvp_id: Long, venue: Option[Venue])
In order to make this work, I've also implemented the following converter:
implicit object EventToUDTValueConverter extends TypeConverter[UDTValue] {
def targetTypeTag = typeTag[UDTValue]
def convertPF = {
case e: Event => UDTValue.fromMap(toMap(e)) // toMap just transforms the case class into a Map[String, Any]
}
}
TypeConverter.registerConverter(EventToUDTValueConverter)
If I look up the converter manually, I can use it to convert an instance of Event into UDTValue, however, when using sc.saveToCassandra passing it an instance of RsvpResponse with related objects, I get the following error:
15/06/23 23:56:29 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
com.datastax.spark.connector.types.TypeConversionException: Cannot convert object Event(EVENT9136830076436652815,First event,http://www.meetup.com/first-event,Some(1435100185774)) of type class model.Event to com.datastax.spark.connector.UDTValue.
at com.datastax.spark.connector.types.TypeConverter$$anonfun$convert$1.apply(TypeConverter.scala:42)
at com.datastax.spark.connector.types.UserDefinedType$$anon$1$$anonfun$convertPF$1.applyOrElse(UserDefinedType.scala:33)
at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:40)
at com.datastax.spark.connector.types.UserDefinedType$$anon$1.convert(UserDefinedType.scala:31)
at com.datastax.spark.connector.writer.DefaultRowWriter$$anonfun$readColumnValues$2.apply(DefaultRowWriter.scala:46)
at com.datastax.spark.connector.writer.DefaultRowWriter$$anonfun$readColumnValues$2.apply(DefaultRowWriter.scala:43)
It seems my converter is never even getting called because of the way the connector library is handling UDTValue internally. However, the solution described above does work for reading data from Cassandra tables (including user defined types). Based on the connector docs, I also replaced my nested case classes with com.datastax.spark.connector.UDTValue types directly, which then fixes the issue described, but breaks reading the data. I can't imagine I'm meant to define 2 separate models for reading and writing data. Or am I missing something obvious here?
Since version 1.3, there is no need to use custom type converters to load and save nested UDTs. Just model everything with case classes and stick to the field naming convention and you should be fine.
I've got a case class Game which I have no trouble serializing/deserializing using json4s.
case class Game(name: String,publisher: String,website: String, gameType: GameType.Value)
In my app I use mapperdao as my ORM. Because Game uses a Surrogate Id I do not have id has part of its constructor.
However, when mapperdao returns an entity from the DB it supplies the id of the persisted object using a trait.
Game with SurrogateIntId
The code for the trait is
trait SurrogateIntId extends DeclaredIds[Int]
{
def id: Int
}
trait DeclaredIds[ID] extends Persisted
trait Persisted
{
#transient
private var mapperDaoVM: ValuesMap = null
#transient
private var mapperDaoDetails: PersistedDetails = null
private[mapperdao] def mapperDaoPersistedDetails = mapperDaoDetails
private[mapperdao] def mapperDaoValuesMap = mapperDaoVM
private[mapperdao] def mapperDaoInit(vm: ValuesMap, details: PersistedDetails) {
mapperDaoVM = vm
mapperDaoDetails = details
}
.....
}
When I try to serialize Game with SurrogateIntId I get empty parenthesis returned, I assume this is because json4s doesn't know how to deal with the attached trait.
I need a way to serialize game with only id added to its properties , and almost as importantly a way to do this for any T with SurrogateIntId as I use these for all of my domain objects.
Can anyone help me out?
So this is an extremely specific solution since the origin of my problem comes from the way mapperDao returns DOs, however it may be helpful for general use since I'm delving into custom serializers in json4s.
The full discussion on this problem can be found on the mapperDao google group.
First, I found that calling copy() on any persisted Entity(returned from mapperDao) returned the clean copy(just case class) of my DO -- which is then serializable by json4s. However I did not want to have to remember to call copy() any time I wanted to serialize a DO or deal with mapping lists, etc. as this would be unwieldy and prone to errors.
So, I created a CustomSerializer that wraps around the returned Entity(case class DO + traits as an object) and gleans the class from generic type with an implicit manifest. Using this approach I then pattern match my domain objects to determine what was passed in and then use Extraction.decompose(myDO.copy()) to serialize and return the clean DO.
// Entity[Int, Persisted, Class[T]] is how my DOs are returned by mapperDao
class EntitySerializer[T: Manifest] extends CustomSerializer[Entity[Int, Persisted, Class[T]]](formats =>(
{PartialFunction.empty} //This PF is for extracting from JSON and not needed
,{
case g: Game => //Each type is one of my DOs
implicit val formats: Formats = DefaultFormats //include primitive formats for serialization
Extraction.decompose(g.copy()) //get plain DO and then serialize with json4s
case u : User =>
implicit val formats: Formats = DefaultFormats + new LinkObjectEntitySerializer //See below for explanation on LinkObject
Extraction.decompose(u.copy())
case t : Team =>
implicit val formats: Formats = DefaultFormats + new LinkObjectEntitySerializer
Extraction.decompose(t.copy())
...
}
The only need for a separate serializer is in the event that you have non-primitives as parameters of a case class being serialized because the serializer can't use itself to serialize. In this case you create a serializer for each basic class(IE one with only primitives) and then include it into the next serializer with objects that depend on those basic classes.
class LinkObjectEntitySerializer[T: Manifest] extends CustomSerializer[Entity[Int, Persisted, Class[T]]](formats =>(
{PartialFunction.empty},{
//Team and User have Set[TeamUser] parameters, need to define this "dependency"
//so it can be included in formats
case tu: TeamUser =>
implicit val formats: Formats = DefaultFormats
("Team" -> //Using custom-built representation of object
("name" -> tu.team.name) ~
("id" -> tu.team.id) ~
("resource" -> "/team/") ~
("isCaptain" -> tu.isCaptain)) ~
("User" ->
("name" -> tu.user.globalHandle) ~
("id" -> tu.user.id) ~
("resource" -> "/user/") ~
("isCaptain" -> tu.isCaptain))
}
))
This solution is hardly satisfying. Eventually I will need to replace mapperDao or json4s(or both) to find a simpler solution. However, for now, it seems to be the fix with the least amount of overhead.
I'm trying to implement a controller in Play2 which exposes a simple REST-style api for my db-tables. I'm using squeryl for database access and spray-json for converting objects to/from json
My idea is to have a single generic controller to do all the work, so I've set up the following routes in conf/routes:
GET /:tableName controllers.Crud.getAll(tableName)
GET /:tableName/:primaryKey controllers.Crud.getSingle(tableName, primaryKey)
.. and the following controller:
object Crud extends Controller {
def getAll(tableName: String) = Action {..}
def getSingle(tableName: String, primaryKey: Long) = Action {..}
}
(Yes, missing create/update/delete, but let's get read to work first)
I've mapped tables to case classes by extended squeryl's Schema:
object MyDB extends Schema {
val accountsTable = table[Account]("accounts")
val customersTable = table[Customer]("customers")
}
And I've told spray-json about my case classes so it knows how to convert them.
object MyJsonProtocol extends DefaultJsonProtocol {
implicit val accountFormat = jsonFormat8(Account)
implicit val customerFormat = jsonFormat4(Customer)
}
So far so good, it actually works pretty well as long as I'm using the table-instances directly. The problem surfaces when I'm trying to generify the code so that I end up with excatly one controller for accessing all tables: I'm stuck with some piece of code that doesn't compile and I am not sure what's the next step.
It seems to be a type issue with spray-json which occurs when I'm trying to convert the list of objects to json in my getAll function.
Here is my generic attempt:
def getAll(tableName: String) = Action {
val json = inTransaction {
// lookup table based on url
val table = MyDB.tables.find( t => t.name == tableName).get
// execute select all and convert to json
from(table)(t =>
select(t)
).toList.toJson // causes compile error
}
// convert json to string and set correct content type
Ok(json.compactPrint).as(JSON)
}
Compile error:
[error] /Users/code/api/app/controllers/Crud.scala:29:
Cannot find JsonWriter or JsonFormat type class for List[_$2]
[error] ).toList.toJson
[error] ^
[error] one error found
I'm guessing the problem could be that the json-library needs to know at compile-time which model type I'm throwing at it, but I'm not sure (notice the List[_$2] in that compile error). I have tried the following changes to the code which compile and return results:
Remove the generic table-lookup (MyDB.tables.find(.....).get) and instead use the specific table instance e.g. MyDB.accountsTable. Proves that JSON serialization for work . However this is not generic, will require a unique controller and route config per table in db.
Convert the list of objects from db query to a string before calling toJson. I.e: toList.toJson --> toList.toString.toJson. Proves that generic lookup of tables work But not a proper json response since it is a string-serialized list of objects..
Thoughts anyone?
Your guess is correct. MyDb.tables is a Seq[Table[_]], in other words it could hold any type of table. There is no way for the compiler to figure out the type of the table you locate using the find method, and it seems like the type is needed for the JSON conversion. There are ways to get around that, but you'd need to some type of access to the model class.