I want to create a Map with information about sales in one class (object) and that use it in another class ProcessSales - iterate over the Map keys and use values. I have already written logic creating a Map in an object SalesData.
However since I've started learning Scala not long ago I'm not sure if it is a good approach to implement the logic in an object.
What will be the correct way to access the Map from another class?
Should the Map be created in an object or in a separate class? Or maybe it's better to create an object in the ProcessSales class that will be using it?
Could you share best practices and examples?
object SalesData {
val stream : InputStream = getClass.getResourceAsStream("/sales.csv")
val salesIterator: Iterator[String] = scala.io.Source.fromInputStream(stream).getLines
def getSales(salesData: Iterator[String]): Map[Int, String] = {
salesData
.map(_.split(","))
.map(line => (line(0).toInt, line(1)))
.toMap
}
val salesMap: Map[Int, String] = getSales(salesIterator)
}
If you wanted flexibility to "mix in" this map you could put the map and getSales() into a new trait.
If, on the other hand, you wanted to insure one and only one factory method existed to create the map, you could put getSales() into a companion object, which has to have the same name as your class and defined in the same source file. For example,
object ProcessSales {
def getSales():Map[Int,String] = {...}
}
Remember that methods in a companion object are analogous to static methods in Java.
It is also possible to put the map instance itself into the companion object, if you want the map to be a singleton--one map instance per many instances of ProcessSales.
Or, if you want 1 such map per each instance of ProcessSales, you would make it a field within the ProcessSales class.
Or, if you wanted the map to be available to all members of a class hierarchy under ProcessSales, you could make ProcessSales an abstract class. But regarding use of an abstract class, remember that use of a trait affords greater flexibility in case you are not certain that all subclasses in the hierarchy will need the map.
It all depends on how you want to use it. Scala is more functional oriented. So for the best practice, you could define getSalesData in an object and in another object you could pass the parameters and call the def getSalesData.
For example,
import packaganame.SalesData._;
Object Check {
val stream : InputStream = getClass.getResourceAsStream("/sales.csv");
val salesIterator: Iterator[String] = scala.io.Source.fromInputStream(stream).getLines;
val salesMap = getSales(salesIterator);
}
Related
I have a very basic and simple Scala question. For example, I have a java class like that
class Dataset{
private List<Record> records;
Dataset(){
records = new ArrayList<Record>()
}
public void addItem(Record r){
records.add(r)
}
}
When I try to write same class in Scala, I encoutered with some error:
class RecordSet() {
private var dataset:List[Record]
def this(){
dataset = new List[Record]
}
def addRecord(rd: Record)={
dataset :+ rd
}
}
I cannot declare a List variable like ( private var dataset:List[Record])
and cannot write a default constructor.
Here is how you will replicate the Java code you mentioned in your question:
// defining Record so the code below compiles
case class Record()
// Here is the Scala implementation
class RecordSet(private var dataset:List[Record]) {
def addRecord(rd: Record)={
dataset :+ rd
}
}
Some explanation:
In Scala, when you define a class, you have the ability to pass parameter to the class definition. eg: class Foo(num:Int, descr:String) Scala would automatically use the given parameter to create a primary constructor for you. So you can now instantiate the Foo, like so new Foo(1, "One"). This is different in Java where you have to explicitly define parameter accepting constructors.
You have to be aware that the parameter passed do not automatically become instance member of the class. Although if you want, you can tell Scala to make them instance member. There are various ways to do this, one way is to prefix the parameter with either var or val. For example class Foo(val num:Int, val descr:String) or class Foo(var num:Int, var descr:String). The difference is that with val, the instance variable are immutable. With var they are mutable.
Also, by default the instance member Scala will generate would be public. That means they can be accessed directly from an instance of the object. For example:
val foo = new Foo(1, "One")
println(foo.num) // prints 1.
If you want them to be private, you add private keyword to the definition. So that would become:
class Foo(private var num:Int, private var desc:String)
The reason why your code fails to compile is you define a method called this() which is used to create multiple constructors. (and not to create a constructor that initiates a private field which is your intention judging from the Java code you shared). You can google for multiple constructors or auxiliary constructors to learn more about this.
As dade told the issue in your code is that with this keyword you are actually creating an auxilary constructor which has some limitations like the first line of your auxilary constructor must be another constructor (auxilary/primary). Hence you cannot use such a way to create a class.
Also you can not write such lines in a scala concrete class private var dataset:List[Record] as it is considered as abstract (no definition provided).
Now with the code. Usually in Scala we don't prefer mutability because it introduces side-effects in our functions (which is not the functional way but as scala is not purely functional you can use mutability too).
In Scala way, the code should be something like this:
class RecordSet(private val dataset:List[Record]) {
def addRecord(rd: Record): RecordSet ={
new RecordSet(dataset :+ rd)
}
}
Now with the above class there is no mutability. Whenever you are adding on an element to the dataset a new instance of RecordSet is being created. Hence no mutability.
However, if you have to use the same class reference in your application use your a mutable collection for your dataset like below:
class RecordSet(private val dataset:ListBuffer[Record]) {
def addRecord(rd: Record): ListBuffer[Record] ={
dataset += rd
}
}
Above code will append the new record in the existing dataset with the same class reference.
While learning Scala, I came across interesting concept of companion object. Companion object can used to define static methods in Scala. Need few clarifications in the below Spark Scala code in regard of companion object.
class BballStatCounter extends Serializable {
val stats: StatCounter = new StatCounter()
var missing: Long = 0
def add(x: Double): BballStatCounter = {
if (x.isNaN) {
missing += 1
} else {
stats.merge(x)
}
this
}
}
object BballStatCounter extends Serializable {
def apply(x: Double) = new BballStatCounter().add(x)
}
Above code is invoked using val stat3 = stats1.map(b=>BballStatCounter(b)).
What is nature of variables stats and missing declared in the
class? Is it similar to class attributes of Python?
What is the significance of apply method in here?
Here stats and missing are class attributes and each instance of BballStatCounter will have their own copy of them just like in Python.
In Scala the method apply serves a special purpose, if any object has a method apply and if that object is used as function calling notation like Obj() then the compiler replaces that with its apply method calling, like Obj.apply() .
The apply method is generally used as a constructor in a Class Companion object.
All the collection Classes in Scala has a Companion Object with apply method, thus you are able to create a list like : List(1,2,3,4)
Thus in your above code BballStatCounter(b) will get compiled to BballStatCounter.apply(b)
stats and missing are members of the class BcStatCounter. stats is a val so it cannot be changed once it has been defined. missing is a var so it is more like a traditional variable and can be updated, as it is in the add method. Every instance of BcStatCounter will have these members. (Unlike Python, you can't add or remove members from a Scala object)
The apply method is a shortcut that makes objects look like functions. If you have an object x with an apply method, you write x(...) and the compiler will automatically convert this to x.apply(...). In this case it means that you can call BballStatCounter(1.0) and this will call the apply method on the BballStatCounter object.
Neither of these questions is really about companion objects, this is just the normal Scala class framework.
Please note the remarks in the comments about asking multiple questions.
When I use ArrayBuffer, I should use:
val arr = new ArrayBuffer[Int]
but when I use Map, I should use:
val map = Map[Int, Int]()
To understand why you need to use Map[T, T]() and not new Map[T, T](...), you need to understand the how the apply method on a companion object works.
A companion object is an object that has the same name as a class along with it. This object contains, generally, contains factory methods and other methods that you need to create (easily) the objects of the class.
To make sure that one doesn't have to go through a lot of verbose code, Scala makes use of the apply method which is executed directly when you call the object as you would call a function.
So, the companion object of Map must look something like this:
object Map {
def apply[K, V](...) = new Map[K,V](...) // Or something like this
}
While the class would be something like
protected class Map[K, V](...) {
...
}
Now calling Map[String, String](...) you are actually calling the apply method of the Map companion object.
ArrayBuffer, here, does not have a companion object though. Thus, you need to create a new instance of the class yourself by directly using the constructor.
This is a follow up to the following question, which concerned serialization: How best to keep a cached list of member fields, one each for a family of case classes in Scala
I'm trying to generically support deserialization in the same way. One straightforward attempt is the following:
abstract class Serializer[T](implicit ctag: ClassTag[T]) {
private val fields = ctag.runtimeClass.getDeclaredFields.toList
fields foreach { _.setAccessible(true) }
implicit class AddSerializeMethod(obj: T) {
def serialize = fields.map(f => (f.getName, f.get(obj)))
}
def deserialize(data: List[(String, Any)]): T = {
val m = data toMap
val r: T = ctag.runtimeClass.newInstance // ???
fields.foreach { case f => f.set(r, m(f.getName)) }
r;
}
}
There are a couple of issues with the code:
The line with val r: T = ... has a compile error because the compiler thinks it's not guaranteed to have the right type. (I'm generally unsure of how to create a new instance of a generic class in a typesafe way -- not sure why this isn't safe since the instance of Serializer is created with a class tag whose type is checked by the compiler).
The objects I'm creating are expected to be immutable case class objects, which are guaranteed to be fully constructed if created in the usual way. However, since I'm mutating the fields of instances of these objects in the deserialize method, how can I be sure that the objects will not be seen as partially constructed (due to caching and instruction reordering) if they are published to other threads?
ClassTag's runtimeClass method returns Class[_], not Class[T], probably due to the fact generics in Scala and Java behave differently; you can try casting it forcefully: val r: T = ctag.runtimeClass.newInstance.asInstanceOf[T]
newInstance calls the default, parameterless constructor. If the class doesn't have one, newInstance will throw InstantiationException. There's no way around it, except for:
looking around for other constructors
writing custom serializers (see how Gson does that; BTW Gson can automatically serialize only classes with parameterless constructors and those classes it has predefined deserializers for)
for case classes, finding their companion object and calling its apply method
Anyhow, reflection allows for modifying final fields as well, so if you manage to create an immutable object, you'll be able to set its fields.
I'm new to scala and can't get my head around how the Lift guys implemented the Record API. However, the question is less about this API but more about Scala in general. I'm interested in how the object in class pattern works, used in Lift.
class MainDoc private() extends MongoRecord[MainDoc] with ObjectIdPk[MainDoc] {
def meta = MainDoc
object name extends StringField(this, 12)
object cnt extends IntField(this)
}
object MainDoc extends MainDoc with MongoMetaRecord[MainDoc]
In the upper snippet you can see how a record is defined in Lift. The interesting part is that the fields are defined as objects. The API allows you to create Instances like this:
val md1 = MainDoc.createRecord
.name("md1")
.cnt(5)
.save
This is probably done by using the apply method? But at the same time you are able to get the values by doing something like this:
val name = md1.name
How does this all work? Are the objects not that static when in scope of an class. Or are they just constructor classes for some internal representation? How is it possible to iterate over all fields, do you use Reflection?
Thanks,
Otto
Otto,
You are more of less on the right track. You actually don't need to define your fields as objects, you could have written your example as
class MainDoc private() extends MongoRecord[MainDoc] with ObjectIdPk[MainDoc] {
def meta = MainDoc
val name = new StringField(this, 12)
val cnt= new IntField(this)
}
object MainDoc extends MainDoc with MongoMetaRecord[MainDoc]
The net.liftweb.record.Field trait does contain an apply method that is the equivalent to set. That's why you can assign the fields by name after instantiating the object.
The field reference you mentioned:
val name = md1.name
Would type name as a StringField. If what you were thinking was
val name: String = md1.name
that would fail to compile (unless there was an implicit in scope to convert Field[T] => T). The proper way retrieve the String value of the field would be
val name = md1.name.get
Record does use reflection to gather the fields. When you define an object within a class, the compiler will create a field to hold the object instance. From the standpoint of reflection, the object appears very similar to the alternate way to define a field that I mentioned before. Each of the definitions probably creates a subclass of the field type, but that's no different than
val name = new StringField(this, 12) {
override def label: NodeSeq = <span>My String Field</span>
}
You're right about it being the apply method. Record's Field base class defines a few apply methods.
def apply(in: Box[MyType]): OwnerType
def apply(in: MyType): OwnerType
By returning the OwnerType, you can chain invocations together.
Regarding the use of object to define fields, that confused me at first, too. The object identifier defines an object within a particular scope. Even though it's convenient to think of object as a shortcut for the singleton pattern, it's more flexible than that. According to the Scala Language Spec (section 5.4):
It is roughly equivalent to the following definition of a lazy value:
lazy val m = new sc with mt1 with ... with mtn { this: m.type => stats }
<snip/>
The expansion given above is not accurate for top-level objects. It cannot be because variable and method definition cannot appear on the top-level outside of a
package object (§9.3). Instead, top-level objects are translated to static fields.
Regarding iterating over all the fields, Record objects define a allFields method which returns a List[net.liftweb.record.Field[_, MyType]].