Using Enumeration for shared variables in Scala - scala

Is it the right pattern to use Enumeration for holding shared variable values?
I am accepting arguments from the command line - arguments like "mongoUsername", "mongoPassword", "mongoDatabase" etc. - across a lot of different files, and want to remove the possibility of making a mistake while specifying the argument name.
I created an object as follows:
object CommonParams extends Enumeration {
val MONGO_USERNAME = "mongoUsername"
val MONGO_PASSWORD = "mongoPassword"
..
}
When accepting these parameters from the command line, the parameters will be read using CommonParams.MONGO_USERNAME rather than just "mongoUsername". This method works. My question is:
Is this the right way to do what I am trying to do?
I dont think I am using Enumeration correctly. What should I change?
What would I gain by declaring the CommonParams as follows:
.
object CommonParams extends Enumeration {
val MONGO_USERNAME = Value("mongoUsername")
val MONGO_PASSWORD = Value("mongoPassword")
..
}
If I declared CommonParams this way, I would have to use CommonParams.MONGO_USERNAME.toString each time instead of just using CommonParams.MONGO_USERNAME which is more verbose.
I understand that Enumeration can stand for a certain value being a "thing". However, I am holding a value inside an object attribute. What advantages would I get if I used the second way of declaring CommonParams?

In your first version, you should remove extends Enumeration, since you aren't actually using it.
The benefit of the second version is exactly that CommonParams.Values aren't strings, so that if you have e.g. a method accepting CommonParams.Value, you can't accidentally pass an invalid string. And also that you can get methods like CommonParams.values to list all values.

Related

Pass a class name as string argument to create instance

Is there any possible way to pass a class name/path as String argument to call it in code in runtime?
Im working with some legacy code and i have no way to change it globally. Creating new integration to it suggest me to create new copy of class X, rename it, and pass new instance of Y i have created manually. My mind tells me to pass Y as some kind of argument and never copy X again.
I don't quite understand why you (think that) you need to do what you are trying to do (why copy class in the first place rather than just using it? why pass classname around instead of the class itself?), but, yeah, you can instantiate classes by (fully qualified) name using reflection.
First you get a handle to the class itself:
val clazz = Class.forName("foo.bar.X")
Then, if constructor does not need any arguments, you can just do
val instance = clazz.newInstance
If you need to pass arguments to constructor, it gets a bit more complicated.
val constructor = clazz.getConstructors().find { c =>
c.getParameters().map(_.getParameterizedType) == args.map(_.getClass)
}.getOrElse (throw new Exception("No suitable constructor found")
// or if you know for sure there will be only one constructor,
// could just do clazz.getConnstructors.headOption.getOrElse(...)
val instance = constructor.newInstance(args)
Note though, that the resulting instance is of type Object (AnyRef), so there isn't much you can actually do with it without casting to some interface type your class is known to implement.
Let me just say it again: it is very likely not the best way to achieve what you are actually trying to do. If you open another question and describe your actual problem (not the solution to it you are trying to implement), you might get more helpful answers.

How do I subtract an RDD[(Key,Object)] from another one?

I want to change the format of my data, from RDD(Label:String,(ID:String,Data:Array[Double])) to an RDD Object with the label, id and data as components.
But when I print my RDD consecutively twice, the references of objects change :
class Data_Object(private val id:String, private var vector:Vector) extends Serializable {
var label = ""
...
}
First print
(1,ms3.Data_Object#35062c11)
(2,ms3.Data_Object#25789aa9)
Second print
(2,ms3.Data_Object#6bf5d886)
(1,ms3.Data_Object#a4eb65)
I think that explains why the subtract method doesn't work. So can I use subtract with objects as values, or do I return to my classic model ?
Unless you specify otherwise, objects in Scala (and Java) are compared using reference equality (i.e. their memory address). They are also printed out according to this address, hence the Data_Object#6bf5d886 and so on.
Using reference equality means that two Data_Object instances with identical properties will NOT compare as equal unless they are exactly the same object. Also, their references will change from one run to the next.
Particularly in a distributed system like Spark, this is no good - we need to be able to tell whether two objects in two different JVMs are the same or not, according to their properties. Until this is fixed, RDD operations like subtract will not give the results you expect.
Fortunately, this is usually easy to fix in Scala/Spark - define your class as a case class. This automatically generates equals and hashcode and toString methods derived from all of the properties of the class. For example:
case class Data_Object(id:String, label:String, vector:Vector)
If you want to compare your objects according to only some of the properties, you'll have to define your own equals and hashcode methods, though. See Programming in Scala, for example.

How to set up classes

i am an engineering student enrolled in computer programming trying to understand a practice assignment for an upcoming lab and was wondering if someone could help me with this step of my program, Step: using The init method for the class takes the first formal parameter self and a list of [x, y] pairs v and stores the list as a class instance variable
It sounds like you are using Python, but next time you post a question, make sure you specify that and tag your question as such. You are looking for something like the following code:
class MyClassName(object):
def __init__(self, pairs):
self.pairs = pairs
Let's look at this line by line:
class MyClassName(object):
The first line declares a class called MyClassName. It extends object, which is not super important to understand right now, but is basically saying that MyClassName is a particular type of object.
def __init__(self, pairs):
The second line creates a function called __init__ which will be called when you instantiate an object of type MyClassName. This line also declares what parameters it takes. It sounds like you already know that the first argument has to be self, and the second parameter, pairs, is the list of [x,y] pairs. In python, we don't need to specify what type these parameters are, so we need only to name them (Some languages would require us to specify that pairs is going to be a list of pairs).
self.pairs = pairs
Now all we have to do is set the instance variable. Inside a class, self refers to this particular instance of the object. In other words, every time we create a variable of type MyClassName, the self keyword will refer to that particular instance of the object, rather than to all instances of MyClassName. So in this case, self.pairs refers to the variable pairs in this particular instance of MyClassName. On the other hand, pairs simply refers to the argument passed into the function __init__.
So, to put all this together, we have defined a class called MyClassName, then defined the __init__ function, and in it, we set the instance variable self.pairs to be equal to the pairs variable passed into __init__.
Last, I'll give a quick example of how to instantiate MyClassName:
my_list = [(1,1),(2,4),(3,9),(4,16)]
my_instance = MyClassName(my_list)
Good luck!
[Edit] Also, I agree with the first comment on your question. You need to be more clear and verbose in exactly what you are trying to accomplish and not leave it up to guess work. In this case, I think I could tell what you were trying to do, but it may not always be clear.

Global Variable in Scala

I am trying to use global variable in Scala. to be accessible in the whole program .
val numMax: Int = 300
object Foo {.. }
case class Costumer { .. }
case class Client { .. }
object main {
var lst = List[Client]
// I would like to use Client as an object .
}
I got this error :
error: missing arguments for method apply in object List;
follow this method with `_' if you want to treat it as a partially applied function
var lst = List[A]
How can I deal with Global Variables in Scala to be accessible in the main program .
Should I use class or case class in this case ?
This isn't a global variable thing. Rather, you want to say this:
val lst = List(client1, client2)
However, I disagree somewhat with the other answers. Scala isn't just a functional language. It is both functional (maybe not as purely as it should be if you ask the Clojure fans) and object-oriented. Therefore, your OO expertise translates perfectly.
There is nothing wrong with global variables per se. The concern is mutability. Prefer val to var as I did. Also, you need to use object for singletons rather than the static paradigm you might be used to from Java.
The error you quote is unrelated to your attempt to create a global variable. You have missing () after the List[Client].
If you must create a global variable, you can put it in an object like Foo and reference it from other objects using Foo.numMax if the variable is called numMax.
However, global variables are discouraged. Maybe pass the data you need into the functions that need it instead. That is the functional way.

Scala instance value scoping

Note that this question and similar ones have been asked before, such as in Forward References - why does this code compile?, but I found the answers to still leave some questions open, so I'm having another go at this issue.
Within methods and functions, the effect of the val keyword appears to be lexical, i.e.
def foo {
println(bar)
val bar = 42
}
yielding
error: forward reference extends over definition of value bar
However, within classes, the scoping rules of val seem to change:
object Foo {
def foo = bar
println(bar)
val bar = 42
}
Not only does this compile, but also the println in the constructor will yield 0 as its output, while calling foo after the instance is fully constructed will result in the expected value 42.
So it appears to be possible for methods to forward-reference instance values, which will, eventually, be initialised before the method can be called (unless, of course, you're calling it from the constructor), and for statements within the constructor to forward-reference values in the same way, accessing them before they've been initialised, resulting in a silly arbitrary value.
From this, a couple of questions arise:
Why does val use its lexical compile-time effect within constructors?
Given that a constructor is really just a method, this seems rather inconsistent to entirely drop val's compile-time effect, giving it its usual run-time effect only.
Why does val, effectively, lose its effect of declaring an immutable value?
Accessing the value at different times may result in different results. To me, it very much seems like a compiler implementation detail leaking out.
What might legitimate usecases for this look like?
I'm having a hard time coming up with an example that absolutely requires the current semantics of val within constructors and wouldn't easily be implementable with a proper, lexical val, possibly in combination with lazy.
How would one work around this behaviour of val, getting back all the guarantees one is used to from using it within other methods?
One could, presumably, declare all instance vals to be lazy in order to get back to a val being immutable and yielding the same result no matter how they are accessed and to make the compile-time effect as observed within regular methods less relevant, but that seems like quite an awful hack to me for this sort of thing.
Given that this behaviour unlikely to ever change within the actual language, would a compiler plugin be the right place to fix this issue, or is it possible to implement a val-alike keyword with, for someone who just spent an hour debugging an issue caused by this oddity, more sensible semantics within the language?
Only a partial answer:
Given that a constructor is really just a method ...
It isn't.
It doesn't return a result and doesn't declare a return type (or doesn't have a name)
It can't be called again for an object of said class like "foo".new ("bar")
You can't hide it from an derived class
You have to call them with 'new'
Their name is fixed by the name of the class
Ctors look a little like methods from the syntax, they take parameters and have a body, but that's about all.
Why does val, effectively, lose its effect of declaring an immutable value?
It doesn't. You have to take an elementary type which can't be null to get this illusion - with Objects, it looks different:
object Foo {
def foo = bar
println (bar.mkString)
val bar = List(42)
}
// Exiting paste mode, now interpreting.
defined module Foo
scala> val foo=Foo
java.lang.NullPointerException
You can't change a val 2 times, you can't give it a different value than null or 0, you can't change it back, and a different value is only possible for the elementary types. So that's far away from being a variable - it's a - maybe uninitialized - final value.
What might legitimate usecases for this look like?
I guess working in the REPL with interactive feedback. You execute code without an explicit wrapping object or class. To get this instant feedback, it can't be waited until the (implicit) object gets its closing }. Therefore the class/object isn't read in a two-pass fashion where firstly all declarations and initialisations are performed.
How would one work around this behaviour of val, getting back all the guarantees one is used to from using it within other methods?
Don't read attributes in the Ctor, like you don't read attributes in Java, which might get overwritten in subclasses.
update
Similar problems can occur in Java. A direct access to an uninitialized, final attribute is prevented by the compiler, but if you call it via another method:
public class FinalCheck
{
final int foo;
public FinalCheck ()
{
// does not compile:
// variable foo might not have been initialized
// System.out.println (foo);
// Does compile -
bar ();
foo = 42;
System.out.println (foo);
}
public void bar () {
System.out.println (foo);
}
public static void main (String args[])
{
new FinalCheck ();
}
}
... you see two values for foo.
0
42
I don't want to excuse this behaviour, and I agree, that it would be nice, if the compiler could warn consequently - in Java and Scala.
So it appears to be possible for methods to forward-reference instance
values, which will, eventually, be initialised before the method can
be called (unless, of course, you're calling it from the constructor),
and for statements within the constructor to forward-reference values
in the same way, accessing them before they've been initialised,
resulting in a silly arbitrary value.
A constructor is a constructor. You are constructing the object. All of its fields are initialized by JVM (basically, zeroed), and then the constructor fills in whatever fields needs filling in.
Why does val use its lexical compile-time effect within constructors?
Given that a constructor is really just a method, this seems rather
inconsistent to entirely drop val's compile-time effect, giving it its
usual run-time effect only.
I have no idea what you are saying or asking here, but a constructor is not a method.
Why does val, effectively, lose its effect of declaring an immutable value?
Accessing the value at different times may result in different
results. To me, it very much seems like a compiler implementation
detail leaking out.
It doesn't. If you try to modify bar from the constructor, you'll see it is not possible. Accessing the value at different times in the constructor may result in different results, of course.
You are constructing the object: it starts not constructed, and ends constructed. For it not to change it would have to start out with its final value, but how can it do that without someone assigning that value?
Guess who does that? The constructor.
What might legitimate usecases for this look like?
I'm having a hard time coming up with an example that absolutely
requires the current semantics of val within constructors and wouldn't
easily be implementable with a proper, lexical val, possibly in
combination with lazy.
There's no use case for accessing the val before its value has been filled in. It's just impossible to find out whether it has been initialized or not. For example:
class Foo {
println(bar)
val bar = 10
}
Do you think the compiler can guarantee it has not been initialized? Well, then open the REPL, put in the above class, and then this:
class Bar extends { override val bar = 42 } with Foo
new Bar
And see that bar was initialized when printed.
How would one work around this behaviour of val, getting back all the
guarantees one is used to from using it within other methods?
Declare your vals before using them. But note that constuctor is not a method. When you do:
println(bar)
inside a constructor, you are writing:
println(this.bar)
And this, the object of the class you are writing a constructor for, has a bar getter, so it is called.
When you do the same thing on a method where bar is a definition, there's no this with a bar getter.