Hierarchy in Python 3.4 [duplicate] - class

I know, there are no 'real' private/protected methods in Python. This approach isn't meant to hide anything; I just want to understand what Python does.
class Parent(object):
def _protected(self):
pass
def __private(self):
pass
class Child(Parent):
def foo(self):
self._protected() # This works
def bar(self):
self.__private() # This doesn't work, I get a AttributeError:
# 'Child' object has no attribute '_Child__private'
So, does this behaviour mean, that 'protected' methods will be inherited but 'private' won't at all?
Or did I miss anything?

Python has no privacy model, there are no access modifiers like in C++, C# or Java. There are no truly 'protected' or 'private' attributes.
Names with a leading double underscore and no trailing double underscore are mangled to protect them from clashes when inherited. Subclasses can define their own __private() method and these will not interfere with the same name on the parent class. Such names are considered class private; they are still accessible from outside the class but are far less likely to accidentally clash.
Mangling is done by prepending any such name with an extra underscore and the class name (regardless of how the name is used or if it exists), effectively giving them a namespace. In the Parent class, any __private identifier is replaced (at compilation time) by the name _Parent__private, while in the Child class the identifier is replaced by _Child__private, everywhere in the class definition.
The following will work:
class Child(Parent):
def foo(self):
self._protected()
def bar(self):
self._Parent__private()
See Reserved classes of identifiers in the lexical analysis documentation:
__*
Class-private names. Names in this category, when used within the context of a class definition, are re-written to use a mangled form to help avoid name clashes between “private” attributes of base and derived classes.
and the referenced documentation on names:
Private name mangling: When an identifier that textually occurs in a class definition begins with two or more underscore characters and does not end in two or more underscores, it is considered a private name of that class. Private names are transformed to a longer form before code is generated for them. The transformation inserts the class name, with leading underscores removed and a single underscore inserted, in front of the name. For example, the identifier __spam occurring in a class named Ham will be transformed to _Ham__spam. This transformation is independent of the syntactical context in which the identifier is used.
Don't use class-private names unless you specifically want to avoid having to tell developers that want to subclass your class that they can't use certain names or risk breaking your class. Outside of published frameworks and libraries, there is little use for this feature.
The PEP 8 Python Style Guide has this to say about private name mangling:
If your class is intended to be subclassed, and you have attributes
that you do not want subclasses to use, consider naming them with
double leading underscores and no trailing underscores. This invokes
Python's name mangling algorithm, where the name of the class is
mangled into the attribute name. This helps avoid attribute name
collisions should subclasses inadvertently contain attributes with the
same name.
Note 1: Note that only the simple class name is used in the mangled
name, so if a subclass chooses both the same class name and attribute
name, you can still get name collisions.
Note 2: Name mangling can make certain uses, such as debugging and
__getattr__(), less convenient. However the name mangling algorithm
is well documented and easy to perform manually.
Note 3: Not everyone likes name mangling. Try to balance the need to
avoid accidental name clashes with potential use by advanced callers.

The double __ attribute is changed to _ClassName__method_name which makes it more private than the semantic privacy implied by _method_name.
You can technically still get at it if you'd really like to, but presumably no one is going to do that, so for maintenance of code abstraction reasons, the method might as well be private at that point.
class Parent(object):
def _protected(self):
pass
def __private(self):
print("Is it really private?")
class Child(Parent):
def foo(self):
self._protected()
def bar(self):
self.__private()
c = Child()
c._Parent__private()
This has the additional upside (or some would say primary upside) of allowing a method to not collide with child class method names.

By declaring your data member private :
__private()
you simply can't access it from outside the class
Python supports a technique called name mangling.
This feature turns class member prefixed with two underscores into:
_className.memberName
if you want to access it from Child() you can use: self._Parent__private()

Also PEP8 says
Use one leading underscore only for non-public methods and instance
variables.
To avoid name clashes with subclasses, use two leading underscores to
invoke Python's name mangling rules.
Python mangles these names with the class name: if class Foo has an
attribute named __a, it cannot be accessed by Foo.__a. (An insistent
user could still gain access by calling Foo._Foo__a.) Generally,
double leading underscores should be used only to avoid name conflicts
with attributes in classes designed to be subclassed.
You should stay away from _such_methods too, by convention. I mean you should treat them as private

Although this is an old question, I encountered it and found a nice workaround.
In the case you name mangled on the parent class because you wanted to mimic a protected function, but still wanted to access the function in an easy manner on the child class.
parent_class_private_func_list = [func for func in dir(Child) if func.startswith ('_Parent__')]
for parent_private_func in parent_class_private_func_list:
setattr(self, parent_private_func.replace("_Parent__", "_Child"), getattr(self, parent_private_func))
The idea is manually replacing the parents function name into one fitting to the current namespace.
After adding this in the init function of the child class, you can call the function in an easy manner.
self.__private()

AFAIK, in the second case Python perform "name mangling", so the name of the __private method of the parent class is really:
_Parent__private
And you cannot use it in child in this form neither

Related

Scala parameters for access modifiers?

What is the difference between
class Test {
private[this] val foo = 0
}
vs
class Test {
private val foo = 0
}
What all can go inside the []? Also, what should I search for when I want to look up the specs of this? I tried Googling various combinations of "scala access modifier arguments/parametrized scala access modifier" and nothing came up.
what should I search for when I want to look up the specs of this?
In The Scala Language Specification it is defined as "access modifier" and "access qualifier" (see BNF in §5.2).
What is the difference between
...
What all can go inside the []?
You can put class name, package name or this there. Here is a relevant quote from language specs that explains this (see §5.2 for more details):
The modifier can be qualified with an identifier C (e.g. private[C ]) that must
denote a class or package enclosing the definition. Members labeled with
such a modifier are accessible respectively only from code inside the package
C or only from code inside the class C and its companion module (§5.4).
An different form of qualification is private[this]. A member M marked
with this modifier is called object-protected; it can be accessed only from
within the object in which it is defined. That is, a selection p.M is only legal if the prefix is this or O.this, for some class O enclosing the reference. In
addition, the restrictions for unqualified private apply.
The first one is private for instance class, second is for class. If you use second version you have access from another instance of Test class (it's usefull for equals method or similiar).

Why are classes inside Scala package objects dispreferred?

Starting with 2.10, -Xlint complains about classes defined inside of package objects. But why? Defining a class inside a package object should be exactly equivalent to defining the classes inside of a separate package with the same name, except a lot more convenient.
In my opinion, one of the serious design flaws in Scala is the inability to put anything other than a class-like entity (e.g. variable declarations, function definitions) at top level of a file. Instead, you're forced to put them into a separate ''package object'' (often in package.scala), separate from the rest of the code that they belong with and violating a basic programming rule which is that conceptually related code should be physically related as well. I don't see any reason why Scala can't conceptually allow anything at top level that it allows at lower levels, and anything non-class-like automatically gets placed into the package object, so that users never have to worry about it.
For example, in my case I have a util package, and under it I have a number of subpackages (util.io, util.text, util.time, util.os, util.math, util.distances, etc.) that group heterogeneous collections of functions, classes and sometimes variables that are semantically related. I currently store all the various functions, classes, etc. in a package object sitting in a file called io.scala or text.scala or whatever, in the util directory. This works great and it's very convenient because of the way functions and classes can be mixed, e.g. I can do something like:
package object math {
// Coordinates on a sphere
case class SphereCoord(lat: Double, long: Double) { ... }
// great-circle distance between two points
def spheredist(a: SphereCoord, b: SphereCoord) = ...
// Area of rectangle running along latitude/longitude lines
def rectArea(topleft: SphereCoord, botright: SphereCoord) = ...
// ...
// ...
// Exact-decimal functions
class DecimalInexactError extends Exception
// Format floating point value in decimal, error if can't do exactly
formatDecimalExactly(val num: Double) = ...
// ...
// ...
}
Without this, I would have to split the code up inconveniently according to fun vs. class rather than by semantics. The alternative, I suppose, is to put them in a normal object -- kind of defeating the purpose of having package objects in the first place.
But why? Defining a class inside a package object should be exactly equivalent to defining the classes inside of a separate package with the same name,
Precisely. The semantics are (currently) the same, so if you favor defining a class inside a package object, there should be a good reason. But the reality is that there is at least one good reason no to (keep reading).
except a lot more convenient
How is that more convenient?
If you are doing this:
package object mypkg {
class MyClass
}
You can just as well do the following:
package mypkg {
class MyClass
}
You'll even save a few characters in the process :)
Now, a good and concrete reason not to go overboard with package objects is that while packages are open, package objects are not.
A common scenario would be to have your code dispatched among several projects, with each project defining classes in the same package. No problem here.
On the other hand, a package object is (like any object) closed (as the spec puts it "There can be only one package object per package"). In other words,
you will only be able to define a package object in one of your projects.
If you attempt to define a package object for the same package in two distinct projects, bad things will happen, as you will effectively end up with two
distinct versions of the same JVM class (n our case you would end up with two "mypkg.class" files).
Depending on the cases you might end up with the compiler complaining that it cannot find something that you defined in the first version of your package object,
or get a "bad symbolic reference" error, or potentially even a runtime error. This is a general limitation of package objects, so you have to be aware of it.
In the case of defining classes inside a package object, the solution is simple: don't do it (given that you won't gain anything substantial compared to just defining the class as a top level).
For type aliase, vals and vars, we don't have such a luxuary, so in this case it is a matter of weighing whether the syntactic convenience (compared to defining them in an object) is worth it, and then take care not to define duplicate package objects.
I have not found a good answer to why this semantically equivalent operation would generate a lint warning. Methinks this is a lint bug. The only thing that I have found that must not be placed inside a package object (vs inside a plain package) is an object that implements main (or extends App).
Note that -Xlint also complains about implicit classes declared inside package objects, even though they cannot be declared at package scope. (See http://docs.scala-lang.org/overviews/core/implicit-classes.html for the rules on implicit classes.)
I figured out a trick that allows for all the benefits of package objects without the complaints about deprecation. In place of
package object foo {
...
}
you can do
protected class FooPackage {
...
}
package object foo extends FooPackage { }
Works the same but no complaint. Clear sign that the complaint itself is bogus.

In Tritium, what's the difference between add_class and attribute

I can use add_class("classname") to add a class attribute to one of my elements, but I can also use attribute("class", "classname") to do the same.
What's the difference between the two functions? Any gotchas?
Yup, the tritium function add_class(...) will append the given argument to the class attribute in the node you're currently in (also prepending a space to separate it from other class names).
On the other hand, calling attribute("class", "classname") will actually clobber whatever class names already existed with the value you provided.
Below is an example illustrating both in tritium tester:
http://tritium.moovweb.com/43ecf5fdbc4bf6b07312372724df5a2522474cc3

What is the philosophy behind making instance variables public by default in Scala?

What is the philosophy behind making the instance variables public by default in Scala. Shouldn't making them private by default made developers make less mistakes and encourage composition?
First, you should know that when you write:
class Person( val name: String, val age: Int ) {
...
}
name and age aren't instance variables but accessors methods (getters), which are public by default.
If you write instead:
class Person( name: String, age: Int ) {
...
}
name and age are only instance variables, which are private as you can expect.
The philosophy of Scala is to prefer immutable instance variables, then having public accessors methods is no more a problem.
Private encourages monoliths. As soon as it's easier to put unrelated functionality into a class just because it needs to read some variables that happen to be private, classes start to grow.
It's just a bad default and one of the big reasons for classes with more than 1000 lines in Java.
Scala defaults to immutable, which removes a massive class of errors that people often use private to restrict (but not remove, as the class' own methods can still mutate the variables) in Java.
with immutables which are preferred in many places, public isn't so much of an problem
you can replace a public val with getters and setters without changing the client code, therefore you don't need the extra layer of getters and setters just in case you need it. (Actually you do get that layer but you don't notice it most of the time.)
the java anti pattern of private field + public setters and getters doesn't encapsulate much anyway
(An additional view supplementing the other answers:)
One major driver behind Java's encapsulation of fields was the uniform access policy, i.e. you didn't have to know or care whether something was implemented simply as a field, or calculated by a method on the fly. The big upside of this being that the maintainer of the class in question could switch between the two as required, without needing other classes to be modified.
In Java, this required that everything was accessed via a method, in order to provide the syntactic flexibility to calculate a value if needed.
In Scala, methods and fields can be accessed via equivalent syntax - so if you have a simple property now, there's no loss in encapsulation to expose it directly, since you can choose to expose it as a no-arg method later without your callers needing to know anything about the change.

What is exactly the point of auto-generating getters/setters for object fields in Scala?

As we know, Scala generates getters and setters automatically for any public field and make the actual field variable private. Why is it better than just making the field public ?
For one this allows swapping a public var/val with a (couple of) def(s) and still maintain binary compatibility. Secondly it allows overriding a var/val in derived classes.
First, keeping the field public allows a client to read and write the field. Since it's beneficial to have immutable objects, I'd recommend to make the field read only (which you can achieve in Scala by declaring it as "val" rather than "var").
Now back to your actual question. Scala allows you to define your own setters and getters if you need more than the trivial versions. This is useful to maintain invariants. For setters you might want to check the value the field is set to. If you keep the field itself public, you have no chance to do so.
This is also useful for fields declared as "val". Assume you have a field of type Array[X] to represent the internal state of your class. A client could now get a reference to this array and modify it--again you have no chance to ensure the invariant is maintained. But since you can define your own getter you can return a copy of the actual array.
The same argument applies when you make a field of a reference type "final public" in Java--clients can't reset the reference but still modify the object the reference points to.
On a related note: accessing a field via getters in Scala looks like accessing the field directly. The nice thing about this is that it allows to make accessing a field and calling a method without parameters on the object look like the same thing. So if you decide you don't want to store a value in a field anymore but calculate it on the fly, the client does not have to care because it looks like the same thing to him--this is known as the Uniform Access Principle
In short: the Uniform Access Principle.
You can use a val to implement an abstract method from a superclass. Imagine the following definition from some imaginary graphics package:
abstract class circle {
def bounds: Rectangle
def centre: Point
def radius: Double
}
There are two possible subclasses, one where the circle is defined in terms of a bounding box, and one where it's defined in terms of the centre and radius. Thanks to the UAP, details of the implementation can be completely abstracted away, and easily changed.
There's also a third possibility: lazy vals. These would be very useful to avoid recalculating the bounds of our circle again and again, but it's hard to imagine how lazy vals could be implemented without the uniform access principle.