I'm trying to learn/understand Rx, specifically RxJS, and keep seeing references to IObservable, IObserver, etc.
Can anyone tell me what the leading I means and/or where it comes from?
From my searching, it looks like the <T> is for the type. If this is wrong or naive, I'd appreciate some clarification on this as well.
Thanks!
In ye olden days of MFC for C++, Microsoft had Hungarian notation down to a very irritating artform, where all concrete classes were prefixed with C and their COM interfaces with I, this does help avoid the conflict where a COM interface and class might share the same name and so muddy your project.
Part of this notation carried over into .NET, except only interfaces kept the I prefix, but classes and other types dropped their Cs. This does make non-interface-heavy code easier to look at, but can cause ambiguity if you begin a class name with a 2-letter acronym beginning with I (as two-letter acronyms must be completely capitalised according to the the .NET style guidelines), but this is rare.
(I note that generic type name placeholders are prefixed with T too, e.g. TKey and TValue in Dictionary).
An example of why this is necessary is when dealing with collections in .NET, if you're building a reusable library and don't want to expose implementation details (e.g. if you use List<T> or T[] as an underlying collection field type), you can use IList<T> or IReadOnlyList<T> which are interfaces. If the interface was simply called List<T> it would conflict with the actual type List<T>, and ReadOnlyList<T> (an interface) might get confused with ReadOnlyCollection<T> (a class).
You might argue that this wouldn't be a problem if classes and interfaces had a different namespace. C does this: struct types and scalars exist in different namespaces, which unfortunately means that every time a struct type name is used, its usage must be prefixed with struct (e.g. a declaration: struct Foo foo). People workaround this by using typedef with anonymous structs, but I feel the end-result is messy (and the Linux kernel coding guidelines prohibit this too).
In Java, however, interfaces are not prefixed with I but instead have class-like names. Whether this is "correct" or "better" is entirely up for debate. C++ does not have interface types, just pure-abstract classes and multiple-inheritance, so the I prefix isn't typically seen at all outside of COM.
Related
In the C++ world, there are two well-known strategies for maintaining binary compatibility with a library:
Interfaces: all public classes are "interface" classes (only pure virtual methods, no members), which are implemented by private subclasses in the library
PIMPL pattern: all public classes hold a single member which is a pointer to a forward-declared class, whose definition is private to the library
Both of these achieve binary stability, but #1 comes with some major disadvantages. The primary one, I believe, is that in the library, methods that accept instances of the public interface classes almost always must immediately force-downcast them to the private implementation classes. The use of interfaces incorrectly signals to clients that they are free to supply their own implementations of these interfaces, which if they ever do, will immediately fail on one of these force-downcasts. Unless polymorphism is the goal, the use of interfaces is arguably the wrong design.
Now let's consider "high-level" languages like Java, Kotlin, C# and Swift (and maybe even Typescript and Ruby). We can certainly adopt strategy #1. However this strategy suffers from the same concerns mentioned above.
But what about the PIMPL pattern? There's no such thing as "forward declaration" in these languages, but we can't even separate the class definition and implementation into different files. The compiler does this for us when it creates the package. So does an analogous pattern exist in these languages that "hides" the private details in the sense that it lets us freely modify private details without breaking binary compatibility?
Which leads to the next question...
Is it even necessary to begin with to "hide" class innards to achieve binary stability in those languages? This is necessary in C++ because of its on-stack value semantics, which makes compiled client code sensitive to the memory size of the library's classes. But to my knowledge, class instances in the "high-level" languages aren't moved around on the call stack, and instead work more like pointers/references would in C++, which may render the concern moot. If that's true, we can simply write classes "naively", and be sure that the binary compatibility remains stable as long as we don't mess with public methods/members. We could, however, do whatever we wish with private members, even if it entails changing the memory size of the public classes, and it wouldn't force client code to be recompiled.
So, in summary: are PIMPL patterns possible in these languages, or does the concept not even apply because there's no problem of private details "leaking" into the binary interface to begin with?
I often read some programming languages have "First Class" support for modules (OCaml, Scala, TypeScript[?]) and recently stumbled upon an answer on SO citing modules as first class citizens among the distinguishing features of Scala.
I thought I knew very well what Modular Programming means but after these incidents I'm beginning to doubt my understanding...
I think modules are nothing special but instances of certain classes that are acting as mini-libraries. The mini-library code goes into a class, objects of that class are the modules. You can pass them around as dependencies to any other class that requires the services provided by the module, so any decent OOPL has first class modules but apparently not!
What exactly is a module? How is it different than, say, a plain class or an object?
How is (1) related (or not) to the Modular Programming that we all know?
What exactly does it mean for a language to have first class modules? What are the benefits? what are the drawbacks if a languages lacks such feature?
A module, as well as a subroutine, is a way of organizing your code. When we develop programs, we pack instructions into subroutines, subroutines into structures, structures into packages, libraries, assemblies, frameworks, solutions, and so on. So, putting everything else aside, it is just a mechanism to organize your code.
The essential reason, why we use all those mechanisms, instead of just laying out our instructions linearly, is because the complexity of a program grows non-linearly with respect to its size. In other words, a program built from n pieces each having m instructions is easier to comprehend than a program which is built from n*m instructions. This is, of course, not always true (otherwise we can just split our program into arbitrary parts and be happy). In fact, for that to be true, we have to introduce one essential mechanism called abstraction. We can benefit from splitting a program into manageable subparts only if each part provides some sort of abstraction. For example, we can have, connect_to_database, query_for_students, sort_by_grade, and take_the_first_n abstractions packed as functions or subroutines, and is much easier to understand the code which is expressed in terms of those abstractions, rather than trying to understand the code in which all those functions are inlined.
So now we have functions and it is natural to introduce the next level of organization -- collections of functions. It is common to see that some functions build families around some common abstraction, e.g., student_name, student_grade, student_courses, etc, they all revolve around the same abstraction student. The same is for connection_establish, connection_close, etc. Therefore we need some mechanism that will tie together those functions. Here we are starting to have options. Some languages took the OOP path, in which objects and classes are the units of the organization. Where a bunch of functions and a state is called an object. Other languages took a different path and decided to combine functions into static structures called modules. The main difference is that a module is a static, compile-time structure, where objects are runtime structures that have to be created in runtime to be used. As a result, naturally, objects tend to contain state, while modules do not (and contain only code). And objects are inherently regular values, which you can assign to variables, store them in files, and do other manipulations which you can do with data. Classical modules, contrary to objects, do not have runtime representation, therefore you can't pass modules as parameters to your functions, store them in a list, and otherwise perform any computations on modules. This is basically what people mean by saying first class citizen - an ability to treat an entity as a simple value.
Back to composable programs. In order to make objects/modules composable, we need to be sure that they create abstractions. For functions abstraction boundary is clearly defined - it is the tuple of parameters. For objects, we have a notion of interfaces and classes. While for modules we have only interfaces. Since modules are inherently more simple (they do not include the state) we do not have to deal with their constructing and deconstructing, therefore we do not need a more complicated notion of a class. Both classes and interfaces is a way to classify objects and modules by some criteria so that we can reason about different modules without looking into the implementation, the same way as we did with connect_to_database, query_for_students, et al functions - we were reasoning about them only based on their name and interface (and probably documentation). Now we can have a class student or a module Student both defining an abstraction called student, so that we can save a lot of brain power, without having to deal with the way how are those students implemented.
And beyond making our programs easier to understand, abstractions give us another benefit -- generalization. Since we don't need to reason about the implementation of a function or a module, it means that all implementations are interchangeable to some degree. Therefore, we can write our programs so that they will express their behavior in a general way, without breaking the abstractions, and then choose particular instances when we run our programs. Objects are runtime instances and essentially it means that we can choose our implementation in runtime. Which is nice. Classes are, however, rarely first-class citizens, therefore we have to invent different cumbersome methods to make the selection, like the Abstract Factory and Builder design patterns. For modules, the situation is even worse, since they are inherently a compile-time structure, we have to choose our implementation at the program building/lining time. Which is not what people want to do in the modern world.
And here comes first-class modules, being an amalgamation of modules and objects, they give us the best of two worlds - an easy to reason about stateless structures, which are, at the same time, a pure first-class citizens, which you can store in a variable, put into list and select the desired implementation in runtime.
Speaking of OCaml, underneath the hood, first-class modules are simply a record of functions. In OCaml, you can even add state to the first-class module making it practically indistinguishable from an object. This brings us to another topic - in the real world, the separation between objects and structures is not that clear. For example, OCaml provides both modules and objects and you can put objects inside modules and even vice verse. In C/C++ we have compilation units, symbols visibility, opaque data types, and header files, which enables some sort of modular programming, as well as we have structures and namespaces. Therefore, the difference is sometimes hard to tell.
Therefore, to summarize. Modules are pieces of code with a well-defined interface to access this code. First class modules are modules which could be manipulated as a regular value, e.g., stored in a data structure, assigned a variable, and picked at runtime.
OCaml perspective here.
Modules and classes are very different.
First of all, classes in OCaml are a very specific (and complex) feature. To go into some detail, classes implement inheritance, row polymorphism and dynamic dispatch (aka virtual methods). It allows them to be highly flexible at the expense of some efficiency.
Modules, however, are quite a different thing altogether.
Indeed, you can see modules as atomic mini-libraries, and usually they are used to define a type and its accessors, but they are much more powerful than just that.
Modules allow you to create several types, as well as module types and submodules. Basically, they allow to create complex compartmentalization and abstraction.
Functors give you behavior similar to c++'s templates. Except they are safe. Basically, they are functions on modules, which allows you to parameterize a data structure or algorithm over some other module.
Modules are usually solved statically and therefore easy to inline, allowing you to write clear code without fear of a loss in efficiency.
Now, a first-class citizen is an entity that can be put in a variable, passed to a function and tested for equality. In a way, it means they will be dynamically evaluated.
For example, suppose you have a module Jpeg and a module Png that allow you to manipulate different kind of pics. Statically, you don't know what kind of image you'll need to display. So you can use first-class modules:
let get_img_type filename =
match Filename.extension filename with
| ".jpg" | ".jpeg" -> (module Jpeg : IMG_HANDLER)
| ".png" -> (module Png : IMG_HANDLER)
let display_img img_type filename =
let module Handler = (val img_type : IMG_HANDLER) in
Handler.display filename
The main differences between a module and an object usually are
Modules are second-class, i.e., they are rather static entities that cannot be passed around as values, while objects can.
Modules can contain types and all other forms of declarations (and types can be made abstract), while objects typically cannot.
However, as you note, there are languages where modules can be wrapped up as first-class values (e.g. Ocaml) and there are languages where objects can contain types (e.g. Scala). That blurs the line a little. There still tend to be various biases towards certain patterns, with different trade-offs made in the type systems. For example, objects focus on recursive types, while modules focus on type abstraction and allowing any definition. It is a very hard problem to support both at the same time without severe compromises, since that quickly leads to an undecidable type system.
As has been mentioned already, "modules", "classes" and "objects" are more like tendencies than strict formal definitions. And if you implement modules as objects for example, as I understand Scala does, then obviously there are no fundamental differences between them, but mostly just syntactic differences that make them more convenient for certain use cases.
In regards to OCaml specifically though, here's a practical example of something you cannot do with modules that you can do with classes because of fundamental differences in implementation:
Modules have functions, which can reference each other recursively using the rec and and keyword. A module can also "inherit" the implementation of another module using include and override its definitions. For example:
module Base = struct
let name = "base"
let print () = print_endline name
end
module Child = struct
include Base
let name = "child"
end
but because modules are early bound, that is, names are resolved at compile time, it's not possible to get Base.print to reference Child.name instead of Base.name. At least not without altering both Base and Child significantly to explicitly enable it:
module AbstractBase(T : sig val name : string end) = struct
let name = T.name
let print () = print_endline name
end
module Base = struct
include AbstractBase(struct let name = "base" end)
end
module Child = struct
include AbstractBase(struct let name = "child" end)
end
With classes on the other hand, overriding is trivial and the default:
class base = object(self)
method name = "base"
method print = print_endline self#name
end
class child = object
inherit base
method! name = "child"
end
Classes can reference themselves, through a variable conventionally named this or self (in OCaml you can name it whatever you want, but self is the convention). They are also late bound, meaning they are resolved at runtime and can therefore call method implementations that didn't exist when it was defined. This is called open recursion.
So why aren't modules late bound too? Primarily for performance reasons I think. Doing a dictionary search on the name of every function call will undoubtedly have a significant impact on execution time.
Interface (or an abstract class with all the methods abstract) is a powerful weapon in a static-typed language such as C#, JAVA. It allows different derived types to be used in a uniformed way. Design patterns encourage us to use interface as much as possible.
However, in a dynamic-typed language, all objects are not checked for their type at compile time. They don't have to implement an interface to be used in a specific way. You just need to make sure that they have some methods (attributes) defined. This makes interface not necessary, or at least not as useful as it is in a static language.
Does a typical dynamic language (e.g. ruby) have interface? If it does, then what are the benefits of having it? If it doesn't, then are we losing many of the beautiful design patterns that require an interface?
Thanks.
I guess there is no single answer for all dynamic languages. In Python, for instance, there are no interfaces, but there is multiple inheritance. Using interface-like classes is still useful:
Interface-like classes can provide default implementation of methods;
Duck-typing is good, but to an extent; sometimes it is useful to be able to write isinstance(x, SomeType), especially when SomeType contains many methods.
Interfaces in dynamic languages are useful as documentation of APIs that can be checked automatically, e.g. by development tools or asserts at runtime.
As an example, zope.interface is the de-facto standard for interfaces in Python. Projects such as Zope and Twisted that expose huge APIs for consumption find it useful, but as far as I know it's not used much outside this type of projects.
In Ruby, which is a dynamically-typed language and only allows single inheritance, you can mimic an "interface" via mixins, rather than polluting the class with the methods of the "interface".
Mixins partially mimic multiple inheritance, allowing an object to "inherit" from multiple sources, but without the ambiguity and complexity of actually having multiple parents. There is only one true parent.
To implement an interface (in the abstract sense, not an actual interface type as in statically-typed languages) You define a module as if it were an interface in a static language. You then include it in the class. Voila! You've gathered the duck type into what is essentially an interface.
Very simplified example:
module Equippable
def weapon
"broadsword"
end
end
class Hero
include Equippable
def hero_method_1
end
def hero_method_2
end
end
class Mount
include Equippable
def mount_method_1
end
end
h = Hero.new
h.weapon # outputs "broadsword"
m = Mount.new
m.weapon # outputs "broadsword"
Equippable is the interface for Hero, Mount, and any other class or model that includes it.
(Obviously, the weapon will most likely be dynamically set by an initializer, which has been simplified away in this example.)
I see the word thrown around often, and I may have used it myself in code and libraries over time, but I never really got it. In most write-ups I came across, they just went on expecting you to figure it out.
What is a Class Factory? Can someone explain the concept?
Here's some supplemental information that may help better understand several of the other shorter, although technically correct, answers.
In the strictest sense a Class Factory is a function or method that creates or selects a class and returns it, based on some condition determined from input parameters or global context. This is required when the type of object needed can't be determined until runtime. Implementation can be done directly when classes are themselves objects in the language being used, such as Python.
Since the primary use of any class is to create instances of itself, in languages such as C++ where classes are not objects that can be passed around and manipulated, a similar result can often be achieved by simulating "virtual constructors", where you call a base-class constructor but get back an instance of some derived class. This must be simulated because constructors can't really be virtual✶ in C++, which is why such object—not class—factories are usually implemented as standalone functions or static methods.
Although using object-factories is a simple and straight-forward scheme, they require the manual maintenance of a list of all supported types in the base class' make_object() function, which can be error-prone and labor-intensive (if not over-looked). It also violates encapsulation✶✶ since a member of base class must know about all of the base's concrete descendant classes (now and in the future).
✶ Virtual functions are normally resolved "late" by the actual type of object referenced, but in the case of constructors, the object doesn't exist yet, so the type must be determined by some other means.
✶✶ Encapsulation is a property of the design of a set of classes and functions where the knowledge of the implementation details of a particular class or function are hidden within it—and is one of the hallmarks of object-oriented programming.
Therefore the best/ideal implementations are those that can handle new candidate classes automatically when they're added, rather than having only a certain finite set currently hardcoded into the factory (although the trade-off is often deemed acceptable since the factory is the only place requiring modification).
James Coplien's 1991 book Advanced C++: Programming Styles and Idioms has details on one way to implement such virtual generic constructors in C++. There are even better ways to do this using C++ templates, but that's not covered in the book which predates their addition to the standard language definition. In fact, C++ templates are themselves class factories since they instantiate a new class whenever they're invoked with different actual type arguments.
Update: I located a 1998 paper Coplien wrote for EuroPLoP titled C++ Idioms where, among other things, he revises and regroups the idioms in his book into design-pattern form à la the 1994 Design Patterns: Elements of Re-Usable Object-Oriented Software book. Note especially the Virtual Constructor section (which uses his Envelope/Letter pattern structure).
Also see the related answers here to the question Class factory in Python as well as the 2001 Dr. Dobb's article about implementing them with C++ Templates titled Abstract Factory, Template Style.
A class factory constructs instances of other classes. Typically, the classes they create share a common base class or interface, but derived classes are returned.
For example, you could have a class factory that took a database connection string and returned a class implementing IDbConnection such as SqlConnection (class and interface from .Net)
A class factory is a method which (according to some parameters for example) returns you a customised class (not instantiated!).
The Wikipedia article gives a pretty good definition: http://en.wikipedia.org/wiki/Factory_pattern
But probably the most authoritative definition would be found in the Design Patterns book by Gamma et al. (commonly called the Gang of Four Book).
I felt that this explains it pretty well (for me, anyway). Class factories are used in the factory design pattern, I think.
Like other creational patterns, it [the factory design pattern]
deals with the problem of creating
objects (products) without specifying
the exact class of object that will be
created. The factory method design
pattern handles this problem by
defining a separate method for
creating the objects, which subclasses
can then override to specify the
derived type of product that will be
created. More generally, the term
factory method is often used to refer
to any method whose main purpose is
creation of objects.
http://en.wikipedia.org/wiki/Factory_method_pattern
Apologies if you've already read this and found it to be insufficient.
To my surprise as I am developing more interest towards dynamic languages like Ruby and Python. The claim is that they are 100% object oriented but as I read on several basic concepts like interfaces, method overloading, operator overloading are missing. Is it somehow in-built under the cover or do these languages just not need it? If the latter is true are, they 100% object oriented?
EDIT: Based on some answers I see that overloading is available in both Python and Ruby, is it the case in Ruby 1.8.6 and Python 2.5.2 ??
Dynamic languages use duck typing.
Any code can call methods on any object that support those methods, so the concept
of interfaces is extraneous.
Python does in fact support operator overloading(check - 3.3. Special method names) , as does Ruby.
Anyway, you seem to be focusing on aspects that are not essential to object oriented programming. The main focus is on concepts like encapsulation, inheritance, and polymorphism, which are 100% supported in Python and Ruby.
Thanks to late binding, they do not need it. In Java/C#, interfaces are used to declare that some class has certain methods and it is checked during compile time; in Python, whether a method exists is checked during runtime.
Method overloading in Python does work:
>>> class A:
... def foo(self):
... return "A"
...
>>> class B(A):
... def foo(self):
... return "B"
...
>>> B().foo()
'B'
Are they object-oriented? I'd say yes. It's more of an approach thing rather than if any concrete language has feature X or feature Y.
I can only speak for python, but there have been proposals for interfaces as well as home-written interface examples in the past.
However, the way python works with objects dynamically tends to reduce the need for (and the benefit of) interfaces to some extent.
With a dynamic language, your type binding happens at runtime - interfaces are mostly used for compile time constraints on objects - if this happens at runtime, it eliminates some of the need for interfaces.
name based polymorphism
"For those of you unfamiliar with Python, here's a quick intro to name-based polymorphism. Python objects have an internal dictionary that contains a string for every attribute and method. When you access an attribute or method in Python code, Python simply looks up the string in the dict. Therefore, if what you want is a class that works like a file, you don't need to inherit from file, you just create a class that has the file methods that are needed.
Python also defines a bunch of special methods that get called by the appropriate syntax. For example, a+b is equivalent to a.add(b). There are a few places in Python's internals where it directly manipulates built-in objects, but name-based polymorphism works as you expect about 98% of the time. "
Python does provide operator overloading, e.g. you can define a method __add__ if you want to overload +.
You typically don't need to provide method overloading, since you can pass arbitrary parameters into a single method. In many cases, that single method can have a single body that works for all kinds of objects in the same way. If you want to have different code for different parameter types, you can inspect the type, or double-dispatch.
Interfaces are mostly unnecessary because of duck typing, as rossfabricant points out. A few remaining cases are covered in Python by ABCs (abstract base classes) or Zope interfaces.