pytest get list of modified objects during a test execution / test that only particular objects was modified - pytest

TL;TD How I can test that only particular objects (which are allowed to be modified) was modified during a test?
I'm making tests and just figured out a tricky subtle bug that hard to cover with standard test strategy.
The bug:
class Foo:
L = [1, 2, 3]
def __init__(self):
self.l = self.L # copy() call missed!
foo1 = Foo()
foo2 = Foo()
foo1.l.append(4)
assert foo1.l is foo2.l # True (wrong logic behavior).
(I know that error is pretty easy but I did it because in all other places l is enum class rather than list/typed dict)
The problem with this bug is that most of the tests check only one particular object itself,
They not intended to trace other objects' behavior.
Well, I want to prevent bugs of such type in my code by corresponding tests.
Checking the state manually of the possible related object is very tedious and unreliable.
How it can look:
import pytest
import Foo
#pytest.fixture
def foo() ->:
yield Foo()
#pytest.mark.allowed_to_modify(objects_list=[foo])
test_l_success(foo: Foo):
"""This test will fail if the state of other objects except "foo" will change"""
foo.l.append('foo')
assert foo.l ...
P.S. I understand that tracing an object's state may be hard (for example if it's an (unhashable) dictionary or database connection, but the problem definitely exists and bugs of such type may be pretty painful for production.

Related

What is a "<refinement>" type gotten through a TypeTag?

I have a method:
import scala.reflect.runtime.universe.{TypeTag,typeOf}
def print[T:TypeTag] = println(typeOf[T].typeSymbol.name.toString)
Most of the time, print[MyClass] prints MyClass when invoked, but sometimes, it prints <refinement>?
I am working on a fairly complex system (multiple interconnecting jars, 100K lines of code), and I cannot seem to identify what determines if it is the one behaviour or the other. But if I knew what <refinement> means, or what triggers that, maybe I could?
Refinements could be explained as anonymous class type. E.g.
import scala.reflect.runtime.universe.{TypeTag,typeOf}
def print[T:TypeTag] = println(typeOf[T].typeSymbol.name.toString)
class C
trait T
print[C with T]
type A = C with T
print[A]
Output will be <refinement> in both cases.

scala programming practice with option

May be my design is flawed (most probably it is) but I have been thinking about the way Option is used in Scala and I am not so very happy about it. Let's say I have 3 methods calling one another like this:
def A(): reads a file and returns something
def B(): returns something
def C(): Side effect (writes into DB)
and C() calls B() and in turn B() calls A()
Now, as A() is dependent on I/O ops, I had to handle the exceptions and return and Option otherwise it won't compile (if A() does not return anything). As B() receives an Option from A() and it has to return something, it is bound to return another Option to C(). So, you can possibly imagine that my code is flooded with match/case Some/case None (don't have the liberty to use getOrElse() always). And, if C() is dependent on some other methods which also return Option, you would be scared to look at the definition of C().
So, am I missing something? Or how flawed is my design? How can I improve it?
Using match/case on type Option is often useful when you want to throw away the Option and produce some value after processing the Some(...) but a different value of the same type if you have a None. (Personally, I usually find fold to be cleaner for such situations.)
If, on the other hand, you're passing the Option along, then there are other ways to go about it.
def a():Option[DataType] = {/*read new data or fail*/}
def b(): Optioon[DataType] = {
... //some setup
a().map{ inData =>
... //inData is real, process it for output
}
}
def c():Unit = {
... //some setup
b().foreach{ outData =>
... //outData is real, write it to DB
}
}
am I missing something?
Option is one design decision, but there can be others. I.e what happens when you want to describe the error returned by the API? Option can only tell you two kinds of state, either I have successfully read a value, or I failed. But sometimes you really want to know why you failed. Or more so, If I return None, is it because the file isn't there or because I failed on an exception (i.e. I don't have permission to read the file?).
Whichever path you choose, you'll usually be dealing with one or more effects. Option is one such effect which representing a partial function, i.e. this operation may not yield a result. While using pattern matching with Option, as other said, is one way of handling it, there are other operations which decrease the verbosity.
For example, if you want to invoke an operation in case the value exists and another in case it isn't and they both have the same return type, you can use Option.fold:
scala> val maybeValue = Some(1)
maybeValue: Some[Int] = Some(1)
scala> maybeValue.fold(0)(x => x + 1)
res0: Int = 2
Generally, there are many such combinators defined on Option and other effects, and they might seem cumbersome at the beginning, later they come to grow on you and you see their real power when you want to compose operations one after the other.

in scala what is cleaner code - to def (~member) versus pass function parameter?

Which is cleaner?
def version
trait Foo {
def db: DB
def save() = db.save()
def load() = db.load()
}
versus parametric version:
trait Foo {
def save(db: DB) = db.save()
def load(db: DB) = db.load()
}
(left out intentionaly other parameters/members I want to focus on this one).
I have to say that when I look at complex projects I thank god when functions are taking all their dependencies in
I can unit test them easily without overriding members, the functions tells me all that it's dependent upon on its signature.
I don't have to read their internal code to understand better what the function does, I have its name, I have its input, I have its output all in function signature.
But I also noticed that in scala its very conventional to use the def version, and I have to say that this code when it comes bundled in complex projects such code is much less readable for me. Am I missing something?
I think in this case it highly depends on what the relationship is between Foo and DB. Would it ever be the case that a single instance of Foo would use one DB for load and another for save? If yes, then DB isn't really a dependency of Foo and the first example makes no sense. But it seems to me that the answer is no, that if you call load with one DB, you'll be using the same DB when you call save.
In your first example, that information is encoded into the type system. You're effectively letting the compiler do some correctness checking for you, since now you're enforcing at compile-time that for a single Foo, load and save will be called on the same DB (yes it's possible that db is a var, but that in itself is another issue).
Furthermore, it seems inevitable that you're just going to be passing around a DB every place you pass a Foo. Suppose you have a function that uses Foo. In the first example, your function would look like
def loadFoo(foo: Foo) {
foo.load()
}
whereas in the second it would look like:
def loadFoo(foo: Foo, db: DB) {
foo.load(db)
}
So all you've done is lengthened every function signature and opened up room for errors.
Lastly, I would argue that your points about unit testing and not needing to read a function's code are invalid. In the first example, it's true that you can't see all of load's dependencies just by looking at the function signature. But load is not an isolated function, it is a method that is part of a trait. A method is not identical to a plain old function and they exist in the context of their defining trait.
In other words, you should not be thinking about unit testing the functions, but rather unit testing the trait. They're a package deal and you should have no expectations that their behavior is independent of each other. If you do want that kind of independance, than Foo should be an object which basically makes load and save static methods (although even then objects can have internal state, but that is far less idiomatic).
Plus, you can never really tell what a function is doing just by looking at its dependencies. After all I could write a function:
def save(db: DB){
throw new Exception("hello!!")
}

Scala, Specs2, Mockito and null return values

I'm trying to test-drive some Scala code using Specs2 and Mockito. I'm relatively new to all three, and having difficulty with the mocked methods returning null.
In the following (transcribed with some name changes)
"My Component's process(File)" should {
"pass file to Parser" in new modules {
val file = mock[File]
myComponent.process(file)
there was one(mockParser).parse(file)
}
"pass parse result to Translator" in new modules {
val file = mock[File]
val myType1 = mock[MyType1]
mockParser.parse(file) returns (Some(myType1))
myComponent.process(file)
there was one(mockTranslator).translate(myType1)
}
}
The "pass file to Parser" works until I add the translator call in the SUT, and then dies because the mockParser.parse method has returned a null, which the translator code can't take.
Similarly, the "pass parse result to Translator" passes until I try to use the translation result in the SUT.
The real code for both of these methods can never return null, but I don't know how to tell Mockito to make the expectations return usable results.
I can of course work around this by putting null checks in the SUT, but I'd rather not, as I'm making sure to never return nulls and instead using Option, None and Some.
Pointers to a good Scala/Specs2/Mockito tutorial would be wonderful, as would a simple example of how to change a line like
there was one(mockParser).parse(file)
to make it return something that allows continued execution in the SUT when it doesn't deal with nulls.
Flailing about trying to figure this out, I have tried changing that line to
there was one(mockParser).parse(file) returns myResult
with a value for myResult that is of the type I want returned. That gave me a compile error as it expects to find a MatchResult there rather than my return type.
If it matters, I'm using Scala 2.9.0.
If you don't have seen it, you can look the mock expectation page of the specs2 documentation.
In your code, the stub should be mockParser.parse(file) returns myResult
Edited after Don's edit:
There was a misunderstanding. The way you do it in your second example is the good one and you should do exactly the same in the first test:
val file = mock[File]
val myType1 = mock[MyType1]
mockParser.parse(file) returns (Some(myType1))
myComponent.process(file)
there was one(mockParser).parse(file)
The idea of unit testing with mock is always the same: explain how your mocks work (stubbing), execute, verify.
That should answer the question, now a personal advice:
Most of the time, except if you want to verify some algorithmic behavior (stop on first success, process a list in reverse order) you should not test expectation in your unit tests.
In your example, the process method should "translate things", thus your unit tests should focus on it: mock your parsers and translators, stub them and only check the result of the whole process. It's less fine grain but the goal of a unit test is not to check every step of a method. If you want to change the implementation, you should not have to modify a bunch of unit tests that verify each line of the method.
I have managed to solve this, though there may be a better solution, so I'm going to post my own answer, but not accept it immediately.
What I needed to do was supply a sensible default return value for the mock, in the form of an org.mockito.stubbing.Answer<T> with T being the return type.
I was able to do this with the following mock setup:
val defaultParseResult = new Answer[Option[MyType1]] {
def answer(p1: InvocationOnMock): Option[MyType1] = None
}
val mockParser = org.mockito.Mockito.mock(implicitly[ClassManifest[Parser]].erasure,
defaultParseResult).asInstanceOf[Parser]
after a bit of browsing of the source for the org.specs2.mock.Mockito trait and things it calls.
And now, instead of returning null, the parse returns None when not stubbed (including when it's expected as in the first test), which allows the test to pass with this value being used in the code under test.
I will likely make a test support method hiding the mess in the mockParser assignment, and letting me do the same for various return types, as I'm going to need the same capability with several return types just in this set of tests.
I couldn't locate support for a shorter way of doing this in org.specs2.mock.Mockito, but perhaps this will inspire Eric to add such. Nice to have the author in the conversation...
Edit
On further perusal of source, it occurred to me that I should be able to just call the method
def mock[T, A](implicit m: ClassManifest[T], a: org.mockito.stubbing.Answer[A]): T = org.mockito.Mockito.mock(implicitly[ClassManifest[T]].erasure, a).asInstanceOf[T]
defined in org.specs2.mock.MockitoMocker, which was in fact the inspiration for my solution above. But I can't figure out the call. mock is rather overloaded, and all my attempts seem to end up invoking a different version and not liking my parameters.
So it looks like Eric has already included support for this, but I don't understand how to get to it.
Update
I have defined a trait containing the following:
def mock[T, A](implicit m: ClassManifest[T], default: A): T = {
org.mockito.Mockito.mock(
implicitly[ClassManifest[T]].erasure,
new Answer[A] {
def answer(p1: InvocationOnMock): A = default
}).asInstanceOf[T]
}
and now by using that trait I can setup my mock as
implicit val defaultParseResult = None
val mockParser = mock[Parser,Option[MyType1]]
I don't after all need more usages of this in this particular test, as supplying a usable value for this makes all my tests work without null checks in the code under test. But it might be needed in other tests.
I'd still be interested in how to handle this issue without adding this trait.
Without the full it's difficult to say but can you please check that the method you're trying to mock is not a final method? Because in that case Mockito won't be able to mock it and will return null.
Another piece of advice, when something doesn't work, is to rewrite the code with Mockito in a standard JUnit test. Then, if it fails, your question might be best answered by someone on the Mockito mailing list.

(Usage of Class Variables) Pythonic - or nasty habit learnt from java?

Hello Pythoneers: the following code is only a mock up of what I'm trying to do, but it should illustrate my question.
I would like to know if this is dirty trick I picked up from Java programming, or a valid and Pythonic way of doing things: basically I'm creating a load of instances, but I need to track 'static' data of all the instances as they are created.
class Myclass:
counter=0
last_value=None
def __init__(self,name):
self.name=name
Myclass.counter+=1
Myclass.last_value=name
And some output of using this simple class , showing that everything is working as I expected:
>>> x=Myclass("hello")
>>> print x.name
hello
>>> print Myclass.last_value
hello
>>> y=Myclass("goodbye")
>>> print y.name
goodbye
>>> print x.name
hello
>>> print Myclass.last_value
goodbye
So is this a generally acceptable way of doing this kind of thing, or an anti-pattern ?
[For instance, I'm not too happy that I can apparently set the counter from both within the class(good) and outside of it(bad); also not keen on having to use full namespace 'Myclass' from within the class code itself - just looks bulky; and lastly I'm initially setting values to 'None' - probably I'm aping static-typed languages by doing this?]
I'm using Python 2.6.2 and the program is single-threaded.
Class variables are perfectly Pythonic in my opinion.
Just watch out for one thing. An instance variable can hide a class variable:
x.counter = 5 # creates an instance variable in the object x.
print x.counter # instance variable, prints 5
print y.counter # class variable, prints 2
print myclass.counter # class variable, prints 2
Do. Not. Have. Stateful. Class. Variables.
It's a nightmare to debug, since the class object now has special features.
Stateful classes conflate two (2) unrelated responsibilities: state of object creation and the created objects. Do not conflate responsibilities because it "seems" like they belong together. In this example, the counting of created objects is the responsibility of a Factory. The objects which are created have completely unrelated responsibilities (which can't easily be deduced from the question).
Also, please use Upper Case Class Names.
class MyClass( object ):
def __init__(self, name):
self.name=name
def myClassFactory( iterable ):
for i, name in enumerate( iterable ):
yield MyClass( name )
The sequence counter is now part of the factory, where the state and counts should be maintained. In a separate factory.
[For folks playing Code Golf, this is shorter. But that's not the point. The point is that the class is no longer stateful.]
It's not clear from question how Myclass instances get created. Lacking any clue, there isn't much more than can be said about how to use the factory. An iterable is the usual culprit. Perhaps something that iterates through a list or a file or some other iterable data structure.
Also -- for folks just of the boat from Java -- the factory object is just a function. Nothing more is needed.
Since the example on the question is perfectly unclear, it's hard to know why (1) two unique objects are created with (2) a counter. The two unique objects are already two unique objects and a counter isn't needed.
For example, the static variables in the Myclass are never referenced anywhere. That makes it very, very hard to understand the example.
x, y = myClassFactory( [ "hello", "goodbye" ] )
If the count or last value where actually used for something, then a perhaps meaningful example could be created.
You can solve this problem by splitting the code into two separate classes.
The first class will be for the object you are trying to create:
class MyClass(object):
def __init__(self, name):
self.Name = name
And the second class will create the objects and keep track of them:
class MyClassFactory(object):
Counter = 0
LastValue = None
#classmethod
def Build(cls, name):
inst = MyClass(name)
cls.Counter += 1
cls.LastValue = inst.Name
return inst
This way, you can create new instances of the class as needed, but the information about the created classes will still be correct.
>>> x = MyClassFactory.Build("Hello")
>>> MyClassFactory.Counter
1
>>> MyClassFactory.LastValue
'Hello'
>>> y = MyClassFactory.Build("Goodbye")
>>> MyClassFactory.Counter
2
>>> MyClassFactory.LastValue
'Goodbye'
>>> x.Name
'Hello'
>>> y.Name
'Goodbye'
Finally, this approach avoids the problem of instance variables hiding class variables, because MyClass instances have no knowledge of the factory that created them.
>>> x.Counter
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute 'Counter'
You don't have to use a class variable here; this is a perfectly valid case for using globals:
_counter = 0
_last_value = None
class Myclass(obj):
def __init__(self, name):
self.name = name
global _counter, _last_value
_counter += 1
_last_value = name
I have a feeling some people will knee-jerk against globals out of habit, so a quick review may be in order of what's wrong--and not wrong--with globals.
Globals traditionally are variables which are visible and changeable, unscoped, from anywhere in the program. This is a problem with globals in languages like C. It's completely irrelevant to Python; these "globals" are scoped to the module. The class name "Myclass" is equally global; both names are scoped identically, in the module they're contained in. Most variables--in Python equally to C++--are logically part of instances of objects or locally scoped, but this is cleared shared state across all users of the class.
I don't have any strong inclination against using class variables for this (and using a factory is completely unnecessary), but globals are how I'd generally do it.
Is this pythonic? Well, it's definitely more pythonic than having global variables for a counter and the value of the most recent instance.
It's said in Python that there's only one right way to do anything. I can't think of a better way to implement this, so keep going. Despite the fact that many will criticize you for "non-pythonic" solutions to problems (like the needless object-orientation that Java coders like or the "do-it-yourself" attitude that many from C and C++ bring), in most cases your Java habits will not send you to Python hell.
And beyond that, who cares if it's "pythonic"? It works, and it's not a performance issue, is it?