Test MongoDB Interactions in a PHP Application With Mocking - mongodb

What is the best practice to work efficiently with MongoDB and PHPUnit? What should (or could) I use to mock objects that access MongoDB? PHPUnit Mocker, Mockery, Phrophecy, Phactory?

If you look at mocking data for SQL databases, there are lots of opinions here.
Some people suggest using an in-memory SQL database.
Some people just mock the ORM calls and assume that the ORM to DB portion is tested.
Some people just use a "local" DB for unit testing and just ignore the whole "mocking" concept.
Given the lack of consensus on SQL, it's even less likely that you will find consensus on the new DBs like MongoDB.
I think there are some important details to consider here.
Are you using some form of ORM / ODM? Just the driver directly?
Are you trying to mock all communications with the DB? Are you trying to mock the ODM?
If you are just trying to mock communications to DB, then the ideal solution is a "fake" implementation of the MongoDB driver. This is probably a lot of work as the driver was never written with "mockability" in mind.
If you have an ODM, then you can simply mock the ODM calls and assume the ODM is doing its job. Ideally the ODM should provide some mockable interface, but this is not always the case.
Again, this answer comes back down to what you're really planning to test and what you consider as a good unit test. Unfortunately, most of these products are still very new so there is very little guidance in this space.

Phactory provides direct support for mocking MongoDB.
Edit: Phactory is no longer maintained. However, I've found a new project called php-mongomock that seems to solve this problem:
<?php
use Helmich\MongoMock\MockCollection;
$collection = new MockCollection();
$collection->createIndex(['foo' => 1]);
$documentId = $collection->insertOne(['foo' => 'bar'])->insertedId();
$collection->updateOne(['_id' => $documentId], ['$set' => ['foo' => 'baz']]);

Related

.net core 2.1 multiple DbContext for same database

I am using .Net core 2.1 + EF Core to build WebApi. Based on this discussion: Entity Framework: One Database, Multiple DbContexts. Is this a bad idea? and the bounded context concept from DDD I want to create multiple DbContext for different functionalities in my app. There are several resources provided by Julie Lerman on PluralSight.com but non of those cover dependency injections in .net core. So far I did something like this:
var connectionString = Configuration.GetConnectionString("DatabaseName");
services.AddDbContext<DatabaseContext>(options =>
options.UseSqlServer(connectionString, optionsBuilder =>
optionsBuilder.MigrationsAssembly("Database.Migrations")));
services.AddDbContext<ProductContext>(options =>
options.UseSqlServer(connectionString));
services.AddDbContext<CountryContext>(options =>
options.UseSqlServer(connectionString));
here DatabaseContext is a context used for EF Core migrations (does not actually query the data) and ProductContext and CountryContext are two of the contexts I use for my data manipulation. My questions are:
Is this a right way to do it? It feels a bit odd to re-use same connection string
It feels like I would be using 2 separete connections (and this will grow), could this cause some data access issues in the future, locks, concurency, etc?
UPDATE (2022-11-11)
I have been using this for few years now. Did not had any issues related to DB connection so far. Planning to use this this approach in future projects as well.
The concept of having bounded contexts is fine, per se. It's based on the idea that the particular application shouldn't have access to things it doesn't need. Personally, I think it's a tad bit of overkill, but reasonable people can disagree on the issue.
However, what you have here, is simply wrong. If your application needs access to all these bounded contexts, it makes no sense to have them. Just feed it one context with all the access it needs. Yes, each context will have a separate connection, so yes, you will likely have multiple connections in play servicing requests. Again, this is why it makes no sense to divy out your contexts in this scenario.
If, however, you were to take a microservice architecture approach, where each microservice dealt with one discreet unit of functionality, and you wanted to be a purist about it, employing bounded contexts would make sense, but in a monolithic app, it does not.
Just do it for each context:
services.AddScoped<YourContext>(s =>
{
var builder = new
DbContextOptionsBuilder().UseSqlServer("ConnectionString");
return new YourContext(builder.Options);
});

How to manage test data for Hibernate Search integration tests

I have a Spring-based system that uses Hibernate Search 3.4 (on top of Hibernate 3.5.4). Integration tests are managed by Spring, with #Transactional annotation. At the moment test data (entities that are to be indexed) is loaded by Liquibase script, we use it's Spring integration. It's very inconvenient to manage.
My new solution is to have test data defined as Spring beans and wire them as Resources, by name. This part works.
I tried to have these beans persisted and indexed in setUp method of my test cases (and in test methods themselves) but I failed. They get into DB fine but I can't get them indexed. I tried calling index() on FullTextEntityManager (with flushToIndexes), I tried createIndexer().startAndWait().
What else can I do?
Or may be there is some better option of testing HS?
Thank You in advance
My new solution is to have test data defined as Spring beans and wire
them as Resources, by name. This part works.
sounds like a strange setup for a unit test. To be honest I am not quote sure how you do this.
In Search itself an in memory database (H2) is used together with a Lucene RAM directory. The benefits of such a setup is that it is fast and easy to avoid dependencies between tests.
I tried to have these beans persisted and indexed in setUp method of
my test cases (and in test methods themselves) but I failed. They get
into DB fine but I can't get them indexed.
If automatic indexing is enabled and the persisting of the test data is occurring within an transaction, it should work. A common mistake in combination with Spring is to use the wrong transaction manager. The Hibernate Search forum has a lot of threads around this, for example this one - https://forum.hibernate.org/viewtopic.php?f=9&t=998155. Since you are not giving any concrete configuration and code examples it is hard to give more specific advice.
I tried createIndexer().startAndWait()
that is also a good approach. I would recommend this approach if you want to insert not such a couple of test entities, but a whole set of data. In this case it can make sense to use a framework like dbunit to insert the testdata and then manually index the data. createIndexer().startAndWait() is the right tool for that. Extracting all this loading/persisting/indexing functionality into a common test base class is the way to go. The base class can also be responsible to do all the Spring bootstrapping.
Again, to give more specific feedback you have to refine your question.
I have a complete different approach, when I write any queries, i want to write a complete test suite, but data creation has always been pain(special mention to when test customer gets corrupt and all your test suite breaks.
To solve this I created Random-JPA. It's simple and easy to integrate. The whole idea is you create fresh data and test.
You Can find the full documentation here

Perl DAL Design Questions

Recently I've been working on some Perl projects and I'm a very novice Perl programmer. I've been experimenting with DBIx::Class and so far I'm really please with the flexibility and the ease of use. I'm curious though. I come from a .NET background and it seems like we spend a lot of time abstracting our DAL to a certain degree. Is this a good idea with a language like Perl?
Where I want to get shortly is to have the ability to start mocking my DAL so I can write unit tests for tasks. Right now though I'm struggling with how the overall structure and design of the application should look though?
Re: Relationship of the ORM within the application...
Hopefully this is the kind of answer you are looking for...
With most web app frameworks in the "scripting" world (i.e. perl, ruby, python, php), most of the time I've seen the business logic implemented at the ORM object level. E.g. in a Rails app it's at the ActiveRecord level; if you are using DBix::Class it would be at the Result-class level.
More concretely, in the case of DBIx::Class, if you have a table named VENDOR there would be a class called MySchema::Result::Vendor which represents a single row in the table VENDOR. Simply add your business methods to this class.
One disadvantage of this approach is that it ties your business logic with the ORM class which can make (unit) testing more difficult. One solution to this is to use a light-weight database for unit tests (i.e. SQLite), and an ORM like DBIx::Class will facilitate switching between the two. Of course, this won't work if you rely on SQL features which are not implemented in SQLite.
Another approach is to place your business logic methods into a Moose role. Then those methods can be composed into either the DBIx::Class Result class or into a mock object for testing. I can elaborate with an example if you'd like.
One big assumption of the above is that your business object = one row in the database. If this is not the case (i.e. you business object spans more than one table), then you'll probably want to create a "shell" or container object which has as instance members each of the constituent ORM objects. Fortunately, Moose has a nice facility for delegating methods (search for Moose delegation and the handles attribute of instance member declarations), so it is relatively easy to make a composite business object out of two or more ORM objects. Again, I can give you an example of this if you'd like.
HTH
I used to work in perl projects for the web long ago. But after working with things such as Django, perl's tools like DBI, etc now look to me rather rudimentary and outdated. Have a look at the django ORM for example, it's elegant and very productive to use, you can bypass it if your query is too complex or the ORM gets in the way...
These days I'd go python or ruby for that kind of projects.
For one liners, small text parsing or sysadmin stuff I still love to use small perl snippets. But I'm more into DRY than TMTOWTDI for more than a few lines of code these days.

Zend Framework Filter, prevent sql injection

For some important reasons I can't use standard methods provided by ZF to prevent sql injection. I have just wrote that (and I am using it on each POST/GET data from user):
$filter = new Zend_Filter_PregReplace();
$filter->setMatchPattern(array("/[';`]/"))
->setReplacement(array(''));
I am using MySQL database only. Is it enough? Is it secure now?
Never do stuff like this using regular expressions. If you can't use Zend's database methods, use whatever sanitation the database library offers you. For mySQL's procedural wrapper, it would be mysql_real_escape_string(). For PDO, parametrized queries will take care of it automatically. And so on.
That said, I really don't understand why this is necessary in the first place. Why can't you use what the Framework offers? I bet there is a better workaround than doing sanitation on your own.
You really should use sanitization provided by the framework - Zend (PDO, ORM). If you don't there is probably something already going wrong.
There are so many cases to inject malicious code, that to exclude all of them, you will have to find/roll your own some kind of framework to be safe.

Rules of thumbs for writing "queries" using ADO.NET Entity Framework

I’m currently working on a prototype of a medium size web application, and I thought that it would be good to also experiment with Entity Framework. The problem is that the major part of the application is not the data layer and logic, and so that I don't have much time to play with Entity Framework. On the other hand, the database schema is quite simple.
One of the problems I’m facing is that I cannot find a consistent way to "write queries". As far as I can tell, there are four "interfaces" for the job:
LINQ to Entities
LINQ to Entities using LINQ extension methods
Entity SQL
Query builder
OK, the first two are essentially the same, but it’s good to use just one for maintenance and consistency.
I’m mostly puzzled by the fact that none of them seems to be complete and the most general. I often find myself cornered and using some ugly looking combination of several of them. My guess is that Entity SQL is the most general one, but writing queries using strings feels like a step back. The main reason I’m experimenting with something like Entity Framework is that I like the compile time checking.
Some other random thought / issues:
I often also use the ObjectQuery.Include() method, but again it takes a string. Is this the only way?
When to use ObjectQuery.Execute() (vs. ToList())? Does it actually execute the query?
Should execute queries as soon as possible (e.g. using ToList()) or should I not care just let leave the execution for the first enumeration which gets in the way?
Are ObjectQuery.Skip() and ObjectQuery.Take() available only as extension methods? Is there a better way to do paging? It’s 2009 and almost every web application deals with paging.
Overall, I understand there are many difficulties when implementing an ORM, and often one has to compromise. On the other hand, the direct database access (e.g. ADO.NET) is plain and simple and has well defined interface (tabular results, data readers), so all code - no matter who and when writes it - is consistent. I don’t want to faced with too many choices whenever writing a database query. It’s too tedious and more than likely different developers will come up with different ways.
What are your rules of thumbs?
I use LINQ-to-Entities as much as possible. I also try and formalise to the lambda-form, as opposed to the extended SQL-style syntax. I have to admit to have had problems enforcing relationships and making compromises on efficiency just to expedite my coding of our application (eg. Master->Child tables may need to be manually loaded) but all in all, EF is a good product.
I do use EF's .Include() method for lazy-loading, which as you say, does require a string input. I find no problem with this, other than that of identifying the string to use which is relatively simple. I guess if you're keen on compile-time checking of such relations, a model similar to: Parent.GetChildren() might be more appropriate.
My application does require some "dynamic" queries to be performed, though. I have two ways of meeting this:
a) I create a mediator object, eg. ClientSearchMediator, which "knows" how to search for clients by name, etc. I can then put this through a SearchHandler.Search(ISearchMediator[] mediators) call (for example). This can be used to target specific data structures and sort results accordingly using LINQ-to-Entities.
b) For a looser experience, possibly as a result of a user designing their own query (using high level tools our application provides), eSQL is ideal for this purpose. It can be made to be injection-safe.
I don't have enough knowledge to address all of this, but I'll at least take a few stabs.
I don't know why you think ADO.NET is more consistent than Entity Framework. There are many different ways to use ADO.NET and I've definitely seen inconsistency within a single code base.
Entity Framework is currently a 1.0 release and it suffers from many 1.0 type problems (incomplete & inconsistent API, missing features, etc.).
In regards to Include, I assume you are referring to eager loading. Multiple people (outside of Microsoft) have developed solutions for getting "type safe" includes (try googling something like: Entity Framework ObjectQueryExtension Include). That said, Include is more of a hint than anything. You can't force eager loading and you have to always remember to call the IsLoaded() method to see if your request was fulfilled. As far as I know, the way "Include" works is not changing at all in the next version of Entity Framework (4.0 - to ship with VS 2010).
As far as executing the Linq query as soon as it's built vs. the last possible moment, that decision is situational. Personally, I would probably execute it as soon as it's built for the most part unless there was a compelling reason not to, but I can see other people going the opposite direction.
There are more mature ORMs on the market and Entity Framework isn't necessarily your best option. For the most part, you can bend Entity Framework to your will, but you may end up rolling your own implementation of features that come out of the box with other ORMs.