Perl DAL Design Questions - perl

Recently I've been working on some Perl projects and I'm a very novice Perl programmer. I've been experimenting with DBIx::Class and so far I'm really please with the flexibility and the ease of use. I'm curious though. I come from a .NET background and it seems like we spend a lot of time abstracting our DAL to a certain degree. Is this a good idea with a language like Perl?
Where I want to get shortly is to have the ability to start mocking my DAL so I can write unit tests for tasks. Right now though I'm struggling with how the overall structure and design of the application should look though?

Re: Relationship of the ORM within the application...
Hopefully this is the kind of answer you are looking for...
With most web app frameworks in the "scripting" world (i.e. perl, ruby, python, php), most of the time I've seen the business logic implemented at the ORM object level. E.g. in a Rails app it's at the ActiveRecord level; if you are using DBix::Class it would be at the Result-class level.
More concretely, in the case of DBIx::Class, if you have a table named VENDOR there would be a class called MySchema::Result::Vendor which represents a single row in the table VENDOR. Simply add your business methods to this class.
One disadvantage of this approach is that it ties your business logic with the ORM class which can make (unit) testing more difficult. One solution to this is to use a light-weight database for unit tests (i.e. SQLite), and an ORM like DBIx::Class will facilitate switching between the two. Of course, this won't work if you rely on SQL features which are not implemented in SQLite.
Another approach is to place your business logic methods into a Moose role. Then those methods can be composed into either the DBIx::Class Result class or into a mock object for testing. I can elaborate with an example if you'd like.
One big assumption of the above is that your business object = one row in the database. If this is not the case (i.e. you business object spans more than one table), then you'll probably want to create a "shell" or container object which has as instance members each of the constituent ORM objects. Fortunately, Moose has a nice facility for delegating methods (search for Moose delegation and the handles attribute of instance member declarations), so it is relatively easy to make a composite business object out of two or more ORM objects. Again, I can give you an example of this if you'd like.
HTH

I used to work in perl projects for the web long ago. But after working with things such as Django, perl's tools like DBI, etc now look to me rather rudimentary and outdated. Have a look at the django ORM for example, it's elegant and very productive to use, you can bypass it if your query is too complex or the ORM gets in the way...
These days I'd go python or ruby for that kind of projects.
For one liners, small text parsing or sysadmin stuff I still love to use small perl snippets. But I'm more into DRY than TMTOWTDI for more than a few lines of code these days.

Related

Replacements to hand-rolled ADO.NET POCO mapping?

I have written a wrapper around ADO.NET's DbProviderFactory that I use extensively throughout my applications. I also have written a lot of code that maps IDataReader rows to POCOs. However, as I have tons of classes the whole thing is getting to be a pain in the ass to maintain.
I have been looking at replacing the whole she-bang with a micro-orm like Petapoco. I have a few queries though:
I have lots of POCOs that contain other POCOs in them as properties. How well does the Petapoco support this?
Should I use a ORM like Massive or Simple.Data that returns a dynamic object and map that to a POCO?
Are there any approaches I can take to the whole mapping of rows to POCOs? I can't really use convention-based tools as my database isn't particularly consistent in how it is designed.
How about using a text templating/code generator to build out a lightweight persistence layer? I have a battle-hardened open source project called TextMetal to generate the necessary persistence layer based on tried and true architectural decisions. The only lacking thing is object to object relations but it does support query expressions and works well with poorly designed data schemas.
You can see a real world project that uses the above tool call Can Do It For.
Feel free to ask me about any design decisions once you take a look-sse.
Simple.Data automagically casts its dynamic type to static types. It will map nested properties as long as they have been eager-loaded using the .With method. So for example
Customer customer = db.Customer.WithOrders().Get(42);
would populate the Orders property of the customer object.
Could you use QueryFirst, or modify it? It takes your sql and wraps it in vanilla ADO code, generated at design time. You get fresh POCOs from your result schema every time you save your file. Additionally, you can choose to test all queries and regenerate all wrappers via the option in the tools menu. It's dependent on Sql Server and SqlClient, so unless you do some modification, you'll lose DbProviderFactory.

ZF models correct use

I am struggling with how to understand the correct usage of models. Currently i use the inheritance of Db_Table directly and declare all the business logic there. I know it's not correct way to do this.
One solution would be to use Doctrine ORM, but this requires learning curve and all the current components what i use needs to be rewritten paginator and auth. Also Doctrine1 adds a another dozen classes which need to be loaded.
So the current cleanest implementation what i have seen is to use the Data Mapper classes between the so called model and DbTabel. I haven't yet implemented this as it seems to head writing another ORM. But example could be something this: SQL table User
create class with setters, getters, business logic here /model/User.php
data mapper /model/mapper/UserMapper.php, the funcionality is basically writing all the update, save actions in here.
the data source /model/DbTable/User.php extends the Db_Table_Abstract
Problems are with relationships between other models.
I have found it beneficial to not have my models extend Db_Table, but to use composition instead. That means my model 'has a' Db_Table rather than 'is a' Db_Table.
That way I find it much easier to reference multiple tables in the same model, which is a common requirement. This is enough for a simple project. I am currently developing a more complex application and have used the Data Mapper pattern and have found that it has simplified my code more than I would have believed.
Specifically, I have created a class which provides all access to the database and exposes methods such as getUser() etc.. That way, if the DB changes, or my client wants something daft like storing records in XML or we split the servers or something I only have to rewrite one class.
Again, my models do not extend this class, but have an instance of it assigned as a property during construction.
I would say the 'correct' way depends on the situation. Following the YAGNI and KISS principles, it is not good to over-complicate your model setup unless you really believe that it will benefit you in the long run.
What is the application you are developing? How is your current setup of extending Db_Table holding you back?

Database-independent queries to the extreme in Java... or in general

Let's say I have an app that should ideally be able to use a relational database, object database, XML files, or whatever to persist its data. In the spirit of coding to interfaces instead of implementations, I have a generic DataStore interface that specifies a contract for all I/O involving the data store. This interface can be implemented by concrete classes such as RDBMSDataStore, OODBMSDataStore, XMLFileDataStore, and so on.
This works well as long as I keep the contents of the DataStore interface simple - i.e. getThis(), getThose(), saveThat(), updateThis(), etc. But as soon as I require more complicated queries, it breaks down. The XMLFileDataStore class obviously doesn't understand SQL, and the RDBMSDataStore class obviously doesn't understand XPath/XQuery. And OODBMSDataStore understands something entirely different depending on the OODBMS in use.
I could adopt a language-independent object query language, write all my queries in that and then have the concrete classes translate them into their native language, but that's a huge task, if I want to be complete.
Are there standards or best practices for handling this kind of situation in Java? Unfortunately it seems like 99% of the world interprets "database independence" to mean "relational database independence" and ignores the object databases, XML databases, document databases, etc. entirely.
From the way I read the question, this sounds a lot like the semantic that Hibernate brings to the table for Java. It even has mode for dealing with XML as the content backing store (using Dom4J). The Hibernate API has a number of extension points that could allow the addition of an OODBMS model. Even if Hibernate turns out not to be the best solution for you (implementation-wise), I think it provides a good example of the types of patterns that can be used to solve the problems you proposed.

Moving from Class::DBI to DBIx::Class

I'm currently doing some research on DBIx::Class in order to migrate my current application from Class::DBI. Honestly I'm a bit disappointed about the DBIx::Class when it comes to configuring the result classes, with Class::DBI I could setup metadata on models just by calling the on function without a code generator and so on my question is ... can I the same thing with DBIX::Class also it seems that client-side triggers are not supported in DBIx::Class or i'm not looking at the wrong docs?
Triggers can be implemented by redefining the appropriate method (new/create/update/delete etc) in the Result class, and calling the parent (via $self->next::method()) within it, either before or after your code. Admittedly it's a bit clumsy compared to the before/after triggers in Class::DBI.
As for metadata - are you talking about temporary columns on an object? i.e. data that won't be stored in the database row. These can be added easily using one of the Class::Accessor::* modules on CPAN
One of the hardest changes to make when switching from CDBI to DBIC is to think in terms of ResultSets - often what would have been implemented via a Class method in CDBI becomes a method on a ResultSet - and code may need to be refactored considerably, it's not always a straightforward conversion from one to the other.

Rules of thumbs for writing "queries" using ADO.NET Entity Framework

I’m currently working on a prototype of a medium size web application, and I thought that it would be good to also experiment with Entity Framework. The problem is that the major part of the application is not the data layer and logic, and so that I don't have much time to play with Entity Framework. On the other hand, the database schema is quite simple.
One of the problems I’m facing is that I cannot find a consistent way to "write queries". As far as I can tell, there are four "interfaces" for the job:
LINQ to Entities
LINQ to Entities using LINQ extension methods
Entity SQL
Query builder
OK, the first two are essentially the same, but it’s good to use just one for maintenance and consistency.
I’m mostly puzzled by the fact that none of them seems to be complete and the most general. I often find myself cornered and using some ugly looking combination of several of them. My guess is that Entity SQL is the most general one, but writing queries using strings feels like a step back. The main reason I’m experimenting with something like Entity Framework is that I like the compile time checking.
Some other random thought / issues:
I often also use the ObjectQuery.Include() method, but again it takes a string. Is this the only way?
When to use ObjectQuery.Execute() (vs. ToList())? Does it actually execute the query?
Should execute queries as soon as possible (e.g. using ToList()) or should I not care just let leave the execution for the first enumeration which gets in the way?
Are ObjectQuery.Skip() and ObjectQuery.Take() available only as extension methods? Is there a better way to do paging? It’s 2009 and almost every web application deals with paging.
Overall, I understand there are many difficulties when implementing an ORM, and often one has to compromise. On the other hand, the direct database access (e.g. ADO.NET) is plain and simple and has well defined interface (tabular results, data readers), so all code - no matter who and when writes it - is consistent. I don’t want to faced with too many choices whenever writing a database query. It’s too tedious and more than likely different developers will come up with different ways.
What are your rules of thumbs?
I use LINQ-to-Entities as much as possible. I also try and formalise to the lambda-form, as opposed to the extended SQL-style syntax. I have to admit to have had problems enforcing relationships and making compromises on efficiency just to expedite my coding of our application (eg. Master->Child tables may need to be manually loaded) but all in all, EF is a good product.
I do use EF's .Include() method for lazy-loading, which as you say, does require a string input. I find no problem with this, other than that of identifying the string to use which is relatively simple. I guess if you're keen on compile-time checking of such relations, a model similar to: Parent.GetChildren() might be more appropriate.
My application does require some "dynamic" queries to be performed, though. I have two ways of meeting this:
a) I create a mediator object, eg. ClientSearchMediator, which "knows" how to search for clients by name, etc. I can then put this through a SearchHandler.Search(ISearchMediator[] mediators) call (for example). This can be used to target specific data structures and sort results accordingly using LINQ-to-Entities.
b) For a looser experience, possibly as a result of a user designing their own query (using high level tools our application provides), eSQL is ideal for this purpose. It can be made to be injection-safe.
I don't have enough knowledge to address all of this, but I'll at least take a few stabs.
I don't know why you think ADO.NET is more consistent than Entity Framework. There are many different ways to use ADO.NET and I've definitely seen inconsistency within a single code base.
Entity Framework is currently a 1.0 release and it suffers from many 1.0 type problems (incomplete & inconsistent API, missing features, etc.).
In regards to Include, I assume you are referring to eager loading. Multiple people (outside of Microsoft) have developed solutions for getting "type safe" includes (try googling something like: Entity Framework ObjectQueryExtension Include). That said, Include is more of a hint than anything. You can't force eager loading and you have to always remember to call the IsLoaded() method to see if your request was fulfilled. As far as I know, the way "Include" works is not changing at all in the next version of Entity Framework (4.0 - to ship with VS 2010).
As far as executing the Linq query as soon as it's built vs. the last possible moment, that decision is situational. Personally, I would probably execute it as soon as it's built for the most part unless there was a compelling reason not to, but I can see other people going the opposite direction.
There are more mature ORMs on the market and Entity Framework isn't necessarily your best option. For the most part, you can bend Entity Framework to your will, but you may end up rolling your own implementation of features that come out of the box with other ORMs.