Is it good practice to use AccessBean or SQL to fetch data from OOTB table in IBM WCS - wcs

I want to get data from multiple OOTB WCS table for which there is no OOTB rest available. I am using multiple access bean in databean to get data from tables. Is this a good practice or we should use ServerJDBCHelperAccessBean make a single query with join to hit database. I understand that AccessBean are cached but there are techniques we can cache sql also.
Is there any other reason we should use AccessBean instead of ServerJDBCHelperAccessBean in case fetching data from multiple tables. or we should use ServerJDBCHelperAccessBean and get data in single sql query with joins.
And which will be more expensive in above approaches.
Thanks
Ankit

There is no hard and fast rule to choose between the above two methods for database interactions. Developer has to make a logical choice
AccessBeans
Caching is one of the advantage of access beans. That is a good performance improvement and is achieved by caching the home objects as the look up for home objects are costly. Another point in favour of access bean is handling optimistic updates. Your case is to get the data (not to update/insert) and hence you are safe here.
Session Bean
Like access bean , session beans are another way of reading data from DB when you want to get data from multiple tables. A session bean must implement BASEJDBCHelper class.
public class TestSessionBean extends
com.ibm.commerce.base.helpers.BaseJDBCHelper
implements SessionBean{
public Object fetchResults() throws
javax.naming.NamingException, SQLException
{
try {
// get a connection from the WebSphere Commerce data source
makeConnection();
PreparedStatement prepStatement = getPreparedStatement( "sql to execute");
ResultSet rs = executeQuery(prepStatement, false);
}
finally {
closeConnection();
}
}
}
Using ServerJDBCHelperAccessBean
This is used when you have to make a db transaction outside of EJBs. Keep in mind that it is highly recommended to use EJBs for update/delete for keeping the overall integrity.
In your case, as far as I understand it is a select involving multiple tables and you are not keen on the data to be really in sync (like you are OK to lose a data which was updated nano seconds back or so). Hence you can go ahead with second or third approach
A good reference :
http://deepakpadmakumar.blogspot.com.au/2012/05/session-beans-and-entity-beans-in-wcs.html

Related

JPA: how to map some entities to a different schema of another database instance?

JPA: is there a way to map some entities to a schema of another database instance? e.g.,
#Entity
public class Foo {
}
#Entity
#Table(schema="schema1")
public class Bar {
}
The Bar entity is mapped to the schema1 of the same database instance. Is there a way in JPA to map it to a schema in a remote database instance? It is useful for sharing entities among multiple applications.
Can the "catalog" be used for this purpose?
What do you mean by 'remote database'?
If you use #Table(schema = "myschema", name = "bar"), Hibernate will qualify all queries with the schema name (e.g. SELECT e FROM Bar will ultimately translate to SELECT * FROM myschema.bar). If the database user you're using to connect to the DB has access to myschema.bar (whatever such a DB object is), then the query will work; if not, then the query will fail.
If you mean 'a remote DB that is a separate server', then, of course, you can only connect to the DB using one JDBC connection per persistence context. If that's your scenario, perhaps you should consult the docs of the RDBMS for ways to connect two DB instances (in Oracle, for example, you could use database links and synonyms).
Make sure that you understand the implications, though, as such a solution introduces its own class of problems (including the fact that you suddenly have implicit distributed transactions in your system).
As a side note, I'm not sure how such an approach is 'useful for sharing entities among multiple applications' or why one would even think 'sharing entities among multiple applications' is somehow useful, but I'd seriously think through the idea of integrating multiple application via shared/linked DBs. It usually introduces more problems than it solves.
If I understand well what you mean, you should use two (or more) different persistence context

JPA: How to exclude certain fields from loading from database table

i have a class User which holds an email address and password for authentication users in my web application. This user is mapped to the database via JPA / Eclipselink.
My question is, how can i prevent JPA from loading the password field back from the database? Since i will access the user object in my web app, i'm uncomfortable regarding security with sending the password to the browser.
Is there any way i can prevent loading the field in JPA / EclipseLink? Declaring the field transient is not an option, since i want to store it on the database when i call persist() on the user object.
Thanks,
fredddmadison
JB Nizet has a valid point. Retrieving it and serializing it in the Http response are two separate concerns.
I'm not sure what you're using to serialize your data. If it this is a REST API, consider Jackson's #JsonIgnore annotation or Eclipselink MOXy's #XmlTransient equivalent. If this uses Java EL (facelets, jsps), you should be able to select only the bean properties of interest.
If you really must do this during retrieval, consider JPQL's/Criteria API's constructor functionality. Ensure that the object has a constructor that accepts the specified parameters, and keep in mind that it won't be managed in the persistence context if it's retrieved in this manner.
SELECT NEW my.package.User(u.id, u.name, u.etc) FROM User u
Alternatively, consider the #PostLoad lifecycle callback.
#PostLoad
private void postLoad() {
this.password = null;
}
Finally, this might not be the case, but I would like to reinforce the notion that passwords shouldn't be stored in plaintext. I mention this because returning a hashed salted password that used a secure algorithm (bCrypt, multiple iteration SHA-512, etc) wouldn't be that big a deal (but still isn't ideal).
I have the similar problem. But in my case I have many #OneToMany relationships inside of Entity class and some of them are EAGER. When I query against this Entity it loads all of them, although for web service I need only some of them.
I tried TupleQuery. But it's not the solution because to get needed OneToMany relationships I have to join and get many duplicate rows of the main query. It makes the result more heawy, than economic.

Refactoring application: Direct database access -> access through REST

we have a huge database application, which must get refactored (there are so many reasons for this. biggest one: security).
What we already have:
MySQL Database
JPA2 (Eclipselink) classes for over 100 tables
Client application that accesses the database directly
What needs to be there:
REST interface
Login/Logout with roles via database
What I've done so far:
Set up Spring MVC 3.2.1 with Spring Security 3.1.1
Using a custom UserDetailsService (contains just static data for testing atm)
Created a few Controllers for testing (simply receiving/providing data)
Design Problems:
We have maaaaany #OneToMany and #ManyToMany relations in our database
1.: (important)
If I'd send the whole object tree with all child objects as a response, I could probably send the whole database at once.
So I need a way to request for example 'all Articles'. But it should omit all the child objects. I've tried this yesterday and the objects I received were tons of megabytes:
#PersistenceContext
private EntityManager em;
#RequestMapping(method=RequestMethod.GET)
public #ResponseBody List<Article> index() {
List<Article> a = em.createQuery("SELECT a FROM Article a", Article.class).getResultList();
return a;
}
2.: (important)
If the client receives an Article, at the moment we can simply call article.getAuthor() and JPA will do a SELECT a FROM Author a JOIN Article ar WHERE ar.author_id = ?.
With REST we could make a request to /authors/{id}. But: This way we can't use our old JPA models on the client side, because the model contains Author author and not Long author_id.
Do we have to rewrite every model or is there a simpler approach?
3.: (less important)
Authentication: Make it stateless or not? I've never worked with stateless auth so far, but Spring seems to have some kind of support for it. When I look at some sample implementations on the web I have security concerns: With every request they send username and password. This can't be the right way.
If someone knows a nice solution for that, please tell me. Else I'd just go with standard HTTP Sessions.
4.:
What's the best way to design the client side model?
public class Book {
int id;
List<Author> authors; //option1
List<Integer> authorIds; //option2
Map<Integer, Author> idAuthorMap; //option3
}
(This is a Book which has multiple authors). All three options have different pros and cons:
I could directly access the corresponding Author model, but if I request a Book model via REST, I maybe don't want the model now, but later. So option 2 would be better:
I could request a Book model directly via REST. And use the authorIds to afterwards fetch the corresponding author(s). But now I can't simply use myBook.getAuthors().
This is a mixture of 1. and 2.: If I just request the Books with only the Author ids included, I could do something like: idAuthorMap.put(authorId, null).
But maybe there's a Java library that handles all the stuff for me?!
That's it for now. Thank you guys :)
The maybe solution(s):
Problem: Select only the data I need. This means more or less to ignore every #ManyToMany, #OneToMany, #ManyToOne relations.
Solution: Use #JsonIgnore and/or #JsonIgnoreProperties.
Problem: Every ignored relation should get fetched easily without modifying the data model.
Solution: Example models:
class Book {
int bId;
Author author; // has #ManyToOne
}
class Author {
int aId;
List<Book> books; // has #OneToMany
}
Now I can fetch a book via REST: GET /books/4 and the result will look like that ('cause I ignore all relations via #JsonIgnore): {"bId":4}
Then I have to create another route to receive the related author: GET /books/4/author. Will return: {"aId":6}.
Backwards: GET /authors/6/books -> [{"bId":4},{"bId":42}].
There will be a route for every #ManyToMany, #OneToMany, #ManyToOne, but nothing more. So this will not exist: GET /authors/6/books/42. The client should use GET /books/42.
First, you will want to control how the JPA layer handles your relationships. What I mean is using Lazy Loading vs. Eager loading. This can easily be controller via the "fetch" option on the annotation like thus:
#OneToMany(fetch=FetchType.Lazy)
What this tells JPA is that, for this related object, only load it when some code requests it. Behind the scenes, what is happening is that a dynamic "proxy" object is being made/created. When you try to access this proxy, it's smart enough to go out and do another SQL to gather that needed bit. In the case of Collection, its even smart enough to grab the underlying objects in batches are you iterate over the items in the Collection. But, be warned: access to these proxies has to happen all within the same general Session. The underlying ORM framework (don't know how Eclipselink works...I am a Hybernate user) will not know how to associate the sub-requests with the proper domain object. This has a bigger effect when you use transportation frameworks like Flex BlazeDS, which tries to marshal objects using bytecode instead of the interface, and usually gets tripped up when it sees these proxy objects.
You may also want to set your cascade policy, which can be done via the "cascade" option like
#OneToMany(cascade=CascadeType.ALL)
Or you can give it a list like:
#OneToMany(cascade={CascadeType.MERGE, CascadeType.REMOVE})
Once you control what is getting pulled from your database, then you need to look at how you are marshalling your domain objects. Are you sending this via JSON, XML, a mixture depending on the request? What frameworks are you using (Jackson, FlexJSON, XStream, something else)? The problem is, even if you set the fetch type to Lazy, these frameworks will still go after the related objects, thus negating all the work you did telling it to lazily load. This is where things get more specific to the mashalling/serializing scheme: you will need to figure out how to tell your framework what to marshal and what not to marshal. Again, this will be highly dependent on whatever framework is in use.

What are the benefits of ORM lazy loading?

I'm researching data layer underpinnings for a new web-based reporting system and have spent a lot of time evaluating ORM's over the last few days. That said, I've never dealt with "lazy loading" before and am confused at why its the default setting for LINQ queries in the Entity Framework. It seems like it creates a lot of network traffic and unnecessarily tasks the database with additional queries that could otherwise be resolved with joins.
Can someone describe a scenario in which lazy loading would be beneficial?
Some meta:
The new system will be working against a database with hundreds of tables and many terabytes of data in a production environment with over 3,000 concurrent users on the system 24 hours a day. They will be retrieving large datasets continuously. Is it possible that an ORM just isn't the right solution for our needs, especially since the app will be web-based?
When we talk about lazy loading we are talking about Navigation Properties (how we follow foreign keys). What lazy loading will do for us is to populate the entity from a remote table as we attempt to access that entity. For example if we have a model like this
public class TestEntity
{
public int Id{get;set;}
public AnotherEntity RemoteEntity{get;set;}
}
And call the following
var something = WhateverContext.TestEntities.First().RemoteEntity;
We will get 2 database calls, one for WhateverContext.TestEntities.First() and one for loading the remote entity.
I'm a web guy, (and more specifically an MVC guy) and for web stuff I don't think there is ever a good reason for wanting to do this, One database call is always going to be quicker than two if we require the same set of data.
The situation where I think that lazy loading is actually worth considering is when you don't know when you do your first query if you will need the second entity at all. In my opinion this is much more relevant for windows applications where we have a user who is performing actions in real time (rather than stateless MVC where users are requesting whole pages at once). For example I think lazy loading shines when we have a list of data with a details link, then we don't load the details until the user decides they want to see them.
I don't feel this extends to paging, sorting and filtering, IMO there should be one specifically crafted database query per page of data you are displaying, which returns exactly the data set required to display that page.
In terms of your performance question, I feel that EF (or another ORM) can probably meet your needs here but you want to be careful with how you are retrieving large datasets due to the way EF tracks entities. Check out my EF performance tuning cheat sheet, and read up on DetectChanges and AsNoTracking if you do decide to use EF with large queries.
Most ORMs will give you the option, when you're building up your object selections, to say "don't be lazy, go ahead and join", so if you're worried about it from an efficiency perspective, don't be. You can make it work (usually).
There are 2 particular cases I know of where lazy loading helps:
Chaining commands
What if you want to create a basic select, but then you want to run it through a sort and a filter function that's based on user input. You can simply pass the ORM object in, and attach the sort and filtering functionality to it. Instead of evaluating it each time, it only evaluates when it's actually used.
Avoiding huge, deep, highly-relational queries
What if you just need the IDs of some related fields? If it loads lazily, you don't have to worry about it joining a whole bunch of data and tables that you don't need, potentially slowing down the query and overusing bandwidth. Of course, if you DID want everything else, then you'll need to be explicit, or you may run into a problem where it lazily runs a query for each detail record. Like I mentioned at the outset, that's easily overcome in any ORM worth using.
A simple case is a result set of N records which you do not want to bring to the client at once. The benefit is that you are able to lazily load only what is needed for the clients demands, such as sorting, filtering, etc... An example would be a paging view where one could page through records and sort them accordingly, thus the client only needs N amount at a given time.
When you perform the LINQ query it translates that to SQL commands on the server side to provide only what is needed in the given context. It boils down to offloading work to the database and minimizing what you need to send back to the client.
Some will argue that ORM based lazy loading is wrong however that starts to move to semantics fairly quick and should be more about approach to design versus what is right and wrong.

Create new or update existing entity at one go with JPA

A have a JPA entity that has timestamp field and is distinguished by a complex identifier field. What I need is to update timestamp in an entity that has already been stored, otherwise create and store new entity with the current timestamp.
As it turns out the task is not as simple as it seems from the first sight. The problem is that in concurrent environment I get nasty "Unique index or primary key violation" exception. Here's my code:
// Load existing entity, if any.
Entity e = entityManager.find(Entity.class, id);
if (e == null) {
// Could not find entity with the specified id in the database, so create new one.
e = entityManager.merge(new Entity(id));
}
// Set current time...
e.setTimestamp(new Date());
// ...and finally save entity.
entityManager.flush();
Please note that in this example entity identifier is not generated on insert, it is known in advance.
When two or more of threads run this block of code in parallel, they may simultaneously get null from entityManager.find(Entity.class, id) method call, so they will attempt to save two or more entities at the same time, with the same identifier resulting in error.
I think that there are few solutions to the problem.
Sure I could synchronize this code block with a global lock to prevent concurrent access to the database, but would it be the most efficient way?
Some databases support very handy MERGE statement that updates existing or creates new row if none exists. But I doubt that OpenJPA (JPA implementation of my choice) supports it.
Event if JPA does not support SQL MERGE, I can always fall back to plain old JDBC and do whatever I want with the database. But I don't want to leave comfortable API and mess with hairy JDBC+SQL combination.
There is a magic trick to fix it using standard JPA API only, but I don't know it yet.
Please help.
You are referring to the transaction isolation of JPA transactions. I.e. what is the behaviour of transactions when they access other transactions' resources.
According to this article:
READ_COMMITTED is the expected default Transaction Isolation level for using [..] EJB3 JPA
This means that - yes, you will have problems with the above code.
But JPA doesn't support custom isolation levels.
This thread discusses the topic more extensively. Depending on whether you use Spring or EJB, I think you can make use of the proper transaction strategy.