Dynamic mask data for web project - jpa

Currently our web projects need to anonymize some data.
(for example a security number like 432-55-1111 might appear as 432-55-**)
These datas may contain email, id, price, date ,and so on.
The tables' name and columns that needed to be masked was saved in DB.
We are using spring security to judge a user whether he can see the data or not.
The data domain object(CMP) may be get from SQL or JPQL(named query or native query)or JPA Load method or Mainframe.
We need to find a best efficient way (not the DB end) to mask these data dynamically.
If we use a interceptor at the EJB method end , we need to annotation all the Object(DTO)
and all the columns. That's may be low efficiency.
Any body know how can we invoke a method(like a interceptor) when finish SQL executed and named query(native query) exectued, and we can call a method to mask the result by the query and user id.
Or other ways.
It would be good to have this in the lowest level, so that other applications like reporting would not need a separate solution.
Our project's architecture is JSF+Spring+EJB 3.0+JPA 1.0.We have many web projects.For JPA some projects using EclipseLink 2.2 ,some using Hibernate.
UPDATE:
More information about our projects. We have many web project about different feature.So we have many ejb projects associated with them. Every ejb has DAO to get their CMP by call JPQL or get(class, primarykey) metod.Like below:
Query query = em.createNamedQuery(XXXCMP.FIND_XXX_BY_NAME);
query.setHint(QueryHints.READ_ONLY, HintValues.TRUE);
query.setParameter("shortName", "XXX").getSingleResult();
Or
XXXCMP screen = entityManager.find(XXXCMP.class, id);
The new EJB services code converter to transfrom the data from CMP to DTO.
The converter as below:
/**
* Convert to CMP.
*
*/
CMP convertToCMP(DTO dto, EntityManager em);
/**
* Convert CMP to domain object with all fields populated, the default scenario is
* <code>EConvertScenario.Detail</code>.
*
*/
DTO convertFromCMP(CMP cmp, EntityManager em);
But some old services use their own methods to convert CMP.Also some domain services used for search lazy paing, they also don't use the converter.
We want to mask the data before the CMP convert to DTO.

You can try EntityListener to intercept entity loading into the persistence context with #PostLoad annotation.
Else, can try within accesor methods(getter/setter) which I think is suitable for masking/formatting etc.
Edit : (based on comments & question update)
You can share entity/DTO across appications
public String getSomethingMasked(){
return mask(originalString);
}
The data retrieval pattern isn't uniform across applications. If all the applications are using same database, it must be generalized. There is no point of writing same thing again with different tools. Each application might apply business logic afterwards.
Probably, you can have a separate project meant for interacting with database & then including it in other applications for further use. So it will be a common point to change anything, debug, enhance etc.
You are using Eclipselink, Hibernate & other custom ways for fetching data & you require minimal workaround, which from my perspective seems difficult.
Either centralize the data retrieval or make changes all over separately if possible, which I think is not feasible, compromising consistency.

In this case, you may intercept the JSF's conversion fase. This solution applies to JSF views, not to reportings.
#FacesConverter("AnonymizeDataConverter")
public class AnonymizeDataConverter implements Converter{
#Override
public Object getAsObject(FacesContext context, UIComponent component,
String value) {
return getAnonymisedData(value);
}
#Override
public String getAsString(FacesContext context, UIComponent component,
Object value) {
return getAnonymisedData(value);
}
public static String getAnonymisedData(Object data) {
if (data == null)
return "";
String value = data.toString().trim();
if (!value.isEmpty())
return value.substring(0, value.lenght() - 4) + "**";
return "";
}
}

Related

Synchronization of multiple catalogs in hybris

I need help on this issue. I am upgrading hybris to version 2105, I ran into the problem that the 'DefaultSetupSyncJobService' class has changed its methods from 'SyncItemJob' to 'SyncItemJobModel'. How could I adapt these classes so that the catalogs are correctly synchronized ?
The new implementation for synchronization is stored in CatalogSynchronizationService.synchronizeFully(source,target) for a full synchronization. There are other methods to perform partial syncs with a select subset of items aswell in that service.
Do note that this will always create new SyncItemCronJobModel objects instead of reusing an existing syncjob. Depending on what you want, you can modify the implementation somewhat to reuse the same syncjob. The code then becomes something like this:
catalogSynchronizationService.getSyncJob(source, target, null);
instead of calling createSyncItemJob(...)
The rest is the same as the original implementation. So customized implementation of synchronizeFully could look something like(Extend from DefaultCatalogSynchronizationService)
public void synchronizeFully(final CatalogVersionModel source, final CatalogVersionModel target) {
final SyncItemJobModel syncJob = getSyncJob(source, target, null);
final SyncItemCronJobModel syncCronJob = createSyncCronJob(syncJob);
cronJobService.performCronJob(syncCronJob, true);
}

How to Register Interests using 'ALL_KEYS' in Spring Data GemFire with ClientRegionFactoryBean

I am going to register interests in ALL_KEYS for my Pivotal GemFire client via Spring Data GemFire, but I find that ClientRegionFactoryBean has one method.
org.springframework.data.gemfire.client.ClientRegionFactoryBean.setInterests(Interest<MyRegionPojo>[] interests)
In this case, I only can set the exact keys, but I want to register interests for all keys. My key is not a simple class like String, or Long, but a complex object MyRegionPojo.
Please help if any method to implement so like GemFire API region.registerInterest("ALL_KEYS");
You problem statement is a bit vague but I assume/suspect you are configuring your Spring (Data GemFire) (SDG) application using Spring JavaConfig?
However, I will quickly add that this is not unlike how you would register interests in all keys using SDG's XML namespace, as shown here.
The JavaConfig approach is similar, but clearly based on "strongly-typed arguments", namely 1 or more sub-type instances of the o.s.d.g.client.Interest class to the o.s.d.g.client.ClientRegionFactoryBean.setInterests(:Interest<K>[]) method.
By way of example, you might do the following...
#Bean("Example")
public ClientRegionFactoryBean<?, ?> exampleRegion(GemFireCache gemfireCache) {
ClientRegionFactoryBean<MyRegionKey, MyRegionValue> exampleRegion =
new ClientRegionFactoryBean<>();
RegexInterest regexInterest = new RegexInterest();
regexInterest.setKey(".*");
exampleRegion.setCache(gemfireCache);
exampleRegion.setShortcut(ClientRegionShortcut.PROXY);
exampleRegion.setInterests(new Interest[] { regexInterest });
exampleRegion.setKeyConstraint(MyRegionKey.class);
exampleRegion.setValueConstraint(MyRegionValue.class);
return exampleRegion;
}
NOTE: updated the example above to reflect the proper way to register (Regex) interests based on SDG 1.9 or earlier. Keep in mind that the `o.s.d.g.client.RegexInterest.getRegex() delegates to getKey() therefore you can set the Regular Expression using setKey(:String) as I have shown above.
Notice the o.s.d.g.client.RegexInterest sub-type registration, which is effectively the same as register interests in "ALL_KEYS", as described here as well.
Hope this helps!
-John

MEF exports that require remote data (like DB data) in order to be created

please excuse the long description at the beginning. the questions are at the end.
i have a windows service that is supposed to read data form some data sources (represented by the IDataSource interface).
i'm using MEF in my project and i was thinking of injecting the required data sources via ctor injection like below:
[Export(typeof(Service))]
public class Service:ServiceBase{
[ImportingConstructor]
public Service([ImportMany]IEnumerable<IDataSource> dataSources){
//...
}
}
However, there is a problem in doing it like this. The service needs to use any combination of data sources: multiple data sources of the same type (ex: 2 CSVDataSource instances) or multiple data sources of different types (ex: 2 CSVDataSource instances and 1 SQLDataSource instance).
Each data source has properties that are retrieved from the DB in order to properly set it up. these settings might indicate from where to read the data and at what intervals. this is why, in my implementation, the data sources have a ctor that accepts an id. this id is used to identify the data source in the DB and to retrieve the specific data source settings from the DB. this can be seen below.
public class CSVDataSource: IDataSource{
public CSVDataSource(int dsId){
//call web service in order to get properties to
//properly set up the data source.
}
//...
}
i feel that the service definition presented above is not suited for this scenario. The other approach I can think of is to use some sort of factory that allows the service to dynamically create the data sources inside. this implementation might look like below.
public class Service:ServiceBase{
[ImportingConstructor]
public Service(IDataSourceFactory dsFactory)
{
if (dsFactory == null) throw new ArgumentNullException("dsFactory");
IEnumerable<IDataSource> dataSources = dsFactory.CreateAll();
}
}
[Export(typeof(IDataSourceFactory))]
[PartCreationPolicy(CreationPolicy.Shared)]
public class DataSourceFactory:IDataSourceFactory
{
private readonly int agentId;
[ImportingConstructor]
public DataSourceFactory([Import("AgentId")]int agentId)
{
this.agentId = agentId;
}
public IEnumerable<IDataSource> CreateAll()
{
List<IDataSource> dataSources = new List<IDataSource>();
//access web service and instantiate the data sources
return dataSources;
}
}
And now to my questions:
is my factory approach a good ideea or should i look for another approach?
is it ok to have exports that require data from a remote location in order to be created?
Did you come across ExportMetadataAttribute before? It will allow you to assign metadata to an export that you can view before the export is created. You'll be able to import your IDataSources as Lazy and then should be able to create them yourself with the required parameters.
There's a good breakdown of Lazy and ExportMetadata here

EntityFramework with Repository Pattern and no Database

I have a web api project that I'm building on an N-Tier system. Without causing too many changes to the overall system, I will not be touching the data server that has access to the database. Instead, I'm using .NET remoting to create a tcp channel that will allow me to send requests to the data server, which will then query the database and send back a response object.
On my application, I would like to use entity framework to create my datacontexts (unit of work), then create a repository pattern that interfaces with those contexts, which will be called by the web api project that I created.
However, I'm having problems with entity framework as it requires me to have a connection with the database. Is there anyway I can create a full entity framework project without any sqlconnections to the database? I just need dbcontexts, which I will be mapping my response objects and I figure that EF would do what I needed (ie help with design, and team collabs, and provide a nice graphical designer); but it throws an error insisting that I need a connection string.
I've been searching high and low for tutorials where a database is not needed, nor any sql connection string (this means no localdb either).
Okay as promised, I have 3 solutions for this. I personally went with #3.
Note: Whenever there is a repository pattern present, and "datacontext" is used, this is interpreted as your UnitOfWork.
Solution 1: Create singletons to represent your datacontext.
http://www.breezejs.com/samples/nodb
I found this idea after going to BreezeJS.com's website and checked out their samples. They have a sample called NoDb, which allows them to create a singleton, which can create an item and a list of items, and a method to populate the datacontext. You create singletons that would lock a space in memory to prevent any kind of thread conflicts. Here is a tid bit of the code:
//generates singleton
public class TodoContext
{
static TodoContext{ }
private TodoContext() { }
public static TodoContext Instance
{
get
{
if (!__instance._initialized)
{
__instance.PopulateWithSampleData();
__instance._initialized = true;
}
return __instance;
}
}
public void PopulateWithSampleData()
{
var newList = new TodoItem { Title = "Before work"};
AddTodoList(newList);
var listId = newList.TodoListId;
var newItem = new TodoItem {
TodoListId = listId, Title = "Make coffee", IsDone = false };
AddTodoItem(newItem);
newItem = new TodoItem {
TodoListId = listId, Title = "Turn heater off", IsDone = false };
AddTodoItem(newItem);
}
//SaveChanges(), SaveTodoList(), AddTodoItem, etc.
{ ... }
private static readonly Object __lock = new Object();
private static readonly TodoContext __instance = new TodoContext();
private bool _initialized;
private readonly List<TodoItem> _todoLists = new List<TodoItem>();
private readonly List<KeyMapping> _keyMappings = new List<KeyMapping>();
}
There's a repository included which directs how to save the context and what needs to be done before the context is saved. It also allows the list of items to be queryable.
Problem I had with this:
I felt like there was higher maintenance when creating new datacontexts. If I have StateContext, CityContext, CountryContext, the overhead of creating them would be too great. I'd have problems trying to wrap my head around relating them to each other as well. Plus I'm not too sure how many people out there who agree with using singletons. I've read articles that we should avoid singletons at all costs. I'm more concerns about anyone who'd be reading this much code.
Solution 2: Override the Seed() for DropCreateDatabaseAlways
http://www.itorian.com/2012/10/entity-frameworks-database-seed-method.html
For this trick, you have to create a class called SampleDatastoreInitializer that inherits from System.Data.Entity.DropCreateDatabaseAlways where T is the datacontext, which has a reference to a collection of your POCO model.
public class State
{
[Key()]
public string Abbr{ get; set; }
public string Name{ get; set; }
}
public class StateContext : DbContext
{
public virtual IDbSet<State> States { get; set; }
}
public class SampleDatastoreInitializer : DropCreateDatabaseAlways<StateContext>
{
protected override void Seed (StateContext context)
{
var states = new List<State>
{
new State { Abbr = "NY", Name = "New York" },
new State { Abbr = "CA", Name = "California" },
new State { Abbr = "AL", Name = "Alabama" },
new State { Abbr = "Tx", Name = "Texas" },
};
states.ForEach(s => context.States.Add(s));
context.SaveChanges();
}
}
This will actually embed the data in a cache, the DropCreateDatabaseAlways means that it will drop the cache and recreate it no matter what. If you use some other means of IDatabaseInitializer, and your model has a unique key, you might get an exception error, where you run it the first time, it works, but run it again and again, it will fail because you're violating the constraints of primary key (since you're adding duplicate rows).
Problem I had with this:
This seems like it should only be used to provide sample data when you're testing the application, not for production level. Plus I'd have to continously create a new initializer for each context, which plays a similar problem noted in solution 1 of maintainability. There is nothing automatic happening here. But if you want a way to inject sample code without hooking up to a database, this is a great solution.
Solution 3: Entity framework with Repository (In-memory persistence)
I got this solution from this website:
http://www.roelvanlisdonk.nl/?p=2827
He first sets up an edmx file, using EF5 and the code generator templates for EF5 dbcontexts you can get from VS extension libraries.
He first uses the edmx to create the contexts and changes the tt templates to bind to the repository class he made, so that the repository will keep track of the datacontext, and provide the options of querying and accessing the data through the repository; in his website though he calls the repository as MemoryPersistenceDbSet.
The templates he modified will be used to create datacontexts that will bind to an interface (IEntity) shared by all. Doing it this way is nice because you are establishing a Dependency Injection, so that you can add any entity you want through the T4 templates, and there'd be no complaints.
Advantage of this solution:
Wrapping up the edmx in repository pattern allows you to leverage the n-tier architecture, so that any changes done to the backend won't affect the front end, and allows you to separate the interface between the front end and backend so there are no coupled dependencies. So maybe later on, I can replace my edmx with petapoco, or massive, or some other ORM, or switch from in-memory persistence to fetching data from a database.
I followed everything exactly as explained. I made one modification though:
In the t4 template for .Context.tt, where DbSetInConstructor is added, I had the code written like this:
public string DbSetInConstructor(EntitySet entitySet)
{
return string.Format(
CultureInfo.InvariantCulture,
“this.{1} = new BaseRepository();”,
_typeMapper.GetTypeName(entitySet.ElementType), entitySet);
}
Because in my case I had the entityset = Persons and entityname = Person. So there’d be discrepancy. But this should cover all bases.
Final step:
So whether you picked solution 1, 2, or 3. You have a method to automatically populate your application. In these cases, the stubs are embedded in the code. In my case, what I've done is have my web server (containing my front end app), contact my data server, have the data server query the database. The data server will receive a dataset, serialize it, and pass it back to the web server. The web server will take that dataset, deserialize it, and auto-map to an object collection (list, or enumberable, or objectcollection, etc).
I would post the solutions more fully but there's way too much detail between all 3 of these solutions. Hopefully these solutions would point anyone in the right direction.
Dependency Injection
If anyone wants some information about how to allow DI to api controllers, Peter Provost provides a very useful blog that explains how to do it. He does a very very good job.
http://www.peterprovost.org/blog/2012/06/19/adding-ninject-to-web-api/
few more helpful links of repository wrapping up edmx:
http://blogs.msdn.com/b/wriju/archive/2013/08/23/using-repository-pattern-in-entity-framework.aspx
http://www.codeproject.com/Articles/688929/Repository-Pattern-and-Unit-of

Unit testing EF - how to extract EF code out from BL?

I have read so much (dozens of posts) about one thing:
How to unit test business logic code that has Entity Framework code in it.
I have a WCF service with 3 layers :
Service Layer
Business Logic Layer
Data Access Layer
My business logic uses the DbContext for all the database operations.
All my entities are now POCOs (used to be ObjectContext, but I changed that).
I have read Ladislav Mrnka's answer here and here on the reasons why we should not mock \ fake the DbContext.
He said:
"That is the reason why I believe that code dealing with context / Linq-to-entities should be covered with integration tests and work against the real database."
and:
"Sure, your approach works in some cases but unit testing strategy must work in all cases - to make it work you must move EF and IQueryable completely from your tested method."
My question is - how do you achieve this ???
public class TaskManager
{
public void UpdateTaskStatus(
Guid loggedInUserId,
Guid clientId,
Guid taskId,
Guid chosenOptionId,
Boolean isTaskCompleted,
String notes,
Byte[] rowVersion
)
{
using (TransactionScope ts = new TransactionScope())
{
using (CloseDBEntities entities = new CloseDBEntities())
{
User currentUser = entities.Users.SingleOrDefault(us => us.Id == loggedInUserId);
if (currentUser == null)
throw new Exception("Logged user does not exist in the system.");
// Locate the task that is attached to this client
ClientTaskStatus taskStatus = entities.ClientTaskStatuses.SingleOrDefault(p => p.TaskId == taskId && p.Visit.ClientId == clientId);
if (taskStatus == null)
throw new Exception("Could not find this task for the client in the database.");
if (taskStatus.Visit.CustomerRepId.HasValue == false)
throw new Exception("No customer rep is assigned to the client yet.");
TaskOption option = entities.TaskOptions.SingleOrDefault(op => op.Id == optionId);
if (option == null)
throw new Exception("The chosen option was not found in the database.");
if (taskStatus.RowVersion != rowVersion)
throw new Exception("The task was updated by someone else. Please refresh the information and try again.");
taskStatus.ChosenOptionId = optionId;
taskStatus.IsCompleted = isTaskCompleted;
taskStatus.Notes = notes;
// Save changes to database
entities.SaveChanges();
}
// Complete the transaction scope
ts.Complete();
}
}
}
In the code attached there is a demonstration of a function from my business logic.
The function has several 'trips' to the database.
I don't understand how exactly I can strip the EF code from this function out to a separate assembly, so that I am able to unit test this function (by injecting some fake data instead of the EF data), and integrate test the assembly that contains the 'EF functions'.
Can Ladislav or anyone else help out?
[Edit]
Here is another example of code from my business logic, I don't understand how I can 'move the EF and IQueryable code' out from my tested method :
public List<UserDto> GetUsersByFilters(
String ssn,
List<Guid> orderIds,
List<MaritalStatusEnum> maritalStatuses,
String name,
int age
)
{
using (MyProjEntities entities = new MyProjEntities())
{
IQueryable<User> users = entities.Users;
// Filter By SSN (check if the user's ssn matches)
if (String.IsNullOrEmusy(ssn) == false)
users = users.Where(us => us.SSN == ssn);
// Filter By Orders (check fi the user has all the orders in the list)
if (orderIds != null)
users = users.Where(us => UserContainsAllOrders(us, orderIds));
// Filter By Marital Status (check if the user has a marital status that is in the filter list)
if (maritalStatuses != null)
users = users.Where(pt => maritalStatuses.Contains((MaritalStatusEnum)us.MaritalStatus));
// Filter By Name (check if the user's name matches)
if (String.IsNullOrEmusy(name) == false)
users = users.Where(us => us.name == name);
// Filter By Age (check if the user's age matches)
if (age > 0)
users = users.Where(us => us.Age == age);
return users.ToList();
}
}
private Boolean UserContainsAllOrders(User user, List<Guid> orderIds)
{
return orderIds.All(orderId => user.Orders.Any(order => order.Id == orderId));
}
If you want to unit test your TaskManager class, you should employ the Repository dessign pattern and inject repositories such as UserRepository or ClientTaskStatusRepository into this class. Then instead of constructing CloseDBEntities object you will use these repositories and call their methods, for example:
User currentUser = userRepository.GetUser(loggedInUserId);
ClientTaskStatus taskStatus =
clientTaskStatusRepository.GetTaskStatus(taskId, clientId);
If yout wanto to integration test your TaskManager class, the solution is much more simple. You just need to initialize CloseDBEntities object with a connection string pointing to the test database and that's it. One way how to achieve this is injecting the CloseDBEntities object into the TaskManager class.
You will also need to re-create the test database before each integration test run and populate it with some test data. This can be achieved using Database Initializer.
There are several misunderstandings here.
First: The Repository Pattern. It's not just a facade over DbSet for unit testing! The repository is a pattenr strongly related to Aggregate and Aggreate Root concepts of Domain Driven Design. An aggregate is a set of related entities that should stay consistent to each other. I mean a business consistency, not just only a foreign keys validity. For example: a customer who have made 2 orders should get a 5% discount. So we should somehow manage the consistency between the number of order entities related to a customer entity and a discount property of the customer entity. A node responsible for this is an aggregate root. It is also the only node that should be accessible directly from outside of the aggregate. And the repository is an utility to obtain an aggregate root from some (maybe persistent) storage.
A typical use case is to create a UoW/Transaction/DbContext/WhateverYouNameIt, obtain one aggregate root entity from the repository, call some methods on it or access some other entities by traversing from the root, Commit/SaveChanges/Whatever. Look, how far it differs from yur samples.
Second: The Business Logic. I've already showed you one example: a customer who have made 2 orders should get a 5% discount. In contrary: your second code sample is not a business logic. It's just a query. The responsibility of this code is to obtain some data from the storage. In such a case, the storage technology behind it does matter. So I would recomend integration tests here rather than pretending the storage doesn't matter when interacting with the storage is the sole purpose of this function.
I would also encapsulate that in a Query Object that was already suggested. Then - such a query object could be mocked. Not just DbContext behind it. The whole QO.
The first code sample is a bit better because it probably ivolves some business logic, but that's dificult to identify. Wich leads us to the third problem.
Third: Anemic Domain Model. Your domain doesnt' look very object oriented. You have some dumb entities and transaction scripts over them. With 7 parameters! Thats pure procedural programming.
Moreover, in your UpdateTaskStatus use case - what is the aggregate root? Befere you answer that, the most important question first: what exactly do you want to do? Is that... hmm... marking a current task of a user done when he was visited? Than, maybe there should be a method Visit() inside a Customer Entity? And this method should have something like this.CurrentTaskStatus.IsCompleted = true?
That was just a random guess. If I missed, that would clearly show another issue. The domain model should use the ubiquitous language - something common for the programmer and a business. Your code doesn't have that expressive power that a common language gives. I just don't know what is going on there in UpdateTaskStatus with 7 parameters.
If you place proper expressive methods for performing business operations in your entities that will also enforce you to not use DbContext there at all, as you need your entities to stay persistence ignorant. Then the problem with mocking disappears. You can test the pure business logic without persistence concerns.
So the final word: Reconsider your model first. Make your API expressive by using ubiquitous language first.
PS: Please don't treat me as an authority. I may be completely wrong as I'm just starting to learn DDD.