how to implementing this simple logic in spring batch? - spring-batch

i tried to make this as simple as possible. i`m new to spring batch, i have a small isuue with understanding how to relate spring items together especially when it comes to multi-steps jobs however this is my logic not code(simplified) and i dont know to impliment it in spring batch so i thought this might be the right structure
reader_money
reader_details
tasklet
reader_profit
tasklet_calculation
writer
however please correct me if i`m wrong and provide some code if possible.
thank you very much
LOGIC:
sql = "select * from MONEY where id= user input"; //the user will input the condition
while (records are available) {
int currency= resultset(currency column);
sql= "select * from DETAILS where D_currency = currency";
while (records are available) {
int amount= resultset(amount column);
string money_flag= resultset(money_type column);
sql= "select * from PROFIT where Mtypes = money_type";
while (records are available) {
int revenue= resultset(revenue);
if (money_type== 1) {
int net_profit= revenue * 3.75;
sql = "update PROFIT set Nprofit = net_profit";
}
else (money_type== 2) {
int net_profit = (revenue - 5 ) * 3.7 ;
sql = "update PROFIT set Nprofit = net_profit";
}
}
sql="update DETAILS set detail_falg = 001 ";
}
sql = "update MONEY set currency_flag = 009";
}

to fit this into a 'conventional' spring batch configuration, you would need to flatten the three loops into one if possible.
perhaps a sql statement that would return it in one loop similiar to;
select p.revenue, d.amount from PROFIT p, DETAILS d, MONEY m where p.MTypes = d.money_type and d.D_currency = m.currency and m.id = :?
once you've "flattened" it, you then fall into the more 'conventional' read/process/write of a chunk pattern where the reader retrieves a record from the resultset, the processor performs the money_type logic, and the writer then executes the 'update' statement.

Check for the use of ItemReaderAdapter where you could place all your SQL in some kind of DAO that could return a list of aggregated object containing all the info you need for your calculation.
Or
You could use the CompositeItemReader pattern. You basicaly define multiple ItemReader into one master ItemReader. The read() method will invoke all the inner ItemReader before going to the Processor /writer phase.
I could post you some example.. but i have to leave :-(..
Leave a comment if you need some example

Related

Entity Framework Core, Stored Procedure

I am totally confused regarding how to use Stored Procedures using Entity Framework Core. If the stored procedure return an anonymous type, how do I retrieve the data? If the return type is not anonymous, what should I do? How do I add input/output parameters?
I am asking these questions because everywhere I look, I get a different answer. I guess EF Core is evolving rapidly and Microsoft is dabbling with a lot of ideas.
How do I add input/output parameters?
I'm going to answer this particular question of yours.
Below is a TSQL stored procedure with two input and two output parameters
CREATE PROCEDURE [dbo].[yourstoredprocedure]
-- Add the parameters for the stored procedure here
#varone bigint
,#vartwo Date
,#varthree double precision OUTPUT
,#varfour bigint OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- YOUR CODE HERE
SET #varthree = 10.02;
SET #varfour = #varone;
return;
END
Now To execute this stored procedure using Entity Framework Core
MyContext.Database
.ExecuteSqlCommand(#"EXECUTE [yourstoredprocedure] " +
" {0} " +
", {1} " +
",#varthree OUTPUT " +
", #varfour OUTPUT ", dataOne, dataTwo, outputVarOne, outputVarTwo);
var outputResultOne= outputVarOne.Value as double?;
var outputResultTwo= outputVarTwo.Value as long?;
You can pass your input simply using parameterized query as above. You can also create named parameters. such as for output parameters, I've created two named parameters as -
var outputVarOne = new SqlParameter
{
ParameterName = "#varthree ",
DbType = System.Data.DbType.Double,
Direction = System.Data.ParameterDirection.Output
};
var outputVarTwo = new SqlParameter
{
ParameterName = "#varfour ",
DbType = System.Data.DbType.Int64,
Direction = System.Data.ParameterDirection.Output
};
And This is how using EF Core you execute a stored procedure with input and output parameters. Hope this helps someone.
This solution provides methods that call a stored procedure and maps the returned value to a defined (non-model) entity. https://github.com/verdie-g/StoredProcedureDotNetCore
Microsoft address this issue:
"SQL queries can only be used to return entity types that are part of your model. There is an enhancement on our backlog to enable returning ad-hoc types from raw SQL queries." https://learn.microsoft.com/en-us/ef/core/querying/raw-sql
And here is the issue tracked in GitHub: https://github.com/aspnet/EntityFramework/issues/1862
you might use an extention like StoredProcedureEFCore
Then the usage is more intuitively.
List rows = null;
ctx.LoadStoredProc("dbo.ListAll")
.AddParam("limit", 300L)
.AddParam("limitOut", out IOutParam<long> limitOut)
.Exec(r => rows = r.ToList<Model>());
long limitOutValue = limitOut.Value;
ctx.LoadStoredProc("dbo.ReturnBoolean")
.AddParam("boolean_to_return", true)
.ReturnValue(out IOutParam<bool> retParam)
.ExecNonQuery();
bool b = retParam.Value;
ctx.LoadStoredProc("dbo.ListAll")
.AddParam("limit", 1L)
.ExecScalar(out long l);

Batching large result sets using Rx

I've got an interesting question for Rx experts. I've a relational table keeping information about events. An event consists of id, type and time it happened. In my code, I need to fetch all the events within a certain, potentially wide, time range.
SELECT * FROM events WHERE event.time > :before AND event.time < :after ORDER BY time LIMIT :batch_size
To improve reliability and deal with large result sets, I query the records in batches of size :batch_size. Now, I want to write a function that, given :before and :after, will return an Observable representing the result set.
Observable<Event> getEvents(long before, long after);
Internally, the function should query the database in batches. The distribution of events along the time scale is unknown. So the natural way to address batching is this:
fetch first N records
if the result is not empty, use the last record's time as a new 'before' parameter, and fetch the next N records; otherwise terminate
if the result is not empty, use the last record's time as a new 'before' parameter, and fetch the next N records; otherwise terminate
... and so on (the idea should be clear)
My question is:
Is there a way to express this function in terms of higher-level Observable primitives (filter/map/flatMap/scan/range etc), without using the subscribers explicitly?
So far, I've failed to do this, and come up with the following straightforward code instead:
private void observeGetRecords(long before, long after, Subscriber<? super Event> subscriber) {
long start = before;
while (start < after) {
final List<Event> records;
try {
records = getRecordsByRange(start, after);
} catch (Exception e) {
subscriber.onError(e);
return;
}
if (records.isEmpty()) break;
records.forEach(subscriber::onNext);
start = Iterables.getLast(records).getTime();
}
subscriber.onCompleted();
}
public Observable<Event> getRecords(final long before, final long after) {
return Observable.create(subscriber -> observeGetRecords(before, after, subscriber));
}
Here, getRecordsByRange implements the SELECT query using DBI and returns a List. This code works fine, but lacks elegance of high-level Rx constructs.
NB: I know that I can return Iterator as a result of SELECT query in DBI. However, I don't want to do that, and prefer to run multiple queries instead. This computation does not have to be atomic, so the issues of transaction isolation are not relevant.
Although I don't fully understand why you want such time-reuse, here is how I'd do it:
BehaviorSubject<Long> start = BehaviorSubject.create(0L);
start
.subscribeOn(Schedulers.trampoline())
.flatMap(tstart ->
getEvents(tstart, tstart + twindow)
.publish(o ->
o.takeLast(1)
.doOnNext(r -> start.onNext(r.time))
.ignoreElements()
.mergeWith(o)
)
)
.subscribe(...)

How to make Appstats show both small and read operations?

I'm profiling my application locally (using the Dev server) to get more information about how GAE works. My tests are comparing the common full Entity query and the Projection Query. In my tests both queries do the same query, but the Projection is specified with 2 properties. The test kind has 100 properties, all with the same value for each Entity, with a total of 10 Entities. An image with the Datastore viewer and the Appstats generated data is shown bellow. In the Appstats image, Request 4 is a memcache flush, Request 3 is the test database creation (it was already created, so no costs here), Request 2 is the full Entity query and Request 1 is the projection query.
I'm surprised that both queries resulted in the same amount of reads. My guess is that small and read operations and being reported the same by Appstats. If this is the case, I want to separate them in the reports. That's the queries related functions:
// Full Entity Query
public ReturnCodes doQuery() {
DatastoreService dataStore = DatastoreServiceFactory.getDatastoreService();
for(int i = 0; i < numIters; ++i) {
Filter filter = new FilterPredicate(DBCreation.PROPERTY_NAME_PREFIX + i,
FilterOperator.NOT_EQUAL, i);
Query query = new Query(DBCreation.ENTITY_NAME).setFilter(filter);
PreparedQuery prepQuery = dataStore.prepare(query);
Iterable<Entity> results = prepQuery.asIterable();
for(Entity result : results) {
log.info(result.toString());
}
}
return ReturnCodes.SUCCESS;
}
// Projection Query
public ReturnCodes doQuery() {
DatastoreService dataStore = DatastoreServiceFactory.getDatastoreService();
for(int i = 0; i < numIters; ++i) {
String projectionPropName = DBCreation.PROPERTY_NAME_PREFIX + i;
Filter filter = new FilterPredicate(DBCreation.PROPERTY_NAME_PREFIX + i,
FilterOperator.NOT_EQUAL, i);
Query query = new Query(DBCreation.ENTITY_NAME).setFilter(filter);
query.addProjection(new PropertyProjection(DBCreation.PROPERTY_NAME_PREFIX + 0, Integer.class));
query.addProjection(new PropertyProjection(DBCreation.PROPERTY_NAME_PREFIX + 1, Integer.class));
PreparedQuery prepQuery = dataStore.prepare(query);
Iterable<Entity> results = prepQuery.asIterable();
for(Entity result : results) {
log.info(result.toString());
}
}
return ReturnCodes.SUCCESS;
}
Any ideas?
EDIT: To get a better overview of the problem I have created another test, which do the same query but uses the keys only query instead. For this case, Appstats is correctly showing DATASTORE_SMALL operations in the report. I'm still pretty confused about the behavior of the projection query which should also be reporting DATASTORE_SMALL operations. Please help!
[I wrote the go port of appstats, so this is based on my experience and recollection.]
My guess is this is a bug in appstats, which is a relatively unmaintained program. Projection queries are new, so appstats may not be aware of them, and treats them as normal read queries.
For some background, calculating costs is difficult. For write ops, the cost are returned with the results, as they must be, since the app has no way of knowing what changed (which is where the write costs happen). For reads and small ops, however, there is a formula to calculate the cost. Each appstats implementation (python, java, go) must implement this calculation, including reflection or whatever is needed over the request object to determine what's going on. The APIs for doing this are not entirely obvious, and there's lots of little things, so it's easy to get it wrong, and annoying to get it right.

Linq to Entities does not recognize the method System.DateTime.. and cannot translate this into a store expression

I have a problem that has taken me weeks to resolve and I have not been able to.
I have a class where I have two methods. The following is supposed to take the latest date from database. That date represents the latest payment that a customer has done to "something":
public DateTime getLatestPaymentDate(int? idCustomer)
{
DateTime lastDate;
lastDate = (from fp in ge.Payments
from cst in ge.Customers
from brs in ge.Records.AsEnumerable()
where (cst.idCustomer == brs.idCustomer && brs.idHardBox == fp.idHardbox
&& cst.idCustomer == idCustomer)
select fp.datePayment).AsEnumerable().Max();
return lastDate;
}//getLatestPaymentDate
And here I have the other method, which is supposed to call the previous one to complete a Linq query and pass it to a Crystal Report:
//Linq query to retrieve all those customers'data who have not paid their safebox(es) annuity in the last year.
public List<ReportObject> GetPendingPayers()
{
List<ReportObject> defaulterCustomers;
defaulterCustomers = (from c in ge.Customer
from br in ge.Records
from p in ge.Payments
where (c.idCustomer == br.idCustomer
&& br.idHardBox == p.idHardBox)
select new ReportObject
{
CustomerId = c.idCustomer,
CustomerName = c.nameCustomer,
HardBoxDateRecord = br.idHardRecord,
PaymentDate = getLatestPaymentDate(c.idCustomer),
}).Distinct().ToList();
}//GetPendingPayers
No compile error is thrown here, but when I run the application and the second method tries to call the first one in the field PaymentDate the error mentioned in the header occurs:
Linq to Entities does not recognize the method System.DateTime.. and cannot translate this into a store expression
Please anybody with an useful input that put me off from this messy error? Any help will be appreciated !
Thanks a lot !
Have a look at these other questions :
LINQ to Entities does not recognize the method
LINQ to Entities does not recognize the method 'System.DateTime Parse(System.String)' method
Basically, you cannot use a value on the C# side and translate it into SQL. The first question offers a more thorough explanation ; the second offers a simple solution to your problem.
EDIT :
Simply put : the EF is asking the SQL server to perform the getLatestPaymentDate method, which it has no clue about. You need to execute it on the program side.
Simply perform your query first, put the results into a list and then do your Select on the in-memory list :
List<ReportObject> defaulterCustomers;
var queryResult = (from c in ge.Customer
from br in ge.Records
from p in ge.Payments
where (c.idCustomer == br.idCustomer
&& br.idHardBox == p.idHardBox)).Distinct().ToList();
defaulterCustomers = from r in queryResult
select new ReportObject
{
CustomerId = r.idCustomer,
CustomerName = r.nameCustomer,
HardBoxDateRecord = r.idHardRecord,
PaymentDate = getLatestPaymentDate(r.idCustomer),
}).Distinct().ToList();
I don't have access to your code, obviously, so try it out and tell me if it works for you!
You'll end up with an in-memory list

Best way to check if object exists in Entity Framework? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
What is the best way to check if an object exists in the database from a performance point of view? I'm using Entity Framework 1.0 (ASP.NET 3.5 SP1).
If you don't want to execute SQL directly, the best way is to use Any(). This is because Any() will return as soon as it finds a match. Another option is Count(), but this might need to check every row before returning.
Here's an example of how to use it:
if (context.MyEntity.Any(o => o.Id == idToMatch))
{
// Match!
}
And in vb.net
If context.MyEntity.Any(function(o) o.Id = idToMatch) Then
' Match!
End If
From a performance point of view, I guess that a direct SQL query using the EXISTS command would be appropriate. See here for how to execute SQL directly in Entity Framework: http://blogs.microsoft.co.il/blogs/gilf/archive/2009/11/25/execute-t-sql-statements-in-entity-framework-4.aspx
I had to manage a scenario where the percentage of duplicates being provided in the new data records was very high, and so many thousands of database calls were being made to check for duplicates (so the CPU sent a lot of time at 100%). In the end I decided to keep the last 100,000 records cached in memory. This way I could check for duplicates against the cached records which was extremely fast when compared to a LINQ query against the SQL database, and then write any genuinely new records to the database (as well as add them to the data cache, which I also sorted and trimmed to keep its length manageable).
Note that the raw data was a CSV file that contained many individual records that had to be parsed. The records in each consecutive file (which came at a rate of about 1 every 5 minutes) overlapped considerably, hence the high percentage of duplicates.
In short, if you have timestamped raw data coming in, pretty much in order, then using a memory cache might help with the record duplication check.
I know this is a very old thread but just incase someone like myself needs this solution but in VB.NET here's what I used base on the answers above.
Private Function ValidateUniquePayroll(PropertyToCheck As String) As Boolean
// Return true if Username is Unique
Dim rtnValue = False
Dim context = New CPMModel.CPMEntities
If (context.Employees.Any()) Then ' Check if there are "any" records in the Employee table
Dim employee = From c In context.Employees Select c.PayrollNumber ' Select just the PayrollNumber column to work with
For Each item As Object In employee ' Loop through each employee in the Employees entity
If (item = PropertyToCheck) Then ' Check if PayrollNumber in current row matches PropertyToCheck
// Found a match, throw exception and return False
rtnValue = False
Exit For
Else
// No matches, return True (Unique)
rtnValue = True
End If
Next
Else
// The is currently no employees in the person entity so return True (Unqiue)
rtnValue = True
End If
Return rtnValue
End Function
I had some trouble with this - my EntityKey consists of three properties (PK with 3 columns) and I didn't want to check each of the columns because that would be ugly.
I thought about a solution that works all time with all entities.
Another reason for this is I don't like to catch UpdateExceptions every time.
A little bit of Reflection is needed to get the values of the key properties.
The code is implemented as an extension to simplify the usage as:
context.EntityExists<MyEntityType>(item);
Have a look:
public static bool EntityExists<T>(this ObjectContext context, T entity)
where T : EntityObject
{
object value;
var entityKeyValues = new List<KeyValuePair<string, object>>();
var objectSet = context.CreateObjectSet<T>().EntitySet;
foreach (var member in objectSet.ElementType.KeyMembers)
{
var info = entity.GetType().GetProperty(member.Name);
var tempValue = info.GetValue(entity, null);
var pair = new KeyValuePair<string, object>(member.Name, tempValue);
entityKeyValues.Add(pair);
}
var key = new EntityKey(objectSet.EntityContainer.Name + "." + objectSet.Name, entityKeyValues);
if (context.TryGetObjectByKey(key, out value))
{
return value != null;
}
return false;
}
I just check if object is null , it works 100% for me
try
{
var ID = Convert.ToInt32(Request.Params["ID"]);
var Cert = (from cert in db.TblCompCertUploads where cert.CertID == ID select cert).FirstOrDefault();
if (Cert != null)
{
db.TblCompCertUploads.DeleteObject(Cert);
db.SaveChanges();
ViewBag.Msg = "Deleted Successfully";
}
else
{
ViewBag.Msg = "Not Found !!";
}
}
catch
{
ViewBag.Msg = "Something Went wrong";
}
Why not do it?
var result= ctx.table.Where(x => x.UserName == "Value").FirstOrDefault();
if(result?.field == value)
{
// Match!
}
Best way to do it
Regardless of what your object is and for what table in the database the only thing you need to have is the primary key in the object.
C# Code
var dbValue = EntityObject.Entry(obj).GetDatabaseValues();
if (dbValue == null)
{
Don't exist
}
VB.NET Code
Dim dbValue = EntityObject.Entry(obj).GetDatabaseValues()
If dbValue Is Nothing Then
Don't exist
End If