I am parsing a JSON file to generate rows and relationships in a database with OrmLite. Using the following code, it takes about 20+ minutes to parse all of the content. Is there any way I can optimize this to take less time?
I have 3 tables making up a many-to-many relationship.
public class FirstTable {
int id;
ForeignCollection<IntermediateTable> intermediateTables;
}
public class IntermediateTable {
int id;
FirstTable firstTable;
SecondTable secondTable;
}
public class SecondTable {
int id;
ForeignCollection<IntermediateTable> intermediateTables;
}
After creating and populating the first and second tables, I am parsing a JSON file to create the relations between the FirstTable and SecondTable. The JSON file stores a collection of FirstTable objects and the IDs of the relating SecondTable entries.
My code looks something like this:
setForeignRelations(JSONObject jsonObject, FirstTable firstTable) {
JSONArray secondTables = jsonObject.getJSONArray(SECOND_TABLE_KEY);
for (int i = 0; i < secondTables.length(); i++) {
int secondTableId = ((Integer)secondTables.get(i)).intValue();
SecondTable secondTable = DbManager.getInstance().getHelper().getSecondTableDao().queryForId(secondTableId);
IntermediateTable intermediateTable = new IntermediateTable();
intermediateTable.setFirstTable(firstTable);
intermediateTable.setSecondTable(secondTable);
DbManager.getInstance().getHelper().getIntermediateTableDao().create(intermediateTable);
}
}
I never try it but this could be faster and should work :
setForeignRelations(JSONObject jsonObject, FirstTable firstTable) {
JSONArray secondTables = jsonObject.getJSONArray(SECOND_TABLE_KEY);
for (int i = 0; i < secondTables.length(); i++) {
int secondTableId = ((Integer)secondTables.get(i)).intValue();
IntermediateTable intermediateTable = new IntermediateTable();
intermediateTable.setFirstTable(firstTable);
intermediateTable.setSecondTable(new SecondTable(secondTableId));
DbManager.getInstance().getHelper().getIntermediateTableDao().create(intermediateTable);
}
}
This should work because it will only save the id of the secondTable in the intermediateTable. So you don't have to request the database to fill all the field of your second table.
If you want to make it faster again you can have a look here to accelerate your object insertion :
ORMLite's createOrUpdate seems slow - what is normal speed?
So you will use the callBatchTasks method which permit to increase speed in ORMLite.
With this 2 things you increase the speed of the SELECT (because you don't do it anymore) and of the INSERT with the callBatchTasks method.
If you want an example of the callBatchTasks method. you can have a look here :
Android ORMLite slow create object
Related
I have an extremely large table that I'm trying to get the number of rows for. Using COUNT(*) is too slow, so I want to run this query using EF Core:
int count = _dbContext.Database.ExecuteSqlRaw(
"SELECT Total_Rows = SUM(st.row_count) " +
"FROM sys.dm_db_partition_stats st " +
"WHERE object_name(object_id) = 'MyLargeTable' AND(index_id < 2)");
The only problem is that the return value isn't the result of the query, but the number of records returned, which is just 1
Is there a way to get the correct value here, or will I need to use a different method?
Since you only need a scalar value you can also use an output parameter to retrieve the data, eg
var sql = #"
SELECT #Total_Rows = SUM(st.row_count)
FROM sys.dm_db_partition_stats st
WHERE object_name(object_id) = 'MyLargeTable' AND(index_id < 2)
";
var pTotalRows = new SqlParameter("#Total_Rows", System.Data.SqlDbType.BigInt);
pTotalRows.Direction = System.Data.ParameterDirection.Output;
db.Database.ExecuteSqlRaw(sql, pTotalRows);
var totalRos = (long?)(pTotalRows.Value == DBNull.Value ? null:pTotalRows.Value) ;
If one let's me to recreate a correct answer based on this blog: https://erikej.github.io/efcore/2020/05/26/ef-core-fromsql-scalar.html
We need to create a virtual entity model for our database, that will contain our needed query result, at the same time we need a pseudo DbSet<this virtual model> to use ef core FromSqlRaw method that returns data instead of ExecuteSqlRaw that just returns numbers of rows affected by query.
The example is for returning an integer value, but you can easily adapt it:
Define a return value holding class
public class IntReturn
{
public int Value { get; set; }
}
Fake a virtual DbSet<IntReturn> it will not be really present in db:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
...
modelBuilder.Entity<IntReturn>().HasNoKey();
base.OnModelCreating(modelBuilder);
}
Now we can call FromSqlRaw for this virtual set. In this example the calling method is inside MyContext:DbContext, you'd need to instantiate your own context to use it instead of this):
NOTE the usage of "as Value" - same name as IntReturn.Value property. In some wierd cases you'd have to do the opposite: name your virtual model property after the name of the value thre database funstion is returning.
public int ReserveNextCustomerId()
{
var sql = $"Select nextval(pg_get_serial_sequence('\"Customers\"', 'Id')) as Value;";
var i = this.Set<IntReturn>()
.FromSqlRaw(sql)
.AsEnumerable()
.First().Value;
return i;
}
I'm using Filehelpers to parse a very wide, fixed format file and want to be able to take the resulting object and load it into a DB using EF. I'm getting a missing key error when I try to load the object into the DB and when I try and add an Id I get a Filehelpers error. So it seems like either fix breaks the other. I know I can map a Filehelpers object to a POCO object and load that but I'm dealing with dozens (sometimes hundreds of columns) so I would rather not have to go through that hassle.
I'm also open to other suggestions for parsing a fixed width file and loading the results into a DB. One option of course is to use an ETL tool but I'd rather do this in code.
Thanks!
This is the FileHelpers class:
public class AccountBalanceDetail
{
[FieldHidden]
public int Id; // Added to try and get EF to work
[FieldFixedLength(1)]
public string RecordNuber;
[FieldFixedLength(3)]
public string Branch;
// Additional fields below
}
And this is the method that's processing the file:
public static bool ProcessFile()
{
var dir = Properties.Settings.Default.DataDirectory;
var engine = new MultiRecordEngine(typeof(AccountBalanceHeader), typeof(AccountBalanceDetail), typeof(AccountBalanceTrailer));
engine.RecordSelector = new RecordTypeSelector(CustomSelector);
var fileName = dir + "\\MOCK_ACCTBAL_L1500.txt";
var res = engine.ReadFile(fileName);
foreach (var rec in res)
{
var type = rec.GetType();
if (type.Name == "AccountBalanceHeader") continue;
if (type.Name == "AccountBalanceTrailer") continue;
var data = rec as AccountBalanceDetail; // Throws an error if AccountBalanceDetail.Id has a getter and setter
using (var ctx = new ApplicationDbContext())
{
// Throws an error if there is no valid Id on AccountBalanceDetail
// EntityType 'AccountBalanceDetail' has no key defined. Define the key for this EntityType.
ctx.AccountBalanceDetails.Add(data);
ctx.SaveChanges();
}
//Console.WriteLine(rec.ToString());
}
return true;
}
Entity Framework needs the key to be a property, not a field, so you could try declaring it instead as:
public int Id {get; set;}
I suspect FileHelpers might well be confused by the autogenerated backing field, so you might need to do it long form in order to be able to mark the backing field with the [FieldHidden] attribute, i.e.,
[FieldHidden]
private int _Id;
public int Id
{
get { return _Id; }
set { _Id = value; }
}
However, you are trying to use the same class for two unrelated purposes and this is generally bad design. On the one hand AccountBalanceDetail is the spec for the import format. On the other you are also trying to use it to describe the Entity. Instead you should create separate classes and map between the two with a LINQ function or a library like AutoMapper.
I was told to use automapper in the code below. I cannot get clarification for reasons that are too lengthy to go into. What object am I supposed to be mapping to what object? I don't see a "source" object, since the source is the database...
Would really appreciate any help on how to do this with automapper. Note, the actual fields are irrelevant, I need help with the general concept. I do understand how mapping works when mapping from one object to another.
public IQueryable<Object> ReturnDetailedSummaries(long orgId)
{
var summaries = from s in db.ReportSummaries
where s.OrganizationId == orgId
select new SummaryViewModel
{
Id = s.Id,
Name = s.Name,
AuditLocationId = s.AuditLocationId,
AuditLocationName = s.Location.Name,
CreatedOn = s.CreatedOn,
CreatedById = s.CreatedById,
CreatedByName = s.User.Name,
OfficeId = s.OfficeId,
OfficeName = s.Office.Name,
OrganizationId = s.OrganizationId,
OrganizationName = s.Organization.Name,
IsCompleted = s.IsCompleted,
isHidden = s.isHidden,
numberOfItemsInAuditLocations = s.numberOfItemsInAuditLocations,
numberOfLocationsScanned = s.numberOfLocationsScanned,
numberOfItemsScanned = s.numberOfItemsScanned,
numberofDiscrepanciesFound = s.numberofDiscrepanciesFound
};
return summaries;
}
It is a handy and a timesaver, especially if you use a one to one naming between translations layers. Here is how I use it.
For single item
public Domain.Data.User GetUserByUserName(string userName)
{
Mapper.CreateMap<User, Domain.Data.User>();
return (
from s in _dataContext.Users
where s.UserName==userName
select Mapper.Map<User, Domain.Data.User>(s)
).SingleOrDefault();
}
Multiple Items
public List<Domain.Data.User> GetUsersByProvider(int providerID)
{
Mapper.CreateMap<User, Domain.Data.User>();
return (
from s in _dataContext.Users
where s.ProviderID== providerID
select Mapper.Map<User, Domain.Data.User>(s)
).ToList();
}
It looks like you already have a model? SummaryViewModel?
If this isn't the DTO, then presumably you want to do:
Mapper.CreateMap<SummaryViewModel, SummaryViewModelDto>();
SummaryViewModelDto summaryViewModelDto =
Mapper.Map<SummaryViewModel, SummaryViewModelDto>(summaryViewModel);
AutoMapper will copy fields from one object to another, to save you having to do it all manually.
See https://github.com/AutoMapper/AutoMapper/wiki/Getting-started
The source is your entity class ReportSummary, the target is SummaryViewModel:
Mapper.CreateMap<ReportSummary, SummaryViewModel>();
The best way to use AutoMapper in combination with an IQueryable data source is through the Project.To API:
var summaries = db.ReportSummaries.Where(s => s.OrganizationId == orgId)
.Project().To<SummaryViewModel>();
Project.To translates the properties in the target model straight to the selected columns in the generated SQL.
Mapper.Map, on the other hand, only works on in-memory collections, so you can only use it when you first fetch complete ReportSummary objects from the database. (In this case there may not be much of a difference, but in other cases it can be substantial).
I am trying the following which is resulting in an additional update execution and failing my tests.
I have an entity like this.
#Entity
#SqlResultSetMapping(name = "tempfilenameRSMapping",
entities = { #EntityResult(entityClass = MyEntity.class) },
columns = { #ColumnResult(name = "TEMPFILENAME") })
//The reason for this mapping is to fetch an additional field data through join.
#Table(name = "MY_TABLE")
public class MyEntity {
#Id
#Column(name="ID")
private String id;
#Column(name="NAME")
private String name;
#Column(name="DESC")
private String description;
#Column(name="STATUS")
private String status;
//follwed by getter setters
}
I am trying to do a retrieve with a native query. And for the retrieved entity, I execute a native update (the reason for native update is that I want to update just one single field). Note that I am not updating the retrieved entity directly.
What I observe is that my update is not getting executed properly. When I turn the TRACE on, I notice that on flush openJPA is executing an additional update query and therefore overriding my original update.
e.g.
SELECT M.ID, M.NAME, M.DESC, O.TEMPFILENAME FROM MY_TABLE M, OTHER_TABLE O WHERE M.ID = ?
UPDATE MY_TABLE SET STATUS = ? WHERE ID = ?
UPDATE MY_TABLE SET ID=?, NAME=?, DESC=?, STATUS=? WHERE ID = ?
What can I do to skip the auto-updation?
Edit:
Here are the routines we use for executing the queries.
The following routine returns a named native query sql.
public String getNamedNativeQuerySql(EntityManagerFactory emf, String qryName) {
MetamodelImpl metamodel = (MetamodelImpl) emf.getMetamodel();
QueryMetaData queryMetaData =
metamodel.getConfiguration().getMetaDataRepositoryInstance().getQueryMetaData(null, qryName, null, true);
String queryString = queryMetaData.getQueryString();
return queryString;
}
The code for retrieval:
Query query = entityManager.createNamedQuery("retrieveQry");
query.setParameter(1, id);
Object[] result = (Object[]) query.getSingleResult();
MyEntity entity = (MyEntity) result[0];
String tempFileName = (String) result[1];
The code for update that follows retrieval:
Query qry = entityManager.createNamedQuery("updateQry");
qry.setParameter(1, status);
qry.setParameter(2, entity.getId() );
qry.executeUpdate()
Edit:
I see the problem even without the update statement. OpenJPA is
executing an additional update query even if I do a simple find.
The problem was with runtime enhancement. OpenJPA was unable to do a proper detection of dirty state with runtime-enhanced entities.
It got resolved by doing a build time enhancement.
I am developing an silverlight application using WCF and EF.
I am using Database first as our database already exists.
I have a table that consists of 100 columns with datatype real. We want to generate a class which has a List<double> or List<float> instead of that 100 discrete variables in the class for each column.
Is this possible ?? Can someone give me an example?
There's no direct way. What you have to do is use reflection to convert it into a List<double>. Suppose your table names is called MyObject, then EF will generate a class MyObject to represent a row in that table. You can then do:
Type type = typeof(MyObject);
// Get properties (columns) through reflection
PropertyInfo[] properties = type.GetProperties();
List<List<double>> allRows = new List<List<double>>();
using(var dbContext = MyDB.GetContext())
{
foreach(var row in dbContext.MyRows)
{
List<double> columnValues = new List<double>();
foreach (PropertyInfo property in properties)
{
// The sql type REAL will map to either float or double
if(property.PropertyType == typeof(float))
{
columnValues.Add( (double) property.GetValue(row, null) );
}
}
allRows.Add(columnValues);
}
}
Hope you get the idea.