Entity Framework bulk add of over 100,000, items - entity-framework

I am building an app that will store stock tick data. one of my methods needs to take an array of items, currently c. 100,000 items but will be over 1,000,000 at a time, and add them to an entity database. this is my current add logic:-
var tickContext = new DataAccess();
// array is created, add it to the entity database
for (int j = 0; j < tickDataArray.Length; j++)
{
MarketTickData temp2 = new MarketTickData();
if(j > 1)
temp2 = tickDataArray[j-1];
MarketTickData temp = tickDataArray[j];
tickContext.TickBarData.Add(tickDataArray[j]);
tickContext.Configuration.AutoDetectChangesEnabled = false;
tickContext.Configuration.ValidateOnSaveEnabled = false;
if (j % 200 == 0)
{
tickContext.SaveChanges();
tickContext.Dispose();
tickContext = new DataAccess();
}
}
// add remaining items to database
tickContext.SaveChanges();
tickContext.Configuration.AutoDetectChangesEnabled = true;
tickContext.Configuration.ValidateOnSaveEnabled = true;
When I test my logic on various sizes of tickContext before saving I am not seeing huge improvments in preformance. At a Context size of 50 my adds are taking 47 Seconds, 1000 is 54 Seconds and the best is 200 items taking 44 seconds. But it is still relatively slow.
Is there a way to improve this ?

Related

Is there a better way to calculate the moving sum of a list in flutter

Is there a better way to calculate a moving sum of a list?
List<double?> rollingSum({int window = 3, List data = const []}) {
List<double?> sum = [];
int i = 0;
int maxLength = data.length - window + 1;
while (i < maxLength) {
List tmpData = data.getRange(i, i + window).toList();
double tmpSum = tmpData.reduce((a, b) => a + b);
sum.add(tmpSum);
i++;
}
// filling the first n values with null
i = 0;
while (i < window - 1) {
sum.insert(0, null);
i++;
}
return sum;
}
Well, the code is already clean for what you need. Maybe just some improvements like:
Use a for loop
You can use the method sublist which creates a "view" of a list, which is more efficient
To insert some values in the left/right of a list, there is a specific Dart method called padLeft, where you specify the lenght of the list which you want it to become (first parameter), then the value you want to use to fill it (second parameter). For example, if you have an array of N elements, and you want to fill it with X "null"s to the left, use padLeft(N+X, null).
List<double?> rollingSum({int window = 3, List data = const []}) {
List<double?> sum = [];
for (int i = 0; i < data.length - window + 1; i++) {
List tmpData = data.sublist(i, i + window);
double tmpSum = tmpData.reduce((a, b) => a + b);
sum.add(tmpSum);
}
sum.padLeft(window - 1, null);
return sum;
}
if I understand your problem correctly you can just calculate the window one time and in one loop you can for each iteration you can add the current element to the sum and subtract i - (window - 1)
so for an input like this
data = [1,2,3,4,5,6]
window = 3
the below code will result in [6,9,12,15]
int sum = 0;
List<double> res = [];
for (int i = 0;i<data.length;i++) {
sum += data[i];
if (i < window - 1) {
continue;
}
res.add(sum);
sum -= data[i - (window - 1)]; // remove element that got out of the window size
}
this way you won't have to use getRange nor sublist nor reduce as all of those are expensive functions in terms of time and space complexity

Insert failed on table with 300 integers and 300 char(20)

I've tried to test TOAST functionality and created the code:
int length = 20;
using (NpgsqlConnection conn = new NpgsqlConnection(""))
{
conn.Open();
StringBuilder ct = new StringBuilder();
ct.Append("CREATE TABLE t300 (");
for (int i = 0; i < 300; i++)
{
ct.Append("i").Append(i).Append(" int not null, n").Append(i).Append(" varchar(").Append(length).Append(") not null, ");
}
ct.Remove(ct.Length - 2, 2).Append(");");
using (NpgsqlCommand cmd = new NpgsqlCommand(ct.ToString(), conn))
{
cmd.ExecuteNonQuery();
}
StringBuilder isql = new StringBuilder();
isql.Append("INSERT INTO t300 (");
StringBuilder vsql = new StringBuilder();
vsql.Append("VALUES (");
for (int i = 0; i < 300; i++)
{
isql.Append("i").Append(i).Append(", n").Append(i).Append(", ");
vsql.Append(":i").Append(i).Append(", :n").Append(i).Append(", ");
}
isql.Remove(isql.Length - 2, 2).Append(") ").Append(vsql).Remove(isql.Length - 2, 2).Append(");");
using (NpgsqlCommand cmd = new NpgsqlCommand(isql.ToString(), conn))
{
for (int i = 0; i < 300; i++)
{
cmd.Parameters.AddWithValue("i" + i.ToString(), NpgsqlDbType.Integer, i);
cmd.Parameters.AddWithValue("n" + i.ToString(), NpgsqlDbType.Varchar, length, i.ToString() + new string('n', length - i.ToString().Length));
}
for (int i = 0; i < 10000; i++)
{
cmd.ExecuteNonQuery();
}
}
}
This code fails on INSERT with exception '54000, row size (8424) exceeds limit (8160)'.
When I set 'length' variable to 26, the code works fine. Please tell me the workaround to eliminate this situation.
Postgres 12, Npgsql 4.1.5
Perhaps you have a misconception of how TOAST storage works. PostgreSQL does not compress the whole row and store it in the TOAST table, but each column of a varying length data type independently.
So after toasting, the row still consists of 600 columns, 300 of which (the integers) won't be toasted (4 bytes), and the other 300 toasted columns (the varchars) will now contain a TOAST header and a TOAST pointer.
Together this happens to be more than fits into a single block, and rows cannot span more than a single block. That causes the error.
The solution is not to use tables with so many columns. You should split the data in several tables (normalization usually takes care of that). If there are truly very many attributes to a single entity, chances are that not all of these attributes will get used in join or WHERE conditions. You could consider storing such attributes in a single jsonb column, where TOASTing will be much more efficient.

How to collect all data from first column in table into array?

I using netbeans 8.
I need to loop to collect all employee ID from first column of jtable and store those IDs into an arraylist.
if (jTabledetail.getRowCount() > 0) {
String ecode = "";
int ishasRow = jTabledetail.getRowCount();// total 1 row
for (int r = 0; r <= ishasRow; r++) {// loop twice. First loop is gone, return to second loop or final loop for 1 row exists giving error bellow.
ecode = jTabledetail.getValueAt(r, 0).toString();
arrempcode.add(ecode);
}
}
I also tried changing to ==>> for (int r = 0; r < ishasRow; r++) but not worked.
Exception in thread "AWT-EventQueue-0" java.lang.ArrayIndexOutOfBoundsException: 1 >= 1
at java.util.Vector.elementAt(Vector.java:474)
at javax.swing.table.DefaultTableModel.getValueAt(DefaultTableModel.java:648)
at javax.swing.JTable.getValueAt(JTable.java:2717)
I don't understand the error. I known that the error comes from loop expression. I am not sure for this error.
Now my jtable named "jTabledetail" has 1 row exists.
Do I need to change something for this case of error? I am not sure that the loop expression is wrong.
Thank you very much.
DefaultTableModel tableModel = (DefaultTableModel) TableName.getModel();
Get the row count of the table
int rowCount = tableModel.getRowCount();
Declare ArrayList
ArrayList<Object> list = new ArrayList<Object>();
Traversing table and adding values into arraylist
for(int i=0; i<rowCount; i++){
for(int j=0; j<tableModel.getColumnCount(); j++){
if(j==0){
list.add(tableModel.getValueAt(i,j));
}
}
}

merge sort performance compared to insertion sort

For any array of length greater than 10, is it safe to say that merge sort performs fewer comparisons among the array's elements than does insertion sort on the same array because the best case for the run time of merge sort is O(N log N) while for insertion sort, its O(N)?
My take on this. First off, you are talking about comparisons, but there are swaps as well that matter.
In insertion sort in the worst case (an array sorted in opposite direction) you have to do n^2 - n comparisons and swaps (11^2 - 11 = 121 - 11 = 110 for 11 elements, for example). But if the array is even partially sorted in needed order (I mean many elements already stay at correct positions or even not far from them), the number of swaps&comparisons may significantly drop. The right position for the element will be found pretty soon and there will be no need for performing as many actions as in case of an array sorted in opposite order. So, as you can see for arr2, which is almost sorted, the number of actions will become linear (in relation to the input size) - 6.
var arr1 = [11,10,9,8,7,6,5,4,3,2,1];
var arr2 = [1,2,3,4,5,6,7,8,11,10,9];
function InsertionSort(arr) {
var arr = arr, compNum = 0, swapNum = 0;
for(var i = 1; i < arr.length; i++) {
var temp = arr[i], j = i - 1;
while(j >= 0) {
if(temp < arr[j]) { arr[j + 1] = arr[j]; swapNum++; } else break;
j--;
compNum++;
}
arr[j + 1] = temp;
}
console.log(arr, "Number of comparisons: " + compNum, "Number of swaps: " + swapNum);
}
InsertionSort(arr1); // worst case, 11^2 - 11 = 110 actions
InsertionSort(arr2); // almost sorted array, few actions
In merge sort we always do aprox. n*log n actions - the properties of the input array don't matter. So, as you can see in both cases we will get both of our arrays sorted in 39 actions:
var arr1 = [11,10,9,8,7,6,5,4,3,2,1];
var arr2 = [1,2,3,4,5,6,7,8,11,10,9];
var actions = 0;
function mergesort(arr, left, right) {
if(left >= right) return;
var middle = Math.floor((left + right)/2);
mergesort(arr, left, middle);
mergesort(arr, middle + 1, right);
merge(arr, left, middle, right);
}
function merge(arr, left, middle, right) {
var l = middle - left + 1, r = right - middle, temp_l = [], temp_r = [];
for(var i = 0; i < l; i++) temp_l[i] = arr[left + i];
for(var i = 0; i < r; i++) temp_r[i] = arr[middle + i + 1];
var i = 0, j = 0, k = left;
while(i < l && j < r) {
if(temp_l[i] <= temp_r[j]) {
arr[k] = temp_l[i]; i++;
} else {
arr[k] = temp_r[j]; j++;
}
k++; actions++;
}
while(i < l) { arr[k] = temp_l[i]; i++; k++; actions++;}
while(j < r) { arr[k] = temp_r[j]; j++; k++; actions++;}
}
mergesort(arr1, 0, arr1.length - 1);
console.log(arr1, "Number of actions: " + actions); // 11*log11 = 39 (aprox.)
actions = 0;
mergesort(arr2, 0, arr2.length - 1);
console.log(arr2, "Number of actions: " + actions); // 11*log11 = 39 (aprox.)
So, answering your question:
For any array of length greater than 10, is it safe to say that merge sort performs fewer comparisons among the array's elements than does insertion sort on the same array
I would say that no, it isn't safe to say so. Merge sort can perform more actions compared to insertion sort in some cases. The size of an array isn't important here. What is important in this particular case of comparing insertion sort vs. merge sort is how far from the sorted state is your array. I hope it helps :)
BTW, merge sort and insertion sort have been united in a hybrid stable sorting algorithm called Timsort to get the best from both of them. Check it out if interested.

Merging geometries using a WebWorker?

Anyone know if it's possible to merge a set of cube geometries in a web worker and pass it back to the main thread? Was thinking this could reduce the lag when merging large amounts of cubes.
Does Three.JS work okay in a web worker, and if it does, would it be possible (and faster) to do this? Not sure if passing the geometry back would take just as long as merging it normally.
At the moment I'm using a timed for loop to reduce the lag:
// This array is populated by the server and contains the chunk position and data (which I do nothing with yet).
var sectionData = data.secData;
var section = 0;
var tick = function() {
var start = new Date().getTime();
for (; section < sectionData.length && (new Date().getTime()) - start < 1; section++) {
var sectionXPos = sectionData[section][0] * 10;
var sectionZPos = sectionData[section][1] * 10;
var combinedGeometry = new THREE.Geometry();
for (var layer = 0; layer < 1; layer++) { // Only 1 layer because of the lag...
for (var x = 0; x < 10; x++) {
for (var z = 0; z < 10; z++) {
blockMesh.position.set(x-4.5, layer-.5, z-4.5);
blockMesh.updateMatrix();
THREE.GeometryUtils.merge(combinedGeometry, blockMesh);
}
}
}
var sectionMesh = new THREE.Mesh(combinedGeometry, grassBlockMat);
sectionMesh.position.set(sectionXPos, 0, sectionZPos);
sectionMesh.matrixAutoUpdate = false;
sectionMesh.updateMatrix();
scene.add(sectionMesh);
}
if (section < sectionData.length) {
setTimeout(tick, 25);
}
};
setTimeout(tick, 25);
Using Three.JS rev59-dev.
Merged cubes make up the terrain in chunks, and at the moment (due to the lag) each chunk only has 1 layer.
Any tips would be appreciated! Thanks.
THREE.JS will not work in a web worker, however you can copy those parts of the library that you need to work both in the main thread and in your web worker.
Your first problem will be that you cannot send the geometry object itself back to the main thread.
Since the web worker onmessage variable passing works only by sending copies of JSON (not javascript objects) or references to ArrayBuffers, you would have to decode the geometry down to each float, pack it in an ArrayBuffer, and send a reference back to the main thread.
Note those are called transferable objects and once sent, they are cleared in the webworker / main thread from which they came.
See here for more details:
http://www.html5rocks.com/en/tutorials/workers/basics/
https://developer.mozilla.org/en-US/docs/Web/Guide/Performance/Using_web_workers
Here is an example of packing position vertices into an array for a physics type system:
//length * 3 axes * 4 bytes per vertex
var posBuffer = new Float32Array(new ArrayBuffer(len * 3 * 4));
//in a loop
//... do hard work
posBuffer[i * 3] = pos.x; //pos is a threejs vector
posBuffer[i * 3 + 1] = pos.y;
posBuffer[i * 3 + 2] = pos.z;
//after loop send buffer to main thread
self.postMessage({posBuffer:posBuffer}, [posBuffer.buffer]);
I copied the THREE.JS vector class inside my web worker and cut out all the methods I didn't need to keep it nice and lean.
FYI it's not slow and for something like n-body collisions it works well.
The main thread sends a command to the web worker telling it to run the update and then listens for the response. Kind of like a producer consumer model in regular threading.