SqlBulkCopy deal with batch size in ado.net - ado.net

I have the following code to enter some data in the database using the class SqlBulkCopy from ADO.NET
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(DCISParameters.ConnectionString))
{
bulkCopy.DestinationTableName = "tbzErgoAnalytical";
bulkCopy.BatchSize = 250;
bulkCopy.ColumnMappings.Add("Column1", "fldESPA");
bulkCopy.ColumnMappings.Add("Column2", "fldEP");
bulkCopy.ColumnMappings.Add("Column13", "fldMISCode");
bulkCopy.WriteToServer(dbTable);
bulkCopy.SqlRowsCopied += bulkCopy_SqlRowsCopied;
}
dbTable is a DataTable object that is passed as a parameter from a method and it contains 7691 rows that i take from an excel file. I have set the batch size to 250. The problem is that 7500 (250 * 30) rows are transferred correctly to the database but then i receive the following error: "Column 'fldMISCode' does not allow DBNull.Value." I am 100% sure that there is no null value in the fldMISCode and i suppose that in the last insert i have only 191 rows left which is less than the batch size (not sure if my assumption in correct). Any idea how to deal with this error? Thanks in advance...

Related

Getting the total number of records in PagedList

The datagrid that I use on the client is based on SQL row number; it also requires a total number of pages for its paging. I also use the PagedList on the server.
SQL Profiler shows that the PagedList makes 2 db calls - the first to get the total number of records and the second to get the current page. The thing is that I can't find a way to extract that total number of records from the PagedList. Therefore, currently I have to make an extra call to get that total which creates 3 calls in total for each request, 2 of which are absolutely identical. I understand that I probably won't be able to rid of the call to get the totals but I hate to call it twice. Here is an extract from my code, I'd really appreciate any help in this:
var t = from c in myDb.MyTypes.Filter<MyType>(filterXml) select c;
response.Total = t.Count(); // my first call to get the total
double d = uiRowNumber / uiRecordsPerPage;
int page = (int)Math.Ceiling(d) + 1;
var q = from c in myDb.MyTypes.Filter<MyType>(filterXml).OrderBy(someOrderString)
select new ReturnType
{
Something = c.Something
};
response.Items = q.ToPagedList(page, uiRecordsPerPage);
PagedList has a .TotalItemCount property which reflects the total number of records in the set (not the number in a particular page). Thus response.Items.TotalItemCount should do the trick.

Is it correct to use -1 as a size for an output SqlParameter to retrieve a VARBINARY(max) value?

I have a stored procedure with an OUTPUT parameter of the varbinary(MAX) type:
ALTER PROCEDURE [dbo].[StoredProcedure1]
...
#FileData varbinary(MAX) OUTPUT
AS
...
I don't know what would be the actual size of the returned data, so I can't use an exact value for the size parameter of the SqlParameter constructor. From the other hand, the actual size could be more than 8 Kb (if it matters).
When I create a SqlParameter without declaring a size:
var fileDataParameter = new SqlParameter("#FileData", SqlDbType.VarBinary)
{ Direction = ParameterDirection.Output };
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add(fileDataParameter);
command.ExecuteNonQuery();
var fileData = fileDataParameter.Value as byte[];
I'm getting the following exception on the command.ExecuteNonQuery() line:
Additional information: Byte[][0]: the Size property has an invalid
size of 0.
So I need to specify the size. Some people recommend to pass -1 as the value of the size:
var fileDataParameter = new SqlParameter("#FileData", SqlDbType.VarBinary, -1)
{ Direction = ParameterDirection.Output };
But I can't find a solid description of this thing neither on the MSDN page, nor anywhere else.
In my case, the maximum size of data returned in #FileData parameter isn't more than 10 Mb.
So the question is if passing -1 as a size to a SqlParameter mapped to a varbinary(MAX) OUTPUT parameter is the correct approach, for example, from the performance perspective?
Here is the MSDN documentation.
https://msdn.microsoft.com/en-us/library/bb399384.aspx
- See "Using Large Value Type Parameters" section
Passing a "-1" is the correct approach for "MAX" value size. Since it's a VarChar, it won't add add or return any extra chars, only the ones you set in that column. So it should be very efficient.

Dataset capacities

Is there any limit of rows for a dataset. Basically I need to generate excel files with data extracted from SQL server and add formatting. There are 2 approaches I have. Either take enntire data (around 4,50,000 rows) and loops through those in .net code OR loop through around 160 records, pass every record as an input to proc, get the relavant data, generate the file and move to next of 160. Which is the best way? Is there any other way this can be handled?
If I take 450000 records at a time, will my application crash?
Thanks,
Rohit
You should not try to read 4 million rows into your application at one time. You should instead use a DataReader or other cursor-like method and look at the data a row at a time. Otherwise, even if your application does run, it'll be extremely slow and use up all of the computer's resources
Basically I need to generate excel files with data extracted from SQL server and add formatting
A DataSet is generally not ideal for this. A process that loads a dataset, loops over it, and then discards it, means that the memory from the first row processed won't be released until the last row is processed.
You should use a DataReader instead. This discards each row once its processed through a subsequent call to Read.
Is there any limit of rows for a dataset
At the very least since the DataRowCollection.Count Property is an int its limited to 4,294,967,295 rows, however there may be some other constraint that makes it smaller.
From your comments this is outline of how I might construct the loop
using (connection)
{
SqlCommand command = new SqlCommand(
#"SELECT Company,Dept, Emp Name
FROM Table
ORDER BY Company,Dept, Emp Name );
connection.Open();
SqlDataReader reader = command.ExecuteReader();
string CurrentCompany = "";
string CurrentDept = "";
string LastCompany = "";
string LastDept = "";
string EmpName = "";
SomeExcelObject xl = null;
if (reader.HasRows)
{
while (reader.Read())
{
CurrentCompany = reader["Company"].ToString();
CurrentDept = reader["Dept"].ToString();
if (CurrentCompany != LastCompany || CurrentDept != LastDept)
{
xl = CreateNewExcelDocument(CurrentCompany,CurrentDept);
}
LastCompany = CurrentCompany;
LastDept = CurrentDept;
AddNewEmpName (xl, reader["EmpName"].ToString() );
}
}
reader.Close();
}

EF4: Object Context consuming too much memory

I have a reporting tool that runs against an MS SQL Server using EF4. The general bulk of this report involves looping over around 5000 rows and then pulling numerous other rows for each one of these.
I pull the initial rows through one data context. The code that pulls the related rows involves using another data context, wrapped in a using statement. It would appear though that the memory consumed by the second data context is never freed and usage shoots up to 1.5GB before an out of memory exception is thrown.
Here a snippet of the code so you can get the idea:
var outlets = (from o in db.tblOutlets
where o.OutletType == 3
&& o.tblCalls.Count() > number && o.BelongsToUser.HasValue && o.tblUser.Active == true
select new { outlet = o, callcount = o.tblCalls.Count() }).OrderByDescending(p => p.callcount);
var outletcount = outlets.Count();
//var outletcount = 0;
//var average = outlets.Average(p => p.callcount);
foreach (var outlet in outlets)
{
using (relenster_v2Entities db_2 = new relenster_v2Entities())
{
//loop over calls and add history
//check the last time the history table was added to for this call
var lastEntry = (from h in db_2.tblOutletDistributionHistories
where h.OutletID == outlet.outlet.OutletID
orderby h.VisitDate descending
select h).FirstOrDefault();
DateTime? beginLooking = null;
I had hoped that by using a second data context that memory could be released after each iteration. It would appear it is not (or the GC is not running in time)
With the input from #adrift I altered the code so that the saving of the changes took place after each iteration of the loop, rather than all at the end. It would appear that there is a limit (in my case anyway) of around 150,000 pending writes that the data context can happily hold before consuming too much memory.
By allowing it to write changes after each iteration, it would appear that it could manage memory more effectively, although it did seem to use as much, it didn't throw an exception.

ADO.NET Entity Data Model are not precise enough

I run this code:
var cos = from k in _db.klienci_do_trasy where k.klient_id == 5 select k;
but the query send to database is:
SELECT * FROM `klienci_do_trasy`
LIMIT 0, 30
why is it for, there is no condition on klient_id?
What database are you using? If this is really the case, there could be an error in the provider.