How do i get the ordinal of a column in a DataReader - ado.net

How can i find out if a column exists in a DataReader's results set?
i try:
int columnOrdinal = reader.GetOrdinal("LastName");
columnExists = (columnOrdinal < 0);
but GetOrdinal throws an exception if the column does not exist. My case is not exceptional. It's the opposite. It's...ceptional.
Note: Not related to my question but, the real reason i want to know if a column exists is because i want to get the ordinal position of a column, without throwing an exception if the column doesn't exist:
int columnOrdinal = reader.GetOrdinal("Lastname");
Note: Not related to my question but, the real reason i want to know if a column exists, because i want to know if the column contains null:
itIsNull = reader.IsDBNull(reader.GetOrdinal("Lastname"));
Unfortunately IsDBNull only takes an ordinal, and GetOrdinal throws an exception. So i'm left with:
if (ColumnExists(reader, "Lastname"))
{
itIsNull = reader.IsDBNull(reader.GetOrdinal("Lastname"));
}
else
itIsNull = false;
Note: Not related to my question but, the real reason i want to know if a column exists is because there will be times where the column will not be present in the results set, and i don't want to throw an exception processing database results, since it's not exceptional.

There is a limit to what you can do since the IDataReader doesn't expose much that helps. Using the loop as shown in the answer to a similar question
Check for column name in a SqlDataReader object
You could, with the first row you process, build a simple dictionary that is keyed by column name with ordinals as values (or a HashSet if you don't care about the ordinal values). Then you can just use columnDictionary.ContainsKey("LastName") as your test. You would only build the dictionary once, for the first row encountered, then all the subsequent rows would be fast.
But to be honest, compared with database time, the time consumed by using as-is the solution in that other stackoverflow qeustion would probably be negligible.
Edit: additional possibilities here: Checking to see if a column exists in a data reader

Related

Azure Data Factory DataFlow md5 specific columns

First of all, I have my array of columns parameter called $array_merge_keys
$array_merge_keys = ['Column1', 'Column2', 'NoColumnInSomeCases']
So then I am going to hash them, if the third NoColumnInSomeCases is not existed, I would like to treat it as null or some strings else there value.
But actually, when I use them with byNames(), it would return NULL because the last is not existed, even though first and second still have values. So I would expect byNames($array_merge_keys) always return value in order to hash them.
Since that problem cannot be solved, I am back to filter these only existed column
filter(columnNames('', true()), contains(['Column1', 'Column2', 'NoColumnInSomeCases'], #item_1 == #item)) => ['Column1', 'Column2']
But it comes to another problem that byNames() cannot compute on the fly, it said 'byNames' does not accept column or argument
array(byNames(filter(columnNames('', true()), contains(['Column1', 'Column2', 'NoColumnInSomeCases'], #item_1 == #item))))
Spark job failed: { "text/plain":
"{"runId":"649f28bf-35af-4472-a170-1b6ece50c551","sessionId":"a26089f4-b0f4-4d24-8b79-d2a91a9c52af","status":"Failed","payload":{"statusCode":400,"shortMessage":"DF-EXPR-030
at Derive 'CreateTypeFromFile'(Line 35/Col 36): Column name function
'byNames' does not accept column or argument
parameters","detailedMessage":"Failure 2022-04-13 05:26:31.317
failed DebugManager.processJob,
run=649f28bf-35af-4472-a170-1b6ece50c551, errorMessage=DF-EXPR-030 at
Derive 'CreateTypeFromFile'(Line 35/Col 36): Column name function
'byNames' does not accept column or argument parameters"}}\n" } -
RunId: 649f28bf-35af-4472-a170-1b6ece50c551
I have tried lots of ways, even created a new derived column (before that stream name) to store ['Column1', 'Column2']. But it said that column cannot be referenced within byNames() function
Do we have any elegant solution?
It is true that byName() cannot evaluate with late binding. You need to either use a Select transformation to set the columns in the stream you wish to hash first or send in the column names via a parameter. Since that is "early column binding", byName() will work.
You can use a get metadata activity in the pipeline to inspect which columns are present in your source before calling the data flow, allowing you to send a pipeline parameter with just those columns you wish to hash.
Alternatively, you can create a new branch, use a select matching rule, then hash the row based on those columns (see example below).

Error in makeClassifTask - columns to join must specify "on="

I am getting an error here for the makeClassifTask() from MLR package.
task = makeClassifTask(data = data[,2:20441], target='Disease')
Entering this I get this error.
Provided data is not a pure data.frame but from class data.table, hence it will be converted.
Error in [.data.table(data, target) :
When i is a data.table (or character vector), the columns to join by must be specified using 'on=' argument (see ?data.table), by keying x (i.e. sorted, and, marked as sorted, see ?setkey), or by sharing column names between x and i (i.e., a natural join). Keyed joins might have further speed benefits on very large data due to x being sorted in RAM.
If someone could help me out it'd be great.
Given that you did not provide the data I can only do some guessing and suggest to read the documentation at https://mlr3book.mlr-org.com/tasks.html.
It looks like you left out the first column in your dataset which might be your target. Hence makeClassifTask() cannot find your target column.
As #Shreyash Gputa pointed out correctly, changing the data.table object to a data.frame object solves the issue:
task = makeClassifTask(data = as.data.frame(data[,2:20441]), target='Disease')
Given of course that data[,2:20441] contains the target variable Disease...

Power Query - Appending two tables but the other table might be empty depending on the situation - throws an error in that case

I am working on a solution that involves merging two queries in Power Query to retrieve a single data table back to Excel. The first query is always populated but the other query comes from an ERP and might be empty (empty table) from time to time.
Appending the two queries involves making the header names the same in the two queries before the appending takes place. As the second query sometimes results in an empty table, the error arises in the steps when Power Query is modifying the header names in the second table (it cannot modify the header names as there are no headers).
"Error message: Expression.Error: The column 'PartMtl_Company' of the table wasn't found.
Details: PartMtl_Company" where the PartMtl_Company is the leftmost column in my table.
I am kind of thinking that I would need to evaluate whether the second table is empty and skip the renaming steps if that is the case. I assume merging the populated first table with an empty table would cause no problem and would only result in the first table. I have tried to look around for a suitable M-code but have not come across such.
I'm thinking you might be able to use Table.RowCount to solve this. Something along the lines of:
= if Table.RowCount(Table2) > 0 then...
You would modify the headers only if there is data in the second table. Same goes for the appending of the tables: you would only append if there is data in the second table, since you won't have renamed any headers otherwise.
Thank you Marc! That did the trick.
In the end, I wrote some in the lines of
= if Table.RowCount(Table2) > 0 then... (code that works on a non-empty table) ...else Table2
, which returns the empty table if it is empty to begin with. Appending the second table into the first table did not throw an error but returned only the first table like planned.

duplicate primary key in return table created by select union

I have the following query called searchit
SELECT 2 AS sourceID, BLOG_COMMENTS.bID, BLOG_TOPICS.Topic_Title,
BLOG_TOPICS.LFD, BLOG_TOPICS.LC,
BLOG_COMMENTS.Comment_Narrative
FROM BLOG_COMMENTS INNER JOIN BLOG_TOPICS
ON BLOG_COMMENTS.bID = BLOG_TOPICS.bID
WHERE (BLOG_COMMENTS.Comment_Narrative LIKE #Phrase)
This query executes AND returns the correct results in the query builder!
HOWEVER, the query needs to run in code-behind, so I have the following line:
DataTable blogcomments = btad.searchit(aphrase);
There are no null fields in any row of any column in EITHER of the tables. The tables are small enough I can easily detect null data. Note that bID is key for blog_topics and cID is key for blog comments.
In any case, when I run this I get the following error:
Failed to enable constraints. One or more rows contain values
violating non-null, unique, or foreign-key constraints.
Tables have a 1 x N relationship, many comments for each blog entry. IF I run the query with DISTINCT and remove the Comment_Narrative from the return fields, it returns data correctly (but I need the other rows!) However, when I return the other rows, I get the above error!
I think tells me that there is a constraint on the return table that I did not put there, therefore it must somehow be inheriting that constraint from the call to the query itself because one of the tables happens to have a primary key defined (which it MUST have). But why does the query work fine in the querybuilder? The querybuilder does not care that bID is duped in the result (and it should not be), but the code-behind DOES care.
Addendum:
Just as tests,
I removed the bID from the return list and I still get the error.
I removed the primary key from blog_topics.bID and I get the same error.
This kinda tells me that it's not the fact that my bID is duped that is causing the problem.
Another test:
I went into the designer code (I know it's nasty, I'm just desperate).
I added the following:
// zzz
try
{
this.Adapter.Fill(dataTable);
}
catch ( global::System.Exception ex )
{
}
Oddly enough, when I run it, I get the same error as before AND it doesn't show the changes I've made in the error message:
Line 13909: }
Line 13910: BPLL_Dataset.BLOG_TOPICSDataTable dataTable = new BPLL_Dataset.BLOG_TOPICSDataTable();
Line 13911: this.Adapter.Fill(dataTable);
Line 13912: return dataTable;
Line 13913: }
I'm stumped.... Unless maybe it sees I'm not doing anything in the try catch and is optimizing for me.
Another addendum:
Suspecting that it was ignoring the test code I added to the designer, I added something to the catch. It produces the SAME error and acts like it does not see this code. (Well, okay, it DOES NOT see this code, because it prints out same as before into the browser.)
// zzz
try
{
this.Adapter.Fill(dataTable);
}
catch ( global::System.Exception ex )
{
System.Web.HttpContext.Current.Response.Redirect("errorpage.aspx");
}
The thing is, when I made the original post, I was ALREADY trying to do a work-around. I'm not sure how far I can afford to go down the rabbit hole. Maybe I read the whole mess into C# and do all the joins and crap myself. I really hate to do that, because I've only recently gotten out of the habit, but I perceive I'm making a good faith effort to use the the tool the way God and Microsoft intended. From wit's end, tff.
You don't really show how you're running this query from C# ... but I'm assuming either as a straight text in a SqlCommand or it's being done by some ORM ... Have you attempted writing this query as a Stored Procedure and calling it that way? The stored Procedure would be easier to test and run by itself with sample data.
Given the fact that the error is mentioning null values I would presume that, if it is a problem with the query and not some other element of your code, then it'd have to be on one of the following fields:
BLOG_COMMENTS.bID
BLOG_TOPICS.bID
BLOG_COMMENTS.Comment_Narrative
If any of those fields are Nullable then you should be doing a COALESCE or an ISNULL on them before using them in any comparison or Join. It's situations like these which explain why most DBAs prefer to have as few nullable columns in tables as possible - they cause overhead and are prone to errors.
If that still doesn't fix your problem, then COALESCE/ISNULL all fields that are nullable and are being returned by this query. Take all null values out of the equation and just get the thing working and then, if you really need the null values to be null, go back through and remove the COALESCE/ISNULLs one at a time until you find the culprit.
My problem came from ignorance and a bit of dullness. I did not realize that just because a field is a key in the sql table does mean it has to be a key in the tableadapter. If one has a key field defined in the SQL table and then creates a table adapter, the corresponding field in the adapter will also be a key. All I had to do was unset the key field in the tableadapter and it worked.
Solution:
Select the key field in the adapter.
Right click
Select "Delete Key" (keeps the field, but removes the "key" icon)
That's it.

Is it correct that "ResultSet.getMetaData.getTableName(col)" of postgresql's jdbc driver is always returning an empty string?

When I use postgresql, I found following code:
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from t");
String tableName = rs.getMetaData().getTableName(1);
System.out.println(tableName);
It prints an empty string.
So I checked the source code, and found the method org.postgresql.jdbc2.AbstractJdbc2ResultSetMetaData#getTableName always returns an empty string.
The source code is:
public abstract class AbstractJdbc2ResultSetMetaData implements PGResultSetMetaData {
/*
* #param column the first column is 1, the second is 2...
* #return column name, or "" if not applicable
* #exception SQLException if a database access error occurs
*/
public String getTableName(int column) throws SQLException
{
return "";
}
}
You can see it just return a "".
I found a discussion about this, please see: http://archives.postgresql.org/pgsql-jdbc/2009-12/msg00100.php
They think
"rs.getMetaData.getTableName(col)" should return the alias name in
query not the underlying table name. But which is hard to implement, so it's better to leave it empty.
Also they gave a method to get the table name, use:
PGResultSetMetaData.getBaseTableName()
Sample:
ResultSet rs = stmt.executeQuery("select * from x");
// convert it to PGResultSetMetaData
PGResultSetMetaData meta = (PGResultSetMetaData)rs.getMetaData();
String tableName = meta.getBaseTableName(1);
Now it can print the correct table name.
I don't know the implementation of postgresql is correct, but returning the underlying table name is much more useful than an empty string, and, most of other databases provides underlying table name instead of an empty string.
I have a problem using play2's anorm framework with postgesql: Play2's anorm can't work on postgresql, but that works well on other databases.
What do you think the correct implementation of postgresql's jdbc driver? Return an empty string, underlying table name, or something else?
I would say that returning an empty string is obviously an incorrect implementation of the interface, since the table name could never be considered to be an empty string.
The problem that I believe they are struggling with is that while their current implementation is wrong, once they choose an implementation they will be stuck with it until they decide that breaking dependencies on the behaviour is acceptable. Therefore, they choose to add a method whose name is unambiguous and provide the data that most users were expecting to come from getTableName, and leave an obviously broken implementation of the getTableName method until some consensus is reached on what it should return or until a patch is submitted that implements the consensus.
My gut reaction is that the method getTableName should return the alias being used for that table. A table could be joined with itself, and using the alias would allow you to identify which was being referenced. A table might have been generated in the query (such as unnesting an array), and therefore not even have a table name in the database. If you make the decision “absolutely always, getTableName returns the alias”, then at least users know what to expect; otherwise, you end up with it not being obvious what the method should return.
However, even if I assume that my gut reaction is “the correct implementation”, it raises the issue of compatibility. It is desirable that it be possible to switch from another DBMS to PostgreSQL with as little investment as possible, if one of PostgreSQL’s goals is to grow in popularity. Therefore, things like “how do other JDBCs implement the java.sql interfaces?” become relevant. As you say, a framework exists that has expectations of how the ResultSetMetaData should be implemented, and it is likely not the only one with certain expectations of how java.sql interfaces will be implemented.
Whichever implementation they end up choosing is going to be a tradeoff, so I can see why “kick the can down the road” is their choice. Once they choose the tradeoff they want to make, they are locked in.
EDIT: I would suggest that throwing an exception regarding not implemented would be better than just silently failing. I expect that frameworks that rely on a specific implementation of getTableName will not have much use for empty string anyway, and either error or themselves fail silently.