Postgresql query for this situation?

Postgresql query for this situation? - postgresql

I am using the following statement
select *
from table
where column1 in(groups)
Where "groups" is a String array of size n.
If i use it as it is, It wont get executed so can anyone suggest exact query to perform this?
EDIT 1
If I use the following code
try{
System.out.println("before execute query");
ps1.setArray(1,conn.createArrayOf("text",gs));
ps1.setArray(2,conn.createArrayOf("text",gs));
System.out.println("after execute query");
}
catch(Exception e)
{
System.out.println("hrer----"+e);
}
First,It prints "before execute query" and then it gives the following exception
javax.servlet.ServletException:servlet execution threw an exception
*NOTE : * It does not print "hrer-----" in catch(Exception e) block

This should work:
PreparedStatement stmt = conn.prepareStatement(
"SELECT * FROM users WHERE username = any(?)");
String[] usernames = {"admin", "guest"};
stmt.setArray(1, conn.createArrayOf("varchar", usernames));
Credit goes to Boris's answer at https://stackoverflow.com/a/10240302

select * from table where column1 in (?, ?)
except that you have n question marks.
StringBuilder q = new StringBuilder("select * from table where column1 in (");
for(int i=0; i<groups.length; i++) {
q.append("?");
if(i != groups.length - 1) {
q.append(",");
}
}
q.append(")");
PreparedStatement query = con.prepareStatement(q.toString());
for(int i=1; i<=groups.length; i++) {
query.setString(i, groups[i-1]);
}
ResultSet rs = query.getResultSet();

Related

How can I read 23 million records from postgres using JDBC? I have to read from a table in postgres and write to another table

When I write a simple JPA code to findAll() data, I run into memory issues. For writing, I can do batch update. But how to read 23 million records and save them in list for storing into another table?

Java is a poor choice for processing "batch" stuff (and I love java!).
Instead, do it using pure SQL:
insert into target_table (col1, col2, ...)
select col1, col2, ....
from ...
where ...
or, if you must do some processing in java that can't be done within the query, open a cursor for the query and read rows 1 at a time and write the target row before reading the next row. This approach however will take a looooong time to finish.

I fully agree with Bohemian's answer.
If the source and the destination tables you can read and write within the same loop
something in a try - catch block like:
PreparedStatement reader = null;
PreparedStatement writer = null;
ResultSet rs = null;
try {
reader = sourceConnection.prepareStatement("select....");
writer = destinationConnection.prepareStatement("insert into...");
rs = reader.executeQuery();
int chunksize = 10000; // this is you batch size, depends on your system
int counter = 0;
while ( rs.next() {
writer.set.... // do for every field to insert the corresponding set
writer.addBatch();
if ( counter++ % chunksize == 0 ) {
int rowsWritten = writer.executeBatch();
System.out.println("wrote " + counter + " rows"); // probably a simple message to see a progress
}
}
// when finished, do not forget to flush the rest of the batch job
writer.executeBatch();
} catch (SQLException sqlex ) {
// an Errormessage to your gusto
System.out.println("SQLException: " + sqlex.getMessage());
} finally {
try {
if ( rs != null ) rs.close();
if ( reader != null ) reader.close();
if ( writer != null ) writer.close();
// probably you want to clsoe the connections as well
} catch (SQLException e ) {
System.out.println("Exception while closing " + e.getMessage());
}
}

Sometimes postgres does not return records when queried remotely from JDBC Client

public static ArrayList<String[]> getDailyRecords () throws ClassNotFoundException {
ArrayList<String[]> list=new ArrayList<String[]> ();
String[] header = {"City", "Location", "Asset", "Number of Alerts", "Time spent in alerts", "Last seen temparature", "Limit"};
list.add (header);
String myDriver = "org.postgresql.Driver";
Class.forName (myDriver);
try( Connection conn = DriverManager.getConnection( ApplicationSettings.DATABASE_URL, ApplicationSettings.DATABASE_USER, ApplicationSettings.DATABASE_PASSWORD);
Statement st = conn.createStatement ();) {
conn.setAutoCommit (true);
ResultSet rs=null;
st.setFetchSize (200);
String dailyQuery="select * from sch.reports();";
rs= st.executeQuery (dailyQuery);
while (rs.next ()) {
String[] ar = new String[7];
ar[0] = rs.getString ("location");
ar[1] = rs.getString ("sublocation");
ar[2] = rs.getString ("zone");
ar[3] = rs.getString ("total_count");
ar[4] = rs.getString ("period");
String temp = rs.getString ("temparature");
ar[5] = temp;
list.add (ar);
}
conn.close ();
st.close ();
rs.close ();
}catch(Exception e){
e.printStackTrace ();
}catch(Error e){
e.printStackTrace ();
}
finally {
if(list.size ()==1){
System.out.println ("NO records found");
}else{
System.out.println ("Foudn some records");
}
return list;
}
I have a sql function, which does return records when queried locally from postgres clients. I invoke this function at scheduled timings from a java application. This worked fine for few months. But all of sudden, st.executeQuery() returning empty result set occasionally. Like 4 out of 10 attempts of executeQuery() in a day return empty resultset.
Things I have tried out:
Made sure query returns some records always. It should return 60 records.
Captured Errors,Exceptions. There were no errors seen.
Made sure JDBC connection is closed.
Debugged Postgres JDBC driver classes. Found that in cases when resultset is empty, data was not received from postgres data stream.
can size of data in tables influence query response?
Any help is kindly appreciated!
Update: SQL function definition added.
DECLARE
exception_error_code text;
exception_message text;
exception_detail text;
exception_hint text;
exception_context text;
BEGIN
RETURN QUERY
select l.name as location,sl.name as sublocation,z.name as zone,al.lim as lim,count(al.id) total_count,sum(al.timestamp_to-al.timestamp_from) period,t2.temperature temparature
from
map.location l
left join
map.sub_location sl
on l.id=sl.location_id
left join
map.zone z
on sl.id=z.sub_location_id
left join
map.subzone sz
on z.id=sz.zone_id
inner join
(select *,(info::json -> 'breach_type')::text as lim from alr.live where date(timestamp_from)= date(now() + INTERVAL '5 hours 30 minutes')) al
on sz.id= al.sub_zone_id
left join
(select subzone_id,max(nd_timestamp) as nd_timestamp from tel.temperature_sh group by subzone_id) t1
on al.sub_zone_id=t1.subzone_id
left join
tel.temperature_sh t2
on t1.subzone_id=t2.subzone_id
and t1.nd_timestamp=t2.nd_timestamp
group by l.name ,sl.name,z.name,t2.temperature , al.lim
order by l.name ,sl.name,z.name;
-- if something "breaks" do the following
EXCEPTION WHEN others THEN
get stacked diagnostics
exception_error_code = RETURNED_SQLSTATE
,exception_message = MESSAGE_TEXT
,exception_detail = PG_EXCEPTION_DETAIL
,exception_hint = PG_EXCEPTION_HINT
,exception_context = PG_EXCEPTION_CONTEXT
;
-- log exception for debugging
PERFORM public.insert_db_exception(
exception_error_code
,exception_message
,exception_detail
,exception_hint
,exception_context
);
END;

How do I replicate %Dictionary.ClassDefintionQuery's Summary() in SQL?

There is a procedure in %Dictionary.ClassDefinitionQuery which lists a class summary; in Java I call it like this:
public void readClasses(final Path dir)
throws SQLException
{
final String call
= "{ call %Dictionary.ClassDefinitionQuery_Summary() }";
try (
final CallableStatement statement = connection.prepareCall(call);
final ResultSet rs = statement.executeQuery();
) {
String className;
int count = 0;
while (rs.next()) {
// Skip if System is not 0
if (rs.getInt(5) != 0)
continue;
className = rs.getString(1);
// Also skip if the name starts with a %
if (className.charAt(0) == '%')
continue;
//System.out.println(className);
count++;
}
System.out.println("read: " + count);
}
}
In namespace SAMPLES this returns 491 rows.
I try and replicate it with a pure SQL query like this:
private void listClasses(final Path dir)
throws SQLException
{
final String query = "select id, super"
+ " from %Dictionary.ClassDefinition"
+ " where System = '0' and name not like '\\%%' escape '\\'";
try (
final PreparedStatement statement
= connection.prepareStatement(query);
final ResultSet rs = statement.executeQuery();
) {
int count = 0;
while (rs.next()) {
//System.out.println(rs.getString(1) + ';' + rs.getString(2));
count++;
}
System.out.println("list: " + count);
}
}
Yet when running the program I get this:
list: 555
read: 491
Why are the results different?
I have looked at the code of %Dictionary.ClassDefinitionQuery but I don't understand why it gives different results... All I know, if I store the names in sets and compare, is that:
nothing is missing from list that is in read;
most, but not all, classes returned by list which are not in read are CSP pages.
But that's it.
How I can I replicate the behaviour of the summary procedure in SQL?

Different is in one property. %Dictionary.ClassDefinitionQuery_Summary shows only classes with Deployed<>2. So, sql must be such.
select id,super from %Dictionary.ClassDefinition where deployed <> 2
But one more things is, why count may be different is, such sql requests may be compilled to temporary class, for example "%sqlcq.SAMPLES.cls22"

using the TSqlParser

I'm attempting to parse SQL using the TSql100Parser provided by microsoft. Right now I'm having a little trouble using it the way it seems to be intended to be used. Also, the lack of documentation doesn't help. (example: http://msdn.microsoft.com/en-us/library/microsoft.data.schema.scriptdom.sql.tsql100parser.aspx )
When I run a simple SELECT statement through the parser it returns a collection of TSqlStatements which contains a SELECT statement.
Trouble is, the TSqlSelect statement doesn't contain attributes such as a WHERE clause, even though the clause is implemented as a class. http://msdn.microsoft.com/en-us/library/microsoft.data.schema.scriptdom.sql.whereclause.aspx
The parser does recognise the WHERE clause as such, looking at the token stream.
So, my question is, am I using the parser correctly? Right now the token stream seems to be the most useful feature of the parser...
My Test project:
public static void Main(string[] args)
{
var parser = new TSql100Parser(false);
IList<ParseError> Errors;
IScriptFragment result = parser.Parse(
new StringReader("Select col from T1 where 1 = 1 group by 1;" +
"select col2 from T2;" +
"select col1 from tbl1 where id in (select id from tbl);"),
out Errors);
var Script = result as TSqlScript;
foreach (var ts in Script.Batches)
{
Console.WriteLine("new batch");
foreach (var st in ts.Statements)
{
IterateStatement(st);
}
}
}
static void IterateStatement(TSqlStatement statement)
{
Console.WriteLine("New Statement");
if (statement is SelectStatement)
{
PrintStatement(sstmnt);
}
}

Yes, you are using the parser correctly.
As Damien_The_Unbeliever points out, within the SelectStatement there is a QueryExpression property which will be a QuerySpecification object for your third select statement (with the WHERE clause).
This represents the 'real' SELECT bit of the query (whereas the outer SelectStatement object you are looking at has just got the 'WITH' clause (for CTEs), 'FOR' clause (for XML), 'ORDER BY' and other bits)
The QuerySpecification object is the object with the FromClauses, WhereClause, GroupByClause etc.
So you can get to your WHERE Clause by using:
((QuerySpecification)((SelectStatement)statement).QueryExpression).WhereClause
which has a SearchCondition property etc. etc.

Quick glance around would indicate that it contains a QueryExpression, which could be a QuerySpecification, which does have the Where clause attached to it.

if someone lands here and wants to know how to get the whole elements of a select statement the following code explain that:
QuerySpecification spec = (QuerySpecification)(((SelectStatement)st).QueryExpression);
StringBuilder sb = new StringBuilder();
sb.AppendLine("Select Elements");
foreach (var elm in spec.SelectElements)
sb.Append(((Identifier)((Column)((SelectColumn)elm).Expression).Identifiers[0]).Value);
sb.AppendLine();
sb.AppendLine("From Elements");
foreach (var elm in spec.FromClauses)
sb.Append(((SchemaObjectTableSource)elm).SchemaObject.BaseIdentifier.Value);
sb.AppendLine();
sb.AppendLine("Where Elements");
BinaryExpression binaryexp = (BinaryExpression)spec.WhereClause.SearchCondition;
sb.Append("operator is " + binaryexp.BinaryExpressionType);
if (binaryexp.FirstExpression is Column)
sb.Append(" First exp is " + ((Identifier)((Column)binaryexp.FirstExpression).Identifiers[0]).Value);
if (binaryexp.SecondExpression is Literal)
sb.Append(" Second exp is " + ((Literal)binaryexp.SecondExpression).Value);

I had to split a SELECT statement into pieces. My goal was to COUNT how many record a query will return. My first solution was to build a sub query such as
SELECT COUNT(*) FROM (select id, name from T where cat='A' order by id) as QUERY
The problem was that in this case the order clause raises the error "The ORDER BY clause is not valid in views, inline functions, derived tables, sub-queries, and common table expressions, unless TOP or FOR XML is also specified"
So I built a parser that split a SELECT statment into fragments using the TSql100Parser class.
using Microsoft.Data.Schema.ScriptDom.Sql;
using Microsoft.Data.Schema.ScriptDom;
using System.IO;
...
public class SelectParser
{
public string Parse(string sqlSelect, out string fields, out string from, out string groupby, out string where, out string having, out string orderby)
{
TSql100Parser parser = new TSql100Parser(false);
TextReader rd = new StringReader(sqlSelect);
IList<ParseError> errors;
var fragments = parser.Parse(rd, out errors);
fields = string.Empty;
from = string.Empty;
groupby = string.Empty;
where = string.Empty;
orderby = string.Empty;
having = string.Empty;
if (errors.Count > 0)
{
var retMessage = string.Empty;
foreach (var error in errors)
{
retMessage += error.Identifier + " - " + error.Message + " - position: " + error.Offset + "; ";
}
return retMessage;
}
try
{
// Extract the query assuming it is a SelectStatement
var query = ((fragments as TSqlScript).Batches[0].Statements[0] as SelectStatement).QueryExpression;
// Constructs the From clause with the optional joins
from = (query as QuerySpecification).FromClauses[0].GetString();
// Extract the where clause
where = (query as QuerySpecification).WhereClause.GetString();
// Get the field list
var fieldList = new List<string>();
foreach (var f in (query as QuerySpecification).SelectElements)
fieldList.Add((f as SelectColumn).GetString());
fields = string.Join(", ", fieldList.ToArray());
// Get The group by clause
groupby = (query as QuerySpecification).GroupByClause.GetString();
// Get the having clause of the query
having = (query as QuerySpecification).HavingClause.GetString();
// Get the order by clause
orderby = ((fragments as TSqlScript).Batches[0].Statements[0] as SelectStatement).OrderByClause.GetString();
}
catch (Exception ex)
{
return ex.ToString();
}
return string.Empty;
}
}
public static class Extension
{
/// <summary>
/// Get a string representing the SQL source fragment
/// </summary>
/// <param name="statement">The SQL Statement to get the string from, can be any derived class</param>
/// <returns>The SQL that represents the object</returns>
public static string GetString(this TSqlFragment statement)
{
string s = string.Empty;
if (statement == null) return string.Empty;
for (int i = statement.FirstTokenIndex; i <= statement.LastTokenIndex; i++)
{
s += statement.ScriptTokenStream[i].Text;
}
return s;
}
}
And to use this class simply:
string fields, from, groupby, where, having, orderby;
SelectParser selectParser = new SelectParser();
var retMessage = selectParser.Parse("SELECT * FROM T where cat='A' Order by Id desc",
out fields, out from, out groupby, out where, out having, out orderby);

MARS vs NextResult

I rehydrate my business objects by collecting data from multiple tables, e.g.,
SELECT * FROM CaDataTable;
SELECT * FROM NyDataTable;
SELECT * FROM WaDataTable;
and so on...
(C# 3.5, SQL Server 2005)
I have been using batches:
void BatchReader()
{
string sql = "Select * From CaDataTable" +
"Select * From NyDataTable" +
"Select * From WaDataTable";
string connectionString = GetConnectionString();
using (SqlConnection conn = new SqlConnection(connectionString)) {
conn.Open();
SqlCommand cmd = new SqlCommand(sql, conn);
using (SqlDataReader reader = cmd.ExecuteReader()) {
do {
while (reader.Read()) {
ReadRecords(reader);
}
} while (reader.NextResult());
}
}
}
I've also used multiple commands against the same connection:
void MultipleCommandReader()
{
string connectionString = GetConnectionString();
string sql;
SqlCommand cmd;
using (SqlConnection conn = new SqlConnection(connectionString)) {
conn.Open();
sql = "Select * From CaDataTable";
cmd = new SqlCommand(sql, conn);
using (SqlDataReader reader = cmd.ExecuteReader()) {
while (reader.Read()) {
ReadRecords(reader);
}
}
sql = "Select * From NyDataTable";
cmd = new SqlCommand(sql, conn);
using (SqlDataReader reader = cmd.ExecuteReader()) {
while (reader.Read()) {
ReadRecords(reader);
}
}
sql = "Select * From WaDataTable";
cmd = new SqlCommand(sql, conn);
using (SqlDataReader reader = cmd.ExecuteReader()) {
while (reader.Read()) {
ReadRecords(reader);
}
}
}
}
Is one of these techniques significantly better than the other?
Also, would there be a gain if I use MARS on the second method? In other words, is it as simple as setting MultipleActiveResultSets=True in the connection string and reaping a big benefit?

If the data structure is the same in each table, I would do:
Select *, 'Ca' Source From CaDataTable
union all
Select *, 'Ny' Source From NyDataTable
union all
Select *, 'Wa' Source From WaDataTable

Without actually timing the two versions against one another, you can only speculate....
I hope bet that version 1 (BatchReader) will be faster, since you only get one round-trip to the database. Version 2 requires three distinct round-trips - one each for every query you execute.
But again: you can only really tell if you measure.
Marc
Oh, PS: of course in a real-life scenario it would also help so limit the columns returned, e.g. don't use SELECT * but instead use SELECT (list of fields) and keep that list of fields as short as possible.