How can I read 23 million records from postgres using JDBC? I have to read from a table in postgres and write to another table - postgresql

When I write a simple JPA code to findAll() data, I run into memory issues. For writing, I can do batch update. But how to read 23 million records and save them in list for storing into another table?

Java is a poor choice for processing "batch" stuff (and I love java!).
Instead, do it using pure SQL:
insert into target_table (col1, col2, ...)
select col1, col2, ....
from ...
where ...
or, if you must do some processing in java that can't be done within the query, open a cursor for the query and read rows 1 at a time and write the target row before reading the next row. This approach however will take a looooong time to finish.

I fully agree with Bohemian's answer.
If the source and the destination tables you can read and write within the same loop
something in a try - catch block like:
PreparedStatement reader = null;
PreparedStatement writer = null;
ResultSet rs = null;
try {
reader = sourceConnection.prepareStatement("select....");
writer = destinationConnection.prepareStatement("insert into...");
rs = reader.executeQuery();
int chunksize = 10000; // this is you batch size, depends on your system
int counter = 0;
while ( rs.next() {
writer.set.... // do for every field to insert the corresponding set
writer.addBatch();
if ( counter++ % chunksize == 0 ) {
int rowsWritten = writer.executeBatch();
System.out.println("wrote " + counter + " rows"); // probably a simple message to see a progress
}
}
// when finished, do not forget to flush the rest of the batch job
writer.executeBatch();
} catch (SQLException sqlex ) {
// an Errormessage to your gusto
System.out.println("SQLException: " + sqlex.getMessage());
} finally {
try {
if ( rs != null ) rs.close();
if ( reader != null ) reader.close();
if ( writer != null ) writer.close();
// probably you want to clsoe the connections as well
} catch (SQLException e ) {
System.out.println("Exception while closing " + e.getMessage());
}
}

Related

Use raw SQL to insert a list in postgresql using Entity Framework

I want to insert a list of data into a table in Postgresql using Entity Framework.
Can anyone help by suggesting the syntax or an algo for that?
Like #Nick said raw sql isn't ideal especially if you are inserting 100k values,Tons of options when it comes to bulk insert i specifically use BulkInsert find it much easier example below
using (var dbContextTransaction = _Context.Database.BeginTransaction())
{
try
{
//bulk insert list of user objects in users table
_Context.BulkInsert(list_of_users, options =>
{
options.InsertIfNotExists = true
options.ColumnPrimaryKeyExpression = x => new{x.id};
});
dbContextTransaction.Commit();
}
catch (Exception ex)
{
dbContextTransaction.Rollback();
throw;
}
}
But if you are sticking to the hard route of building the raw sql query and executing it here is how i would do it,i would divide up the list into batches of 10k to avoid memory issues
var insert_query = "INSERT INTO TABLE (Column_1,Column_2,Column_3)VALUES";
int count=0;
foreach (var obj in list_data)
{
var size = count + 1;
if(count is 0)
insert_query += $"({obj.param1},{obj.param2},{obj.param3})";
else
insert_query += $",({obj.param1},{obj.param2},{obj.param3)";
if(size.Equals(data.Count()))
insert_query += ";";
count++;
}
int rows_affected = _Context.Database.ExecuteSqlRaw(insert_query);

cursor count is 1 whereas the table has 3 rows

I m trying to populate sql table and then retrieve data from it. Following is my code.
public void addQuestion(Question quest)
{
int id = 1;
ContentValues values = new ContentValues();
SQLiteDatabase db = this.getWritableDatabase();
db.execSQL("DROP TABLE IF EXISTS " + TABLE_QUEST1);
onCreate(db);
values.put(KEY_QUES, quest.getQuestion());
values.put(KEY_ANSWER, quest.getAnswer());
values.put(KEY_OPTA, quest.getOptA());
values.put(KEY_OPTB, quest.getOptB());
values.put(KEY_OPTC, quest.getOptC());
db.insert(TABLE_QUEST1, null, values);
System.out.println("Added in database: " + quest.getQuestion());
}
public ArrayList<Question> getAllQuestions() {
System.out.println("getting rows 1");
ArrayList<Question> quesList = new ArrayList<Question>();
System.out.println("getting rows 2");
Cursor cursor = null;
SQLiteDatabase db = getReadableDatabase();
System.out.println("getting rows ");
cursor = db.rawQuery("SELECT * FROM " + TABLE_QUEST1, null);
if (!cursor.moveToFirst()) {
System.out.println("No data in the database ");
} else {
System.out.println("theres data in the database ");
quesList = new ArrayList<Question>();
do {
System.out.print("total rows " + cursor.getCount());
Question quest = new Question();
quest.setID(cursor.getInt(0));
quest.setQuestion(cursor.getString(1));
quest.setAnswer(cursor.getString(2));
quest.setOptA(cursor.getString(3));
quest.setOptB(cursor.getString(4));
quest.setOptC(cursor.getString(5));
quesList.add(quest);
} while (cursor.moveToNext());
cursor.close();
}
}
I have 4 rows of data in my table and I can see that with the print statement "added in database"
but when i actually read it the cursor just reads row 1 and moves out of the while loop. what could potentially be wrong.
tia
Your code was absolutely fine except placing drop command in the loop. As mentioned in the earlier comments, please make sure to avoid calling drop query each time and you'll find the result.
As Santosh has pointed out DROPPING the table (as per db.execSQL("DROP TABLE IF EXISTS " + TABLE_QUEST1);) and then re-creating it (as per onCreate(db);) will delete the table and then re-create the table removing any rows/data that had previously been added to the table.
As such it's simply a matter of removing those two lines of code, Also there appears to be no need for the line int id = 1;, so perhaps remove this, as per :-
public void addQuestion(Question quest)
{
ContentValues values = new ContentValues();
SQLiteDatabase db = this.getWritableDatabase();
values.put(KEY_QUES, quest.getQuestion());
values.put(KEY_ANSWER, quest.getAnswer());
values.put(KEY_OPTA, quest.getOptA());
values.put(KEY_OPTB, quest.getOptB());
values.put(KEY_OPTC, quest.getOptC());
db.insert(TABLE_QUEST1, null, values);
System.out.println("Added in database: " + quest.getQuestion());
}
P.S. you may consider not using hard coded column offsets but instead obtain offsets according to column names by utilising the getColumnIndex(column_name) Cursor method. e.g. :-
Question quest = new Question();
quest.setID(cursor.getInt(cursor.getColumnIndex("name_of_your_id_columm")));
quest.setQuestion(cursor.getString(cursor.getColumnIndex(KEY_QUES)));
quest.setAnswer(cursor.getString(cursor.getColumnIndex(KEY_ANSWER)));
quest.setOptA(cursor.getString(cursor.getColumnIndex(KEY_OPTA)));
quest.setOptB(cursor.getString(cursor.getColumnIndex(KEY_OPTB)));
quest.setOptC(cursor.getString(cursor.getColumnIndex(KEY_OPTC)));
quesList.add(quest);
Noting that instead of "name_of_your_id_columm", you may have something like KEY_ID defined, if so use that, thus you have a single definition so it reduces the chance of inadvertently mispelling column names or miscalculating the offsets.

How can convert bytea to base64 in Postgres

I have now facing the problem in bytea to Base64, actually I have save the image in below query,
user_profile_pic is defind in bytea in table
Update user_profile_pic
Set user_profile_pic = (profilepic::bytea)
Where userid = userid;
after that I have select the below query,
case 1:
SELECT user_profile_pic
FROM user_profile_pic;
its return exact same as I have updated, but after passing service its display a byte format
case 2:
Select encode(user_profile_pic::bytea, 'base64')
FROM user_profile_pic;
it returns totally different result.
I want to result case 1 along with service?
its working for me, not working query if write procedure/function, i write direct code behind
conn.Open();
NpgsqlCommand command = new NpgsqlCommand("SELECT profile_pic FROM userlog WHERE cust_id = '" + CustID + "'", conn);
Byte[] result = (Byte[])command.ExecuteScalar();
if(result.Length > 0)
{
ProfilePicture = Convert.ToBase64String(result);
ErrorNumber = 0;
ErrorMessage = "Successful operation";
}
else
{
ErrorNumber = 1;
}
conn.Close();

How do I replicate %Dictionary.ClassDefintionQuery's Summary() in SQL?

There is a procedure in %Dictionary.ClassDefinitionQuery which lists a class summary; in Java I call it like this:
public void readClasses(final Path dir)
throws SQLException
{
final String call
= "{ call %Dictionary.ClassDefinitionQuery_Summary() }";
try (
final CallableStatement statement = connection.prepareCall(call);
final ResultSet rs = statement.executeQuery();
) {
String className;
int count = 0;
while (rs.next()) {
// Skip if System is not 0
if (rs.getInt(5) != 0)
continue;
className = rs.getString(1);
// Also skip if the name starts with a %
if (className.charAt(0) == '%')
continue;
//System.out.println(className);
count++;
}
System.out.println("read: " + count);
}
}
In namespace SAMPLES this returns 491 rows.
I try and replicate it with a pure SQL query like this:
private void listClasses(final Path dir)
throws SQLException
{
final String query = "select id, super"
+ " from %Dictionary.ClassDefinition"
+ " where System = '0' and name not like '\\%%' escape '\\'";
try (
final PreparedStatement statement
= connection.prepareStatement(query);
final ResultSet rs = statement.executeQuery();
) {
int count = 0;
while (rs.next()) {
//System.out.println(rs.getString(1) + ';' + rs.getString(2));
count++;
}
System.out.println("list: " + count);
}
}
Yet when running the program I get this:
list: 555
read: 491
Why are the results different?
I have looked at the code of %Dictionary.ClassDefinitionQuery but I don't understand why it gives different results... All I know, if I store the names in sets and compare, is that:
nothing is missing from list that is in read;
most, but not all, classes returned by list which are not in read are CSP pages.
But that's it.
How I can I replicate the behaviour of the summary procedure in SQL?
Different is in one property. %Dictionary.ClassDefinitionQuery_Summary shows only classes with Deployed<>2. So, sql must be such.
select id,super from %Dictionary.ClassDefinition where deployed <> 2
But one more things is, why count may be different is, such sql requests may be compilled to temporary class, for example "%sqlcq.SAMPLES.cls22"

Postgresql query for this situation?

I am using the following statement
select *
from table
where column1 in(groups)
Where "groups" is a String array of size n.
If i use it as it is, It wont get executed so can anyone suggest exact query to perform this?
EDIT 1
If I use the following code
try{
System.out.println("before execute query");
ps1.setArray(1,conn.createArrayOf("text",gs));
ps1.setArray(2,conn.createArrayOf("text",gs));
System.out.println("after execute query");
}
catch(Exception e)
{
System.out.println("hrer----"+e);
}
First,It prints "before execute query" and then it gives the following exception
javax.servlet.ServletException:servlet execution threw an exception
*NOTE : * It does not print "hrer-----" in catch(Exception e) block
This should work:
PreparedStatement stmt = conn.prepareStatement(
"SELECT * FROM users WHERE username = any(?)");
String[] usernames = {"admin", "guest"};
stmt.setArray(1, conn.createArrayOf("varchar", usernames));
Credit goes to Boris's answer at https://stackoverflow.com/a/10240302
select * from table where column1 in (?, ?)
except that you have n question marks.
StringBuilder q = new StringBuilder("select * from table where column1 in (");
for(int i=0; i<groups.length; i++) {
q.append("?");
if(i != groups.length - 1) {
q.append(",");
}
}
q.append(")");
PreparedStatement query = con.prepareStatement(q.toString());
for(int i=1; i<=groups.length; i++) {
query.setString(i, groups[i-1]);
}
ResultSet rs = query.getResultSet();