Last rows missing in Mysql Table - talend

I am using Talend to charge a table(Mysql), However there are missing records in the table. I've looked into the Talend Job with tLogRow and it says I have all 3650 records while in my SQL Table i find only 3600. Does anyone have an idea why?

Related

Cloudant/Db2 - How to determine if a database table row was read from?

I have two databases - Cloudant and IBM Db2. I have a table in each of these databases that hold static data that is only read from and never updated. These were created a long time ago and I'm not sure if they are used today so I wish to do a clean-up.
I want to determine if these tables or rows from these tables, are still being read from.
Is there a way to record the read timestamp (or at least know if it is simply accessed like a dirty bit) on a row of the table when it is read from?
OR
Record the read timestamp of the entire table (if any record from it is accessed)?
There is SYSCAT.TABLES.LASTUSED system catalog column in Db2 for DML statements on whole table.
There is no way to track each table row read access.

How to COPY CSV file into table resolving foreign key values into ids

I'm expert at mssql but a beginner with PostgreSQL so all my assumptions are likely wrong.
Boss is asking me to get a 300 MB CSV file into PostgreSQL 12 (1/4 million rows and 100+ columns). The file has usernames in 20 foreign key columns that would need to be looked up and converted to id int values before getting inserted into a table. The COPY command doesn't seem to handle joining a csv to other tables before inserting. Am I going in a wrong direction? I want to test locally but ultimately am only allowed to give the CSV to a DBA for importing into a docker instance on a server. If only I could use pgAdmin and directly insert the rows!

Odoo 10 is not backing up DB in PostgreSQL 9.5. Shows "SQL state: 22008. Timestamp out of range on account_bank_statement_line."

At our company we had a DB crash a few days ago due to hardware reasons. We recovered from that but since then we're having this following error every time we try to back up our DB.
pg_dump: ERROR: timestamp out of range
pg_dump: SQL command to dump the contents of table "account_bank_statement_line"
The error is in "account_bank_statement_line" table, where we have 5 rows created with only the 'create_date' column has a date of year 4855(!!!!), the rest of the columns have null value, even the id (primary key). We can't even delete or update those rows using pgAdmin 4 or PostgreSQL terminal.
We're in a very risky stage right now with no back up of few days of retail sales. Any hints at least would be very highly appreciated.
First, if the data are important, hire a specialist.
Second, run your pg_dump with the option --exclude-table=account_bank_statement_line so that you at least have a backup of the rest of your database.
The next thing you should do is to stop the database and take a cold backup of all the files. That way you have something to go back to if you mess up.
The key point to proceed is to find out the ctids (physical addresses) of the problematic rows. Then you can use that to delete the rows.
You can approach that by running queries like
SELECT create_date FROM account_bank_statement_line
WHERE ctid < '(42,0)';
and try to find the ctids where you get an error. Once you have found a row where the following falls over:
SELECT * FROM account_bank_statement_line
WHERE ctid = '(42,14)';
you can delete the row by its ctid.
Once you are done, take a pg_dumpall of the database cluster, create a new one and restore the dump. It is dangerous to continue working with a cluster that has experienced corruption, because corruption can remain unseen and spread.
I know what we did might not be the most technically advanced, but it solved our issue. We consulted a few experts and what we did was:
migrated all the data to a new table (account_bank_statement_line2), this transferred all the rows that had valid data.
Then we deleted the "account_bank_statement_line" table and
renamed the new table to "account_bank_statement_line".
After that we could DROP the table.
Then the db backup ran smoothly like always.
Hope this helps anyone who's in deep trouble like us. Cheers!

Adding a table to HDB by using dbmaint function

I would like to backfill a table to all dates in HDB. but the table has like 100 columns. What's the fastest way to backfill with the existing table?
I tried to get the schema from the current table and use the schema to backfill but doesn't work.
this is what I tried:
oldTable:0#newTable;
addtable[dbdir;`table;oldTable]
but this doesn't work. Any good way?
Does the table exist within the latest date partition of the HDB?
If so .Q.chk will add tables to partitions in which they are missing.
https://code.kx.com/q/ref/dotq/#qchk-fill-hdb
And with regards to addtable, what specific error are you getting when trying the above?

Hive: Read timeout exception

The problem is when I'm trying to query smth from hive using my application e.g.
analyze table table_entity compute statistics or for example select count(*) from table_entity sometimes I got such exception:java.net.SocketTimeoutException: Read timed out
But when for example I query show tables or show tblproperties table_entity I didn't get such exception.
Has anyone faced with this problem?
Thank you in advance.
I know this is little late, but I was stuck with same sort of problem where we were not able to drop a table, we raised a case with Cloudera and following was the suggestion provided by them:
Upon investigating the logs and the stack of messages we have observed
during the execution of the DROP query for the table "tableName",
the query got failed due to "Read time out" in HiveServer2 from
HiveMetastore. This means the drop operation of the table was not able
to get complete at Metadata level. (In general - DROP operation
deletes the metadata details of a table from HMS) We are suspecting
that the metadata of this table is not up to date and might have more
data which could have led to the timeout. Could you please perform the
below steps and let us know your feedback?
Login to Beeline-Hive.
Update the partition level metadata in Hive Metastore:
MSCK REPAIR TABLE db.table_name DROP PARTITIONS;
Compute the statistics for the table:
ANALYZE TABLE db.tableName partition(col1, col2) compute statistics noscan;
Drop the partitions of the table first using the query:
ALTER TABLE db.tableName DROP [IF EXISTS] PARTITION partition_spec[, PARTITION partition_spec, ...]
Please replace the partition values in the above command accordingly as per the partitions of the table.
Increase the Hive Metastore client socket timeout.
set hive.metastore.client.socket.timeout=1500.
This will give an increased socket time out only for this session. Hence run the next step #6 in the same session.
Drop the table tableName;
DROP TABLE db.tableName;
Although the question here is regarding Analyze table issue but this answer is intended to cover all the other issues related to Read Timeout Error.
I have shown my scenario(drop statement), however these steps will surely help across all sorts of queries which leads to the Read timeout Exception.
Cheers!