H2 Optimize select statement / shutdown defrag - select
Test Case:
drop table master;
create table master(id int primary key, fk1 int, fk2 int, fk3 int, dataS varchar(255), data1 int, data2 int, data3 int, data4 int,data5 int,data6 int,data7 int,data8 int,data9 int,b1 boolean,b2 boolean,b3 boolean,b4 boolean,b5 boolean,b6 boolean,b7 boolean,b8 boolean,b9 boolean,b10 boolean,b11 boolean,b12 boolean,b13 boolean,b14 boolean,b15 boolean,b16 boolean,b17 boolean,b18 boolean,b19 boolean,b20 boolean,b21 boolean,b22 boolean,b23 boolean,b24 boolean,b25 boolean,b26 boolean,b27 boolean,b28 boolean,b29 boolean,b30 boolean,b31 boolean,b32 boolean,b33 boolean,b34 boolean,b35 boolean,b36 boolean,b37 boolean,b38 boolean,b39 boolean,b40 boolean,b41 boolean,b42 boolean,b43 boolean,b44 boolean,b45 boolean,b46 boolean,b47 boolean,b48 boolean,b49 boolean,b50 boolean);
create index idx_comp on master(fk1,fk2,fk3);
#loop 5000000 insert into master values(?, mod(?,100), mod(?,5), ?,'Hello World Hello World Hello World',?, ?, ?,?, ?, ?, ?, ?, ?,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true);
1.The following select statement takes up to 30seconds. Is there a way to optimize the response time?
SELECT count(*), SUM(CONVERT(b1,INT)) ,SUM(CONVERT(b2,INT)),SUM(CONVERT(b3,INT)),SUM(CONVERT(b4,INT)),SUM(CONVERT(b5,INT)),SUM(CONVERT(b6,INT)),SUM(CONVERT(b7,INT)),SUM(CONVERT(b8,INT)),SUM(CONVERT(b9,INT)),SUM(CONVERT(b10,INT)),SUM(CONVERT(b11,INT)),SUM(CONVERT(b12,INT)),SUM(CONVERT(b13,INT)),SUM(CONVERT(b14,INT)),SUM(CONVERT(b15,INT)),SUM(CONVERT(b16,INT))
FROM master
WHERE fk1=53 AND fk2=3
2.I tried shutdown defrag. But this statement took about 40min for my test case. After shutdown defrag the select takes up to 15seconds. If i execute the statement again it takes under 1sec. Even if stop and start the server, the statement takes about 1sec.
Has H2 a persistent Cache?
Infrastructure: WebBrowser <-> H2 Console Server <-> H2 DB: h2 1.3.158
According to the profiler output, the main problem (93%) is reading from the disk. I ran this in the H2 Console:
#prof_start;
SELECT ... FROM master WHERE fk1=53 AND fk2=3;
#prof_stop;
and got:
Profiler: top 3 stack trace(s) of 48039 ms [build-158]:
4084/4376 (93%):
at java.io.RandomAccessFile.readBytes(Native Method)
at java.io.RandomAccessFile.read(RandomAccessFile.java:338)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
at org.h2.store.FileStore.readFully(FileStore.java:285)
at org.h2.store.PageStore.readPage(PageStore.java:1253)
at org.h2.store.PageStore.getPage(PageStore.java:707)
at org.h2.index.PageDataIndex.getPage(PageDataIndex.java:225)
at org.h2.index.PageDataNode.getRowWithKey(PageDataNode.java:269)
at org.h2.index.PageDataNode.getRowWithKey(PageDataNode.java:270)
According to EXPLAIN ANALYZE SELECT it's reading over 55'000 pages from the disk (2 KB each page; 110 MB) for this query. I'm not sure how other databases perform for such a query. But I guess if possible the query should be changed so that it reads less data.
Is it possible to have a temporary table/view that already has the datatype conversions done? If it's feasible to have that update itself from the main table occassionally (once a night or so), then you've got a lot of processing power that goes into the conversion done already.
If that's not feasible, you may want to do multiple sub-selects, one for each "b" column, where you only pull where b# = 1. Then do a COUNT instead of a SUM, which should be faster as well. For instance:
SELECT (count1+count2) AS Count,
(SELECT COUNT(*) FROM master WHERE fk1=53 AND fk2=3 AND b1=1) AS count1
(SELECT COUNT(*) FROM master WHERE fk1=53 AND fk2=3 AND b2=1) AS count2
I'm not sure if that exact syntax works in your program, but hopefully as a generic SQL idea it gets you on the right track.
Related
DB2 measure execution time of triggers
How can I best measure the execution time of triggers in DB2 for insert or update. It is needed for some performance issues, some of them are behaving very slow. CREATE OR REPLACE TRIGGER CHECK NO CASCADE BEFORE INSERT ON DAG REFERENCING NEW AS OBJ FOR EACH ROW MODE DB2SQL WHEN( xyz) ) SIGNAL SQLSTATE xxx)
For compiled triggers (that is, with BEGIN ... END body): SELECT T.TRIGNAME , M.SECTION_NUMBER , M.STMT_EXEC_TIME , M.NUM_EXEC_WITH_METRICS -- Other M metrics , M.* , M.STMT_TEXT FROM SYSCAT.TRIGGERS T JOIN SYSCAT.TRIGDEP D ON (D.TRIGSCHEMA, D.TRIGNAME) = (T.TRIGSCHEMA, T.TRIGNAME) JOIN TABLE (MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) M ON (M.PACKAGE_SCHEMA, M.PACKAGE_NAME) = (D.BSCHEMA, D.BNAME) WHERE D.BTYPE = 'K' -- Or use whatever filter on your triggers AND (T.TABSCHEMA, T.TABNAME) = ('MYSCHEMA', 'MYTABLE') ORDER BY 1, 2 For inlined triggers (that is, with BEGIN ATOMIC ... END body): No way to get separate statistics for them. They are compiled and optimized with the corresponding statement fired them.
Get postgres query log statement and duration as one record
I have log_min_duration_statement=0 in config. When I check log file, sql statement and duration are saved into different rows. (Not sure what I have wrong, but statement and duration are not saved together as this answer points) As I understand session_line_num for duration record always equals to session_line_num + 1 for relevant statement, for same session of course. Is this correct? is below query reliable to correctly get statement with duration in one row? (csv log imported into postgres_log table): WITH sql_cte AS( SELECT session_id, session_line_num, message AS sql_statement FROM postgres_log WHERE message LIKE 'statement%' ) ,durat_cte AS ( SELECT session_id, session_line_num, message AS duration FROM postgres_log WHERE message LIKE 'duration%' ) SELECT t1.session_id, t1.session_line_num, t1.sql_statement, t2.duration FROM sql_cte t1 LEFT JOIN durat_cte t2 ON t1.session_id = t2.session_id AND t1.session_line_num + 1 = t2.session_line_num;
Is it possible to have hibernate generate update from values statements for postgresql?
Given a postgresql table Table "public.test" Column | Type | Modifiers ----------+-----------------------------+----------- id | integer | not null info | text | And the following values : # select * from test; id | info ----+-------------- 3 | value3 4 | value4 5 | value5 As you may know with postgresql you can use this kind of statements to update multiples rows with different values : update test set info=tmp.info from (values (3,'newvalue3'),(4,'newvalue4'),(5,'newvalue5')) as tmp (id,info) where test.id=tmp.id; And it results in the table being updated in a single queries to : # select * from test; id | info ----+-------------- 3 | newvalue3 4 | newvalue4 5 | newvalue5 I have been looking around everywhere as to how to make hibernate generate this kind of statements for update queries. I know how to make it work for insert queries (with reWriteBatchedInserts jdbc option and hibernate batch config options). But is it possible for update queries or do I have to write the native query myself ? No matter what I do, hibernate always sends separate update queries to the database (I'm looking to the postgresql server statements logs for this affirmation). 2020-06-18 08:19:48.895 UTC [1642] LOG: execute S_6: BEGIN 2020-06-18 08:19:48.895 UTC [1642] LOG: execute S_8: update test set info = $1 where id = $2 2020-06-18 08:19:48.895 UTC [1642] DETAIL: parameters: $1 = 'newvalue3', $2 = '3' 2020-06-18 08:19:48.896 UTC [1642] LOG: execute S_8: update test set info = $1 where id = $2 2020-06-18 08:19:48.896 UTC [1642] DETAIL: parameters: $1 = 'newvalue4', $2 = '4' 2020-06-18 08:19:48.896 UTC [1642] LOG: execute S_8: update test set info = $1 where id = $2 2020-06-18 08:19:48.896 UTC [1642] DETAIL: parameters: $1 = 'newvalue4', $2 = '5' 2020-06-18 08:19:48.896 UTC [1642] LOG: execute S_1: COMMIT I always find it many times faster to issue a single massive update query than many separate update targeting single rows. With many seperate update queries, even though they are sent in a batch by the jdbc driver, they still need to be processed sequentially by the server, so it is not as efficient as a single update query targeting multiples rows. So if anyone has a solution that wouldn't involve writing native queries for my entities, I would be very glad ! Update To further refine my question I want to add a clarification. I'm looking for a solution that wouldn't abandon Hibernate dirty checking feature for entities updates. I'm trying to avoid to write batch update queries by hand for the general case of having to updating a few basic fields with different values on an entity list. I'm currently looking into the SPI of hibernate to see it if it's doable. org.hibernate.engine.jdbc.batch.spi.Batch seems to be the proper place but I'm not quite sure yet because I've never done anything with hibernate SPI). Any insights would be welcomed !
You can use Blaze-Persistence for this which is a query builder on top of JPA which supports many of the advanced DBMS features on top of the JPA model. It does not yet support the FROM clause in DML, but that is about to land in the next release: https://github.com/Blazebit/blaze-persistence/issues/693 Meanwhile you could use CTEs for this. First you need to define a CTE entity(a concept of Blaze-Persistence): #CTE #Entity public class InfoCte { #Id Integer id; String info; } I'm assuming your entity model looks roughly like this #Entity public class Test { #Id Integer id; String info; } Then you can use Blaze-Persistence like this: criteriaBuilderFactory.update(entityManager, Test.class, "test") .with(InfoCte.class, false) .fromValues(Test.class, "newInfos", newInfosCollection) .bind("id").select("newInfos.id") .bind("info").select("newInfos.info") .end() .set("info") .from(InfoCte.class, "cte") .select("cte.info") .where("cte.id").eqExpression("test.id") .end() .whereExists() .from(InfoCte.class, "cte") .where("cte.id").eqExpression("test.id") .end() .executeUpdate(); This will create an SQL query similar to the following WITH InfoCte(id, info) AS( SELECT t.id, t.info FROM (VALUES(1, 'newValue', ...)) t(id, info) ) UPDATE test SET info = (SELECT cte.info FROM InfoCte cte WHERE cte.id = test.id) WHERE EXISTS (SELECT 1 FROM InfoCte cte WHERE cte.id = test.id)
Inserting many rows in treelike structure in SQL SERVER 2005
I have a few tables in SQL that are pretty much like this A B C ID_A ID_B ID_C Name Name Name ID_A ID_B As you can see, A is linked to B and B to C. Those are basically tables that contains data models. Now, I would need to be able to create date based on those tables. For example, if I have the following datas A B C 1 Name1 1 SubName1 1 1 SubSubName1 1 2 Name2 2 SubName2 1 2 SubSubName2 1 3 SubName3 2 3 SubSubName3 2 4 SubSubName4 3 5 SubSubName5 3 I would like to copy the 'content' of those tables in others tables. Of course, the auto numeric key that is generated when inserting into the new tables are diffirent that those one and I would like to be able to keep track so that I can copy the entire thing. The structure of the recipient table contains more information that those, but it's mainly dates and other stuff that are easy to get for me. I would need to this entirely in TRANSACT-SQL (with built-in function if needed). Is this possible and can anyone give me a short example. I manage to do it for one level, but I get confused for the rest. thanks EDIT : The info above is just an example, because my actual diagram looks more like this Model tables : Processes -- (1-N) Steps -- (1-N) Task -- (0-N) TaskCheckList -- (0-N) StepsCheckLists Where as the table I need to fill looks like this Client -- (0-N) Sequence -- (1-N) ClientProcesses -- (1-N) ClientSteps -- (1-N)ClientTasks -- (0-N) ClientTaskCheckList -- (0-N)ClientStepCheckLists The Client already exists and when I need to run the script, I create one sequence, which will contains all processes, which will contains its steps, taks, etc...
Ok, So I did a lot of trials and error, and here is what I got. It seems to work fine although it sound quite big for something that seemed easy at first. The whole this is somehow in french and english because our client is french and so are we anyway. It does insert every date in all tables that I needed. The only thing left to this will be the first lines where I need to select the date to insert according to some parameters but this is the easy part. DECLARE #IdProcessusRenouvellement bigint DECLARE #NomProcessus nvarchar(255) SELECT #IdProcessusRenouvellement = ID FROM T_Ref_Processus WHERE Nom LIKE 'EXP%' SELECT #NomProcessus = Nom FROM T_Ref_Processus WHERE Nom LIKE 'EXP%' DECLARE #InsertedSequence table(ID bigint) DECLARE #Contrats table(ID bigint,IdClient bigint,NumeroContrat nvarchar(255)) INSERT INTO #Contrats SELECT ID,IdClient,NumeroContrat FROM T_ClientContrat DECLARE #InsertedIdsSeq as Table(ID bigint) -- Séquences de travail INSERT INTO T_ClientContratSequenceTravail(IdClientContrat,Nom,DateDebut) OUTPUT Inserted.ID INTO #InsertedIdsSeq SELECT ID, #NomProcessus + ' - ' + CONVERT(VARCHAR(10), GETDATE(), 120) + ' : ' + NumeroContrat ,GETDATE() FROM #Contrats -- Processus DECLARE #InsertedIdsPro as Table(ID bigint,IdProcessus bigint) INSERT INTO T_ClientContratProcessus (IdClientContratSequenceTravail,IdProcessus,Nom,DateDebut,DelaiRappel,DateRappel,LienAvecPro cessusRenouvellement,IdStatutProcessus,IdResponsable,Sequence) OUTPUT Inserted.ID,Inserted.IdProcessus INTO #InsertedIdsPro SELECT I.ID,P.ID,P.Nom,GETDATE(),P.DelaiRappel,GETDATE(),P.LienAvecProcessusRenouvellement,0,0,0 FROM #InsertedIdsSeq I, T_Ref_Processus P WHERE P.ID = #IdProcessusRenouvellement -- Étapes DECLARE #InsertedIdsEt as table(ID bigint,IdProcessusEtape bigint) INSERT INTO T_ClientContratProcessusEtape (IdClientContratProcessus,IdProcessusEtape,Nom,DateDebut,DelaiRappel,DateRappel,NomListeVeri fication,Sequence,IdStatutEtape,IdResponsable,IdTypeResponsable,ListeVerificationTermine) OUTPUT Inserted.ID,Inserted.IdProcessusEtape INTO #InsertedIdsEt SELECT I.ID,E.ID, E.Nom,GETDATE(),E.DelaiRappel,GETDATE(),COALESCE(L.Nom,''),E.Sequence,0,0,E.IdTypeResponsabl e,0 FROM #InsertedIdsPro I INNER JOIN T_Ref_ProcessusEtape E ON I.IdProcessus = E.IdProcessus LEFT JOIN T_Ref_ListeVerification L ON E.IdListeVerification = L.ID -- Étapes : Items de la liste de vérification INSERT INTO T_ClientContratProcessusEtapeListeVerificationItem (IdClientContratProcessusEtape,Nom,Requis,Verifie) SELECT I.ID,IT.Nom,IT.Requis,0 FROM #InsertedIdsEt I INNER JOIN T_Ref_ProcessusEtape E ON I.IdProcessusEtape = E.ID INNER JOIN T_Ref_ListeVerificationItem IT ON E.IdListeVerification = IT.IdListeVerification -- Tâches DECLARE #InsertedIdsTa as table(ID bigint, IdProcessusEtapeTache bigint) INSERT INTO T_ClientContratProcessusEtapeTache (IdClientContratProcessusEtape,IdProcessusEtapeTache,Nom,DateDebut,DelaiRappel,DateRappel,No mListeVerification,Sequence,IdStatutTache,IdResponsable,IdTypeResponsable,ListeVerificationT ermine) OUTPUT Inserted.ID,Inserted.IdProcessusEtapeTache INTO #InsertedIdsTa SELECT I.ID,T.ID, T.Nom,GETDATE(),T.DelaiRappel,GETDATE(),COALESCE(L.Nom,''),T.Sequence,0,0,T.IdTypeResponsabl e,0 FROM #InsertedIdsEt I INNER JOIN T_Ref_ProcessusEtapeTache T ON I.IdProcessusEtape = T.IdProcessusEtape LEFT JOIN T_Ref_ListeVerification L ON T.IdListeVerification = L.ID -- Tâches : Items de la liste de vérification INSERT INTO T_ClientContratProcessusEtapeTacheListeVerificationItem (IdClientContratProcessusEtapeTache,Nom,Requis,Verifie) SELECT I.ID,IT.Nom,IT.Requis,0 FROM #InsertedIdsTa I INNER JOIN T_Ref_ProcessusEtapeTache T ON I.IdProcessusEtapeTache = T.ID INNER JOIN T_Ref_ListeVerificationItem IT ON T.IdListeVerification = IT.IdListeVerification
Entity Framework object missing related data
I have an Entity Framework model that has two tables, client and postcode. Postcode can have many clients, client can have 1 postcode. They are joined on the postcode. The two tables are mapped to views. I have some clients that do not have a Postcode in the model, however in the DB they do! I ran some tests and found postcodes that were returning clients when I do Postcode.Clients but not all of the clients? In the db a postcode had 14 related clients but EF was only returning the first 6. Basically certain postcodes are not returning all the data. Lazy loading is turned on and I have tried turning it off without any luck. Any ideas? I am using VS 2010, C#, .NET 4.0, EF4 and SQL Server 2008 Thanks UPDATE: I have been running through this in LinqPad. I try the following code Client c = Clients.Where(a => a.ClientId == 9063202).SingleOrDefault(); c.PostcodeView.Dump(); This returns null. I then take the generated SQL and run this in a separate SQL query and it works correctly (after I add the # to the start of the variable name) SELECT TOP (2) [Extent1].[ClientId] AS [ClientId], [Extent1].[Surname] AS [Surname], [Extent1].[Forename] AS [Forename], [Extent1].[FlatNo] AS [FlatNo], [Extent1].[StNo] AS [StNo], [Extent1].[Street] AS [Street], [Extent1].[Town] AS [Town], [Extent1].[Postcode] AS [Postcode] FROM (SELECT [ClientView].[ClientId] AS [ClientId], [ClientView].[Surname] AS [Surname], [ClientView].[Forename] AS [Forename], [ClientView].[FlatNo] AS [FlatNo], [ClientView].[StNo] AS [StNo], [ClientView].[Street] AS [Street], [ClientView].[Town] AS [Town], [ClientView].[Postcode] AS [Postcode] FROM [dbo].[ClientView] AS [ClientView]) AS [Extent1] WHERE 9063202 = [Extent1].[ClientId] GO -- Region Parameters DECLARE #EntityKeyValue1 VarChar(8) = 'G15 6NB' -- EndRegion SELECT [Extent1].[Postcode] AS [Postcode], [Extent1].[ltAstId] AS [ltAstId], [Extent1].[ltLhoId] AS [ltLhoId], [Extent1].[ltChcpId] AS [ltChcpId], [Extent1].[ltCppId] AS [ltCppId], [Extent1].[ltWardId] AS [ltWardId], [Extent1].[ltAst] AS [ltAst], [Extent1].[ltCpp] AS [ltCpp], [Extent1].[ltWard] AS [ltWard], [Extent1].[WardNo] AS [WardNo], [Extent1].[Councillor] AS [Councillor], [Extent1].[ltAdminCentre] AS [ltAdminCentre], [Extent1].[ltChcp] AS [ltChcp], [Extent1].[Forename] AS [Forename], [Extent1].[Surname] AS [Surname], [Extent1].[AreaNo] AS [AreaNo], [Extent1].[LtAomId] AS [LtAomId], [Extent1].[OOHltCoordinatorId] AS [OOHltCoordinatorId], [Extent1].[OvernightltCoordinatorId] AS [OvernightltCoordinatorId], [Extent1].[DayltCoordinatorId] AS [DayltCoordinatorId] FROM (SELECT [PostcodeView].[Postcode] AS [Postcode], [PostcodeView].[ltAstId] AS [ltAstId], [PostcodeView].[ltLhoId] AS [ltLhoId], [PostcodeView].[ltChcpId] AS [ltChcpId], [PostcodeView].[ltCppId] AS [ltCppId], [PostcodeView].[ltWardId] AS [ltWardId], [PostcodeView].[ltAst] AS [ltAst], [PostcodeView].[ltCpp] AS [ltCpp], [PostcodeView].[ltWard] AS [ltWard], [PostcodeView].[WardNo] AS [WardNo], [PostcodeView].[Councillor] AS [Councillor], [PostcodeView].[ltAdminCentre] AS [ltAdminCentre], [PostcodeView].[ltChcp] AS [ltChcp], [PostcodeView].[Forename] AS [Forename], [PostcodeView].[Surname] AS [Surname], [PostcodeView].[AreaNo] AS [AreaNo], [PostcodeView].[LtAomId] AS [LtAomId], [PostcodeView].[DayltCoordinatorId] AS [DayltCoordinatorId], [PostcodeView].[OOHltCoordinatorId] AS [OOHltCoordinatorId], [PostcodeView].[OvernightltCoordinatorId] AS [OvernightltCoordinatorId] FROM [dbo].[PostcodeView] AS [PostcodeView]) AS [Extent1] WHERE [Extent1].[Postcode] = #EntityKeyValue1
Ended up removing the relationship and manually getting child data. Nasty but cannot find a reason why this is happening. Cheers for the comments