Keep table data, after SELECT - tsql

so i use the following query alot
SELECT [id]
,[date]
,[pcknum]
,[pcktype]
,[pckinfo]
,[pckuptime]
,[pckbyte]
,[pcktext]
FROM [dbo].[RawLog]
where pcktext like '%USER SC%' and pckinfo = 174 and not (pcktext like '%type=Dial%' or pcktext like '%type=VoicemailCollect%')
order by date desc
problem is, the table contains more then 10 mio rows, and takes about 20-40 minutes to search trough, and it consumes alot of time.
The table is only accessed by direct user input, as it is used to reverse engineer som network protocols.
i was wondering, if there was a way to make a saved state, of a SQL query, to reduce search times?
for instance a saved state of the query
SELECT [id]
,[date]
,[pcknum]
,[pcktype]
,[pckinfo]
,[pckuptime]
,[pckbyte]
,[pcktext]
FROM [dbo].[RawLog]
where pcktext like '%USER SC%' and pckinfo = 174
that i could later on, use queries on?
for instance
SELECT [id]
,[date]
,[pcknum]
,[pcktype]
,[pckinfo]
,[pckuptime]
,[pckbyte]
,[pcktext]
FROM table_savedstate_of_RawLog_USERSC
where not (pcktext like '%type=Dial%' or pcktext like '%type=VoicemailCollect%')
order by date desc

You can use temp tables for this
SELECT [id]
,[date]
,[pcknum]
,[pcktype]
,[pckinfo]
,[pckuptime]
,[pckbyte]
,[pcktext]
into #temptable
FROM [dbo].[RawLog]
where pcktext like '%USER SC%' and pckinfo = 174 and not (pcktext like '%type=Dial%' and pcktext like '%type=VoicemailCollect%')
order by date desc
By this you can access the details that are already queried in the #temptable
for example
Select * from #temptable where id = 1
You can also index your temptable for better searching when using it
CREATE NONCLUSTERED INDEX IDX_ID ON #temptable(id)
Hope this helps :)
**EDIT
To use the temptable based on your query above
SELECT [id]
,[date]
,[pcknum]
,[pcktype]
,[pckinfo]
,[pckuptime]
,[pckbyte]
,[pcktext]
FROM #temptable
where not (pcktext like '%type=Dial%' or pcktext like '%type=VoicemailCollect%')
order by date desc

Related

Update column with more than one value

I have a table tableA which looks something like this:
issue_id start_date end_date
issue1 2019-11-07 2020-04-30
issue2 2019-11-07 2020-01-28
I have to update the end_date based on the results of the query.
UPDATE tableA SET end_date =
(
SELECT max_end_date from update_end_date
)
WHERE issue_id = (SELECT issue_id FROM update_end_date);
It works when when query returns one result. However it fails when more than one results are returned which make sense. I cannot pre determine the results of the query so it might return more than one result. Is there any way if I can update the column with multiple values.
You could use correlated subquery:
UPDATE tableA
SET end_date = (SELECT max_end_date
from update_end_date
WHERE update_end_date.issue_id = tableA.issue_id)
WHERE issue_id IN (SELECT issue_id FROM update_end_date);
Another possibility to #Lukas solution is using proprietary PostgreSQL's syntax UPDATE FROM
UPDATE tablea
SET end_date = max_end_date
FROM update_end_date
WHERE tablea.issue_id = update_end_date.issue_id

Cassandra filter with ordering query modeling

I am new to Cassandra and I am trying to model a table in Cassandra. My queries look like the following
Query #1: select * from TableA where Id = "123"
Query #2: select * from TableA where name="test" orderby startTime DESC
Query #3: select * from TableA where state="running" orderby startTime DESC
I have been able to build the table for Query #1 which looks like
val tableAStatement = SchemaBuilder.createTable("tableA").ifNotExists.
addPartitionKey(Id, DataType.uuid).
addColumn(Name, DataType.text).
addColumn(StartTime, DataType.timestamp).
addColumn(EndTime, DataType.timestamp).
addColumn(State, DataType.text)
session.execute(tableAStatement)
but for Query#2 and 3, I have tried many different things but failed. Everytime, I get stuck in a different error from cassandra.
Considering the above queries, what would be the right table model? What is the right way to model such queries.
Query #2: select * from TableB where name="test"
CREATE TABLE TableB (
name text,
start_time timestamp,
PRIMARY KEY (text, start_time)
) WITH CLUSTERING ORDER BY (start_time DESC)
Query #3: select * from TableC where state="running"
CREATE TABLE TableC (
state text,
start_time timestamp,
PRIMARY KEY (state, start_time)
) WITH CLUSTERING ORDER BY (start_time DESC)
In cassandra you model your tables around your queries. Data denormalization and duplication is wanted. Notice the clustering order - this way you can omit the "ordered by" in your query

Difference between dates in different rows

Hy
my problem is, that I need the average time between a chargebegin & chargeend row (timestampserver) grouped by stationname and connectornumber and day.
The main problem is, that i can not use a Max oder Min function because I have the same stationname/connecternumber combination several times in the table.
So in fact I have to select the first chargebegin and find the next chargeend (the one with the same station/connectornumber combination and the min(id) > chargebegin.id) to get the difference.
I tried a lot but in fact i have no idea how to do this.
Database is postgresql 9.2
Testdata:
create table datatable (
id int,
connectornumber int,
message varchar,
metercount int,
stationname varchar,
stationuser varchar,
timestampmessage varchar,
timestampserver timestamp,
authsource varchar
);
insert into datatable values (181,1,'chargebegin',4000,'100','FCSC','2012-10-10 16:39:10','2012-10-10 16:39:15.26');
insert into datatable values (182,1,'chargeend',4000,'100','FCSC','2012-10-10 16:39:17','2012-10-10 16:39:28.379');
insert into datatable values (184,1,'chargebegin',4000,'100','FCSC','2012-10-11 11:06:31','2012-10-11 11:06:44.981');
insert into datatable values (185,1,'chargeend',4000,'100','FCSC','2012-10-11 11:16:09','2012-10-11 11:16:10.669');
insert into datatable values (191,1,'chargebegin',4000,'100','MSISDN_100','2012-10-11 13:38:19','2012-10-11 13:38:26.583');
insert into datatable values (192,1,'chargeend',4000,'100','MSISDN_100','2012-10-11 13:38:53','2012-10-11 13:38:55.631');
insert into datatable values (219,1,'chargebegin',4000,'100','MSISDN_','2012-10-12 11:38:03','2012-10-12 11:38:29.029');
insert into datatable values (220,1,'chargeend',4000,'100','MSISDN_','2012-10-12 11:40:14','2012-10-12 11:40:18.635');
This might have some syntax errors as I can't test it right now, but you should get an idea, how to solve it.
with
chargebegin as (
select
stationname,
connectornumber,
timestampserver,
row_number() over(partition by stationname, connectornumber order by timestampserver) as rn
from
datatable
where
message = 'chargebegin'
),
chargeend as (
select
stationname,
connectornumber,
timestampserver,
row_number() over(partition by stationname, connectornumber order by timestampserver) as rn
from
datatable
where
message = 'chargeend'
)
select
stationname,
connectornumber,
avg(b.timestampserver - a.timestampserver) as avg_diff
from
chargebegin a
join chargeend b using (stationname, connectornumber, rn)
group by
stationname,
connectornumber
This assumes that there is always end event for begin event and that these event cannot overlap (means that for stationname and connectornumber, there can be only one connection at any time). Therefore you can user row_number() to get matching begin/end events and then do whatever calculation is needed.

Speeding up TSQL

Hi all i wondering if there's a more efficient way of executing this TSQl script. It basically goes and gets the very latest activity ordering by account name and then join this to the accounts table. So you get the very latest activity for a account. The problem is there are currently about 22,000 latest activities, so obviously it has to go through alot of data, just wondering if theres a more efficient way of doing what i'm doing?
DECLARE #pastAppointments TABLE (objectid NVARCHAR(100), account NVARCHAR(500), startdate DATETIME, tasktype NVARCHAR(100), ownerid UNIQUEIDENTIFIER, owneridname NVARCHAR(100), RN NVARCHAR(100))
INSERT INTO #pastAppointments (objectid, account, startdate, tasktype, ownerid, owneridname, RN)
SELECT * FROM (
SELECT fap.regardingobjectid, fap.regardingobjectidname, fap.actualend, fap.activitytypecodename, fap.ownerid, fap.owneridname,
ROW_NUMBER() OVER (PARTITION BY fap.regardingobjectidname ORDER BY fap.actualend DESC) AS RN
FROM FilteredActivityPointer fap
WHERE fap.actualend < getdate()
AND fap.activitytypecode NOT LIKE 4201
) tmp WHERE RN = 1
ORDER BY regardingobjectidname
SELECT fa.name, fa.owneridname, fa.new_technicalaccountmanagername, fa.new_customerid, fa.new_riskstatusname, fa.new_numberofopencases,
fa.new_numberofurgentopencases, app.startdate, app.tasktype, app.ownerid, app.owneridname
FROM FilteredAccount fa LEFT JOIN #pastAppointments app on fa.accountid = app.objectid and fa.ownerid = app.ownerid
WHERE fa.statecodename = 'Active'
AND fa.ownerid LIKE #owner_search
ORDER BY fa.name
You can remove ORDER BY regardingobjectidname from the first INSERT query - the only (narrow) purpose such a sort would have on an INSERT query is if there was an identity column on the table being inserted into. And there isn't in this case, so if the optimizer isn't smart enough, it'll perform a pointless sort.

Two different group by clauses in one query?

First time posting here, a newbie to SQl, and I'm not exactly sure how to word this but I'll try my best.
I have a query:
select report_month, employee_id, split_bonus,sum(salary) FROM empsal
where report_month IN('2010-12-01','2010-11-01','2010-07-01','2010-04-01','2010-09-01','2010-10-01','2010-08-01')
AND employee_id IN('100','101','102','103','104','105','106','107')
group by report_month, employee_id, split_bonus;
Now, to the result of this query, I want to add a new column split_bonus_cumulative that is essentially equivalent to adding a sum(split_bonus) in the select clause but for this case, the group buy should only have report_month and employee_id.
Can anyone show me how to do this with a single query? Thanks in advance.
Try:
SELECT
report_month,
employee_id,
SUM(split_bonus),
SUM(salary)
FROM
empsal
WHERE
report_month IN('2010-12-01','2010-11-01','2010-07-01','2010-04-01','2010-09-01','2010-10-01','2010-08-01')
AND
employee_id IN('100','101','102','103','104','105','106','107')
GROUP BY
report_month,
employee_id;
Assuming you're using Postgres, you might also find window functions useful:
http://www.postgresql.org/docs/9.0/static/tutorial-window.html
Unless I'm mistaking, you want something that resembles the following:
select report_month, employee_id, salary, split_bonus,
sum(salary) over w as sum_salary,
sum(split_bonus) over w as sum_bonus
from empsal
where ...
window w as (partition by employee_id);
CTEs are also convenient:
http://www.postgresql.org/docs/9.0/static/queries-with.html
WITH
rows as (
SELECT foo.*
FROM foo
WHERE ...
),
report1 as (
SELECT aggregates
FROM rows
WHERE ...
),
report2 as (
SELECT aggregates
FROM rows
WHERE ...
)
SELECT *
FROM report1, report2, ...