I have this search but I want to azure alert when the bandwidth reaches 50%. I have tried the alert setup but that only set how many times the search found. so not sure what needs to be add on search only triggers the bandwidth threshold.
AzureMetrics
| where ResourceId contains "ckt"
| where MetricName == "BitsINPerSecond"
| where TimeGenerated > (now() - 12h) and TimeGenerated <= now()
| project TimeGenerated, Resource, inBytes=Maximum
| join kind= inner
(
AzureMetrics
| where MetricName == "BitsOutPerSecond"
| where TimeGenerated > (now() - 12h) and TimeGenerated <= now()
| project TimeGenerated, Resource, outBytes= Maximum
)
on TimeGenerated, Resource
| summarize data_in_Gbps = max(inBytes)/1000000000, data_out_Gbps = max(outBytes)/1000000000,
data_total_Gbps = sum(inBytes + outBytes)/1000000000 by bin(TimeGenerated, 1h), Resource
| extend BW_percentage = data_out_Gbps * 100
| order by TimeGenerated
Add in the end of the query : "|where BW_percentage>50".
Check that you are happy with the results when you run the query yourself.
Then copy the query to the alert rule and set the threshold to >0 to alert you on any one resource where this is true.
(You can change the 1h to 30m if this is the time span that interest you).
Related
Long time watcher, first time poster so please be kind to this poor noob....
We're marching forth into Azure and I'm working on the monitoring and alerting side (because no-one else is so far). I have successfully created a number of alerts using KQL with LogAnalytics but having issues with an ADF query.
Need something that will alert as Resolved ONLY when original failed pipeline subsequently shows as Successful. Right now, we're getting a Resolved alert when any other pipeline is successful. Help me Obi Wan Kenobi - you're my only hope.....
Current query is:
let activities = ADFActivityRun
| where Status == 'Failed' and ActivityType !in ('IfCondition', 'ExecutePipeline', 'ForEach')
| project
ActivityName,
ActivityType,
Input,
Output,
ErrorMessage,
Error,
PipelineRunId,
ActivityRunId,
_ResourceId;
ADFPipelineRun
| project RunId,PipelineName, Status, Start, End
| summarize max(Start) by PipelineName
| join kind = inner ADFPipelineRun on $left.PipelineName == $right.PipelineName and $left.max_Start == $right.Start
| project RunId
, TimeGenerated
, ResourceName=split(_ResourceId, '/')[-1]
, PipelineName
, Status
, Start
, End
,Parameters
,Predecessors
| where Status == 'Failed'
| join kind = inner activities on $left.RunId == $right.PipelineRunId
| project TimeGenerated
, ResourceName=split(_ResourceId, '/')[-1]
, PipelineName
, ActivityName
, ActivityType
, Status
, Start
, End
,Parameters
,Error
,PipelineRunId
,ActivityRunId
,Predecessors
Just trying to use a pre-existing "Slowest queries - top 5" from Azure log analytics for postgres flexible server. The query that is provided is:
// Slowest queries
// Identify top 5 slowest queries.
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreRuntimeStatistics"
| where user_id_s != "10" //exclude azure system user
| summarize avg(todouble(mean_time_s)) by event_class_s , db_id_s ,query_id_s
| top 5 by avg_mean_time_s desc
This query results in the error :
'where' operator: Failed to resolve column or scalar expression named 'user_id_s'
If the issue persists, please open a support ticket. Request id: XXXX
I am guessing that something is not configured in order to utilize the user_id_s column. Any assistance is appreciated.
I am expecting you are checking the integer value 10 is not equal to the user_id_s.
In your KQL query user_id_s != "10" .
Thanks # venkateshdodda-msft I am adding your suggestion to help to fix the issue.
If you are using integer in a KQL make sure to remove the " " double quotes.
# using as a integer
| where user_id_s != 10
Or convert the integer into string by using
# converting into string
| extend user_id_s = tostring(Properties.user_id_s)
| where UserId in ('10')
Modified KQL Query
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreRuntimeStatistics"
# using as a integer
| where user_id_s != 10 //exclude azure system user
| summarize avg(todouble(mean_time_s)) by event_class_s , db_id_s ,query_id_s
| top 5 by avg_mean_time_s desc
Reference:
Operator failed to resolve table or column expression
Converting integer to string
I have been trying a few various methods to see if I can get this to work, but I haven't had any luck.
Here is what I am trying to accomplish.
Every day, there are cases that get closed. We are wanting to track cases that have been 're-opened' after having been already closed once, but there is nothing in the information provided that tells us this is a re-opened case. The only way to do this is to check to see if the Case ID and the Report Date and see if the there is a duplicate Case Id that exists and was closed prior to this report date. To complicate matters, here is some additional info:
1) A common situation is that a case is closed, re-opened and then closed again within the same day(sometimes multiple times). This should count as a re-open, each time it is done after the first instance, even if it's the same day ( I assume we would group by case ID?)
2) I run a 5 Day reporting window, so a case should NOT count as a re-open if for instance on 3/20/2019 the case was closed for the first time, and then re-opened at some point and closed again 3/26/2019 until 3/26/2019. On 3/20, 3/21, 3/22, and 3/25(report days skips weekends and holidays, this is already built in, do not need anything fo that) it should NOT be marked as a re-open because the case still only has one instance on or before the report date we are looking at. On 3/26 it would be marked as a re-open because it would then have been closed for a second time on or before the report date.
Here are some queries:
CREATE TABLE ResolvedCases(
Case_ID varchar(20),
Case_Closed_On datetime,
Report_Date date,
Is_ReOpened_Case VarChar(3) NULL
)
INSERT INTO ResolvedCases VALUES('US1236', '2019-02-16 12:30:45', '2/16/2019')
INSERT INTO ResolvedCases VALUES('US1238', '2019-02-28 15:30:45', '2/28/2019')
INSERT INTO ResolvedCases VALUES('US1234', '2019-03-19 12:30:45', '3/19/2019')
INSERT INTO ResolvedCases VALUES('US1234', '2019-03-19 15:30:45', '3/19/2019')
INSERT INTO ResolvedCases VALUES('US1235', '2019-03-20 9:30:45', '3/20/2019')
INSERT INTO ResolvedCases VALUES('US1235', '2019-03-23 12:40:45', '3/23/2019')
INSERT INTO ResolvedCases VALUES('US1236', '2019-03-20 12:30:45', '3/24/2019')
INSERT INTO ResolvedCases VALUES('US1237', '2019-03-25 12:30:45', '3/25/2019')
Expected Results(Only showing the cases with Report_Date between 3/20 and 3/26):
Case_ID Case_Closed_On Report_Date Is_ReOpened_Case
US1234 2019-03-19 12:30:45 3/19/2019 No (There is a duplicate case Id on 3/19 but it didn't happen until 3:30 PM---at 12:30PM this hadn't occurred yet so it was not a re-open at that time)
US1234 2019-03-19 15:30:45 3/19/2019 Yes
US1235 2019-03-20 9:30:45 3/20/2019 No (There is a duplicate case Id on 3/23 but on 3/20 this hadn't occurred yet so it was no a re-open on that date)
US1235 2019-03-23 12:40:45 3/23/2019 Yes
US1236 2019-03-20 12:30:45 3/24/2019 Yes (Because of the case closed on 2/16/2019 even though it doesn't show in this query)
US1237 2019-03-25 12:30:45 3/25/2019 No
Any help would be appreciated with this...
I have something that shows the count of the case ID which shows me all the duplicates for a given date range and have them grouped by Case_ID but I am not sure how to just mark each individual row as a re-open or not based on the requirements above...
For your immediate problem, you can use LAG to update your table with the flag you're looking for. (It returns a NULL if there's no preceding value, hence the logic in the CASE statement.)
UPDATE rc
SET rc.Is_ReOpened_Case = sq.Is_ReOpened_Case
FROM
#ResolvedCases AS rc
LEFT JOIN
(
SELECT
Case_ID
,Case_Closed_On
,Report_Date
,Is_ReOpened_Case =
CASE
WHEN LAG(Case_ID) OVER (PARTITION BY Case_ID ORDER BY Case_Closed_On) IS NOT NULL
THEN 'Yes'
ELSE 'No'
END
FROM #ResolvedCases
) AS sq
ON sq.Case_ID = rc.Case_ID
AND sq.Case_Closed_On = rc.Case_Closed_On
WHERE
COALESCE(rc.Is_ReOpened_Case,'') <> COALESCE(sq.Is_ReOpened_Case,'')
SELECT
rc.*
FROM #ResolvedCases AS rc
WHERE rc.Report_Date >= '20190319' AND rc.Report_Date < '20190326'
ORDER BY Case_ID, Case_Closed_On;
Results:
+---------+-------------------------+-------------+------------------+
| Case_ID | Case_Closed_On | Report_Date | Is_ReOpened_Case |
+---------+-------------------------+-------------+------------------+
| US1234 | 2019-03-19 12:30:45.000 | 2019-03-19 | No |
| US1234 | 2019-03-19 15:30:45.000 | 2019-03-19 | Yes |
| US1235 | 2019-03-20 09:30:45.000 | 2019-03-20 | No |
| US1235 | 2019-03-23 12:40:45.000 | 2019-03-23 | Yes |
| US1236 | 2019-03-20 12:30:45.000 | 2019-03-24 | Yes |
| US1237 | 2019-03-25 12:30:45.000 | 2019-03-25 | No |
+---------+-------------------------+-------------+------------------+
But thereafter you'll need to do something with the code that populates this table to maintain those values for future entries. That might require a two-step solution, but you'll have to decide after you review that code set. Maybe just run that UPDATE after the data load.
Has anyone got any idea how I could optimize this query so that it'll run faster? Right now it takes up to 30sec to retrieve around 3k of "containers" and thats way to long.. It's forseen that it'll have to retrieve around 1miljon records.
Query query = em().createNativeQuery("SELECT * FROM CONTAINER where TO_CHAR(CREATION_DATE, 'YYYY-MM-DD') >= TO_CHAR(:from, 'YYYY-MM-DD') " +
"AND TO_CHAR(CREATION_DATE, 'YYYY-MM-DD') <= TO_CHAR(:to, 'YYYY-MM-DD') ", Container.class);
query.setParameter("from", from);
query.setParameter("to", to);
return query.getResultList();
JPA 2.0, Oracle DB
EDIT: I've got an index on the CREATION_DATE column:
CREATE INDEX IDX_CONTAINER_CREATION_DATE
ON CONTAINER (CREATION_DATE);
it's not a named query because the TO_CHAR function doesn't seem to be supported by JPA 2.0 and I've read that it should make the query faster if there's an index..
My explain plan (still doing full table scan for some reason instead of using the index):
---------------------------------------
| Id | Operation | Name |
---------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| CONTAINER |
---------------------------------------
One fix I don't like:
I've done the following..
TypedQuery<Container> query = em().createQuery(
"SELECT NEW Container(c.barcode, c.createdBy, c.creationDate, c.owner, c.sequence, c.containerSizeBarcode, c.a, c.b, c.c) " +
"FROM Container c where c.creationDate >= :from AND c.creationDate <= :to", Container.class);
and I've added an absurdly long constructor to Container and this fixes the loading times.. But, this is really ugly and I don't want this tbh. Anyone any other suggestions?
I used to write my EXISTS checks like this:
IF EXISTS (SELECT * FROM TABLE WHERE Columns=#Filters)
BEGIN
UPDATE TABLE SET ColumnsX=ValuesX WHERE Where Columns=#Filters
END
One of the DBA's in a previous life told me that when I do an EXISTS clause, use SELECT 1 instead of SELECT *
IF EXISTS (SELECT 1 FROM TABLE WHERE Columns=#Filters)
BEGIN
UPDATE TABLE SET ColumnsX=ValuesX WHERE Columns=#Filters
END
Does this really make a difference?
No, SQL Server is smart and knows it is being used for an EXISTS, and returns NO DATA to the system.
Quoth Microsoft:
http://technet.microsoft.com/en-us/library/ms189259.aspx?ppud=4
The select list of a subquery
introduced by EXISTS almost always
consists of an asterisk (*). There is
no reason to list column names because
you are just testing whether rows that
meet the conditions specified in the
subquery exist.
To check yourself, try running the following:
SELECT whatever
FROM yourtable
WHERE EXISTS( SELECT 1/0
FROM someothertable
WHERE a_valid_clause )
If it was actually doing something with the SELECT list, it would throw a div by zero error. It doesn't.
EDIT: Note, the SQL Standard actually talks about this.
ANSI SQL 1992 Standard, pg 191 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
3) Case:
a) If the <select list> "*" is simply contained in a <subquery> that
is immediately contained in an <exists predicate>, then the <select list> is
equivalent to a <value expression>
that is an arbitrary <literal>.
The reason for this misconception is presumably because of the belief that it will end up reading all columns. It is easy to see that this is not the case.
CREATE TABLE T
(
X INT PRIMARY KEY,
Y INT,
Z CHAR(8000)
)
CREATE NONCLUSTERED INDEX NarrowIndex ON T(Y)
IF EXISTS (SELECT * FROM T)
PRINT 'Y'
Gives plan
This shows that SQL Server was able to use the narrowest index available to check the result despite the fact that the index does not include all columns. The index access is under a semi join operator which means that it can stop scanning as soon as the first row is returned.
So it is clear the above belief is wrong.
However Conor Cunningham from the Query Optimiser team explains here that he typically uses SELECT 1 in this case as it can make a minor performance difference in the compilation of the query.
The QP will take and expand all *'s
early in the pipeline and bind them to
objects (in this case, the list of
columns). It will then remove
unneeded columns due to the nature of
the query.
So for a simple EXISTS subquery like
this:
SELECT col1 FROM MyTable WHERE EXISTS (SELECT * FROM Table2 WHERE MyTable.col1=Table2.col2) The * will be
expanded to some potentially big
column list and then it will be
determined that the semantics of the
EXISTS does not require any of those
columns, so basically all of them can
be removed.
"SELECT 1" will avoid having to
examine any unneeded metadata for that
table during query compilation.
However, at runtime the two forms of
the query will be identical and will
have identical runtimes.
I tested four possible ways of expressing this query on an empty table with various numbers of columns. SELECT 1 vs SELECT * vs SELECT Primary_Key vs SELECT Other_Not_Null_Column.
I ran the queries in a loop using OPTION (RECOMPILE) and measured the average number of executions per second. Results below
+-------------+----------+---------+---------+--------------+
| Num of Cols | * | 1 | PK | Not Null col |
+-------------+----------+---------+---------+--------------+
| 2 | 2043.5 | 2043.25 | 2073.5 | 2067.5 |
| 4 | 2038.75 | 2041.25 | 2067.5 | 2067.5 |
| 8 | 2015.75 | 2017 | 2059.75 | 2059 |
| 16 | 2005.75 | 2005.25 | 2025.25 | 2035.75 |
| 32 | 1963.25 | 1967.25 | 2001.25 | 1992.75 |
| 64 | 1903 | 1904 | 1936.25 | 1939.75 |
| 128 | 1778.75 | 1779.75 | 1799 | 1806.75 |
| 256 | 1530.75 | 1526.5 | 1542.75 | 1541.25 |
| 512 | 1195 | 1189.75 | 1203.75 | 1198.5 |
| 1024 | 694.75 | 697 | 699 | 699.25 |
+-------------+----------+---------+---------+--------------+
| Total | 17169.25 | 17171 | 17408 | 17408 |
+-------------+----------+---------+---------+--------------+
As can be seen there is no consistent winner between SELECT 1 and SELECT * and the difference between the two approaches is negligible. The SELECT Not Null col and SELECT PK do appear slightly faster though.
All four of the queries degrade in performance as the number of columns in the table increases.
As the table is empty this relationship does seem only explicable by the amount of column metadata. For COUNT(1) it is easy to see that this gets rewritten to COUNT(*) at some point in the process from the below.
SET SHOWPLAN_TEXT ON;
GO
SELECT COUNT(1)
FROM master..spt_values
Which gives the following plan
|--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[Expr1004],0)))
|--Stream Aggregate(DEFINE:([Expr1004]=Count(*)))
|--Index Scan(OBJECT:([master].[dbo].[spt_values].[ix2_spt_values_nu_nc]))
Attaching a debugger to the SQL Server process and randomly breaking whilst executing the below
DECLARE #V int
WHILE (1=1)
SELECT #V=1 WHERE EXISTS (SELECT 1 FROM ##T) OPTION(RECOMPILE)
I found that in the cases where the table has 1,024 columns most of the time the call stack looks like something like the below indicating that it is indeed spending a large proportion of the time loading column metadata even when SELECT 1 is used (For the case where the table has 1 column randomly breaking didn't hit this bit of the call stack in 10 attempts)
sqlservr.exe!CMEDAccess::GetProxyBaseIntnl() - 0x1e2c79 bytes
sqlservr.exe!CMEDProxyRelation::GetColumn() + 0x57 bytes
sqlservr.exe!CAlgTableMetadata::LoadColumns() + 0x256 bytes
sqlservr.exe!CAlgTableMetadata::Bind() + 0x15c bytes
sqlservr.exe!CRelOp_Get::BindTree() + 0x98 bytes
sqlservr.exe!COptExpr::BindTree() + 0x58 bytes
sqlservr.exe!CRelOp_FromList::BindTree() + 0x5c bytes
sqlservr.exe!COptExpr::BindTree() + 0x58 bytes
sqlservr.exe!CRelOp_QuerySpec::BindTree() + 0xbe bytes
sqlservr.exe!COptExpr::BindTree() + 0x58 bytes
sqlservr.exe!CScaOp_Exists::BindScalarTree() + 0x72 bytes
... Lines omitted ...
msvcr80.dll!_threadstartex(void * ptd=0x0031d888) Line 326 + 0x5 bytes C
kernel32.dll!_BaseThreadStart#8() + 0x37 bytes
This manual profiling attempt is backed up by the VS 2012 code profiler which shows a very different selection of functions consuming the compilation time for the two cases (Top 15 Functions 1024 columns vs Top 15 Functions 1 column).
Both the SELECT 1 and SELECT * versions wind up checking column permissions and fail if the user is not granted access to all columns in the table.
An example I cribbed from a conversation on the heap
CREATE USER blat WITHOUT LOGIN;
GO
CREATE TABLE dbo.T
(
X INT PRIMARY KEY,
Y INT,
Z CHAR(8000)
)
GO
GRANT SELECT ON dbo.T TO blat;
DENY SELECT ON dbo.T(Z) TO blat;
GO
EXECUTE AS USER = 'blat';
GO
SELECT 1
WHERE EXISTS (SELECT 1
FROM T);
/* ↑↑↑↑
Fails unexpectedly with
The SELECT permission was denied on the column 'Z' of the
object 'T', database 'tempdb', schema 'dbo'.*/
GO
REVERT;
DROP USER blat
DROP TABLE T
So one might speculate that the minor apparent difference when using SELECT some_not_null_col is that it only winds up checking permissions on that specific column (though still loads the metadata for all). However this doesn't seem to fit with the facts as the percentage difference between the two approaches if anything gets smaller as the number of columns in the underlying table increases.
In any event I won't be rushing out and changing all my queries to this form as the difference is very minor and only apparent during query compilation. Removing the OPTION (RECOMPILE) so that subsequent executions can use a cached plan gave the following.
+-------------+-----------+------------+-----------+--------------+
| Num of Cols | * | 1 | PK | Not Null col |
+-------------+-----------+------------+-----------+--------------+
| 2 | 144933.25 | 145292 | 146029.25 | 143973.5 |
| 4 | 146084 | 146633.5 | 146018.75 | 146581.25 |
| 8 | 143145.25 | 144393.25 | 145723.5 | 144790.25 |
| 16 | 145191.75 | 145174 | 144755.5 | 146666.75 |
| 32 | 144624 | 145483.75 | 143531 | 145366.25 |
| 64 | 145459.25 | 146175.75 | 147174.25 | 146622.5 |
| 128 | 145625.75 | 143823.25 | 144132 | 144739.25 |
| 256 | 145380.75 | 147224 | 146203.25 | 147078.75 |
| 512 | 146045 | 145609.25 | 145149.25 | 144335.5 |
| 1024 | 148280 | 148076 | 145593.25 | 146534.75 |
+-------------+-----------+------------+-----------+--------------+
| Total | 1454769 | 1457884.75 | 1454310 | 1456688.75 |
+-------------+-----------+------------+-----------+--------------+
The test script I used can be found here
Best way to know is to performance test both versions and check out the execution plan for both versions. Pick a table with lots of columns.
There is no difference in SQL Server and it has never been a problem in SQL Server. The optimizer knows that they are the same. If you look at the execution plans, you will see that they are identical.
Personally I find it very, very hard to believe that they don't optimize to the same query plan. But the only way to know in your particular situation is to test it. If you do, please report back!
Not any real difference but there might be a very small performance hit. As a rule of thumb you should not ask for more data than you need.