rearrage lines of file in unix - perl

I have a file with the below format. This is a snippet, original lines cross over a million.
Infact ABC/DEF/GH9j/etc.. is the starting of a line and ";" is supposed to be the end of the line.
Here in this example line01, line10 & line13 are perfect; But other lines are scattered in to multiple lines:
line01: ABC abc_123 ( .Y ( B2b ) , .A ( sel ) );
line02: DEF def_456 ( .Z ( n_2 ) , .in ( 1b0 ) ,
line03: .tstin ( sel ) , .tstmb ( DD ) );
line04: GH9j 3_inst ( .Q0 ( CC3 ) , .Q1 ( Ee ) ,
line05: .Q2 ( p_2 ) , .Q3 ( cin ) ,
line06: .D0 ( AA ) , .D1 ( rdata[5] ) ,
line07: .D2 ( gg ) , .D3 ( hp ) ,
line08: .SE0 ( sel ) , .SE1 ( sel ) ,
line09: .SE2 ( pqr ) , .SE3 ( AA ) , .CK ( Bb ) );
line10: BUF 4PQR ( .Y ( eE ) , .A ( cC ) );
line11: MX2 MnOp ( .X ( DD ) , .A ( PQR_11 ) ,
line12: .B ( trstb ) , .S0 ( klm2 ) );
line13: BUFH 6th_inst ( .Zz ( AA ) , .A ( B2B ) );
.......
Q. I want to rearrange all lines like below:
All the ports of one instance should be in ONE LINE (13 lines should reduce to 6 lines)
line01: ABC abc_123 ( .Y ( B2b ) , .A ( sel ) );
line02: DEF def_456 ( .Z ( n_2 ) , .in ( 1b0 ) , .tstin ( sel ) , .tstmb ( DD ) );
line03: GH9j 3_inst ( .Q0 ( CC3 ) , .Q1 ( Ee ) , .Q2 ( p_2 ) , .Q3 ( cin ) , .D0 ( AA ) , .D1 ( rdata[5] ) , .D2 ( gg ) , .D3 ( hp ) , .SE0 ( sel ) , .SE1 ( sel ) , .SE2 ( pqr ) , .SE3 ( AA ) , .CK ( Bb ) );
line04: BUF 4PQR ( .Y ( eE ) , .A ( cC ) );
line05: MX2 MnOp ( .X ( DD ) , .A ( PQR_11 ) , .B ( trstb ) , .S0 ( klm2 ) );
line06: BUFH 6th_inst ( .Zz ( AA ) , .A ( B2B ) );

A one liner might work: perl -pe' chomp unless m/\;/' file.txt - though with millions of lines some tweaking might be needed.
Here is the one liner as a script:
#!/usr/bin/env perl
while (<DATA>) {
chomp unless m/\;/ ;
print ;
}
__DATA__
ABC abc_123 ( .Y ( B2b ) , .A ( sel ) );
DEF def_456 ( .Z ( n_2 ) , .in ( 1b0 ) ,
.tstin ( sel ) , .tstmb ( DD ) );
GH9j 3_inst ( .Q0 ( CC3 ) , .Q1 ( Ee ) ,
.Q2 ( p_2 ) , .Q3 ( cin ) ,
.D0 ( AA ) , .D1 ( rdata[5] ) ,
.D2 ( gg ) , .D3 ( hp ) ,
.SE0 ( sel ) , .SE1 ( sel ) ,
.SE2 ( pqr ) , .SE3 ( AA ) , .CK ( Bb ) );
BUF 4PQR ( .Y ( eE ) , .A ( cC ) );
MX2 MnOp ( .X ( DD ) , .A ( PQR_11 ) ,
.B ( trstb ) , .S0 ( klm2 ) );
BUFH 6th_inst ( .Zz ( AA ) , .A ( B2B ) );
Output:
ABC abc_123 ( .Y ( B2b ) , .A ( sel ) );
DEF def_456 ( .Z ( n_2 ) , .in ( 1b0 ) ,.tstin ( sel ) , .tstmb ( DD ) );
GH9j 3_inst ( .Q0 ( CC3 ) , .Q1 ( Ee ) ,.Q2 ( p_2 ) , .Q3 ( cin ) ,.D0 ( AA ) , .D1 ( rdata[5] ) ,.D2 ( gg ) , .D3 ( hp ) ,.SE0 ( sel ) , .SE1 ( sel ) ,.SE2 ( pqr ) , .SE3 ( AA ) , .CK ( Bb ) );
BUF 4PQR ( .Y ( eE ) , .A ( cC ) );
MX2 MnOp ( .X ( DD ) , .A ( PQR_11 ) ,.B ( trstb ) , .S0 ( klm2 ) );
BUFH 6th_inst ( .Zz ( AA ) , .A ( B2B ) );
If you need to preserve the line numbers (line01:) add a note to that effect in the question.

Here is a sed solution:
sed -n 'H;/;/{s/.*//;x;s/\n//g;p;}' filename

Without seeing your whole dataset, or at least large chuncks of it, its hard to know what pattern to match on, however in the example you give, it seems as though the lines you want to merge end in a ,newline therefore simply changing ,newline to , might do the trick:
sed -e ':begin;$!N;s/,\n/,/;tbegin;P;D' in.txt
outputs:
ABC abc_123 ( .Y ( B2b) , .A ( sel));
DEF def_456 ( .Z ( n_2) , .in ( 1b0) ,.tstin ( sel) , .tstmb ( DD));
GH9j 3_inst ( .Q0 ( CC3) , .Q1 ( Ee) ,.Q2 ( p_2) , .Q3 ( cin) ,.D0 ( AA) , .D1 ( rdata [5]) ,.D2 ( gg) , .D3 ( hp) ,.SE0 ( sel) , .SE1 ( sel) ,.SE2 ( pqr) , .SE3 ( AA) , .CK ( Bb));
BUF 4PQR ( .Y ( eE) , .A ( cC));
MX2 MnOp ( .X ( DD) , .A ( PQR_11) ,.B ( trstb) , .S0 ( klm2));
BUFH 6th_inst ( .Zz ( AA) , .A ( B2B));
See http://backreference.org/2009/12/23/how-to-match-newlines-in-sed/ for more details on matching newlines in sed.

A one-liner in AWK
awk '{printf "%s%s", $0, /;\s*$/ ? "\n" : " "}' filename
It outputs each line with a separator that is either a newline or space depending on whether ; is found at the end of the line.

Related

Getting last 7 days data on perticular entity and displaying in SSRS report

I have case entity in that I need to get how many cases created in last 7 days and closed in last 7 days. this I need to show it on chart every day for last 7days how many closed and created.
Can some one help on this. I have written SQL query to fetch but it is not working or should I have to go for expression?
This will produce the last 7 days even if there is no data by using a calendar table.
WITH
calendar
AS
(
SELECT
[date] = CAST(GETDATE() AS DATE)
, [day_count] = 1
UNION ALL
SELECT
[date] = DATEADD(DAY, -1, [date])
, [day_count] = [day_count] + 1
FROM
[calendar]
WHERE
[day_count] < 7
)
,
tablecase
AS
(
SELECT tbl.* FROM (VALUES
( 1, '01-Jan-2023', '11-Jan-2023')
, ( 2, '01-Jan-2023', '12-Jan-2023')
, ( 3, '03-Jan-2023', '13-Jan-2023')
, ( 4, '04-Jan-2023', '14-Jan-2023')
, ( 5, '06-Jan-2023', '15-Jan-2023')
, ( 6, '06-Jan-2023', '16-Jan-2023')
, ( 7, '06-Jan-2023', '17-Jan-2023')
, ( 8, '11-Jan-2023', '18-Jan-2023')
, ( 9, '11-Jan-2023', '19-Jan-2023')
, ( 10, '11-Jan-2023', '20-Jan-2023')
, ( 11, '11-Jan-2023', '21-Jan-2023')
, ( 12, '12-Jan-2023', '22-Jan-2023')
, ( 13, '13-Jan-2023', '23-Jan-2023')
, ( 14, '14-Jan-2023', '24-Jan-2023')
, ( 15, '15-Jan-2023', '25-Jan-2023')
, ( 16, '16-Jan-2023', '26-Jan-2023')
) tbl ([item_id], [orderdate], [duedate])
)
SELECT
cal.[date]
, [closed] = ISNULL([closed], 0)
, [opened] = ISNULL([opened], 0)
FROM
calendar AS cal
LEFT JOIN
(
SELECT
[date] = CAST([orderdate] AS DATE)
, [measure] = COUNT(1)
, [action] = 'closed'
FROM
[tablecase]
WHERE
[orderdate] IS NOT NULL
GROUP BY
[orderdate]
UNION ALL
SELECT
[date] = CAST([duedate] AS DATE)
, [measure] = COUNT(1)
, [action] = 'opened'
FROM
[tablecase]
WHERE
[duedate] IS NOT NULL
GROUP BY
[duedate]
) AS a
PIVOT
(
SUM([measure]) FOR [action] IN
(
[closed], [opened]
)
) AS pvt ON pvt.[date] = cal.[date];

How to `sum( DISTINCT <column> ) OVER ()` using window function?

I have next data:
Here I already calculated total for conf_id. But want also calculate total for whole partition. eg:
Calculate total suma by agreement for each its order (not goods at order which are with slightly different rounding)
How to sum 737.38 and 1238.3? eg. take only one number among group
(I can not sum( item_suma ), because it will return 1975.67. Notice round for conf_suma as intermediate step)
UPD
Full query. Here I want to calculate rounded suma for each group. Then I need to calculate total suma for those groups
SELECT app_period( '2021-02-01', '2021-03-01' );
WITH
target_date AS ( SELECT '2021-02-01'::timestamptz ),
target_order as (
SELECT
tstzrange( '2021-01-01', '2021-02-01') as bill_range,
o.*
FROM ( SELECT * FROM "order_bt" WHERE sys_period #> sys_time() ) o
WHERE FALSE
OR o.agreement_id = 3385 and o.period_id = 10
),
USAGE AS ( SELECT
ocd.*,
o.agreement_id as agreement_id,
o.id AS order_id,
(dense_rank() over (PARTITION BY o.agreement_id ORDER BY o.id )) as zzzz_id,
(dense_rank() over (PARTITION BY o.agreement_id, o.id ORDER BY (ocd.ic).consumed_period )) as conf_id,
sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id ) AS agreement_suma2,
(sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_suma,
(sum( ocd.item_cost ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_cost,
(sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_suma,
(sum( ocd.item_cost ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_cost,
max((ocd.ic).consumed) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ) AS consumed,
(sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id )) AS order_suma2
FROM target_order o
LEFT JOIN order_cost_details( o.bill_range ) ocd
ON (ocd.o).id = o.id AND (ocd.ic).consumed_period && o.app_period
)
SELECT
*,
(conf_suma/6) ::numeric( 10, 2 ) as group_nds,
(SELECT sum(x) from (SELECT sum( DISTINCT conf_suma ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_suma,
(SELECT sum(x) from (SELECT (sum( DISTINCT conf_suma ) /6)::numeric( 10, 2 ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_nds
FROM USAGE
WINDOW w AS ( PARTITION BY usage.agreement_id ROWS CURRENT ROW EXCLUDE TIES)
ORDER BY
order_id,
conf_id
My old question
I found solution. See dbfiddle.
To run window function for distinct values I should get first value from each peer. To complete this I
aggregate IDs of rows for this peer
lag this aggregation by one
Mark rows that are not aggregated yet (this is first row at peer) as _distinct
sum( ) FILTER ( WHERE _distinct ) over ( ... )
Voila. You get sum over DISTINCT values at target PARTITION
which are not implemented yet by PostgreSQL
with data as (
select * from (values
( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057),
( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
) t (id, agreement_id, order_id, suma)
),
intermediate as (select
*,
sum( suma ) over ( partition by agreement_id, order_id ) as fract_order_suma,
sum( suma ) over ( partition by agreement_id ) as fract_agreement_total,
(sum( suma::numeric(10,2) ) over ( partition by agreement_id, order_id )) as wrong_order_suma,
(sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
(sum( suma ) over ( partition by agreement_id ))::numeric( 10, 2) as wrong_agreement_total,
id as xid,
array_agg( id ) over ( partition by agreement_id, order_id ) as agg
from data),
distinc as (select *,
lag( agg ) over ( partition by agreement_id ) as prev,
id = any (lag( agg ) over ()) is not true as _distinct, -- allow to match first ID from next peer
order_suma as xorder_suma, -- repeat column to easily visually compare with _distinct
(SELECT sum(x) from (SELECT sum( DISTINCT order_suma ) AS x FROM intermediate sub_q WHERE sub_q.agreement_id = intermediate.agreement_id GROUP BY agreement_id, order_id) t) as correct_total_suma
from intermediate
)
select
*,
sum( order_suma ) filter ( where _distinct ) over ( partition by agreement_id ) as also_correct_total_suma
from distinc
better approach dbfiddle:
Assign row_number at each order: row_number() over (partition by agreement_id, order_id ) as nrow
Take only first suma: filter nrow = 1
with data as (
select * from (values
( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057),
( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
) t (id, agreement_id, order_id, suma)
),
intermediate as (select
*,
row_number() over (partition by agreement_id, order_id ) as nrow,
(sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
from data)
select
*,
sum( order_suma ) filter (where nrow = 1) over (partition by agreement_id)
from intermediate```

Looking for ways to optimize DB2 query

I have a DB2 query as below. I am looking for ways to improve the speed of this one. I have tried visual explain but no indexes were advised by the index advisor. Can somebody have a look at this and advise if something can be done?
There is
CREATE OR REPLACE VIEW PSCMPORDVW
AS
WITH INPROGRESS AS
(
SELECT
DIODR#
, DIDISP
, DIUNIT
, DISTST
, DIAPRV
, DIETAD
, DITRLR AS TRAILER_ID
, DIDR1
, DIETAT
FROM
LOAD
WHERE
DIETAD <> 0
AND DIETAT <> '0000'
ORDER BY
1
)
, STOPGROUP AS
(
SELECT
SOORD STOPORDER
, COUNT(*) STOPSREMAIN
, MIN(SOSTP#) NEXTSTOP
, MAX(SOAPPR) APPTREQ
FROM
STOPOFF
INNER JOIN
INPROGRESS
ON
DIODR# = SOORD
WHERE
SOARDT = 0
GROUP BY
SOORD
ORDER BY
1
)
, STOPAPPTS AS
(
SELECT
SOORD APPTORDER
, SOCUST STOPCUST
, SOEDA ETADATE
, SOETA ETATIME
, SOADT1 EARLYDATE
, SOATM1 EARLYTIME
, SOADT2 LATEDATE
, SOATM2 LATETIME
, SOCTYC NEXTCITY
, SOSTP# APPTSTOP
, SOST NEXTSTATE
FROM
STOPOFF
INNER JOIN
STOPGROUP
ON
STOPORDER = SOORD
AND NEXTSTOP = SOSTP#
)
SELECT
ORDER_NUMBER
, SHIPPER_ID
, SHIPPER_NAME
, SHIPPER_ADDRESS_1
, SHIPPER_ADDRESS_2
, SHIPPER_CITY
, SHIPPER_ST
, SHIPPER_ZIP
, SHIPPER_ZIP_EXT
, LOAD_AT_ID
, LOAD_AT_NAME
, LOAD_AT_ADDRESS_1
, LOAD_AT_ADDRESS_2
, LOAD_AT_CITY
, LOAD_AT_ST
, LOAD_AT_ZIP
, LOAD_AT_ZIP_EXT
, LOAD_AT_LATITUDE
, LOAD_AT_LONGITUDE
, EARLY_PU_DATE_TIME
, LATE_PU_DATE_TIME
, EARLY_DELV_DATE_TIME
, EST_REVENUE
, ORDER_DIV
, CONSIGNEE_ID
, CONSIGNEE_NAME
, CONSIGNEE_ADDRESS_1
, CONSIGNEE_ADDRESS_2
, CONSIGNEE_CITY
, CONSIGNEE_ST
, CONSIGNEE_ZIP
, CONSIGNEE_ZIP_EXT
, CONSIGNEE_LATITUDE
, CONSIGNEE_LONGITUDE
, TRAILER_TYPE
, ORDER_MESSAGE
, ADDITIONAL_STOPS
, CMDTY_CODE
, CMDTY_DESCRIPTION
, ORDER_MILES
, ORDER_WGT
, ORIGIN_CITY_CODE
, ORIGIN_CITY
, ORIGIN_ST
, DEST_CITY_CODE
, DEST_CITY_NAME
, DEST_ST
, PICK_UP_AREA
, PLAN_INFO
, NUMBER_LDS
, NUMBER_DISP
, SHIP_DATE_TIME
, NEW_PICKUP_AREA
, EQUIPMENT_NUMBER
, APPT_REQ
, APPT_MADE
, PRE_T_SEQ
, PRE_T_AREA
, LOAD_DISPATCHED
, CUST_SERV_REP
, NEGOTIATIONS
,
(
CASE
WHEN UNUNIT IS NOT NULL
THEN UNUNIT
ELSE ' '
END
)
UNIT_DISPATCHED
,
(
CASE
WHEN UNSUPR IS NOT NULL
THEN UNSUPR
ELSE ' '
END
)
DRIVER_MGR_CODE
, COALESCE(SUPNAM, ' ') DRIVER_MGR_NAME
,
(
CASE
WHEN UNFMGR IS NOT NULL
THEN UNFMGR
ELSE ' '
END
)
FLEET_MGR_CODE
, COALESCE(FLTNAM, ' ') FLEET_MGR_NAME
,
(
CASE
WHEN UNTRL1 IS NOT NULL
THEN UNTRL1
ELSE ' '
END
)
TRAILER_ID
, DIDISP DISPATCH_NUMBER
, (COALESCE(BCMCNEW, ' ')) FED_MC_ID
, DIUNIT DISPATCHED_UNIT
, CASE
WHEN UNETAD <> 0
AND UNETAT = ''
THEN CVTDATETIM(CHAR(UNETAD),'0000', (
SELECT
SUBSTR(DATA_AREA_VALUE, 1109, 2) AS TIMEZONE
FROM
TABLE(QSYS2.DATA_AREA_INFO('COMPAN', '*LIBL'))
)
)
WHEN UNETAD <> 0
THEN CVTDATETIM(CHAR(UNETAD),UNETAT, (
SELECT
SUBSTR(DATA_AREA_VALUE, 1109, 2) AS TIMEZONE
FROM
TABLE(QSYS2.DATA_AREA_INFO('COMPAN', '*LIBL'))
)
)
WHEN UNETAD = 0
THEN '0000-00-00T00:00:00-00:00'
END AS ETA_DATE_TIME
, NEXTSTOP
, CASE
WHEN SOARDT <> 0
AND SOARTM = ''
THEN CVTDATETIM(CHAR(SOARDT),'0000', (
SELECT
SUBSTR(DATA_AREA_VALUE, 1109, 2) AS TIMEZONE
FROM
TABLE(QSYS2.DATA_AREA_INFO('COMPAN', '*LIBL'))
)
)
WHEN SOARDT <> 0
THEN CVTDATETIM(CHAR(SOARDT),SOARTM, (
SELECT
SUBSTR(DATA_AREA_VALUE, 1109, 2) AS TIMEZONE
FROM
TABLE(QSYS2.DATA_AREA_INFO('COMPAN', '*LIBL'))
)
)
WHEN SOARDT = 0
THEN '0000-00-00T00:00:00-00:00'
END AS STOP_ARRIVAL_DATE_TIME
, CASE
WHEN SOLUDT <> 0
AND SOLUTM = ''
THEN CVTDATETIM(CHAR(SOLUDT),'0000', (
SELECT
SUBSTR(DATA_AREA_VALUE, 1109, 2) AS TIMEZONE
FROM
TABLE(QSYS2.DATA_AREA_INFO('COMPAN', '*LIBL'))
)
)
WHEN SOLUDT <> 0
THEN CVTDATETIM(CHAR(SOLUDT),SOLUTM, (
SELECT
SUBSTR(DATA_AREA_VALUE, 1109, 2) AS TIMEZONE
FROM
TABLE(QSYS2.DATA_AREA_INFO('COMPAN', '*LIBL'))
)
)
WHEN SOLUDT = 0
THEN '0000-00-00T00:00:00-00:00'
END AS STOP_DEPART_DATE_TIME
, ORBAMT ORDER_INV_AMT
, ORARST AR_STATUS_FLAG
, DISTST SETTLEMENT_FLAG
, DIAPRV APPROVED_FOR_PAY
, BCCARR CARRIER_CODE
, BCNAME CARRIER_NAME
, BCADDR CARRIER_ADDRESS_1
, BCADR2 CARRIER_ADDRESS_2
, BCCITY CARRIER_CITY
, BCST CARRIER_ST
, BCZIP CARRIER_ZIP
FROM
INPROGRESS
INNER JOIN
PSMAINORVW A
ON
DIODR# = ORDER_NUMBER
AND DIDISP = NUMBER_DISP
AND
(
SUBSTR(ORDER_NUMBER, 1, 2) <> 'DH'
AND SUBSTR(ORDER_NUMBER, 1, 1) <> 'M'
)
LEFT OUTER JOIN
STOPOFF
ON
DIODR# = SOORD
AND SOSTP# = 90
LEFT OUTER JOIN
LMCARR
ON
DIUNIT = BCCARR
LEFT OUTER JOIN
MMILES
ON
MMORD# = DIODR#
AND MMRECTYPE = 'D'
AND MMDSP# = DIDISP
EXCEPTION JOIN
ORDBILL B
ON
B.ORODR# = DIODR#
AND B.ORSEQ = ' '
AND ORARST = '1'
LEFT OUTER JOIN
STOPGROUP
ON
STOPORDER = DIODR#
LEFT OUTER JOIN
STOPAPPTS
ON
APPTORDER = STOPORDER
AND APPTSTOP = NEXTSTOP
LEFT OUTER JOIN
UNITS
ON
UNUNIT = DIUNIT
AND UNORD# = ORDER_NUMBER
LEFT OUTER JOIN
SUPMAST
ON
SUPCDE = UNSUPR
LEFT OUTER JOIN
FLTMAST
ON
UNFMGR = FLTCDE
WHERE
DIETAD <> 0
AND DIETAT <> '0000'
RCDFMT PSCMPORDVW ;
I suspect that the below part might be slowing it up. Can someone advise what can be done here?
STOPGROUP AS
(
SELECT
SOORD STOPORDER
, COUNT(*) STOPSREMAIN
, MIN(SOSTP#) NEXTSTOP
, MAX(SOAPPR) APPTREQ
FROM
STOPOFF
INNER JOIN
INPROGRESS
ON
DIODR# = SOORD
WHERE
SOARDT = 0
GROUP BY
SOORD
ORDER BY
1
)
Even though Visual Explain (VE) doesn't advise any indexes ... it can still be used to see how long various parts of your query are taking.
If the
SELECT
SOORD STOPORDER
, COUNT(*) STOPSREMAIN
, MIN(SOSTP#) NEXTSTOP
, MAX(SOAPPR) APPTREQ
Turns out to really be any issue, I'd look at using an Encoded Vector Index (EVI) with aggregate values to speed that up.
I'd suggest breaking it down and building back up while using VE to see where the issues lie.
The only magic wand I might suggest is putting code into a user defined table function (UDTF); assuming you currently plan to use the view like so:
select *
from myview
where something = 'somevalue';
A UDTF would allow for you to explicitly push the selection into the query
select *
from table ( myudtf('somevalue'));

Convert Teradata-sql-query to tableau new custom query

I have a Teradata-SQL-Query that auto-created through FinBI SAP tool. I am trying to use that query in Tableau as a New Custom SQL. Due to differences in the synax I am getting an error.
Below is the query that I pulled from FinBI SAP Tool.
SELECT
ABC.PRODUCT_ID,
sum(CASE WHEN DEF.SERVICE_FLG = 'N' THEN DEF.COMP_US_NET_PRICE_AMT ELSE 0 END),
Sum(CASE WHEN DEF.SERVICE_FLG = 'N' THEN DEF.COMP_US_LIST_PRICE_AMT ELSE 0 END),
Sum(CASE WHEN DEF.SERVICE_FLG = 'N' THEN DEF.COMP_US_COST_AMT ELSE 0 END),
Sum(CASE WHEN DEF.SERVICE_FLG = 'N' THEN DEF.EXTENDED_QTY ELSE 0 END)
,
GHI.FISCAL_YEAR_NUMBER_INT,
GHI.JKL,
MNO.GU_PRIMARY_NAME
FROM
ABC,
DEF,
GHI,
MNO
WHERE
( DEF.FISCAL_YEAR_QUARTER_NUMBER_INT=GHI.FISCAL_YEAR_QUARTER_NUMBER_INT )
AND ( ABC.ITEM_KEY=DEF.PRODUCT_KEY )
AND ( DEF.END_CUSTOMER_KEY=MNO.END_CUSTOMER_KEY )
AND ( DEF.PRODUCT_KEY IN ( SELECT ITEM_KEY FROM ABC H JOIN PQR S ON H.TECHNOLOGY_GROUP_ID = S.TECHNOLOGY_GROUP_ID WHERE user_id=#Variable('BOUSER') AND IAM_LEVEL_NUM=1 ) )
AND ( DEF.DV_ATTRIBUTION_CD IN ('ATTRIBUTED','STANDALONE') )
AND
(
ABC.BUSINESS_UNIT_ID IN ( 'xyz' )
AND
DEF.REVENUE_RECOGNITION_FLG IN ( 'Y' )
)
GROUP BY
1,
6,
7,
8
enter code here

Need to go from hostname to base domain

I need a function:
f(fqdn,suffix) -> basedomain
with these example inputs and outputs:
f('foobar.quux.somedomain.com','com') -> 'somedomain.com'
f('somedomain.com','com') -> 'somedomain.com'
f('foobar.quux.somedomain.com.br','com.br') -> 'somedomain.com.br'
f('somedomain.com.br','com.br') -> 'somedomain.com.br'
In plain English, if the suffix has n segments, take the last n+1 segments. Find the base domain for the FQDN, allowing for the fact that some FQDNs have more than one suffix element.
The suffixes I need to match are here. I've already got them in my SQL database.
I could write this in C#; it might not be the most elegant but it would work. Unfortunately I would like to have this function in either T-SQL, where it is closest to the data, or in Powershell, which is where the rest of the utility that consumes this data is going to be. I suppose it would be ok to do it in C#, compile to an assembly and then access it from T-SQL, or even from Powershell ... if that would be the fastest executing. If there's some reasonably clever alternative in pure T-SQL or simple Powershell, I'd like that.
EDIT: One thing I forgot to mention explicitly (but which is clear when reviewing the suffix list, at my link above) is that we must pick the longest matching suffix. Both "br" and "com.br" appear in the suffix list (with similar things happening for uk, pt, etc). So the SQL has to use a window function to make sure the longest matching suffix is found.
Here is how far I got when I was doing the SQL. I had gotten lost in all the substring/reverse functions.
SELECT Domain, suffix
FROM (
SELECT SD.Domain, SL.suffix,
RN=ROW_NUMBER() OVER (
PARTITION BY sd.Domain ORDER BY LEN(SL.suffix) DESC)
FROM SiteDomains SD
INNER JOIN suffixlist SL ON SD.Domain LIKE '%.'+SL.suffix
) AS X
WHERE RN=1
This works ok for finding the right suffix. I'm a little concerned about its performance though.
The following demonstrates matching FQDNs with TLDs and extracting the desired n + 1 domain name segments:
-- Sample data.
declare #SampleTLDs as Table ( TLD VarChar(64) );
insert into #SampleTLDs ( TLD ) values
( 'com' ), ( 'somedomain.com' ), ( 'com.br' );
declare #SampleFQDNs as Table ( FQDN VarChar(64) );
insert into #SampleFQDNs ( FQDN ) values
( 'foobar.quux.somedomain.com' ), ( 'somedomain.com' ),
( 'foobar.quux.somedomain.com.br' ), ( 'somedomain.com.br' );
select * from #SampleTLDs;
select * from #SampleFQDNs;
-- Fiddle about.
select FQDN, TLD,
case
when DotPosition = 0 then FQDN
else Reverse( Left( ReversedPrefix, DotPosition - 1) ) + '.' + TLD
end as Result
from (
select FQDNs.FQDN, TLDs.TLD,
Substring( Reverse( FQDNs.FQDN ), Len( TLDs.TLD ) + 2, 100 ) as ReversedPrefix,
CharIndex( '.', Substring( Reverse( FQDNs.FQDN ), Len( TLDs.TLD ) + 2, 100 ) ) as DotPosition
from #SampleFQDNs as FQDNs inner join
#SampleTLDs as TLDs on FQDNs.FQDN like '%.' + TLDs.TLD or FQDNs.FQDN = TLDs.TLD ) as Edna;
-- To select only the longest matching TLD for each FQDN:
with
ExtendedFQDNs as (
select FQDNs.FQDN, TLDs.TLD, Row_Number() over ( partition by FQDN order by Len( TLDs.TLD ) desc ) as TLDLenRank,
Substring( Reverse( FQDNs.FQDN ), Len( TLDs.TLD ) + 2, 100 ) as ReversedPrefix,
CharIndex( '.', Substring( Reverse( FQDNs.FQDN ), Len( TLDs.TLD ) + 2, 100 ) ) as DotPosition
from #SampleFQDNs as FQDNs inner join
#SampleTLDs as TLDs on FQDNs.FQDN like '%.' + TLDs.TLD or FQDNs.FQDN = TLDs.TLD )
select FQDN, TLD,
case
when DotPosition = 0 then FQDN
else Reverse( Left( ReversedPrefix, DotPosition - 1) ) + '.' + TLD
end as Result
from ExtendedFQDNs
where TLDLenRank = 1;
Here's how I would do it in C#:
string getBaseDomain(string fqdn, string suffix)
{
string[] domainSegs = fqdn.Split('.');
return domainSegs[domainSegs.Length - suffix.Split('.').Length - 1] + "." + suffix;
}
So here it is in Powershell:
function getBaseDomain
{
Param(
[string]$fqdn,
[string]$suffix
)
$domainSegs = $fqdn.Split(".");
return $domainSegs[$domainSegs.Length - $suffix.Split(".").Length - 1] + "."+$suffix;
}
Seems rather silly now to have wasted stackoverflow.com's time with this. My apologies.
Here is a tsql variant...
declare #fqdn varchar(256) = 'somedomain.com'
declare #suffix varchar(128) = 'com'
select left(#fqdn,CHARINDEX(#suffix,#fqdn) - 2)
if(select CHARINDEX('.',reverse(left(#fqdn,CHARINDEX(#suffix,#fqdn) - 2)))) = 0
begin
select left(#fqdn,CHARINDEX(#suffix,#fqdn) - 2) + '.' + #suffix
end
else
begin
select right(left(#fqdn,CHARINDEX(#suffix,#fqdn) - 2),CHARINDEX('.',reverse(left(#fqdn,CHARINDEX(#suffix,#fqdn) - 2))) - 1) + '.' + #suffix
end