PL/SQL block to proceed execution only if the row has one not null value - select

I have a table like below :
Group_id
Sub_Group
Number
Values
1111
5568
3A
e1*70/221*2000/8*1283*00
1111
16978
2B
e1*715/2007*692/2008A005100
1111
23547
1C
not applicable
1111
5567
2D
not applicable
1111
5567
3B
(null)
I want my block to proceed with the further steps only if only one value other than not applicable exists.
Eg. like below :
Group_id
Sub_Group
Number
Values
1111
5568
3A
e1*70/221*2000/8*1283*00
1111
5568
3A
e1*70/221*2000/8*1283*01
1111
5568
3A
e1*70/221*2000/8*1283*02
1111
16978
2B
not applicable
1111
23547
1C
not applicable
1111
5567
2D
not applicable
1111
5567
3B
not applicable
For above given example only the Number 3A has value other than not applicable. so in this case I can proceed with the further steps to print the values. So if there are more than one Number for a group with values other that not applicable I need to print the error message. But a single can have multiple values other than not applicable.

Related

Extracting all rows containing a specific datetime value (MATLAB)

I have a table which looks like this:
Entry number
Timestamp
Value1
Value2
Value3
Value4
5758
28-06-2018 16:30
34
63
34.2
60.9
5759
28-06-2018 17:00
33.5
58
34.9
58.4
5758
28-06-2018 16:30
34
63
34.2
60.9
5759
28-06-2018 17:00
33.5
58
34.9
58.4
5760
28-06-2018 17:30
33
53
35.2
58.5
5761
28-06-2018 18:00
33
63
35
57.9
5762
28-06-2018 18:30
33
61
34.6
58.9
5763
28-06-2018 19:00
33
59
34.1
59.4
5764
28-06-2018 19:30
28
89
33.5
64.2
5765
28-06-2018 20:00
28
89
33
66.1
5766
28-06-2018 20:30
28
83
32.5
67
5767
28-06-2018 21:00
29
89
32.2
68.4
Where '28-06-2018 16:30' is under one column. So I have 6 columns:
Entry number, Timestamp, Value1, Value2, Value3, Value4
I want to extract all rows that belong to '28-06-2018', i.e all data pertaining to that day. Since my table is too large I couldn't fit more data, however, the entries under the timestamp range for a couple of months.
t=table([5758;5759],["28-06-2018 16:30";"29-06-2018 16:30"],[34;33.5],'VariableNames',{'Entry number','Timestamp','Value1'})
t =
2×3 table
Entry number Timestamp Value1
____________ __________________ ______
5758 "28-06-2018 16:30" 34
5759 "29-06-2018 16:30" 33.5
t(contains(t.('Timestamp'),"28-06"),:)
ans =
1×3 table
Entry number Timestamp Value1
____________ __________________ ______
5758 "28-06-2018 16:30" 34

pyspark - converting DF Structure

I am new to Python and Spark Programming.
I have data in Below given format-1, which will have data captured for different fields based on timestamp and trigger.
I need to convert this data into format-2, i.e, based on timestamp and Key, need to group all the fields given in format-1 and created records as per Format-2. In Format-1, there are field that does not have any key value (timestamp and Trigger), these fields should be populated for all the records in format-2
Can you please suggest me the best approach to perform this in pyspark.
Format-1:
Event time (key-1) trig (key-2) data field_Name
------------------------------------------------------
2021-05-01T13:57:29Z 30Sec 10 A
2021-05-01T13:57:59Z 30Sec 11 A
2021-05-01T13:58:29Z 30Sec 12 A
2021-05-01T13:58:59Z 30Sec 13 A
2021-05-01T13:59:29Z 30Sec 14 A
2021-05-01T13:59:59Z 30Sec 15 A
2021-05-01T14:00:29Z 30Sec 16 A
2021-05-01T14:00:48Z OFF 17 A
2021-05-01T13:57:29Z 30Sec 110 B
2021-05-01T13:57:59Z 30Sec 111 B
2021-05-01T13:58:29Z 30Sec 112 B
2021-05-01T13:58:59Z 30Sec 113 B
2021-05-01T13:59:29Z 30Sec 114 B
2021-05-01T13:59:59Z 30Sec 115 B
2021-05-01T14:00:29Z 30Sec 116 B
2021-05-01T14:00:48Z OFF 117 B
2021-05-01T14:00:48Z OFF 21 C
2021-05-01T14:00:48Z OFF 31 D
Null Null 41 E
Null Null 51 F
Format-2:
Event Time Trig A B C D E F
--------------------------------------------------------------
2021-05-01T13:57:29Z 30Sec 10 110 Null Null 41 51
2021-05-01T13:57:59Z 30Sec 11 111 Null Null 41 51
2021-05-01T13:58:29Z 30Sec 12 112 Null Null 41 51
2021-05-01T13:58:59Z 30Sec 13 113 Null Null 41 51
2021-05-01T13:59:29Z 30Sec 14 114 Null Null 41 51
2021-05-01T13:59:59Z 30Sec 15 115 Null Null 41 51
2021-05-01T14:00:29Z 30Sec 16 116 Null Null 41 51
2021-05-01T14:00:48Z OFF 17 117 21 31 41 51

Add null to the columns which are empty

I am trying to put null to the columns which are empty using perl or awk, to find the number of column , header's column count can be used. I tried to perform the solution using perl and some regex. However, the output looks very close to the desired output but if noticed carefully row number one is showing incorrect data.
Input data:
id name type foo-id zoo-id loo-id-1 moo-id-2
----- --------------- ----------- ------ ------ ------ ------
0 zoo123 soozoo 8 31 32
51 zoo213 soozoo 48 51
52 asz123 soozoo 47 52
53 asw122 soozoo 1003 53
54 fff123 soozoo 68 54
55 sss123 soozoo 75 55
56 ssd123 soozoo 76 56
Expected Output:
0 zoo123 soozoo 8 null 31 32
51 zoo213 soozoo 48 51 null null
52 asz123 soozoo 47 52 null null
53 asw122 soozoo 1003 53 null null
54 fff123 soozoo 68 54 null null
55 sss123 soozoo 75 55 null null
56 ssd123 soozoo 76 56 null null
Very close to solution but row-1 is showing incorrect data:
echo "$x"|grep -E '^[0-9]+' |perl -ne 'm/^([\d]+)(?:\s+([\w]+))?(?:\s+([-\w]+))?(?:\s+([\d]+))?(?:\s+([\d]+))?(?:\s+([\d]+))?(?:\s+([\d]+))?/;printf "%s %s %s %s %s %s %s\n", $1, $2//"null", $3//"null",$4//"null",$5//"null",$6//"null",$7//"null"' |column -t
0 zoo123 soozoo 8 31 32 null
51 zoo213 soozoo 48 51 null null
52 asz123 soozoo 47 52 null null
53 asw122 soozoo 1003 53 null null
54 fff123 soozoo 68 54 null null
55 sss123 soozoo 75 55 null null
56 ssd123 soozoo 76 56 null null
When you have a fixed-width string to parse, you'll find that unpack() is a better tool than regexes.
This should demonstrate how to do it. I'll leave it to you to convert it to a one-liner.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
while (<DATA>) {
next if /^\D/; # Skip lines that don't start with a digit
# I worked out the unpack() template by counting columns.
my #data = map { /\S/ ? $_ : 'null' } unpack('A7A14A16A8A8A8A8');
say join ' ', #data;
}
__DATA__
id name type foo-id zoo-id loo-id-1 moo-id-2
----- --------------- ----------- ------ ------ ------ ------
0 zoo123 soozoo 8 31 32
51 zoo213 soozoo 48 51
52 asz123 soozoo 47 52
53 asw122 soozoo 1003 53
54 fff123 soozoo 68 54
55 sss123 soozoo 75 55
56 ssd123 soozoo 76 56
Output:
$ perl unpack | column -t
0 zoo123 soozoo 8 null 31 32
51 zoo213 soozoo 48 51 null null
52 asz123 soozoo 47 52 null null
53 asw122 soozoo 1003 53 null null
54 fff123 soozoo 68 54 null null
55 sss123 soozoo 75 55 null null
56 ssd123 soozoo 76 56 null null
With GNU awk:
awk 'NR>2{ # ignore first and second row
NF=7 # fix number of columns
for(i=1; i<=NF; i++) # loop with all columns
if($i ~ /^ *$/){ # if empty or only spaces
$i="null"
}
print $0}' FIELDWIDTHS='7 14 16 8 8 10 8' OFS='|' file | column -s '|' -t
As one line:
awk 'NR>2{NF=7; for(i=1;i<=NF;i++) if($i ~ /^ *$/){$i="null"} print $0}' FIELDWIDTHS='7 14 16 8 8 10 8' OFS='|' file | column -s '|' -t
Output:
0 zoo123 soozoo 8 null 31 32
51 zoo213 soozoo 48 51 null null
52 asz123 soozoo 47 52 null null
53 asw122 soozoo 1003 53 null null
54 fff123 soozoo 68 54 null null
55 sss123 soozoo 75 55 null null
56 ssd123 soozoo 76 56 null null
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

Update Spark dataframe to populate data from another dataframe

I have 2 dataframes. I want to take distinct values of 1 column and link it with all the rows of another dataframe. For e.g -
Dataframe 1 : df1 contains
scenarioId
---------------
101
102
103
Dataframe 2 : df2 contains columns
trades
-------------------------------------
isin price
ax11 111
re32 909
erre 445
Expected output
trades
----------------
isin price scenarioid
ax11 111 101
re32 909 101
erre 445 101
ax11 111 102
re32 909 102
erre 445 102
ax11 111 103
re32 909 103
erre 445 103
Note that i dont have a possibility to join the 2 dataframes on a common column. Please suggest.
What you need is cross join or cartessian product:
val result = df1.crossJoin(df2)
although I do not recommend it as the amount of data rises very fast. You'll get all possible pairs - elements of cartessian product (the number will be number of rows in df1 times number of rows in df2).

certutil.exe formatting the output in powershell

I am using certutil.exe to get a list of issued certificates and export them to a .txt file, the output comes back in rows even though i specify format-table, autosize or wrap options. here is the command i've used, where am I going wrong?
certutil.exe -view -restrict "Disposition=20" -out "Request.RequestID,Request.RequesterName,certificatehash,Request.SubmittedWhen" | Format-Table -autosize
here is the output: (copied the first few here)
PS C:\Users\administrator.JLR> certutil.exe -view -restrict "Disposition=20" -out "Requ Schema: Column Name Localized Name Type MaxLength
---------------------------- ---------------------------- ------ --------- Request.RequestID Request ID Long 4 -- Indexed Request.RequesterName Requester Name String 2048 -- Indexed CertificateHash Certificate Hash String 128 -- Indexed Request.SubmittedWhen Request Submission Date Date 8 -- Indexed
Row 1: Request ID: 0x12 (18) Requester Name: "JLR\QA-ADFS1$" Certificate Hash: "13 24 46 54 fe 0b 6d 30 ff b8 b8 cd 55 e6 55 eb da 7d 15 bd" Request Submission Date: 8/4/2015 10:24 AM
Row 2: Request ID: 0x15 (21) Requester Name: "JLR\svcSAM" Certificate Hash: "94 2c 41 e0 2a d6 16 fc 74 bd ba 08 16 e8 a6 1c d2 4e 7e 12" Request Submission Date: 8/4/2015 2:13 PM
Row 3: Request ID: 0x17 (23) Requester Name: "JLR\Administrator" Certificate Hash: "24 b3 78 7b 69 db dc 6c d6 65 88 1c 7f b3 c6 ef 06 db 25 9b" Request Submission Date: 8/4/2015 2:35 PM
Row 4: Request ID: 0x25 (37) Requester Name: "JLR\paul.charles" Certificate Hash: "c4 00 20 df 0e 0b 65 29 b6 b3 c4 29 fa b7 a7 c6 c2 6b 44 c7" Request Submission Date: 8/7/2015 3:31 PM