Apache Drill handling cp1252 character codes - encoding

The data that we are querying as part of the csv holds cp1252 character codes and apache drill gives below error:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: MalformedInputException: Input length = 1 Fragment 0:0 [Error Id: 53bc07e3-a6e4-4301-a858-205be382275e on 172.16.243.116:31010] (java.lang.RuntimeException) java.nio.charset.MalformedInputException: Input length = 1 org.apache.drill.exec.expr.fn.impl.CharSequenceWrapper.decodeUT8():185 org.apache.drill.exec.expr.fn.impl.CharSequenceWrapper.setBuffer():119 org.apache.drill.exec.test.generated.FiltererGen174.doEval():50 org.apache.drill.exec.test.generated.FiltererGen174.filterBatchNoSV():100 org.apache.drill.exec.test.generated.FiltererGen174.filterBatch():73 org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.doWork():81 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.physical.impl.BaseRootExec.next():104 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1657 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745 Caused By (java.nio.charset.MalformedInputException) Input length = 1 java.nio.charset.CoderResult.throwException():281 org.apache.drill.exec.expr.fn.impl.CharSequenceWrapper.decodeUT8():183 org.apache.drill.exec.expr.fn.impl.CharSequenceWrapper.setBuffer():119 org.apache.drill.exec.test.generated.FiltererGen174.doEval():50 org.apache.drill.exec.test.generated.FiltererGen174.filterBatchNoSV():100 org.apache.drill.exec.test.generated.FiltererGen174.filterBatch():73 org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.doWork():81 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.physical.impl.BaseRootExec.next():104 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1657 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745
Is there are way to handle such data in Apache Drill?

#OP
I know this is an old post, how ever I ran into this challenge last week with new data feed.
Directly in Apache Drill (MapR version), I used STRING_BINARY() to convert the cp1252 set.
Not the elegant or efficient solution, but it works.
apache drill 1.10.0
"drill baby drill"
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT> use sys;
+-------+----------------------------------+
| ok | summary |
+-------+----------------------------------+
| true | Default schema changed to [sys] |
+-------+----------------------------------+
1 row selected (0.975 seconds)
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT> select version from version;
+----------+
| version |
+----------+
| 1.10.0 |
+----------+
1 row selected (0.409 seconds)
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT>
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT> select * from users.`sbalas002c`.drill_spl_char;
+------------------------+--------------------------------------------------------------+
| ORIG_CAMPAIGN_LINE_ID | ORIG_CAMPAIGN_LINE_NAME |
+------------------------+--------------------------------------------------------------+
| 30092278 | 1573256-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_SSEA |
| 30092282 | 1573257-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_WORD |
| 30092286 | 1573254-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_BLIS |
| 30092290 | 1573255-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_NSEA |
+------------------------+--------------------------------------------------------------+
4 rows selected (0.445 seconds)
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT>
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT> select ORIG_CAMPAIGN_LINE_NAME,
. . . . . . . . . . . . . . . . . . . . . . .> substr(ORIG_CAMPAIGN_LINE_NAME,1,4) sub_CAMPAIGN_LINE_NAME
. . . . . . . . . . . . . . . . . . . . . . .> from users.`sbalas002c`.drill_spl_char;
Error: SYSTEM ERROR: DrillRuntimeException: Unexpected byte 0xa0 at position 36 encountered while decoding UTF8 string.
Fragment 0:0
[Error Id: 1889163a-f847-48ad-a7a9-bbe4284e112c on titand-ch2-p20.cable.comcast.com:31010] (state=,code=0)
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT>
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT> select ORIG_CAMPAIGN_LINE_NAME,
. . . . . . . . . . . . . . . . . . . . . . .> STRING_BINARY(ORIG_CAMPAIGN_LINE_NAME) SB_CAMPAIGN_LINE_NAME,
. . . . . . . . . . . . . . . . . . . . . . .> regexp_replace(STRING_BINARY(ORIG_CAMPAIGN_LINE_NAME),'\\xA0','') Good_CAMPAIGN_LINE_NAME
. . . . . . . . . . . . . . . . . . . . . . .> from users.`sbalas002c`.drill_spl_char;
+--------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------+
| ORIG_CAMPAIGN_LINE_NAME | SB_CAMPAIGN_LINE_NAME | Good_CAMPAIGN_LINE_NAME |
+--------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------+
| 1573256-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_SSEA | 1573256-1_306774_SeattleTheatreGroup\xA0_201901_ISV_SEA_Z_SSEA | 1573256-1_306774_SeattleTheatreGroup_201901_ISV_SEA_Z_SSEA |
| 1573257-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_WORD | 1573257-1_306774_SeattleTheatreGroup\xA0_201901_ISV_SEA_Z_WORD | 1573257-1_306774_SeattleTheatreGroup_201901_ISV_SEA_Z_WORD |
| 1573254-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_BLIS | 1573254-1_306774_SeattleTheatreGroup\xA0_201901_ISV_SEA_Z_BLIS | 1573254-1_306774_SeattleTheatreGroup_201901_ISV_SEA_Z_BLIS |
| 1573255-1_306774_SeattleTheatreGroup�_201901_ISV_SEA_Z_NSEA | 1573255-1_306774_SeattleTheatreGroup\xA0_201901_ISV_SEA_Z_NSEA | 1573255-1_306774_SeattleTheatreGroup_201901_ISV_SEA_Z_NSEA |
+--------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------+
4 rows selected (0.64 seconds)
0: jdbc:drill:zk=titana-ch2-p3:5181/drill/TIT>
Hope this helps others.

Related

Print a line on demand without variable in it using format processor

I'm writing a protocol unsing the format processor of perl.
So I have a format like
format err_spooler_line =
##### | Error: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_line, $error_spooler_text_short
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_short
# --> This line should only displayed, when $error_spooler_text_long is set.
| Details:
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_long
.
As workaround I use:
format err_spooler_line =
##### | Error: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_line, $error_spooler_text_short
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_short
# --> This line should only displayed, when $error_spooler_text_long is set.
# So it is working, but it writes some text of the description in the line
~ | Details: ^<
$error_spooler_text_long
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_long
.
If I start my description test with \n it works:
$error_spooler_text_long = "\n" . $error_spooler_text_long. So after "Details" there is a linebreak and the next line starts with the next picture line.
But how can I do this automatically? So that there is no need of prefix my string with an \n.
A complete example:
#!usr/bin/perl
use strict;
# Variables used in the format
my $error_spooler_line;
my $error_spooler_text_short;
my $error_spooler_text_long;
format err_spooler_top =
Protocol: - DATA-Error Page: ####
$%
--------------------------------------------------------------
.
format err_spooler_line =
##### | Error: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_line, $error_spooler_text_short
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_short
| Details:
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_long
.
# The whole protocol is written into a string.
my $error_file_string = "";
open my $hnd_spooler, ">", \$error_file_string;
select((select($hnd_spooler),
$~ = "err_spooler_line",
$^ = "err_spooler_top"
)[0]);
while(<DATA>) {
chomp;
$error_spooler_line = $.;
($error_spooler_text_short, $error_spooler_text_long) = split (/;/);
write $hnd_spooler;
}
close ($hnd_spooler);
## Now, all the protocol is in $error_file_string !
print $error_file_string;
__DATA__
ERR_ID_0 OK;
ERR_ID_278 UPDATE Failed;Update failed cause of DB connection error\nDB error number: 22.
ERR_ID_0 OK;
ERR_ID_33 Invalid data format;Only numbers allowed.
The output is then:
Protocol: - DATA-Error Page: 1
--------------------------------------------------------------
1 | Error: ERR_ID_0 OK
| Details:
2 | Error: ERR_ID_278 UPDATE Failed
| Details:
| Update failed cause of DB connection error\nDB error
| number: 22.
3 | Error: ERR_ID_0 OK
| Details:
4 | Error: ERR_ID_33 Invalid data format
| Details:
| Only numbers allowed.
But I want to have the "Details:" line only it there are Details:
Protocol: - DATA-Error Page: 1
--------------------------------------------------------------
1 | Error: ERR_ID_0 OK
2 | Error: ERR_ID_278 UPDATE Failed
| Details:
| Update failed cause of DB connection error
| DB error number: 22.
3 | Error: ERR_ID_0 OK
4 | Error: ERR_ID_33 Invalid data format
| Details:
| Only numbers allowed.
According to perlform:
Using caret fields can produce lines where all fields are blank. You
can suppress such lines by putting a "~" (tilde) character anywhere in
the line. The tilde will be translated to a space upon output.
The following seems to work:
use strict;
use warnings;
my $error_spooler_line ;
my $error_spooler_text_short;
my $error_spooler_text_long;
format err_spooler =
##### | Error: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_line, $error_spooler_text_short
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_short
~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
#{[(length $error_spooler_text_long) ? "Details:" : ""]}
~~ | ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$error_spooler_text_long
.
select (STDOUT);
$~ = "err_spooler";
while(<DATA>) {
chomp;
$error_spooler_line = $.;
($error_spooler_text_short, $error_spooler_text_long) = split /;/;
write;
}
__DATA__
ERR_ID_0 OK;
ERR_ID_278 UPDATE Failed;Update failed cause of DB connection error, DB error number: 22.
ERR_ID_0 OK;
ERR_ID_33 Invalid data format;Only numbers allowed.
Output:
1 | Error: ERR_ID_0 OK
2 | Error: ERR_ID_278 UPDATE Failed
| Details:
| Update failed cause of DB connection error, DB error number: 22.
3 | Error: ERR_ID_0 OK
4 | Error: ERR_ID_33 Invalid data format
| Details:
| Only numbers allowed.

Perl: Perl6::Form format

I have file something like this,
SR Name Rollno Class
1 Sanjay 01 B
2 Rahul_Kumar_Khanna 09 A
Now I need to add "|" between each. So it should look like
SR | Name |Rollno | Class|
1 | Sanjay |01 | B |
2 | Rahul_Kumar_Khanna|09 | A |
I am using Perl6::form
my $text;
foreach my $line (#arr) {
my ($SR, $Name, $Rollno, $Class) = split (" ", $line);
my $len = length $Name;
$text = form
'| {||||||||} | {||||||||} | {||||||||} | {||||||||}|',
$SR, $Name, $Rollno, $Class;
print $text;
}
Here till now I have done but the name is not comming out properly. I have add extra "|" in name for that. Is there any way we can add "|" by calculating length like(below). I tried but getting error.
'| {||||||||} | {||||||||}x$len | {||||||||} | {||||||||}|',
Problem #1
'| {||||||||} | {||||||||}x$len | {||||||||} | {||||||||}|'
produces
| {||||||||} | {||||||||}x20 | {||||||||} | {||||||||}|
but you're trying to get
| {||||||||} | {||||||||||||||||||||} | {||||||||} | {||||||||}|
For that, you'd want
'| {||||||||} | {'.( "|" x $len ).'} | {||||||||} | {||||||||}|'
Problem #2
$len is the length of the name field of the current row. It's different for every row. This is wrong, cause you want the output to be the same width for every row. $len needs to be the length of the longest name field.
You will need to find the correct value for $len before even starting the loop.
# Read in the data as an array of rows.
# Each row is an array of values.
my #rows = map { [ split ] } <>;
# Find the maximum width of each column.
my #col_lens = (0) x #{rows[0]};
for my $row (#rows) {
# Skip the blank line after the header.
next if !#$row;
for my $col_idx (0..$#$row) {
my $col_len = $row->[$col_idx];
if ($col_lens->[$col_idx] < $col_len) {
$col_lens->[$col_idx] = $col_len;
}
}
}
my $form =
join "",
"| ",
"{".( "|"x($col_lens[0]-2) )."}",
" | ",
"{".( "|"x($col_lens[1]-2) )."}",
" | ",
"{".( "|"x($col_lens[2]-2) )."}",
" | ",
"{".( "|"x($col_lens[3]-2) )."}",
" |";
for my $row (#rows) {
if (#$row) {
print form($form, #$row);
} else {
print "\n";
}
}

powershell log - new file every day

I log network status (nslookup, ping, tracert) into a log file. The log file size grows and is quite difficult to work with after a while.
I am looking for a way to have a new log file created for every day. I am sure it's easy but I did not find a way.
When I set log file name to $log_file = ".\network_" + (Get-Date -f yyyy-MM-dd) + ".log" hoping it will create a new file when date changes, it does not work. Instead I get Non-authoritative answer:.
Sorry for newbie question but I really did not find any answer. Thx!
edit:
it's really simple script (shortened example below):
$log_file = ".\network_" + (Get-Date -f yyyy-MM-dd_HH-mm) + ".log"
$server = "server.com"
$gateway = (Get-wmiObject Win32_networkAdapterConfiguration | ?{$_.IPEnabled}).DefaultIPGateway
& ipconfig /all >> $log_file
while($true) {
$timestamp = "rnrn[" + (Get-Date -f yyyy-MM-dd) + " " +(Get-Date -f HH:mm:ss) + "]"
$timestamp >> $log_file
"rnrn" >> $log_file
"ping to $server)" >> $log_file
& ping $server >> $log_file
}
Here's what I usually do.
$Date = get-date -format yyyy-MM-dd
$log_file = "\\Share\folder\folder\FileName-$date.log"
I would suggest including the date/time (yyyy-MM-dd_HH-mm-ss) in the log file name (to prevent duplicate file names) and scheduling your script to run daily / hourly or whenever. Just tell the task scheduler service to end the task if it runs for X number of hours, where X is right before it is scheduled to start again. This should give you the intended result of a new file on the schedule you decide is best. This also ensures your script continues running even if the computer reboots.
If the size of the log file is your main issue, then I would suggest starting a new file based on the file size and not the date. This way you can choose exactly what size files you want to work with.
To do this, just change while($true) to while((Get-Item $log_file).Length -lt "100000") where the length is the size in bytes that you want the script to stop at.
To make your script create a new log file when the while statement is triggered, just wrap it in a function and then call the function using another while $true statement.
Here is the change:
function NetworkLogging {
$log_file = "$PSScriptRoot\network_" + (Get-Date -f yyyy-MM-dd_HH-mm-ss) + ".log"
$server = "server.com"
$gateway = (Get-wmiObject Win32_networkAdapterConfiguration | ?{$_.IPEnabled}).DefaultIPGateway
& ipconfig /all >> $log_file
while((Get-Item $log_file).Length -lt "100000") {
$timestamp = "rnrn[" + (Get-Date -f yyyy-MM-dd) + " " +(Get-Date -f HH:mm:ss) + "]"
$timestamp >> $log_file
"rnrn" >> $log_file
"ping to $server)" >> $log_file
& ping $server >> $log_file
}
}
while ($true){NetworkLogging}
You might consider the Log-Entry framework (also on GitHub) I published awhile ago:
It has basically all the features were you ask for:
File is automatically truncated if it grows over ~100Kb (default)
Time stamps
Inline logging
Proper type casting (e.g. notice that the $gateway has also a $Null property in my case)
Function Main {
Log -File ".\Network.log"
$server = Log "Server:" "192.168.1.1" ?
$gateway = Log "Gateway:" (Get-wmiObject Win32_networkAdapterConfiguration | ?{$_.IPEnabled}).DefaultIPGateway ?
Log "IP Config:" ((& ipconfig /all) -Join "`r`n")
Log "Ping to $Server" ((& ping $server) -Join "`r`n")
}
The log will look like this:
2017-06-08 Test (version: 01.00.02, PowerShell version: 5.X.X5063.296)
09:13:14.01 C:\Users\User\Network.ps1
15:28:14.67 Server: 192.168.X.X
15:28:14.75 Gateway: #($Null, "192.168.X.X")
15:28:14.78 IP Config: Windows IP Configuration
Host Name . . . . . . . . . . . . : Computer
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : lan
Ethernet adapter Ethernet:
Connection-specific DNS Suffix . : lan
Description . . . . . . . . . . . : Intel(R) 82579LM Gigabit Network Connection
Physical Address. . . . . . . . . : XX-XX-XX-XX-XX-XX
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
Link-local IPv6 Address . . . . . : XXXX::XXXX:XXXX:XXXX:XXXX%8(Preferred)
IPv4 Address. . . . . . . . . . . : 192.168.X.X(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Lease Obtained. . . . . . . . . . : Thursday, June 8, 2017 9:08:58 AM
Lease Expires . . . . . . . . . . : Friday, June 9, 2017 9:08:57 AM
Default Gateway . . . . . . . . . : 192.168.X.X
DHCP Server . . . . . . . . . . . : 192.168.X.X
DHCPv6 IAID . . . . . . . . . . . : 9808.X.X
DHCPv6 Client DUID. . . . . . . . : XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX
DNS Servers . . . . . . . . . . . : 192.168.X.X
NetBIOS over Tcpip. . . . . . . . : Enabled
Ethernet adapter VirtualBox Host-Only Network:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : VirtualBox Host-Only Ethernet Adapter
Physical Address. . . . . . . . . : XX-XX-XX-XX-XX-XX
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Link-local IPv6 Address . . . . . : XXXX::XXXX:XXXX:XXXX:XXXX%5(Preferred)
IPv4 Address. . . . . . . . . . . : 192.168.X.X(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . :
DHCPv6 IAID . . . . . . . . . . . : 420085799
DHCPv6 Client DUID. . . . . . . . : XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX-XX
DNS Servers . . . . . . . . . . . : XXXX:0:0:XXXX:.X.X
XXXX:0:0:XXXX::2%1
XXXX:0:0:XXXX::3%1
NetBIOS over Tcpip. . . . . . . . : Enabled
Wireless LAN adapter Wi-Fi:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . : lan
Description . . . . . . . . . . . : Intel(R) Centrino(R) AdvancXX-N 6235
Physical Address. . . . . . . . . : XX-XX-XX-XX-XX-XX
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
Wireless LAN adapter Local Area Connection* 2:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter
Physical Address. . . . . . . . . : XX-XX-XX-XX-XX-XX
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
Ethernet adapter Bluetooth Network Connection:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Bluetooth Device (Personal Area Network)
Physical Address. . . . . . . . . : XX-XX-XX-XX-XX-XX
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
15:28:17.82 Ping to 192.168.X.X Pinging 192.168.X.X with 32 bytes of data:
Reply from 192.168.X.X: bytes=32 time<1ms TTL=64
Reply from 192.168.X.X: bytes=32 time=1ms TTL=64
Reply from 192.168.X.X: bytes=32 time<1ms TTL=64
Reply from 192.168.X.X: bytes=32 time<1ms TTL=64
Ping statistics for 192.168.X.X:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 1ms, Average = 0ms
15:28:17.84 End

Creating a forward looking -match

Playing around with PS and I have a simple script.
ipconfig /all | where-object {$_ -match "IPv4" -or $_ -match "Description"}
This is great and does what i would expect. What I would like to do is read ahead and only show the description preceding the IPv4 line. Or reverse search and get the ipv4 and the next description then look for the next IPv4 etc.
Is there a way to do this without spinning through creating an array and then spinning through the array extricating the meaningful parts?
This command on my laptop results in:
Description . . . . . . . . . . . : Microsoft Virtual WiFi Miniport Adapter
Description . . . . . . . . . . . : Killer Wireless-N 1103 Network Adapter
IPv4 Address. . . . . . . . . . . : 192.168.1.2(Preferred)
Description . . . . . . . . . . . : Atheros AR8151 PCI-E Gigabit Ethernet Controller (NDIS 6.20)
Description . . . . . . . . . . . : VMware Virtual Ethernet Adapter for VMnet1
IPv4 Address. . . . . . . . . . . : 192.168.122.1(Preferred)
Description . . . . . . . . . . . : VMware Virtual Ethernet Adapter for VMnet8
IPv4 Address. . . . . . . . . . . : 192.168.88.1(Preferred)
Description . . . . . . . . . . . : Microsoft ISATAP Adapter
Description . . . . . . . . . . . : Microsoft ISATAP Adapter #2
Description . . . . . . . . . . . : Microsoft ISATAP Adapter #3
Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface
Description . . . . . . . . . . . : Microsoft ISATAP Adapter #4
Description . . . . . . . . . . . : Microsoft ISATAP Adapter #5
What I want is:
Description . . . . . . . . . . . : Killer Wireless-N 1103 Network Adapter
IPv4 Address. . . . . . . . . . . : 192.168.1.2(Preferred)
Description . . . . . . . . . . . : VMware Virtual Ethernet Adapter for VMnet1
IPv4 Address. . . . . . . . . . . : 192.168.122.1(Preferred)
Description . . . . . . . . . . . : VMware Virtual Ethernet Adapter for VMnet8
IPv4 Address. . . . . . . . . . . : 192.168.88.1(Preferred)
If you want to extract all Descriptions for IPv4 enabled adapters, you could try something like this:
ipconfig /all | Select-String "IPv4" -AllMatches -SimpleMatch -Context 5 | % {
$_.Context.Precontext -match "Description" -replace 'Description(?:[^:]+):(.*)$', '$1'
}
Intel(R) 82579V Gigabit Network Connection
To get it with your code, try this:
ipconfig /all | where-object {
$_ -match "IPv4" -or $_ -match "Description"
} | Select-String "IPv4" -SimpleMatch -AllMatches -Context 1 | % {
$_.context.precontext -replace 'Description(?:[^:]+):(.*)$', '$1'
}
EDIT Sorry, I misread your question earlier it seems. I thought you only wanted the description. This shows the description and IP lines for IPv4 active adapters
ipconfig /all | Select-String "IPv4" -AllMatches -SimpleMatch -Context 5 | % {
$_.Context.Precontext -match "Description"
$_.Line
}
Description . . . . . . . . . . . : Intel(R) 82579V Gigabit Network Connection
IPv4 Address. . . . . . . . . . . : xx.xx.xx.xx(Preferred)
Alternative solution:
[regex]$regex = '(?ms)^\s*(Description[^\r]+\r\n\s*IPv4[^\r]+)\r'
$regex.matches(((ipconfig /all) -match '^\s*Description|IPv4') -join "`r`n") |
foreach {$_.groups[1].value -replace '\. ',''}
Another option, which simply keeps track of the last description found in the output:
switch -regex ( ipconfig /all ) { 'IPv4' { $d + $_ } 'Description' { $d = #($_) } }
Also, the -match comparison operator can work on an array as well as a single string. So using (ipconfig /all) -match 'IPv4|Description' is equivalent to the original ipconfig /all | where { $_ -match 'IPv4' -or $_ -match 'Description' } that checks each line individually.

How to preserve trailing spaces on perl variables?

Script (originally copied from here) takes a fixed-width text file as input, rearranges the order of columns, and should output a fixed-width text file. But trailing spaces are being truncated from the variables, which means the output isn't fixed-width.
open(INPUT, "</home/ecom/tmp/citiBIG/GROUP.txt");
open(OUTPUT, ">/home/ecom/tmp/citiBIG/GROUP2.txt");
my $LINEFORMAT = "A2 A7 A14 A4 A2 A2 A4 A12 A25 A30 A26 A40 A40 A40 A25 A4 A12 A14 A2 A8 A12 A70 A8"; # Adjust to your
field widths
while(<INPUT>) {
chomp;
my($Null0, $EmpNum, $CcNumber, $Null1, $CcExpYy, $CcExpMm, $Null2, $Title, $LastName, $FirstName, $HolderName, $Ad
dress1, $Address2, $Address3, $Suburb, $State, $PostCode, $Null3, $AreaCode, $WorkPhone, $Null4, $Email, $GroupName) =
unpack($LINEFORMAT, $_);
print OUTPUT $EmpNum . " " . "~" . $LastName . "~" . $FirstName . "~" . $Title . " " . "~" .
$Address1 . "~" . $Address2 . "~" . $Address3 . "~" . $Suburb . "~" . $PostCode . "~" . $State . "~" . $AreaCode . "~"
. $WorkPhone . "~" . $CcNumber . "~" . $CcExpMm . "~" . $CcExpYy . "~" . $HolderName . "~" . $Email . "~" . $GroupNam
e . " " . "~" . "\n";
}
close INPUT;
close OUTPUT;
perldoc -f pack suggests:
o The "a", "A", and "Z" types gobble just one value, but pack
it as a string of length count, padding with nulls or
spaces as needed. When unpacking, "A" strips trailing
whitespace and nulls, "Z" strips everything after the first
null, and "a" returns data without any sort of trimming.
Maybe you could try "a" instead of "A" in the format string? Alternatively you could use printf to pad the output fields to the desired widths.