How to convert spaces into new line using perl? - perl

I have the following input which is stored in single scalar variable named as $var1.
Input(i.e stored in $var1)
Gain Dead_coverage Export_control Functional_coverage Function_logic top dac_decoder Datapath System_Level Black_DV Sync_logic temp1 temp2 temp3 temp4 temp5 temp6 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
Expected output:
Gain
Dead_coverage
Export_control
Functional_coverage
Function_logic
top
dac_decoder
Datapath
System_Level
Black_DV
Sync_logic
temp1
temp2
temp3
temp4
temp5
temp6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
My code:
I had tried the following regular expression.
$var1=tr{\s}{\n};
The above regular expression not brings my expected output.
Note:the numbers may range upto n numbers and the character may starts or ends with capital or lower case.Whatever i need to bring like the expected output.For that which regular expression can i use it.
Requirements:
1.split space into new line.
2.for numbers(i.e 123456789101112.....) it should be considered as follows
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
so on,...
After digit 9 the other numbers should be considered as double digit.

tr is a transliteration. That only works with individual characters, not patterns. You need to use s/// with the /g modifier.
$var1 =~ s/\s/\n/g;
You can also do this with split and join.
$var1 = join "\n", split / /, $var1;
It shouldn't make a difference in terms of performance, even if there are a lot of strings.

Related

How to read a block of rows into a single record with PowerShell?

How would columns of data for a block of text:
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ cat multiple_lines.data
a 4
b 5
d 6
e 7
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ datamash transpose < multiple_lines.data > transposed.data
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ cat transposed.data
a 4 b 5 d 6 e 7
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ datamash transpose < transposed.data
a 4
b 5
d 6
e 7
nicholas#mordor:~/powershell$
be fed into a CSV type file so that column a has the value 4, etc? Here c has been omitted, but it can be assumed to be present. Or, at least that missing columns can be added.
No doubt awk would be fantastic at grabbing the above numbers, but looking to use PowerShell here. Output to json or xml would be just as good as CSV, most any sort of output like data interchange format would be fine.
Assuming an array of such blocks.
Use Import-Csv instead of ConvertFrom-Csv when reading from a file. Powershell will recognize automaically the space deliminater instead of the comma.
$txt = #"
a 4
b 5
d 6
e 7
"#
$table = $txt | ConvertFrom-Csv
$table

Combining Multiple String Commands Into One Line

I'm using PowerShell and running a tool to extract Lenovo hardware RAID controller info to identify the controller number for use later on in another command line (this is part of a SCCM Server Build Task Sequence). The tool outputs a lot of data and I'm trying to isolate just what I need from the output.
I've been able to isolate what I need, but I'm thinking there has to be a more efficient way so looking for optimizations. I'm still learning when it comes to working with strings.
The line output from the tool that I'm looking for looks like this:
0 0 0 252:0 17 DRIVE Onln N 557.861 GB dsbl N N dflt -
I'm trying to get the 3 characters to the left of the :0 (the 252 but on other models this could be 65 or some other 2 or 3 digit number)
My existing code is:
$ControllerInfo = cmd /c '<path>\storcli64.exe /c0 show'
$forEach ($line in $ControllerInfo) {
if ($line -like '*:0 *') {
$ControllerNum = $line.split(':')[0] # Get everything left of :
$ControllerNum = $ControllerNum.Substring($ControllerNum.Length -3) # Get last 3 chars of string
$ControllerNum = $ControllerNum.Replace(' ', '') # Remove blanks
Write-Host $ControllerNum
break #stop looping through output
}
}
The above works but I'm wondering if there's a way to combine the three lines that start with $ControllerNum = so I can have just have a single $ControllerNum = (commands) line to set the variable instead of doing it in 3 lines. Basically want to combine the Split, Substring and Replace commands into a single line.
Thanks!
Here's another option:
$ControllerNum = ([regex]'(\d{2,3}):0').Match($line).Groups[1].Value
Used on your sample 0 0 0 252:0 17 DRIVE Onln N 557.861 GB dsbl N N dflt -
the result in $ControllerNum wil be 252
If you want just the last digits before the first :, without any whitespace, you can do that with one or two regex expressions:
$line -replace '^.*\b(\d+):.*$','$1'
Regex explanation:
^ # start of string
.* # any number of any characters
\b # word boundary
( # start capture group
\d+ # 1 or more strings
) # end capture group
: # a literal colon (:)
.* # any number of any characters
$ # end of string
replacement:
$1 # Value captured in the capture group above

split a line into its components

I need to split the lines of an input file into its columns.
ATOM 0 HB3 ALA C 999 28.811 -7.680 12.279 1.00 57.53 H
ATOM 7637 N PRO C1000 27.299 -5.667 10.647 1.00216.82 N
The code I have works fine, as long as the 6th column is <1000, or shorter than 4 digits:
($ATOM, $atom_num, $atom_type, $res, $chain, $res_num) = split(" ", $pdb)
However as soon as column 6 reaches 1000, it will no longer discriminate the two columns. I am no expert in perl, but the code I am dealing with is perl, so I need to figure out how to split this e.g. by the number of digits of each column.
Any suggestions?
I solved it by using unpack and defining the length of each column.
$format = 'A6 A6 A5 A4 A1 A5';
($ATOM, $atom_num, $atom_type, $res, $chain, $res_num) = unpack($format, $pdb);

Perl: print certain rows based on certain values of column

Hey guys im begginer in Perl programming ,on my list.txt i have a 5 row and 7 columns what i want to do is print certain rows based on the value that the column have for example:
NO. RES REF ERRORS WARNING PROB_E PROB_C
1 k C 0 0 0.240 0.713
2 l C 16 2 0.365 0.568
3 n C 7 4 0.365 0.568
4 f E 0 0 0.613 0.342
I want to print from the column 3,4(error and warnings ) all the rows that have value different than 0. In this case the output to is the row 2 and 3.I hope i make myself clear :) sorry for my poor english.
Try this:
perl -ane 'print if ($F[3] or $F[4])' list.txt

Unicode character transformation in SPSS

I have a string variable. I need to convert all non-digit characters to spaces (" "). I have a problem with unicode characters. Unicode characters (the characters outside the basic charset) are converted to some invalid characters. See the code for example.
Is there any other way how to achieve the same result with procedure which would not choke on special unicode characters?
new file.
set unicode = yes.
show unicode.
data list free
/T (a10).
begin data
1234
5678
absd
12as
12(a
12(vi
12(vī
12āčž
end data.
string Z (a10).
comp Z = T.
loop #k = 1 to char.len(Z).
if ~range(char.sub(Z, #k, 1), "0", "9") sub(Z, #k, 1) = " ".
end loop.
comp Z = normalize(Z).
comp len = char.len(Z).
list var = all.
exe.
The result:
T Z len
1234 1234 4
5678 5678 4
absd 0
12as 12 2
12(a 12 2
12(vi 12 2
12(vī 12 � 6
>Warning # 649
>The first argument to the CHAR.SUBSTR function contains invalid characters.
>Command line: 1939 Current case: 8 Current splitfile group: 1
12āčž 12 �ž 7
Number of cases read: 8 Number of cases listed: 8
The substr function should not be used on the left hand side of an expression in Unicode mode, because the replacement character may not be the same number of bytes as the character(s) being replaced. Instead, use the replace function on the right hand side.
The corrupted characters you are seeing are due to this size mismatch.
How about instead of replacing non-numeric characters, you cycle though and pull out the numeric characters and rebuild Z? (Note my version here is pre CHAR. string functions.)
data list free
/T (a10).
begin data
1234
5678
absd
12as
12(a
12(vi
12(vī
12āčž
12as23
end data.
STRING Z (a10).
STRING #temp (A1).
COMPUTE #len = LENGTH(RTRIM(T)).
LOOP #i = 1 to #len.
COMPUTE #temp = SUBSTR(T,#i,1).
DO IF INDEX('0123456789',#temp) > 0.
COMPUTE Z = CONCAT(SUBSTR(Z,1,#i-1),#temp).
ELSE.
COMPUTE Z = CONCAT(SUBSTR(Z,1,#i-1)," ").
END IF.
END LOOP.
EXECUTE.