Forget string deduction - powershell

I am not sure I am phrasing this correctly so I'd rather show.
I am trimming a string in this way:
$input = '12345'
$string = $input.Substring(1,$string.Length-1)
The idea is to remove the first and the final character. It works fine on the first run. On the second run the length is already -1 so two characters are actually trimmed.
However I want the script to always deduct the final character (5) even after the first run. How do I reset it ?
Thank you.

The second parameter of Substring is the length of the substring, not the ending index. Hence you want the string to be 2 characters shorter:
$inputstring = "12345"
$string = $inputstring
while ($string.Length -gt 2)
{
$string
$string = $string.Substring(1,$string.Length-2)
}
$string
This outputs:
12345
234
3

Related

Powershell - Remove text and capitalise some letters

Been scratching my head on this one...
I'd like to remove .com and capitalize S and T from: "sometext.com"
So output would be Some Text
Thank you in advance
For most of this you can use the replace() member of the String object.
The syntax is:
$string = $string.replace('what you want replaced', 'what you will replace it with')
Replace can be used to erase things by using blank quotes '' for the second argument. That's how you can get rid of .com
$string = $string.replace('.com','')
It can also be used to insert things. You can insert a space between some and text like this:
$string = $string.replace('et', 'e t')
Note that using replace does NOT change the original variable. The command below will print "that" to your screen, but the value of $string will still be "this"
$string = 'this'
$string.replace('this', 'that')
You have to set the variable to the new value with =
$string = "this"
$string = $string.replace("this", "that")
This command will change the value of $string to that.
The tricky part here comes in changing the first t to capital T without changing the last t. With strings, replace() replaces every instance of the text.
$string = "text"
$string = $string.replace('t', 'T')
This will set $string to TexT. To get around this, you can use Regex. Regex is a complex topic. Here just know that Regex objects look like strings, but their replace method works a little differently. You can add a number as a third argument to specify how many items to replace
$string = "aaaaaa"
[Regex]$reggie = 'a'
$string = $reggie.replace($string,'a',3)
This code sets $string to AAAaaa.
So here's the final code to change sometext.com to Some Text.
$string = 'sometext.com'
#Use replace() to remove text.
$string = $string.Replace('.com','')
#Use replace() to change text
$string = $string.Replace('s','S')
#Use replace() to insert text.
$string = $string.Replace('et', 'e t')
#Use a Regex object to replace the first instance of a string.
[regex]$pattern = 't'
$string = $pattern.Replace($string, 'T', 1)
What you're trying to achieve isn't well-defined, but here's a concise PowerShell Core solution:
PsCore> 'sometext.com' -replace '\.com$' -replace '^s|t(?!$)', { $_.Value.ToUpper() }
SomeText
-replace '\.com$' removes a literal trailing .com from your input string.
-replace '^s|t(?!$), { ... } matches an s char. at the start (^), and a t that is not (!) at the end ($); (?!...) is a so-called negative look-ahead assertion that looks ahead in the input string without including what it finds in the overall match.
Script block { $_.Value.ToUpper() } is called for each match, and converts the match to uppercase.
-replace (a.k.a -ireplace) is case-INsensitive by default; use -creplace for case-SENSITIVE replacements.
For more information about PowerShell's -replace operator see this answer.
Passing a script block ({ ... }) to dynamically determine the replacement string isn't supported in Windows PowerShell, so a Windows PowerShell solution requires direct use of the .NET [regex] class:
WinPs> [regex]::Replace('sometext.com' -replace '\.com$', '^s|t(?!$)', { param($m) $m.Value.ToUpper() })
SomeText

In Perl, substitution operator for removing space is removing value 0 when removing space

We have code where input could be either single value or comma separated value. We need to remove any spaces present before and after each value.
We are doing as below:
my #var_1 = split /,/,$var;
print "print_1 : #var_1 \n ";
#var_1 = grep {s/^\s+|\s+$//g; $_ } #var_1;
print "print_2 : #var_1 \n ";
$var would contain input value. If the $var is 0 , in print_1 is printing value 0 but print_2 is printing nothing. Our requirement was just to remove spaces before and after value 0. But if the $var is 1, both print (print_1 and print_2) is correctly printing value 1. if we give input as 1,0 it is removing 0 and printing value 1 in print_2.
I am not sure why it is removing value 0. Is there any correction that can be done to substitution operator not to remove value 0 ?
Thanks in advance!!!
In Perl, only a few distinct values are false. These are primarily
undef
the integer 0
the unsigned integer 0
the floating point number 0
the string 0
the empty string ""
You've got the empty string variant and 0 here.
#var_1 = grep {s/^\s+|\s+$//g; $_ } #var_1;
This code can go in three ways:
$_ gets cleaned up and becomes foo. We want it to pass.
$_ gets cleaned up and becomes 0. We want it to pass.
$_ gets cleaned up and becomes the empty string "". We want it to fail.
But what happens is that because 0 is false, and grep only lets it through if the last statement in its block is true. That's what we want for the empty string "", but not for 0.
#var_1 = grep {s/^\s+|\s+$//g; $_ ne "" } #var_1;
Now that we check explicitly that the cleaned up value is not the empty string "", zero 0 is allowed.
Here's a complete version with cleaned up variable names (naming is important!).
my $input = q{foo, bar, 1 23 ,,0};
my #values = split /,/,$input;
print "print_1 : #values \n ";
#values = grep {s/^\s+|\s+$//g; $_ ne q{} } #values;
print "print_2 : #values \n ";
The output is:
print_1 : foo bar 1 23 0
print_2 : foo bar 1 23 0
Note that your grep is not the optimal solution. As always, there is more than one way to do it in Perl. The for loop that Сухой27 suggests in their answer is way more concise and I would go with that.
If you want to split on commas and removing leading and trailing whitespace from each of the resulting strings, that translates pretty literally into code:
my #var = map s/^\s+|\s+\z//gr, split /,/, $var, -1;
/r makes s/// return the result of the substitution (requires perl 5.14+). -1 on the split is required to keep it from ignoring trailing empty fields.
If there are no zero length entries (so not e.g. a,,b), you can just extract what you want (sequences of non-commas that don't start or end with whitespace) directly from the string instead of first splitting it:
#var = $var =~ /(?!\s)[^,]+(?<!\s)/g;
You want
#var_1 = map { my $v = s/^\s+|\s+$//gr; length($v) ? $v : () } #var_1
instead of,
#var_1 = grep {s/^\s+|\s+$//g; $_ } #var_1;
grep is used for filtering list elements, and all false values are filtered (including '', 0, and undef)
I suggest cleaning up the array using map with a regex pattern that matches from the first to the last non-space character
There's also no need to do the split operation separately
Like this
my #var_1 = map { / ( \S (?: .* \S )? ) /x } split /,/, $var;
Note that this method removes empty fields. It's unclear whether that was required or not
You can also use
#values = map {$_ =~ s/^\s+|\s+$//gr } #values;
or even more concise
#values = map {s/^\s+|\s+$//gr } #values;
to remove spaces in you array
Do not forget the r as it is the non-destructive option, otherwise you will replace your string by the number of occurences of spaces.
This said it will only work if you use Perl 5.14 or higher,
some documentation here ;)
https://www.perl.com/pub/2011/05/new-features-of-perl-514-non-destructive-substitution.html
I think this synthax is easier to understand since it is closer to the "usual" method of substitution.

Using PowerShell To Count Sentences In A File

I am having an issue with my PowerShell Program counting the number of sentences in a file I am using. I am using the following code:
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("?")
$n = $Sentence.Split(".")
$Sentences += $i.Length
$Sentences += $n.Length
}
The total number of sentences I should get is 61 but I am getting 71, could someone please help me out with this? I have Sentences set to zero as well.
Thanks
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("[?\.]")
$Sentences = $i.Length
}
I edited your code a bit.
The . that you were using needs to be escaped, otherwise Powershell recognises it as a Regex dotall expression, which means "any character"
So you should split the string on "[?\.]" or similar.
When counting sentences, what you are looking for is where each sentence ends. Splitting, though, returns a collection of sentence fragments around those end characters, with the ends themselves represented by the gap between elements. Therefore, the number of sentences will equal the number of gaps, which is one less the number of fragments in the split result.
Of course, as Keith Hill pointed out in a comment above, the actual splitting is unnecessary when you can count the ends directly.
foreach( $Sentence in (Get-Content test.txt) ) {
# Split at every occurrence of '.' and '?', and count the gaps.
$Split = $Sentence.Split( '.?' )
$SplitSentences += $Split.Count - 1
# Count every occurrence of '.' and '?'.
$Ends = [char[]]$Sentence -match '[.?]'
$CountedSentences += $Ends.Count
}
Contents of test.txt file:
Is this a sentence? This is a
sentence. Is this a sentence?
This is a sentence. Is this a
very long sentence that spans
multiple lines?
Also, to clarify on the remarks to Vasili's answer: the PowerShell -split operator interprets a string as a regular expression by default, while the .NET Split method only works with literal string values.
For example:
'Unclosed [bracket?' -split '[?]' will treat [?] as a regular expression character class and match the ? character, returning the two strings 'Unclosed [bracket' and ''
'Unclosed [bracket?'.Split( '[?]' ) will call the Split(char[]) overload and match each [, ?, and ] character, returning the three strings 'Unclosed ', 'bracket', and ''

How to get rid of control characters in perl.. specifically [gs]?

my code is as follows
my $string = $cells[71];
print $string;
this prints the string but where spaces should be there is a box with 01 10 in it. I opened it in Notepad++ and the box turned into a black GS (which i am assuming is group separator).
I looked online and it said to use:
s/[^[:print:]]+//g
but when i set the string to:
my $string =~s/[^[:print:]]+//g
and I run the program i get:
4294967295
How do i resolve this?
I did what HOBBS said and it worked... thanks :)
Is there anyway I could print an enter where each of these characters are ( the box with 1001)?
When doing a regex match, you need to be careful to write $var =~ /pattern/, not $var = ~ /pattern/. When you use the second one, you're doing /pattern/, which is a regex match against $_, returning a number in scalar context. Then you do ~, which takes the bitwise inverse of that number, then ($var =) you assign that result to $var. Not what you wanted at all.
You have to assign the variable first, then do the substitution:
my $string = $cells[71];
$string =~ s/[^[:print:]]+//g;

Perl - get first "word" from input string

I am trying to write a Perl program that reads in lines from a text file, and, for each line, extract the first "word" from the line, and perform a different action based on the string that gets returned.
The main loop looks like this:
while(<AXM60FILE>) {
$inputline = $_;
($start) = ($inputline =~ /\A(.*?) /);
perform something, based on the value of string in $start
}
The input file is actually a parameter file, with the parameter_name and parameter_value, separated by a colon (":"). There can be spaces or tabs before or after the colon.
So, the file looks (for example) like the following:
param1: xxxxxxxxxxxx
param2 :xxxxxxxxxxxxx
param3 : xxxxxxxxxxxxxxxxx
param4:xxxxxxxxxxxxx
That "($start) = ($inputline =~ /\A(.*?) /);" works ok for the "param2" example and the "param3" example where the 1st word is terminated by a blank/space, but how can I handle the "param1" and "param4" situations, where the parameter_name is followed immediately by the colon?
Also, what about if the "whitespace" is a tab or tabs, instead of blank/space character?
Thanks,
Jim
This will cover all of your cases and then some:
my ($key, $value) = split /\s*:\s*/, $inputline, 2;
(Or, in English, split $inputline into a maximum of two elements separated by any amount of whitespace, a colon and any amount of whitespace.)
($start) = $inputline =~ /\A([^:\s]+)/;
This will match anything except whitespace and : at the beginning of the line.
Or using split:
($start) = split /[:\s]+/, $inputline, 2;