Replace character in pig in HDInsight using powershell - powershell

My data is in the following format..
{"Foo":"ABC","Bar":"20090101100000","Quux":"{\"QuuxId\":1234,\"QuuxName\":\"Sam\"}"}
I need it to be in this format:
{"Foo":"ABC","Bar":"20090101100000","Quux":{"QuuxId":1234,"QuuxName":"Sam"}}
I'm trying to using Pig's replace function to get it in the format I need..
So, I tried as in here..
#Specify the cluster name
$clusterName = "CLUSTERNAME"
#Where the output will be saved
$statusFolder = "/tutorial/pig/status"
#Store the Pig Latin into $QueryString
$QueryString = "LOGS = LOAD 'wasb:///example/data/sample.log'as unparsedString:chararray;" +
"REPL1 = foreach LOGS REPLACE($0, '"\\{', '\\{');"
...and so on..
I receive an error at the second line (REPL1 =...)
Unexpected token '\\' in expression or statement.
Now this code works perfectly well when I run it using remote desktop
Any help is sincerely appreciated..
Thanks

I assume you attempt to store the following string value in the variable:
REPL1 = foreach LOGS REPLACE($0, '"\\{', '\\{');
The first "interpretation" of your string is by the PowerShell parser. Since you use double-quotes ("), it's treated as an expandable string.
Since you don't escape the " inside the REPLACE() statement, the parser assumes that the string stops there.
What you're left with is:
"REPL1 = foreach LOGS REPLACE(, '"
# a valid string, $0 expanded to an empty string
\\
# two slashes , PowerShell cannot resolve these to anything meaningful
{
# opening curly brace
', '
# a valid string literal
\\
# two slashes , PowerShell still cannot resolve these to anything meaningful
{
# opening curly brace
');"
# non-terminated string
You need to escape the " inside REPLACE(), either by using a two double-quotes in succession (""), or use the backtick escape sequence (\"`):
$QueryString += "REPL1 = foreach LOGS REPLACE($0, '`"\\{', '\\{');"
or
$QueryString += "REPL1 = foreach LOGS REPLACE($0, '""\\{', '\\{');"
Your might also want to escape $0, to avoid string expansion:
$QueryString += "REPL1 = foreach LOGS REPLACE(`$0, '""\\{', '\\{');"

Related

Powershell - Remove text and capitalise some letters

Been scratching my head on this one...
I'd like to remove .com and capitalize S and T from: "sometext.com"
So output would be Some Text
Thank you in advance
For most of this you can use the replace() member of the String object.
The syntax is:
$string = $string.replace('what you want replaced', 'what you will replace it with')
Replace can be used to erase things by using blank quotes '' for the second argument. That's how you can get rid of .com
$string = $string.replace('.com','')
It can also be used to insert things. You can insert a space between some and text like this:
$string = $string.replace('et', 'e t')
Note that using replace does NOT change the original variable. The command below will print "that" to your screen, but the value of $string will still be "this"
$string = 'this'
$string.replace('this', 'that')
You have to set the variable to the new value with =
$string = "this"
$string = $string.replace("this", "that")
This command will change the value of $string to that.
The tricky part here comes in changing the first t to capital T without changing the last t. With strings, replace() replaces every instance of the text.
$string = "text"
$string = $string.replace('t', 'T')
This will set $string to TexT. To get around this, you can use Regex. Regex is a complex topic. Here just know that Regex objects look like strings, but their replace method works a little differently. You can add a number as a third argument to specify how many items to replace
$string = "aaaaaa"
[Regex]$reggie = 'a'
$string = $reggie.replace($string,'a',3)
This code sets $string to AAAaaa.
So here's the final code to change sometext.com to Some Text.
$string = 'sometext.com'
#Use replace() to remove text.
$string = $string.Replace('.com','')
#Use replace() to change text
$string = $string.Replace('s','S')
#Use replace() to insert text.
$string = $string.Replace('et', 'e t')
#Use a Regex object to replace the first instance of a string.
[regex]$pattern = 't'
$string = $pattern.Replace($string, 'T', 1)
What you're trying to achieve isn't well-defined, but here's a concise PowerShell Core solution:
PsCore> 'sometext.com' -replace '\.com$' -replace '^s|t(?!$)', { $_.Value.ToUpper() }
SomeText
-replace '\.com$' removes a literal trailing .com from your input string.
-replace '^s|t(?!$), { ... } matches an s char. at the start (^), and a t that is not (!) at the end ($); (?!...) is a so-called negative look-ahead assertion that looks ahead in the input string without including what it finds in the overall match.
Script block { $_.Value.ToUpper() } is called for each match, and converts the match to uppercase.
-replace (a.k.a -ireplace) is case-INsensitive by default; use -creplace for case-SENSITIVE replacements.
For more information about PowerShell's -replace operator see this answer.
Passing a script block ({ ... }) to dynamically determine the replacement string isn't supported in Windows PowerShell, so a Windows PowerShell solution requires direct use of the .NET [regex] class:
WinPs> [regex]::Replace('sometext.com' -replace '\.com$', '^s|t(?!$)', { param($m) $m.Value.ToUpper() })
SomeText

Powershell Store Variables into Variable does not work

I need some help with Powershell, i hope somebody can help.
I want to store multiple Variables into one single variable.
Here is my code:
$Content = #"
$Var1 = "1"
$Var2 = "2"
$Var3 = "3"
"#
echo $Content
And thats my output:
echo $Content
= "1"
= "2"
= "3"
It should look like this:
$Var1 = "1", etc...
But every variable gets removed. I don't know why. Could somebody please explain this? Do i need to store them into something like an object?
Thanks in advance!
The quickest fix for your situation is to switch to a single-quoted here-string.
$Content = #'
$Var1 = "1"
$Var2 = "2"
$Var3 = "3"
'#
In general, double quotes around a string instruct PowerShell to interpolate. Single quotes around a string tells PowerShell to treat the content as a literal string. This also applies to here-strings (#""# and #''#).
# Double Quotes
$string = "my string"
"$string"
my string
#"
$string
"#
my string
# Single Quotes
'$string'
$string
#'
$string
'#
$string
When interpolation happens, variables are expanded and evaluated. A variable is identified starting with the $ character until an illegal character for a variable name is read. Notice in the example below, how the variable interpolation stops at the ..
"$string.length"
my string.length
You can use escape characters or other operators to control interpolation within double quoted strings. The subexpression operator $() allows an expression to be evaluated before it is converted into a string.
"$($string.length)"
9
The backtick character is the PowerShell escape character. It can be used to escape $ when you don't want it to be treated as a variable.
"`$string"
$string
Mixing quotes can create certain gotchas. If you surround your string with single quotes, everything inside will be a literal string regardless of using escape characters or subexpressions.
'"$($string.length)"'
"$($string.length)"
'"`$string"'
"`$string"
Surrounding your string with double quotes with inside single quotes will just treat the inside single quotes literally because the outer quotes determine how the string will be interpreted.
"'$string'"
'my string'
Using multiple single quote pairs or double quote pairs requires special treatment. In this unique situation for double quotes, you can use the backtick escape or a two " to print a single ".
"""$string"""
"my string"
"`"$string`""
"my string"
For multiple single quotes, you must use two ' because a backtick will be treated literally and not tell PowerShell to escape anything.
'''$string'''
'$string'
Please reference About_Quoting_Rules for official documentation.

How can I write/create a powershell script verbatim, from another powershell script?

The following code (at the bottom) produces one of the following outputs in the file
4/12/2019 = (get-date).AddDays(2).ToShortDateString();
4/13/2019 = (get-date).AddDays(2 + 1).ToShortDateString();
or if I haven't initialized the variable
= (get-date).AddDays(2).ToShortDateString();
= (get-date).AddDays(2 + 1).ToShortDateString();
This is the code block, I would like the parent ps1 file to write the child ps1 file verbatim.
$multiLineScript2 = #"
$startDate2 = (get-date).AddDays($resultOfSubtraction).ToShortDateString();
$endDate2 = (get-date).AddDays($resultOfSubtraction + 1).ToShortDateString();
"#
$multiLineScript2 | Out-File "c:\file2.ps1";
tl;dr:
To create a verbatim multi-line string (i.e., a string with literal contents), use a single-quoted here-string:
$multiLineScript2 = #'
$startDate2 = (get-date).AddDays($resultOfSubtraction).ToShortDateString();
$endDate2 = (get-date).AddDays($resultOfSubtraction + 1).ToShortDateString();
'#
Note the use of #' and '# as the delimiters.
Use a double-quoted here-string only if string expansion (interpolation) is needed; to selectively suppress expansion, escape $ chars. to be included verbatim as `$, as shown in your own answer.
String Literals in PowerShell
Get-Help about_quoting rules discusses the types of string literals supported by PowerShell:
To get a string with literal content (no interpolation, what C# would call a verbatim string), use single quotes: '...'
To embed ' chars. inside a '...' string, double them (''); all other chars. can be used as-is.
To get an expandable string (string interpolation), i.e., a string in which variable references (e.g., $var or ${var}) and expressions (e.g., $($var.Name)) can be embedded that are replaced with their values, use double quotes: "..."
To selectively suppress expansion, backtick-escape $ chars.; e.g., to prevent $var from being interpolated (expanded to its value) inside a "..." string, use `$var; to embed a literal backtick, use ``
For an overview of the rules of string expansion, see this answer.
Both fundamental types are also available as here-strings - in the forms #'<newline>...<newline>'# and #"<newline>...<newline>"# respectively (<newline> stands for an actual newline (line break)) - which make defining multi-line strings easier.
Important:
Nothing (except whitespace) must follow the opening delimiter - #' or #" - on the same line - the string's content must be defined on the following lines.
The closing delimiter - '# or "# (matching the opening delimiter) - must be at the very start of a line.
Here-strings defined in files invariably use the newline format of their enclosing file (CRLF vs. LF), whereas interactively defined ones always use LF only.
Examples:
# Single-quoted: literal:
PS> 'I am $HOME'
I am $HOME
# Double-quoted: expandable
PS> "I am $HOME"
I am C:\Users\jdoe
# Here-strings:
# Literal
PS> #'
I am
$HOME
'#
I am
$HOME
# Expandable
PS> #"
I am
$HOME
"#
I am
C:\Users\jdoe
I couldn't find this anywhere, but it appears every single variable in the script (string literal) has to be escaped with a tick like so. Instead of deleting the question I'll leave it up for a search hit.
$multiLineScript2 = #"
`$startDate2 = (get-date).AddDays($resultOfSubtraction).ToShortDateString();
`$endDate2 = (get-date).AddDays($resultOfSubtraction + 1).ToShortDateString();
"#

how do I replace a string with a dollar sign in it in powershell

In Powershell given the following string
$string = "this is a sample of 'my' text $PSP.what do you think"
how do I use the -replace function to convert the string to
this is a sample of 'my' text Hello.what do you think
I obviously need to escape the string somehow, Also $PSP is not a declared variable in my script
I need to change all mentions of $PSP for some other string
Use the backtick character (above the tab key):
$string = "this is a sample of 'my' text `$PSP.what do you think"
To replace the dollar sign using the -replace operator, escape it with backslash:
"this is a sample of 'my' text `$PSP.what do you think" -replace '\$PSP', 'hello'
Or use the string.replace method:
$string = "this is a sample of 'my' text `$PSP.what do you think"
$string.Replace('$PSP','Hello)'
this is a sample of 'my' text Hello.what do you think
Unless you modify your original string (e.g. by escaping the $), this is isn't (really) possible.
Your $string doesn't really contain a $PSP, as it is replaced by nothing in the assignment statement.
$string = "this is a sample of 'my' text $PSP.what do you think"
$string -eq "this is a sample of 'my' text .what do you think"
evaluates to:
True
This comes up as the first answer in google even though it is really old, so I will add my slight variation.
In my case I was reading in a file and replacing a string with $s in it.
The short version of my file is:
<version>$version$<version>
In the case where one is actiong on a (file) stream, variables are not autoreplaced so there is no need to escape the $ in the file.
In the replacement pattern you can avoid the interpretation of the variable using ' instead of ".
My final command looked like:
(gc $fileName) | % { $_.replace('$version$', "$BuildNumber") } | sc $fileName
This is a file read(get-content) piped through the replace and back in to the file with a set-content.
You should try
$string = $string.Replace("\$PSP", "Hello")
or
$string = $string.Replace("\$PSP", $the_new_value)
or to be more generic use Regex
$string = [regex]::Replace($string, "\$\w+", "Hello")

$macro substitution - ExpandString limitations

I am trying to implement macro replacement based on this discussion. Basically it works, but seems the ExpandString have some limitatoins:
main.ps1:
$foo = 'foo'
$text = [IO.File]::ReadAllText('in.config')
$ExecutionContext.InvokeCommand.ExpandString($text) | out-file 'out.config'
in.config (OK):
$foo
in.config (Error: "Encountered end of line while processing a string token."):
"
in.config (Error: "Missing ' at end of string."):
'
The documentation states:
Return Value: The expanded string
with all the variable and expression
substitutions done.
What is 'expression substitution' (may be this is my case)?
Is there some workaround?
The error is occurring because quotes (single and double) are special characters to the PowerShell runtime. They indicate a string and if they are to be used as just that character, they need to be escaped.
A possible workaround would be to escape quotes with a backtick, depending on your desired result.
For example if my text file had
'$foo'
The resulting expansion of that string would be
PS>$text = [io.file]::ReadAllText('test.config')
PS>$ExecutionContext.InvokeCommand.ExpandString($text)
$foo
If you wanted to have that variable expanded, you would need to escape those quotes.
`'$foo`'
PS>$text = [io.file]::ReadAllText('test.config')
PS>$ExecutionContext.InvokeCommand.ExpandString($text)
'foo'
or if you were going to have an unpaired single or double quote, you would need to escape it.
You could do a -replace on the string to escape those characters, but you'll have to make sure that is the desired effect across the board.
PS>$single, $double = "'", '"'
PS>$text = [io.file]::ReadAllText('test.config') -replace "($single|$double)", '`$1'
PS>$ExecutionContext.InvokeCommand.ExpandString($text)
NOTE: After you do the ExpandString call, you won't have the backticks hanging around anymore.