I'm having issues with the rex command on splunk.
My Query outputs the below.
{"(001) NULL.COUNT(1).NUMBER": "12345"}
I am looking to extract just the value 12345, but at the moment, I have below rex command which returns "{"(001) NULL.COUNT(1).NUMBER": "12345"}".
| rex field=_transfers "(001) NULL.COUNT(1).NUMBER": "(?<value>.*)"
Quotation marks have to escaped in the rex command. Also, parentheses in regex strings must be escaped if they're not part of a capture group. Try | rex field=_transfers "\\(001) NULL.COUNT\\(1).NUMBER\\": \\"(?<value>.*)".
Related
I've found plenty of post explaining how to literally escape both single and double quotation marks using either """" for one double quotation mark, or '''' for a single quotation mark (or just doing `"). I find myself in a situation where I need to search through a list of names that is input in a different query:
Foreach($Username in $AllMyUsers.Username){
$Filter = "User = '$Username'"
# do some more code here using $Filter
}
The problem occurs when I reach a username like O'toole or O'brian which contains quotation marks. If the string is literal I could escape it with
O`'toole
O''brian
etc.
But, since it's in a loop I need to escape the quotation mark for each user.
I tried to use [regex]::Escape() but that doesn't escape quotation marks.
I could probably do something like $Username.Replace("'","''") but it feels like there should be a more generic solution than having to manually escape the quotation marks. In other circumstances I might need to escape both single and double, and just tacking on .Replace like so $VariableName.Replace("'","''").Replace('"','""') doesn't feel like it's the most efficient way to code.
Any help is appreciated!
EDIT: This feels exactly like a "how can I avoid SQL injection?" question but for handling strings in Powershell. I guess I'm looking for something like mysqli_real_escape_string but for Powershell.
I could probably do something like $Username.Replace("'","''") but it feels like there should be a more generic solution than having to manually escape the quotation marks
As implied by Mathias R. Jessen's comment, there is no generic solution, given that you're dealing with an embedded '...' string, and the escaping requirements entirely depend on the ultimate target API or utility - which is unlikely to be PowerShell itself (where ' inside a '...' string must be doubled, i.e. represented as '').
In the case at hand, where you're passing the filter string to a System.Data.DataTable's .DataView's .RowFilter property, '' is required as well.
The conceptually cleanest way to handle the required escaping is to use -f, the format operator, combined with a separate string-replacement operation:
$Filter = "User = '{0}'" -f ($UserName -replace "'", "''")
Note how PowerShell's -replace operator - rather than the .NET [string] type's .Replace() method - is used to perform the replacement.
Aside from being more PowerShell-idiomatic (with respect to: syntax, being case-insensitive by default, accepting arrays as input, converting to strings on demand), -replace is regex-based, which also makes performing multiple replacements easier.
To demonstrate with your hypothetical .Replace("'","''").Replace('"','""') example:
#'
I'm "fine".
'# -replace '[''"]', '$0$0'
Output:
I''m ""fine"".
I've got an application that has no useful api implemented, and the only way to get certain information is to parse string output. This is proving to be very painful...
I'm trying to achieve this in bash on SLES12.
Given I have the following strings:
QMNAME(QMTKGW01) STATUS(Running)
QMNAME(QMTKGW01) STATUS(Ended normally)
I want to extract the STATUS value, ie "Ended normally" or "Running".
Note that the line structure can move around, so I can't count on the "STATUS" being the second field.
The closest I have managed to get so far is to extract a single word from inside STATUS like so
echo "QMNAME(QMTKGW01) STATUS(Running)" | sed "s/^.*STATUS(\(\S*\)).*/\1/"
This works for "Running" but not for "Ended normally"
I've tried switching the \S* for [\S\s]* in both "grep -o" and "sed" but it seems to corrupt the entire regex.
This is purely a regex issue, by doing \S you requested to match non-white space characters within (..) but the failing case has a space between which does not comply with the grammar defined. Make it simple by explicitly calling out the characters to match inside (..) as [a-zA-Z ]* i.e. zero or more upper & lower case characters and spaces.
sed 's/^.*STATUS(\([a-zA-Z ]*\)).*/\1/'
Or use character classes [:alnum:] if you want numbers too
sed 's/^.*STATUS(\([[:alnum:] ]*\)).*/\1/'
sed 's/.*STATUS(\([^)]*\)).*/\1/' file
Output:
Running
Ended normally
Extracting a substring matching a given pattern is a job for grep, not sed. We should use sed when we must edit the input string. (A lot of people use sed and even awk just to extract substrings, but that's wasteful in my opinion.)
So, here is a grep solution. We need to make some assumptions (in any solution) about your input - some are easy to relax, others are not. In your example the word STATUS is always capitalized, and it is immediately followed by the opening parenthesis (no space, no colon etc.). These assumptions can be relaxed easily. More importantly, and not easy to work around: there are no nested parentheses. You will want the longest substring of non-closing-parenthesis characters following the opening parenthesis, no mater what they are.
With these assumptions:
$ grep -oP '\bSTATUS\(\K[^)]*(?=\))' << EOF
> QMNAME(QMTKGW01) STATUS(Running)
> QMNAME(QMTKGW01) STATUS(Ended normally)
> EOF
Running
Ended normally
Explanation:
Command options: o to return only the matched substring; P to use Perl extensions (the \K marker and the lookahead). The regexp: we look for a word boundary (\b) - so the word STATUS is a complete word, not part of a longer word like SUBSTATUS; then the word STATUS and opening parenthesis. This is required for a match, but \K instructs that this part of the matched string will not be returned in the output. Then we seek zero or more non-closing-parenthesis characters ([^)]*) and we require that this be followed by a closing parenthesis - but the closing parenthesis is also not included in the returned string. That's a "lookahead" (the (?= ... ) construct).
I can't quite understand how PowerShell parses commands and need your help.
I read the following explanation by Microsoft's about_parsing documentation:
When processing a command, the PowerShell parser operates in expression mode or in argument mode:
In expression mode, character string values must be contained in quotation marks. Numbers not enclosed in quotation marks are treated as numerical values (rather than as a series of characters).
In argument mode, each value is treated as an expandable string unless it begins with one of the following special characters: dollar sign ($), at sign (#), single quotation mark ('), double quotation mark ("), or an opening parenthesis (().
If preceded by one of these characters, the value is treated as a value expression.
I can understand when parsing a command, PowerShell uses either expression mode or argument mode, but I can't quite understand the following examples.
$a = 2+2
Write-Output $a #4(int), expression mode
Write-Output $a/H #4/H(str), argument mode
I wonder PowerShell expands variable first and then decide which mode when parsing, but is it right?
If so, there's another question about data type.
It seems reasonable for me the former command produces integral, but the latter one doesn't. Why can integer 4 be put next to string /H?
I tried this example and it worked. It seems variables turn into string whatever data type they are when expanded. Is it right?
$b = 100
Add-Content C:\Users\Owner\Desktop\$b\test.txt 'test'
I appreciate for your help.
Edited to clarify the point after got the comment
I've got the comment that the both Write-Output examples are argument mode, so can the examples be interpreted like this?
Write-Output "$a"
Write-Output "$a/H"
I'm terribly sorry for too ambiguous question, but I want to know:
In argument mode, double quotations are omitted?
The Write-Output examples are quoted from microsoft's document I linked and it says the first example produces integral, but is it wrong?
I have a question about using sed to modify file. My file content:
<data-value name="WLS_INSTALL_DIR" value="/home/Oracle/wlserver_10.3">
I want to replace the content of field value="/home/Oracle/wlserver_10.3"
to get this result:
<data-value name="WLS_INSTALL_DIR" value="/u03/Middle_home/Oracle/wlserver_10.3">
I use sed:
sed "6 i/^value=/>/s/value= />\(.*\)/value=\"\/u03\/Oracle/Middleware/wlserver_10.3"\" \/\ /u03/silent.xml
Your sed script has a number of issues.
First off, anything that looks like 6istuff will simply write everything after i ("insert") verbatim as a new line before the sixth line. (Some dialects require a newline after the i and will basically do nothing.)
Secondly, ^value= does not match your input; it would only select a line starting with the string value= (the ^ metacharacter means beginning of line).
Thirdly, the /> in your subsitution regex terminates the substitution and so everything from > onwards is parsed as invalid flags for the substitution. I cannot see the purpose of this part, anyway; it doesn't match your data, and so the regex fails.
What remains after removing all these superfluous and erroneous details is a more or less useful sed script. (I assume the 6 to address only the sixth line of input is intentional, although you don't mention this in the question at all.) I have made some additional minor improvements, such as using % as the substitution delimiter and tightening the regex so that it only ever substitutes a double-quoted value.
sed '6s%value="[^"]*"%value="/u03/Oracle/Middleware/wlserver_10.3"%' /u03/silent.xml
Better than 6 would perhaps be to identify the line with /name="WLS_INSTALL_DIR"/.
Still, as alluded to in a comment, the proper way to manipulate XML is with a dedicated tool such as xsltproc.
Try:
sed 's|/home|/u03/Middle_home|'
I am using PostgreSQL regexp_replace function to escape square brackets, parentheses and backslash in a string so that I could use that string as a regex pattern itself (there are other manipulations done on this string as well before using it, but they are outside the scope of this question. The idea is to replace:
[ with \[
] with \]
( with \(
) with \)
\ with \\
Postgres documentation page on regular expressions states the following:
The replacement string can contain \n, where n is 1 through 9, to
indicate that the source substring matching the n'th parenthesized
subexpression of the pattern should be inserted, and it can contain \&
to indicate that the substring matching the entire pattern should be
inserted. Write \ if you need to put a literal backslash in the
replacement text.
However regexp_replace('abc [def]', '([\[\]\(\)\\])', E'\\\1', 'g'); produces abc \ def\.
Further down on that same page, an example is given, which uses \\1 notation - so I tried that.
Yet, regexp_replace('abc [def]', '([\[\]\(\)\\])', E'\\\\1', 'g'); produces abc \1def\1.
I would guess this is expected, but regexp_replace('abc [def]', '([\[\]\(\)\\])', E'.\\1', 'g'); produces abc .[def.]. That is, escaping works with characters other than the standard backslash.
At this point I don't know how to proceed. What can I do to actually give me the replacement I want?
OK, found the answer. Apparently, I need to double-escape the backslash in the replacement. Also, I need to E-prefix and double-escape backslashes in the search pattern on older versions of postgres (8.3 in my case). The final code looks like this:
regexp_replace('abc [def]', E'([\\[\\]\\(\\)\\\\\?\\|_%])', E'\\\\\\1', 'g')
Yes, it looks horrible, but it works :)
it's simpliest way
select regexp_replace('abc [def]', '([\[\]\(\)\\])', '\\\1', 'g')