I want to convert as below via preg_replace - preg-replace

I want to convert as below via preg_replace.
How can i know answer??
preg_replace($pattern, "$2/$1", "One001Two111Three");
result> Three/Two111/One001

You'd better use preg_split, it's much more simple than preg_replace and it works with any number of elements:
$str = "One001Two111Three";
$res = implode('/', array_reverse(preg_split('/(?<=\d)(?=[A-Z])/', $str)));
echo $res,"\n";
output:
Three/Two111/One001
The regex /(?<=\d)(?=[A-Z])/ splits on boundary between a digit and a capital letter, array_reverse reverse the order of the array given by preg_split, then the elements of reversed array are joined by implode with a /

$string = "One001Two111Three";
$result = preg_replace('/^(.*?\d+)(.*?\d+)(.*?)$/im', '$3/$2/$1', $string );
echo $result;
RESULT: Three/Two111/One001
DEMO
EXPLANATION:
^(.*?\d+)(.*?\d+)(.*?)$
-----------------------
Options: Case insensitive; Exact spacing; Dot doesn't match line breaks; ^$ match at line breaks; Greedy quantifiers; Regex syntax only
Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed) «^»
Match the regex below and capture its match into backreference number 1 «(.*?\d+)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match a single character that is a “digit” (any decimal number in any Unicode script) «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 2 «(.*?\d+)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match a single character that is a “digit” (any decimal number in any Unicode script) «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 3 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line (at the end of the string or before a line break character) (line feed) «$»
$3/$2/$1
Insert the text that was last matched by capturing group number 3 «$3»
Insert the character “/” literally «/»
Insert the text that was last matched by capturing group number 2 «$2»
Insert the character “/” literally «/»
Insert the text that was last matched by capturing group number 1 «$1»

Related

How to match exact string in perl

I am trying to parse all the files and verify if any of the file content has strings TESTDIR or TEST_DIR
Files contents might look something like:-
TESTDIR = foo
include $(TESTDIR)/chop.mk
...
TEST_DIR := goldimage
MAKE_TESTDIR = var_make
NEW_TEST_DIR = tesing_var
Actually I am only interested in TESTDIR ,$(TESTDIR),TEST_DIR but in my case last two lines should be ignored. I am new to perl , Can anyone help me out with re-rex.
/\bTEST_?DIR\b/
\b means a "word boundary", i.e. the place between a word character and a non-word character. "Word" here has the Perl meaning: it contains characters, numbers, and underscores.
_? means "nothing or an underscore"
Look at "characterset".
Only (space) surrounding allowed:
/^(.* )?TEST_?DIR /
^ beginning of the line
(.* )? There may be some content .* but if, its must be followed by a space
at the and says that a whitespace must be there. Otherwise use ( .*)?$ at the end.
One of a given characterset is allowed:
Should the be other characters then a space be possible you can use a character class []:
/^(.*[ \t(])?TEST_?DIR[) :=]/
(.*[ \t(])? in front of TEST_?DIR may be a (space) or a \t (tab) or ( or nothing if the line starts with itself.
afterwards there must be one of (space) or : or = or ). Followd by anything (to "anything" belongs the "=" of ":=" ...).
One of a given group is allowed:
So you need groups within () each possible group in there devided by a |:
/^(.*( |\t))?TEST_?DIR( | := | = )/
In this case, at the beginning is no change to [ \t] because each group holds only one character and \t.
At the end, there must be (single space) or := (':=' surrounded by spaces) or = ('=' surrounded by spaces), following by anything...
You can use any combination...
/^(.*[ \t(])?TEST_?DIR([) =:]| :=| =|)/
Test it on Debuggex.com. (Use 'PCRE')

How to check column that contain letter and number in Talend

My columns must contains 2 letter and 4 number like this (AV1234)
How can i check this ?
You can use sql templates as mentioned in talend documentation here and you can check your column that contain letter and number using regular expressions.
Use this [a-zA-Z]{2}[0-9]{6}
Use this If you want only uppercase letters [A-Z]{2}[0-9]{6}
[a-zA-Z] # Match a single character present in the list below
# A character in the range between “a” and “z”
# A character in the range between “A” and “Z”
{2} # Exactly 2 times
[0-9] # Match a single character in the range between “0” and “9”
{6} # Exactly 6 times
Thank you for your answer ! it Works
My routine code:
public static Boolean MyPattern(String str) {
String stringPattern = "[A-Z]{2}[0-9]{4}";
boolean match = Pattern.matches(stringPattern, str);
return match ;
}

Changing a numerical value in a text file using PowerShell

Im looking for a solution that will allow me to modify numerical values(xyz values) in a line of text using the Get-Content cmdlet and write those vales back to the text file. I have a text file "MyFile.txt" with lines of text as follows"
COMPONENT-IDENTIFIER 1
ATTRIBUTE9 C
ATTRIBUTE22 0
END-POINT 518700.500 555700.500 33234.800 1 SL
END-POINT 518500.500 555700.500 33234.800 1 SL
WEIGHT 2.177
UBV {111256-254885-000-1515-BGL518FS7D}
END-POINT 518700.500 555700.500 33234.800 1 PL
END-POINT 518500.500 555700.500 33234.800 1 PL
ATTRIBUTE15 D
ATTRIBUTE08 3
Basically i need to find the -Pattern "END-POINT" and parse the line and change the numerical values for the first three double values and write it back to the text file (text file delimiters are odd - four spaces). I just need to perform basic math like add, subtract and or divide. The file is large and has multiple attributes and other values that i do not need to modify only the "END-POINT" values.
here is what i have, cant quiet figure out how to replace the values, im not sure if its even the right direction:
$MyFile = Get-Content "MyFile.txt"
ForEach ($line in $MyFile){
if ($line | select-String -Pattern 'END-POINT'){
$Array = #($line)
$NewArray = $Array -split " "
$Arrayx = $NewArray.split(" ")[2]/12.0
$Arrayy = $NewArray.split(" ")[2]/12.0
$Arrayx = $NewArray.split(" ")[2]/12.0
}
}
Appreciate any insight in advance.
In order to update these numeric END-POINT values, I'd suggest you use switch with option -Regex and loop through the file line-by-line:
$endPointRegex = '(?<endpoint>\s*END-POINT\s{4})(?<num1>\d+(?:\.\d+))?\s+(?<num2>\d+(?:\.\d+))?\s+(?<num3>\d+(?:\.\d+))?(?<rest>.*)'
$result = switch -Regex -File 'D:\Test\MyFile.txt' {
$endPointRegex {
'{0}{1} {2} {3}{4}' -f $matches['endpoint'],
([double]$matches['num1'] / 12.0),
([double]$matches['num2'] / 12.0),
([double]$matches['num3'] / 12.0),
$matches['rest']
}
default { $_ }
}
# output on screen
$result
# output to new file
$result | Set-Content -Path 'D:\Test\MyNewFile.txt' -Force
Output:
COMPONENT-IDENTIFIER 1
ATTRIBUTE9 C
ATTRIBUTE22 0
END-POINT 43225.0416666667 46308.375 2769.56666666667 1 SL
END-POINT 43208.375 46308.375 2769.56666666667 1 SL
WEIGHT 2.177
UBV {111256-254885-000-1515-BGL518FS7D}
END-POINT 43225.0416666667 46308.375 2769.56666666667 1 PL
END-POINT 43208.375 46308.375 2769.56666666667 1 PL
ATTRIBUTE15 D
ATTRIBUTE08 3
If you want the numbers devided by 12.0 also have 3 decimals like the original values, change to:
$endPointRegex = '(?<endpoint>\s*END-POINT\s{4})(?<num1>\d+(?:\.\d+))?\s+(?<num2>\d+(?:\.\d+))?\s+(?<num3>\d+(?:\.\d+))?(?<rest>.*)'
$result = switch -Regex -File 'D:\Test\MyFile.txt' {
$endPointRegex {
'{0}{1:F3} {2:F3} {3:F3}{4}' -f $matches['endpoint'],
([double]$matches['num1'] / 12.0),
([double]$matches['num2'] / 12.0),
([double]$matches['num3'] / 12.0),
$matches['rest']
}
default { $_ }
}
# output on screen
$result
# output to new file
$result | Set-Content -Path 'D:\Test\MyNewFile.txt' -Force
Output:
COMPONENT-IDENTIFIER 1
ATTRIBUTE9 C
ATTRIBUTE22 0
END-POINT 43225.042 46308.375 2769.567 1 SL
END-POINT 43208.375 46308.375 2769.567 1 SL
WEIGHT 2.177
UBV {111256-254885-000-1515-BGL518FS7D}
END-POINT 43225.042 46308.375 2769.567 1 PL
END-POINT 43208.375 46308.375 2769.567 1 PL
ATTRIBUTE15 D
ATTRIBUTE08 3
Regex details:
(?<endpoint> Match the regular expression below and capture its match into backreference with name “endpoint”
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
END-POINT Match the characters “END-POINT” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
{4} Exactly 4 times
)
(?<num1> Match the regular expression below and capture its match into backreference with name “num1”
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?: Match the regular expression below
\. Match the character “.” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
)? Between zero and one times, as many times as possible, giving back as needed (greedy)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?<num2> Match the regular expression below and capture its match into backreference with name “num2”
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?: Match the regular expression below
\. Match the character “.” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
)? Between zero and one times, as many times as possible, giving back as needed (greedy)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?<num3> Match the regular expression below and capture its match into backreference with name “num3”
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?: Match the regular expression below
\. Match the character “.” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
)? Between zero and one times, as many times as possible, giving back as needed (greedy)
(?<rest> Match the regular expression below and capture its match into backreference with name “rest”
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
Mind you, these numbers will format with decimal comma's if you are in a country that uses the decimal comma instead of a decimal point. Since you're in Houston, you won't have to deal with that
convert to xml file, search for END-POINT using regular expression, edit, and convert back
https://en.wikipedia.org/wiki/Regular_expression
https://en.wikipedia.org/wiki/XML

Avoiding duplicate items in a comma-separated list of two-letter words

I need to write a regex which allows a group of 2 chars only once. This is my current regex :
^([A-Z]{2},)*([A-Z]{2}){1}$
This allows me to validate something like this :
AL,RA,IS,GD
AL
AL,RA
The problem is that it validates also AL,AL and AL,RA,AL.
EDIT
Here there are more details.
What is allowed:
AL,RA,GD
AL
AL,RA
AL,IS,GD
What it shouldn't be allowed:
AL,RA,AL
AL,AL
AL,RA,RA
AL,IS,AL
IS,IS,AL
IS,GD,GD
IS,GD,IS
I need that every group of two characters appears only once in the sequence.
Try something like this expression:
/^(?:,?(\b\w{2}\b)(?!.*\1))+$/gm
I have no knowledge of swift, so take it with a grain of salt. The idea is basically to only match a whole line while making sure that no single matched group occurs at a later point in the line.
First of all, let's shorten your pattern. It can be easily achieved since the length of each comma-separated item is fixed and the list items are only made up of uppercase ASCII letters. So, your pattern can be written as ^(?:[A-Z]{2}(?:,\b)?)+$. See this regex demo.
Now, you need to add a negative lookahead that will check the string for any repeating two-letter sequence at any distance from the start of string, and within any distance between each. Use
^(?!.*\b([A-Z]{2})\b.*\b\1\b)(?:[A-Z]{2}(?:,\b)?)+$
See the regex demo
Possible implementation in Swift:
func isValidInput(Input:String) -> Bool {
return Input.range(of: #"^(?!.*\b([A-Z]{2})\b.*\b\1\b)(?:[A-Z]{2}(?:,\b)?)+$"#, options: .regularExpression) != nil
}
print(isValidInput(Input:"AL,RA,GD")) // true
print(isValidInput(Input:"AL,RA,AL")) // false
Details
^ - start of string
(?!.*\b([A-Z]{2})\b.*\b\1\b) - a negative lookahead that fails the match if, immediately to the right of the current location, there is:
.* - any 0+ chars other than line break chars, as many as possible
\b([A-Z]{2})\b - a two-letter word as a whole word
.* - any 0+ chars other than line break chars, as many as possible
\b\1\b - the same whole word as in Group 1. NOTE: The word boundaries here are not necessary in the current scenario where the word length is fixed, it is two, but if you do not know the word length, and you have [A-Z]+, you will need the word boundaries, or other boundaries depending on the situation
(?:[A-Z]{2}(?:,\b)?)+ - 1 or more sequences of:
[A-Z]{2} - two uppercase ASCII letters
(?:,\b)? - an optional sequence: , only if followed with a word char: letter, digit or _. This guarantees that , won't be allowed at the end of the string
$ - end of string.
You can use a negative lookahead with a back-reference:
^(?!.*([A-Z]{2}).*\1).*
if, as in the all the examples in the question, it is known that the string contains only comma-separated pairs of capital letters. I will relax that assumption later in my answer.
Demo
The regex performs the following operations:
^ # match beginning of line
(?! # begin negative lookahead
.* # match 0+ characters (1+ OK)
([A-Z]{2}) # match 2 uppercase letters in capture group 1
.* # match 0+ characters (1+ OK)
\1 # match the contents of capture group 1
) # end negative lookahead
.* # match 0+ characters (the entire string)
Suppose now that one or more capital letters may appear between each pair of commas, or before the first comma or after the last comma, but it is only strings of two letters that cannot be repeated. Moreover, I assume the regex must confirm the regex has the desired form. Then the following regex could be used:
^(?=[A-Z]+(?:,[A-Z]+)*$)(?!.*(?:^|,)([A-Z]{2}),(?:.*,)?\1(?:,|$)).*
Demo
The regex performs the following operations:
^ # match beginning of line
(?= # begin pos lookahead
[A-Z]+ # match 1+ uc letters
(?:,[A-Z]+) # match ',' then by 1+ uc letters in a non-cap grp
* # execute the non-cap grp 0+ times
$ # match the end of the line
) # end pos lookahead
(?! # begin neg lookahead
.* # match 0+ chars
(?:^|,) # match beginning of line or ','
([A-Z]{2}) # match 2 uc letters in cap grp 1
, # match ','
(?:.*,) # match 0+ chars, then ',' in non-cap group
? # optionally match non-cap grp
\1 # match the contents of cap grp 1
(?:,|$) # match ',' or end of line
) # end neg lookahead
.* # match 0+ chars (entire string)
If there is no need check that the string contains only comma-separated strings of one or more upper case letters the postive lookahead at the beginning can be removed.

Get Event Log Message content in a Variable

I want to get the the first "WDS.Device.ID" (00-15-5D-8A-44-25) (without the [] brackets) into a variable.
I tried some RegEx things but without success as I lack the knowledge for it.
PS C:\Windows\system32> $result | fl
Message : A device query was successfully processed (status 0x0):
Input:
WDS.Request.Type='Deployment'
WDS.Client.Property.Architecture.Process='X64'
WDS.Client.Property.Architecture.Native='X64'
WDS.Client.Property.Firmware.Type='BIOS'
WDS.Client.Property.SMBIOS.Manufacturer='Microsoft Corporation'
WDS.Client.Property.SMBIOS.Model='Virtual Machine'
WDS.Client.Property.SMBIOS.Vendor='American Megatrends Inc.'
WDS.Client.Property.SMBIOS.Version='090008 '
WDS.Client.Property.SMBIOS.ChassisType='Desktop'
WDS.Client.Property.SMBIOS.UUID={CCD695BE-20AB-48CC-8F01-319B498F7A69}
WDS.Client.Request.Version=1.0.0.0
WDS.Client.Version=10.0.18362.1
WDS.Client.Host.Version=10.0.18362.1
WDS.Client.DDP.Default.Match=FALSE
WDS.Device.ID=[00-15-5D-8A-44-25]
WDS.Device.ID=[BE-95-D6-CC-AB-20-CC-48-8F-01-31-9B-49-8F-7A-69]
Output:
WDS.Client.Property.Architecture.Process='X64'
WDS.Client.Property.Architecture.Native='X64'
WDS.Client.Property.Firmware.Type='BIOS'
WDS.Client.Property.SMBIOS.Manufacturer='Microsoft Corporation'
WDS.Client.Property.SMBIOS.Model='Virtual Machine'
WDS.Client.Property.SMBIOS.Vendor='American Megatrends Inc.'
WDS.Client.Property.SMBIOS.Version='090008 '
WDS.Client.Property.SMBIOS.ChassisType='Desktop'
WDS.Client.Property.SMBIOS.UUID={CCD695BE-20AB-48CC-8F01-319B498F7A69}
WDS.Client.Request.Version=1.0.0.0
WDS.Client.Version=10.0.18362.1
WDS.Client.Host.Version=10.0.18362.1
WDS.Client.DDP.Default.Match=FALSE
WDS.Client.Request.ResendAuthenticated=TRUE
Turning my comment into an answer.
If the message you show is inside a string variable (let's call it $message), then you can use regex to get the value for the WDS.Device.ID without the brackets like this:
$devideID = ([regex]'(?i)WDS\.Device\.ID=\[((?:[0-9a-f]{2}-){5}[0-9a-f]{2})\]').Match($message).Groups[1].Value
Result:
00-15-5D-8A-44-25
Regex details:
WDS Match the characters “WDS” literally
\. Match the character “.” literally
Device Match the characters “Device” literally
\. Match the character “.” literally
ID= Match the characters “ID=” literally
\[ Match the character “[” literally
( Match the regular expression below and capture its match into backreference number 1
(?: Match the regular expression below
[0-9a-f] Match a single character present in the list below
A character in the range between “0” and “9”
A character in the range between “a” and “f”
{2} Exactly 2 times
- Match the character “-” literally
){5} Exactly 5 times
[0-9a-f] Match a single character present in the list below
A character in the range between “0” and “9”
A character in the range between “a” and “f”
{2} Exactly 2 times
)
] Match the character “]” literally
The (?i) in the regex makes it case-insensitive
here's another way to go about it. this presumes the $Result variable holds one multiline string AND that the 1st [ & the 1st ] are "bracketing" your target data. [grin]
$Result.Split('[')[1].Split(']')[0]
output = 00-15-5D-8A-44-25