How to replace words after first two words - preg-replace

Let say i have the full name like: Wan Ahmad Wan Dollah Karmat.
And i want to display like: Wan Ahmad W.D.K
I tried this code:
preg_replace('/(.)[^\s]+\s?/', '${1}.', strtoupper($_GET['fullname']), 2)
But the output is: W.A.Wan Dollah Karmat
I want the first two words and shorter the rest words. please help.
Problem solved, thanks to Casimir et Hippolyte. The final code is:
preg_replace('~^(?:\s*\S+){1,2}(*SKIP)(*FAIL)|(\S)\S+~', '${1}.', strtoupper($_GET['fullname']))
its the matter of patterns.

You can use the backtracking control verbs (*SKIP) and (*FAIL) to avoid the two first words.
$pattern = '~^(?:\s*\S+){1,2}(*SKIP)(*FAIL)|(\S)\S+~';
$result = preg_replace_callback($pattern,
function ($m) { return strtoupper($m[1]) . '.'; },
$_GET['fullname'] );
In short:
(*SKIP) forces a substring that matches the preceding subpattern to not be retry if the pattern fails later.
(*FAIL) forces the pattern to fail.

Related

Modify vscode snippet by regex: TitleCase and SNAKE_CASE

I have two questions for vscode snippets+regex;
I have a pathname like some-component and I need to generate an output like SomeComponent using vscode snippet.
I need to input sendData and return an string like const sendData = createMessage(SEND_DATA);
How can I do this using regex on vscode snippet?
"${TM_DIRECTORY/(.*)/${1:/pascalcase}/g}" you didn't really provide enough info on how you are getting your pathName, so this is just one possibility, perhaps RELATIVE_FILEPATH` works for you.
"$1 = createMessage(${1/(([^A-Z]+)(\\w*))/${2:/upcase}_${3:/upcase}/});"
split the input sendData into 2 capture groups $2 and $3. Upcase them both in the transform.
"sendData": {
"prefix": "cm",
"body": [
"${TM_DIRECTORY/(.*)/${1:/pascalcase}/}",
// simpler form if ONLY two "words" like "sendData"
"$1 = createMessage(${1/(([^A-Z]+)(\\w*))/${2:/upcase}_${3:/upcase}/});",
// for any number of words, like "sendDataTwoThreeFour" use this:
"$1 = createMessage(${1/([a-z]*)([A-Z][a-z]*)/${1:/upcase}${2:+_}${2:/upcase}/g});"
]
}
${1/([a-z]*)([A-Z][a-z]*)/${1:/upcase}${2:+_}${2:/upcase}/g} get the first word "send" into capture group 1 and the other words like "Data" or "Two", etc. into subsequent matches' capture group 2. [So the g flag at the end is very important.]
Upcase group1. Then if there is a group 2 ${2:+_} add _. Then upcase group2.
The only case this will not work on is send with nothing else. It still prints out the all the text just doesn't upcase send if it is by itself. There is probably a way to include that...
Edit: And here it is:
"$1 = createMessage(${1/([a-z]*)([A-Z][a-z]*)|([a-z]+)/${1:/upcase}${3:/upcase}${2:+_}${2:/upcase}/g});"
now a bare send will be put into group 3 and upcased. For the rest of the matches there will not be a group 3 so ${3:/upcase} returns nothing.

ignore spaces and cases MATLAB

diary_file = tempname();
diary(diary_file);
myFun();
diary('off');
output = fileread(diary_file);
I would like to search a string from output, but also to ignore spaces and upper/lower cases. Here is an example for what's in output:
the test : passed
number : 4
found = 'thetest:passed'
a = strfind(output,found )
How could I ignore spaces and cases from output?
Assuming you are not too worried about accidentally matching something like: 'thetEst:passed' here is what you can do:
Remove all spaces and only compare lower case
found = 'With spaces'
found = lower(found(found ~= ' '))
This will return
found =
withspaces
Of course you would also need to do this with each line of output.
Another way:
regexpi(output(~isspace(output)), found, 'match')
if output is a single string, or
regexpi(regexprep(output,'\s',''), found, 'match')
for the more general case (either class(output) == 'cell' or 'char').
Advantages:
Fast.
robust (ALL whitespace (not just spaces) is removed)
more flexible (you can return starting/ending indices of the match, tokenize, etc.)
will return original case of the match in output
Disadvantages:
more typing
less obvious (more documentation required)
will return original case of the match in output (yes, there's two sides to that coin)
That last point in both lists is easily forced to lower or uppercase using lower() or upper(), but if you want same-case, it's a bit more involved:
C = regexpi(output(~isspace(output)), found, 'match');
if ~isempty(C)
C = found; end
for single string, or
C = regexpi(regexprep(output, '\s', ''), found, 'match')
C(~cellfun('isempty', C)) = {found}
for the more general case.
You can use lower to convert everything to lowercase to solve your case problem. However ignoring whitespace like you want is a little trickier. It looks like you want to keep some spaces but not all, in which case you should split the string by whitespace and compare substrings piecemeal.
I'd advertise using regex, e.g. like this:
a = regexpi(output, 'the\s*test\s*:\s*passed');
If you don't care about the position where the match occurs but only if there's a match at all, removing all whitespaces would be a brute force, and somewhat nasty, possibility:
a = strfind(strrrep(output, ' ',''), found);

Separating file name in parts by identifier

This may be a very simple task for many but I could not find anything appropriate for me.
I have a file name: filenm_A006.2011.269.10.47.G25_2010
I want to separate all its parts (separated by . and _) to use them separately. How can I do it with simple matlab commands?
Kind Regards,
Mushi
I recommend regexp:
fname = 'filenm_A006.2011.269.10.47.G25_2010';
parts = regexp(fname, '[^_.]+', 'match');
parts =
'filenm' 'A006' '2011' '269' '10' '47' 'G25' '2010'
You can now refer to parts{1} through parts{8} for the pieces. Explanation: the regexp pattern [^_.] means all characters not equal to _ or ., and the + means you want groups of at least 1 character. Then 'match' asks the regexp function to return a cell array of the strings of all the matches of that pattern. There are other regexp modes; for example, the indices of each piece of the file.
Use the command
strsplit.
cellArrayOfParts = strsplit(fileName,{'.' '_'});
You can use strsplit to split it:
strsplit('filenm_A006.2011.269.10.47.G25_2010',{'_','.'})
ans =
'filenm' 'A006' '2011' '269' '10' '47' 'G25' '2010'
Another option is to use regexp, like Peter suggested.

Saving specific areas of a filename

I have a list of pdf files in this format "123 - Test - English.pdf". I want to be able to set "111", "Test" and "English.pdf" in their own individual variables. I tried running the code below but I don't think it accounts for multiple dashes "-". How can I do this? Please help Thanks in advance.
Loop,C:\My Documents\Notes\*.pdf, 0, 0
{
NewVariable = Trim(Substr(A_LoopFileName,1, Instr(A_LoopFileName, "-")-1))
I would recommend using a parse loop to get your variables. The following loops through values between the dashes and removes the whitespace.
FileName = Test - file - name.pdf
Loop, parse, FileName, `-
MyVar%A_Index% := RegExReplace(A_LoopField, A_Space, "")
msgbox % Myvar1 "`n" Myvar2 "`n" MyVar3
First, I don't know if it was a typo, but if you use a { under your loop statement, you also need to close it. If your next statement is just one line, you don't need any brackets at all.
Second, if you just use = then your code will output as just that very code text. You need to use a :=
Third, your present code, if coded correctly would result in this:
somepdffile.pd
if it found any pdf files without a dash. Instr() will return the position of a dash. If there is no dash, it returns 0 - in which case, your substr() statement will add 0 and your -1 which adds up to -1 and if you use a negative number with substr(), it will search from the end of the string instead of the beginning - which is why your string would get cut off.
Loop, C:\My Documents\Notes\*.pdf, 0, 0
{
;look at the docs (http://www.autohotkey.com/docs/) for `substr`
}
So there is an explanation of why your code doesn't work. To get it to do what you want to do, can you explain a bit more as to how you want NewVariable to look like?
; here is another way (via RegExMatch)
src:="123 - Test - English.pdf", pat:="[^\s|-]+"
While, mPos:=RegExMatch(src, pat, match, mPos ? mPos+StrLen(match):1)
match%A_Index%:=match
MsgBox, 262144, % "result", % match1 ", "match2 ", "match3

Powershell - Capture text in a var from a specific character

I want to grab the first char of a var string and the first char of the following caracter
Example:
$var1 = "Jean-Martin"
I want a way to grab the first letter "J" then I want to take the first char following the "-" (dash) which is "M".
Something like this?
$initial1 = $var1[0]
$initial2 = $var1.Split('-')[1][0]
Strings in Powershell use the System.String class from the .Net framework. As such, they are indexable to retrieve individual characters and have many methods available such as the Split method used above.
See the documentation here.
$var1 = "Jean-Martin"
To get the first character:
$var1[0]
To get the first character after the dash:
$characterToSeek = '-'
$var1[$var1.IndexOf($characterToSeek)+1]
Another option using regex:
PS> $var1 -replace '^(.)[^-]+-(.).+$','$1$2'
JM