replace string1 with string2 in many java files, only in comments - perl

I have around 3000 instance of replacement done in hundreds of files. Replacing all occurance of string1 with string2 was easy. IntelliJ allows me to replace all occurences in "comments and strings".
The problem is that the same string appear in comments and real code. I would like restrict the replacement only in comment section ( we use mix of /**/ or // )
Any library/IDE/script that can do this?

use Regexp::Common 'comment';
...
s/($RE{comment}{'C++'})/(my $x = $1) =~ s#string1#string2#g; $x/ge;

Try using the following regex to find all comments, and then replace what you want afterwards:
/(?>\/\*[^\*\/]*\*\/|\/\/([^\n])*\n)/
The first part \/\*[^\*\/]*\*\/ Tries to find all /**/ pairs where it finds something that starts with /* and then contains something other than end tag */ and the contains end tag */.
THe other part checks something that starts with // and goes to endline(\n) and contains something not newline between ([^\n]*).
Thus it should all comments

Related

Replace every non letter or number character in a string with another

Context
I am designing a code that runs a bunch of calculations, and outputs figures. At the end of the code, I want to save everything in a nice way, so my take on this is to go to a user specified Output directory, create a new folder and then run the save process.
Question(s)
My question is twofold:
I want my folder name to be unique. I was thinking about getting the current date and time and creating a unique name from this and the input filename. This works but it generates folder names that are a bit cryptic. Is there some good practice / convention I have not heard of to do that?
When I get the datetime string (tn = datestr(now);), it looks like that:
tn =
'07-Jul-2022 09:28:54'
To convert it to a nice filename, i replace the '-',' ' and ':' characters by underscores and append it to a shorter version of the input filename chosen by the user. I do that using strrep:
tn = strrep(tn,'-','_');
tn = strrep(tn,' ','_');
tn = strrep(tn,':','_');
This is fine but it bugs me to have to use 3 lines of code to do so. Is there a nice one liner to do that? More generally, is there a way to look for every non letter or number character in a string and replace it with a given character? I bet that's what regexp is there for but frankly I can't quite get a hold on how regexps work.
Your point (1) is opinion based so you might get a variety of answers, but I think a common convention is to at least start the name with a reverse-order date string so that sorting alphabetically is the same as sorting chronologically (i.e. yymmddHHMMSS).
To answer your main question directly, you can use the built-in makeValidName utility which is designed for making valid variable names, but works for making similarly "plain" file names.
str = '07-Jul-2022 09:28:54';
str = matlab.lang.makeValidName(str)
% str = 'x07_Jul_202209_28_54'
Because a valid variable can't start with a number, it prefixes an x - you could avoid this by manually prefixing something more descriptive first.
This option is a bit more simple than working out the regex, although that would be another option which isn't too nasty here using regexprep and replacing non-alphanumeric chars with an underscore:
str = regexprep( str, '\W', '_' ); % \W (capital W) matches all non-alphanumeric chars
% str = '07_Jul_2022_09_28_54'
To answer indirectly with a different approach, a nice trick with datestr which gets around this issue and addresses point (1) in one hit is to use the following syntax:
str = datestr( now(), 30 );
% str = '20220707T094214'
The 30 input (from the docs) gives you an ISO standardised string to the nearest second in reverse-order:
'yyyymmddTHHMMSS' (ISO 8601)
(note the T in the middle isn't a placeholder for some time measurement, it remains a literal letter T to split the date and time parts).
I normally use your folder naming approach with a meaningful prefix, replacing ':' by something else:
folder_name = ['results_' strrep(datestr(now), ':', '.')];
As for your second question, you can use isstrprop:
folder_name(~isstrprop(folder_name, 'alphanum')) = '_';
Or if you want more control on the allowed characters you can use good old ismember:
folder_name(~ismember(folder_name, ['0':'9' 'a':'z' 'A':'Z'])) = '_';

What is the right regex to match a relative path to an image file?

I have this path ../../Capture.jpg. So far I've figured out this incomplete regex: '[../]+'. I want to check if user puts in the right path like ../../image file name. The file extensions can be jpg, png, ..
your [../]+ is not sufficient or correct for the job at hand, if you REALLY want to match a bunch of ../ at the start of a filename.
It's not completely clear what you want to do exactly, but the following will match one or more ../ at the start of a string:
/^((?:\.\.\/)+)/
basically:
^ to anchor to the start of the string being tested - will not match any ../ INSIDE the string
( and the balancing ) at the end: capture the contents within. All your ../../ will be available in a variable called $1
then I'm using (?: ) to wrap the next content. This groups the bit inside, but does NOT save the value inside a $1, $2, etc. More information soon...
The REAL pattern of interest is
\.\.\/
Since . and / are magic characters, they need 'escaping' with backslash. This tells Perl that the . and / do NOT have a special meaning at this point.
I've used the (?: ) wrapper to group them together, so that the + operates on all 3 characters of interest. The + operator means "one or more repetitions".
So, my pattern will match one or more repetitions of ../ which are anchored to the start of the string. Furthermore, the exact contents matched will be available in $1 if you are interested in doing something with that (eg count how many ../ you have)
Please ask if you have further questions, or I have misunderstood your goals.
EDIT: to suit your new requirements, and add a bit of bonus:
m!^\.\./\.\./(([^/]+)\.([^.]+))$!
Note first that I've used m!pattern! instead of /pattern/. Firstly, if Perl sees /pattern/ it assumes it's m/pattern/ but you can use an alternative character to wrap the patterns. This is useful if you actually want to use / in your pattern without having to go nuts with backslashes.
so:
^ exactly match only from the start
followed by exactly ../../
next I've used ( ) wrappers to capture the bits following. Explanation after...
ignoring the ( and ) now:
[^/]+ one or more repetitions (+) of any character that isn't /
. literally a dot - the one before the extension
[^./]+ one or more repetitions of any character that isn't . or /
Notice how the [^/]+ allows for any character including . but prevents another directory part from sneaking in. Thus, the filename could be foo.bar.jpg and it will be collected properly.
Notice how [^./]+ allows for any character in the extension except a dot - and also excluding / to prevent another directory segment from sneaking in.
Finally, $ is used to ensure we've reached the end of the pattern.
as for the captures:
$1 will contain all of foo.bar.jpg
$2 will contain foo.bar
$3 will contain jpg (not .jpg) but I'll leave it up to you to figure out what to change if you wish to capture the dot as well.
FINALLY - in a typical script, you might do something like:
if($filename =~ m!^\.\./\.\./(([^/]+)\.([^./]+))$!) {
print "You correctly entered ../../$1 giving basename=$2 and extension=$3 - Bravo!\n";
}
else {
print "you've failed to read the instructions properly\n";
}
As a bonus, I even tested that, and found 2 spolling mistaiks you'll never have to see
cheers.
# convert relative file paths to md links ...
# file paths and names with letters , nums - and _ s supported
$str =~ s! (\.\.\/([a-zA-Z0-9_\-\/\\]*)[\/\\]([a-zA-Z0-9_\-]*)\.([a-zA-Z0-9]*)) ! [$3]($1) !gm
If you don't care the path prefix, use:
$path =~ /\.(jpg|png)$/
or
substr($path, -4) ~~ ['.jpg', '.png']
With exactly '../../', use:
$path =~ m!^\.\./\.\./[^/]*\.(jpg|png)$!
With any number of '../'s, use:
$path =~ m!^(\.\./)*[^/]*\.(jpg|png)$!

matlab regexprep

How to use matlab regexprep , for multiple expression and replacements?
file='http:xxx/sys/tags/Rel/total';
I want to replace 'sys' with sys1 and 'total' with 'total1'. For a single expression a replacement it works like this:
strrep(file,'sys', 'sys1')
and want to have like
strrep(file,'sys','sys1','total','total1') .
I know this doesn't work for strrep
Why not just issue the command twice?
file = 'http:xxx/sys/tags/Rel/total';
file = strrep(file,'sys','sys1')
strrep(file,'total','total1')
To solve it you need substitute functionality with regex, try to find in matlab's regexes something similar to this in php:
$string = 'http:xxx/sys/tags/Rel/total';
preg_replace('/http:(.*?)\//', 'http:${1}1/', $string);
${1} means 1st match group, that is what in parenthesis, (.*?).
http:(.*?)\/ - match pattern
http:${1}1/ - replace pattern with second 1 as you wish to add (first 1 is a group number)
http:xxx/sys/tags/Rel/total - input string
The secret is that whatever is matched by (.*?) (whether xxx or yyyy or 1234) will be inserted instead of ${1} in replace pattern, and then replace instead of old stuff into the input string. Welcome to see more examples on substitute functionality in php.
As documented in the help page for regexprep, you can specify pairs of patterns and replacements like this:
file='http:xxx/sys/tags/Rel/total';
regexprep(file, {'sys' 'total'}, {'sys1' 'total1'})
ans =
http:xxx/sys1/tags/Rel/total1
It is even possible to use tokens, should you be able to define a match pattern for everything you want to replace:
regexprep(file, '/([st][yo][^/$]*)', '/$11')
ans =
http:xxx/sys1/tags/Rel/total1
However, care must be taken with the first approach under certain circumstances, because MATLAB replaces the pairs one after another. That is to say if, say, the first pattern matches a string and replaces it with something that is subsequently matched by a later pattern, then that will also be replaced by the later replacement, even though it might not have matched the later pattern in the original string.
Example:
regexprep('This\is{not}LaTeX.', {'\\' '([{}])'}, {'\\textbackslash{}' '\\$1'})
ans =
This\textbackslash\{\}is\{not\}LaTeX.
=> This\{}is{not}LaTeX.
and
regexprep('This\is{not}LaTeX.', {'([{}])' '\\'}, {'\\$1' '\\textbackslash{}'})
ans =
This\textbackslash{}is\textbackslash{}{not\textbackslash{}}LaTeX.
=> This\is\not\LaTeX.
Both results are unintended, and there seems to be no way around this with consecutive replacements instead of simultaneous ones.

Splitting a variable and putting into an array

I have a string like this <name>sekar</name>. I want to split this string (i am using perl) and take out only sekar, and push it into an array while leaving other stuff.
I know how to push into an array, but struck with the splitting part.
Does any one have any idea of doing this?
push #output, $1 if m|<name>(\w*)</name>|;
Try this:
my($name) = $string =~ m|<name>(.*)</name>|;
From perldoc perlop:
If the "/g" option is not used, "m//" in list context returns a
list consisting of the subexpressions matched by the
parentheses in the pattern, i.e., ($1, $2, $3...).
Try <(("[^"]*"|'[^']*'|[^'">])*)>(\w+)<\/\1>. Should work, when I get home I'll test it. The idea is that the first capture group finds the contents within a <> and its nested capture group prevents a situation like <blah=">"> matching as <blah=">. The third capture group (\w+) matches the inner word. This may have to be changed depending on the format of the possibilities you can have within the <tag>content</tag>. Lastly the \1 looks back at the content of the first capture group so that this way you will find the proper closing tag.
Edit: I've tested this with perl and it works.

Incrementing an integer at the end of a string in perl

I have a string in the following format:
\main\stream\foo.h\3
it may have more or less "sections", but will always end with a slash followed by an integer. Other examples include:
\main\stream2309\stream222\foo.c\45
\main\foo.c\9
I need to, in Perl, increment the number at the end of the string and leave the rest alone. I found an example on this site that does exactly what I want to do (see Increment a number in a string in with regex) only the language is Javascript. The solution given was:
.replace(/\d+$/,function(n) { return ++n })
I need to do the same thing in Perl.
You can use the /e regex modifier to put executable code in your replacement string.
Something like:
$string =~ s/(\d+)$/$1 + 1/e;
should work.
Try $var =~ s/(\d+$)/($1 + 1)/e