search a string only inside a function definetion - perl

Is there any way to search a string only inside a function definition.
I mean to say suppose there is a c program file a.c , in which there is definition of several functions are present , but i want output of search only when that string present inside specific function ( lets say do_something()) definition, is there any way to search string like that, from command prompt?
for example , for following code:
#include <stdio.h>
void f(int n,
int j,
int k)
{
printf("name is is pankaj ");
printf("name is is kumar ");
printf("name is is mayank ");
}
int main()
{
printf("name is is pankaj ");
return 0;
}
for above program, I want only one occurrence of pankaj which is present in function f(), I don't want pankaj present in main function as output of search.
Please ignore any semantic or syntax error in program , my query is only for search of a string in program.

Of course, try this:
$0 ~ fun {
count = 1
while (! ($0 ~ /{/))
getline
getline
}
count > 0 {
if ($0 ~ /{/)
count++
if ($0 ~ /}/)
count--
if ($0 ~ query)
print FILENAME ": l" FNR ". " $0
}
And invoke the script like this:
awk -v query="pankaj" -v fun="void f[(]" -f script.awk inputfile.java
Where query is the string to search and fun the regex for the function name.
This script counts { and } to see when we leave the function and should print the line if a match is found.
Edit: you may want to extend the regex for counting brackets, perhaps an extra check to see if they aren't placed in comments is required (although you'd never do that).

Related

Method returning a regex in Perl 6?

I'm only beginning to study classes, so I don't understand the basics.
I want a method to construct regex using attributes of the object:
class TEST {
has Str $.str;
method reg {
return
rx/
<<
<[abc]> *
$!str
<!before foo>
/;
}
}
my $var = TEST.new(str => 'baz');
say $var.reg;
When trying to run this program, I get the following error message:
===SORRY!=== Error while compiling /home/evb/Desktop/p6/e.p6
Attribute $!str not available inside of a regex, since regexes are methods on Cursor.
Consider storing the attribute in a lexical, and using that in the regex.
at /home/evb/Desktop/p6/e.p6:11
------> <!before foo>⏏<EOL>
expecting any of:
infix stopper
So, what's the right way to do that?
Looks like this would work:
class TEST {
has Str $.str;
method reg {
my $str = $.str;
return
regex {
<<
<[abc]> *
$str
<!before foo>
}
}
}
my $var = TEST.new(str => 'baz');
say $var.reg;
say "foo" ~~ $var.reg;
say "<<abaz" ~~ $var.reg
You are returning an anonymous regex, which can be used as an actual regex, as it's done in the last two sentences.
Using EVAL solved my problem. So, I wonder, whether there are any drawbacks in this method.
class TEST {
has Str $.str;
method reg {
return
"rx/
<<
<[abc]> *
$!str
<!before foo>
/".EVAL;
}
}
my $var = TEST.new(str => 'baz');
say "abaz" ~~ $var.reg; # abaz
say "cbazfoo" ~~ $var.reg; # Nil

Lex program rules not working

%{
#include <stdio.h>
int sline=0,mline=0;
%}
%%
"/*"[a-zA-Z0-9 \t\n]*"*/" { mline++; }
"//".* { sline++; }
.|\n { fprintf(yyout,"%s",yytext); }
%%
int main(int argc,char *argv[])
{
if(argc!=3)
{
printf("Invalid number of arguments!\n");
return 1;
}
yyin=fopen(argv[1],"r");
yyout=fopen(argv[2],"w");
yylex();
printf("Single line comments = %d\nMultiline comments=%d\nTotal comments = %d\n",sline,mline,sline+mline);
return 0;
}
I am trying to make a Lex program which would count the number of comment lines (single-line comments and multi-line comments separately).
Using this code, I gave a .c file and a blank text file as input and output arguments.
When I have any special characters in multi-line comments, its not working for that multi-line and mline is not incremented for the comment line.
How do I fix this problem?
Below is a nudge in the right direction. The main differences between what you did and what I have done is that I made only two regex - one for whitespace and one for ident (identifiers). What I mean by identifiers is anything that you want to comment out. This regex can obviously be expanded out to include other characters and symbols. I also just defined the three patterns that begin and end comments and associated them with tokens that we could pass to the syntax analyzer (but that's a whole new topic).
I also changed the way that you feed input to the program. I find it cleaner to redirect input to a program from a file and redirect output to another file - if you need this.
Here is an example of how you might use this program:
flex filename.l
g++ lex.yy.c -o lexer
./lexer < input.txt
You can redirect the output to another file if you need to by using:
./lexer < input.txt > output.txt
Instead of the last command above.
Note: the '.'(dot) character at the end of the pattern matching is used as a catch-all for characters, sequences of characters, symbols, etc. that do not have a match.
There are many nuances to pattern matching using regex to match comment lines. For example, this would still match even if the comment line was part of a string.
Ex. " //This is a comment in a string! "
You will need to do a little more work to get past these nuances - like I said, this is a nudge in the right direction.
You can do something similar to this to accomplish your goal:
%{
#include <stdio.h>
int sline = 0;
int mline = 0;
#define T_SLINE 0001
#define T_BEGIN_MLINE 0002
#define T_END_MLINE 0003
#define T_UNKNOWN 0004
%}
WSPACE [ \t\r]+
IDENT [a-zA-Z0-9]
%%
"//" {
printf("TOKEN: T_SLINE LEXEME: %s\n", yytext);
sline++;
return T_SLINE;
}
"/*" {
printf("TOKEN: T_BEGIN_MLINE LEXEME: %s\n", yytext);
return T_BEGIN_MLINE;
}
"*/" {
printf("TOKEN: T_END_MLINE LEXEME: %s\n", yytext);
mline++;
return T_END_MLINE;
}
{IDENT} {/*Do nothing*/}
{WSPACE} { /*Do Nothing*/}
. {
printf("TOKEN: UNKNOWN LEXEME: %s\n", yytext);
return T_UNKNOWN;
}
%%
int yywrap(void) { return 1; }
int main(void) {
while ( yylex() );
printf("Single-line comments = %d\n Multi-line comments = %d\n Total comments = %d\n", sline, mline, (sline + mline));
return 0;
}
The problem is your regex for multiline comments:
"/*"[a-zA-Z0-9 \t\n]*"*/"
This only matches multiline comments that ONLY contain letters, digits, spaces, tabs, and newlines. If the comment contains anything else it won't match. You want something like:
/"*"([^*]|"*"+[^*/])*"*"+/
This will match anything except a */ between the /* and */.
Below is the full lex code to count the number of comment line and executable line.
%{
int cc=0,cl=0,el=0,flag=0;
%}
%x cmnt
%%
^[ \t]*"//".*\n {cc++;cl++;}
.+"//".*\n {cc++;cl++;el++;}
^[ \t]*"/*" {BEGIN cmnt;}
<cmnt>\n {cl++;}
<cmnt>.\n {cl++;}
<cmnt>"*/"\n {cl++;cc++;BEGIN 0;}
<cmnt>"*/" {cl++;cc++;BEGIN 0;}
.*"/*".*"*/".+\n {cc++;cl++;}
.+"/*".*"*/".*\n {cc++;cl++;el++;}
.+"/*" {BEGIN cmnt;}
.\n {el++;}
%%
main()
{
yyin=fopen("abc.cpp","r");
yyout=fopen("abc.txt","w");
yylex();
fprintf(yyout,"Comment Count: %d \nCommented Lines: %d \nExecutable Lines: %d",cc,cl,el);
}
int yywrap()
{
return 1;
}
The program takes the input as a c++ program that is abc.cpp and appends the output in the file abc.txt

Compare two parameters

I have two parameters:
security.server.port=8443
security.authorization.enabled=true
I want every time to check those parameters in file
if one of them missing to add after the next one or opositive
or if they missing at all to added after "security.server.ip="
For example:
If we have in the file parameter:
security.authorization.enabled=true
Expected View:
security.server.port=8443
security.authorization.enabled=true
Not able to create whole process grep and add after the grep
Something like this might be what you're looking for:
awk '
BEGIN{
params["security.server.port=8443"]
params["security.authorization.enabled=true"]
}
{
for (param in params)
if ($0 == param)
params[param] = 1
print
}
END {
for (param in params)
if (params[param] != 1)
print param
}
' file
It's hard to say without more representative input and expected output.

Put all methods into .h file automatically

In implementation file (.m) I have 30.. methods. How can I put their description (all of them) into .h file automatically?
Seams hard to do properly with a regex, but you can do it with awk:
https://gist.github.com/1771131
#!/usr/bin/env awk -f
# print class and instance methods declarations from implementation
# Usage: ./printmethods.awk class.m or awk -f printmethods.awk class.m
/^[[:space:]]*#implementation/ {
implementation = 1;
}
/^[[:space:]]*#end/ {
implementation = 0;
}
/^[[:space:]]*[\-\+]/ {
if(implementation) {
method = 1;
collect = "";
}
}
/[^[:space:]]/ {
if(implementation && method) {
p = index($0, "{");
if(p == 0) {
if(collect == "")
collect = $0
else
collect = collect $0 "\n";
} else {
method = 0;
# trim white space and "{" from line end
gsub("[\{[:space:]]*$", "", $0);
collect = collect $0;
# trim white space from start
gsub("^[[:space:]]*", "", collect);
print collect ";"
}
}
}
Write a piece of code that will extract all the methods definitions (Use regular expressions to detect them) and then just added it to the h file and "\; \n".
The program Accessorizer (on the Mac App Store for $5) is specifically intended for these obnoxious grunt work issues in Xcode. It can generate prototypes as well as property synthesizes, accessors, inits, etc.
Caveat: it's been a bit touchy and rough around the edges in my experience. It might, for example, not realize that a function is inside a multi-line comment, and thus provide an unwanted prototype for it. But even with those quirks in mind, it's saved me way more than $5 worth of time.
Their website: http://www.kevincallahan.org/software/accessorizer.html

Getting a string which ends with a string "lngt" in Lex

I am writing a lex script to tokenize C ASTs. I want to write a regex in lex to get a string that ends with a specific string "lngt" but does not include "lngt" in the final string returned by lex. So basically the string form would be (.*lngt), but I haven't been able to figure out how to do this in lex. Any advice/direction would be really helpful
Example:I have this line in my file
#65 string_cst type: #71 strg: Reverse order of the given number is : %d lngt: 42
I want to retrieve string after strg: and before lngt: ie "Reverse order of the given number is : %d" (NOTE: this string could be composed of any characters possible)
Thanks.
This question needs an answer is similar to the one I wrote here. It can be done by writing your own state machine in lex. It could also be done by writing some C code as shown in the cited answer or in the other texts cited below.
If we assume that the string you want is always between "strg" and "lngt" then this is the same as any other non-symmetric string delimiters.
%x STRG LETTERL LN LNG LNGT
ws [ \t\r\n]+
%%
<INITIAL>"strg: " {
BEGIN(STRG);
}
<STRG>[^l]*l {
yymore();
BEGIN(LETTERL);
}
<LETTERL>n {
yymore();
BEGIN(LN);
}
<LN>g {
yymore();
BEGIN(LNG);
}
<LNG>t {
yymore();
BEGIN(LNGT);
}
<LNGT>":" {
printf("String is '%s'\n", yytext);
BEGIN(INITIAL);
}
<LETTERL>[^n] {
BEGIN(STRG);
yymore();
}
<LN>[^g] {
BEGIN(STRG);
yymore();
}
<LNG>[^t] {
BEGIN(STRG);
yymore();
}
<LNGT>[^:] {
BEGIN(STRG);
yymore();
}
<INITIAL>{ws} /* skip */ ;
<INITIAL>. /* skip anything not in the string */
%%
To quote my other answer:
There are suggested solutions on several university compiler courses. The one that explains it well is here (at Manchester). Which cites a couple of good books which also cover the problems:
J.Levine, T.Mason & D.Brown: Lex and Yacc (2nd ed.)
M.E.Lesk & E.Schmidt: Lex - A Lexical Analyzer Generator
The two techniques described are to use Start Conditions to explicity specify the state machine, or manual input to read characters directly.