Notepad++ conditional Macro

Notepad++ conditional Macro - macros

I'd like to search for a string starting with doi = { or url = { and then remove it from the file. For example, for the following data I'd like to remove the url and subsequently doi sections.
I don't know how I can use the replace command, as I don't know the complete string, and for Macro, how can I do this if these lines are not at regular distance from each other?
#article{Carrion2006,
author = {Carrion, M. and Arroyo, J.M.},
doi = {10.1109/TPWRS.2006.876672},
journal = {IEEE Trans. Power Syst.},
title = {{Bla Bla Bla 1}},
pages = {1371--1378},
url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1664974},
year = {2006}
}
#article{Chandrasekaran2012,
author = {Chandrasekaran, K. and Hemamalini, S. and Simon, Sishaj P. and Padhy, Narayana Prasad},
issn = {03787796},
journal = {Electr. Power Syst. Res.},
pages = {109--119},
publisher = {Elsevier B.V.},
title = {{Bla Bla Bla 2}},
url = {http://linkinghub.elsevier.com/retrieve/pii/S0378779611002471},
volume = {84},
year = {2012}
}

How about:
Find what: ^(?:doi|url)\s*=\s*[^}]+\},\R
Replace with: NOTHING

Related

Dart language, How to extract specific part from a string

I want to extract from the full String this part:
Will
full String:
"Will posted an update in the group Testing"
Another example
longerName
full String:
"longerName posted an update in the group Testing"
any help please

This could work:
String str = "Will posted an update in the group Testing";
String result = str.split("</a>")[0] + "</a>";

From the top of my head, this could work.
Regex
<a\b[^>]*>(.*?)</a>
Dart
final myString = "Will bla bla bla";
final regexp = RegExp(r'<a\b[^>]*>(.*?)</a>');
// final match = regexp.firstMatch(myString);
// final link = match.group(0);
Iterable matches = regexp.allMatches(myString);
matches.forEach((match) {
print(myString.substring(match.start, match.end));
});

Question mark in reserved keyword

I am trying to write a parser for LOLCODE GOD, WHAT I AM DOING???
(just in case to explain those strange words=) )
So, I need to have tokens for O RLY? and YA RLY.
I am trying to do like this:
reserved = { ...,
'O': 'IF_O',
'RLY?': 'IF_RLY',
'YA': 'THEN_YA',
'RLY': 'THEN_RLY', ...}
tokens = reserved.values() + (...)
t_IF_O = r'O'
t_IF_RLY = r'RLY\?'
t_THEN_YA = r'YA'
t_THEN_RLY = r'RLY'
And when I write O RLY? it is parsed like IF_O THEN_RLY and an undefined symbol ?.
If I replace RLY? with, for example, RLYY, replacing in dictionary RLY?: 'IF_RLY' -> 'RLYY': 'IF_RLY' and t_IF_RLY = r'RLYY', then it works for O RLYY.
So I think this is a problem with question marks in reserved words and do not know a workaround for this.

Sorry, but I can't reproduce this problem. Here is a working sample (ply=3.10, python=3.6):
import ply.lex as lex
tokens = (
'IF_O',
'IF_RLY',
'THEN_YA',
'THEN_RLY'
)
t_IF_O = r'O'
t_IF_RLY = r'RLY\?'
t_THEN_YA = r'YA'
t_THEN_RLY = r'RLY'
t_ignore = ' \t'
def t_error(t):
print(t)
lexer = lex.lex()
lexer.input('O RLY?')
while True:
token = lexer.token()
if token is None:
break
print(token)
And it prints:
LexToken(IF_O,'O',1,0)
LexToken(IF_RLY,'RLY?',1,2)

Blank results while using Tokens Regex rules to identify Named Entities

I am struggling with writing the correct rule which involves macros to identify organizations in a text.
To Identify Matrix Inc. in:
With it's rising share prices Matrix Inc. has come out a winner this quarter.
I am trying to check for words like Inc within the entity and thus defined a macros and rule as below:
$ORGANIZATION_TITLES = "/pharmaceuticals?|group|corp|corporation|international|co.?|inc.?|incorporated|holdings|motors|ventures|parters|llc|limited liability corporation|pvt.? ltd.?/"
ENV.defaults["stage"] = 1
{
ruleType: "tokens",
pattern: ([$ORGANIZATION_TITLES]),
action: ( Annotate($0, ner, "ORGANIZATION") )
}
ENV.defaults["stage"] = 2
{ ( [{tag:NNP}]+? ($ORGANIZATION_TITLES)) => ORGANIZATION }
I tried using bindings also and then applying the rule.
env.bind("$ORGANIZATION_TITLES", TokenSequencePattern.compile(env,"/pharmaceuticals?|group|corp|corporation|international|co.?|inc.?|incorporated|holdings|motors|ventures|parters|llc|limited liability corporation|pvt.? ltd.?/"));
Nothing seems to be working. I need to define more complex pattern rules involving macros like:
pattern: ( [ { ner:PERSON } ]+ /,/*? ($TITLES_CORPORATE_PREFIXES)*? $TITLES_CORPORATE+? /,/*? /of|for/? /,/*? [ { ner:ORGANIZATION } ]+ )
where $TITLES_CORPORATE_PREFIXES and $TITLES_CORPORATE are macros similar to $ORGANIZATION_TITLES.
What am I doing wrong?
EDIT
Here's my code:
public static void main(String[] args)
{
String rulesFile = "D:\\Workspace\\resource\\NERRulesFile.txt";
String dataFile = "D:\\Workspace\\resource\\GoldSetSentences.txt";
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// pipeline.addAnnotator(new TokensRegexAnnotator(rulesFile));
String inputText = "Bill Edelman , CEO and Chairman , for Paragonix commented on the Supply Agreement with Essential Pharmaceuticals .";
Annotation document = new Annotation(inputText.toLowerCase());
pipeline.annotate(document);
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
CoreMapExpressionExtractor extractor = CoreMapExpressionExtractor.createExtractorFromFiles(TokenSequencePattern.getNewEnv(), rulesFile);
/* Next we can go over the annotated sentences and extract the annotated words,
Using the CoreLabel Object */
for (CoreMap sentence : sentences)
{
List<MatchedExpression> matched = extractor.extractExpressions(sentence);
for(MatchedExpression phrase : matched){
// Print out matched text and value
System.out.println("matched: " + phrase.getText() + " with value " + phrase.getValue());
// Print out token information
CoreMap cm = phrase.getAnnotation();
for (CoreLabel token : cm.get(TokensAnnotation.class))
{
String word = token.get(TextAnnotation.class);
String lemma = token.get(LemmaAnnotation.class);
String pos = token.get(PartOfSpeechAnnotation.class);
String ne = token.get(NamedEntityTagAnnotation.class);
System.out.println("matched token: " + "word="+word + ", lemma="+lemma + ", pos=" + pos + "ne=" + ne);
}
}
}
}

Here is a rules file that should work:
ner = { type: "CLASS", value: "edu.stanford.nlp.ling.CoreAnnotations$NamedEntityTagAnnotation" }
$ORGANIZATION_TITLES = "/inc\.|corp\./"
{ pattern: ([{pos: NNP}]+ $ORGANIZATION_TITLES), action: ( Annotate($0, ner, "RULE_FOUND_ORG") ) }
I have made some changes to our code base to make the TokensRegexAnnotator more easily accessible. You will need to get the latest version from GitHub: https://github.com/stanfordnlp/CoreNLP
java -Xmx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,tokensregex -tokensregex.rules organization.rules -file samples.txt -outputFormat text -tokensregex.caseInsensitive
If you run this command or the equivalent Java API call it should work:

TYPO3 : editing page title

I try to get page title and edit it like on the following example.
Page name My tools
Final string my_tools (Thus, I will be able to use it through markers in my css classes)
I know how to get page title using:
HEADERTITLE = TEXT
HEADERTITLE.data = page : title
But how can I transform this string?
Thank you for your help!

to convert the case and replace some characters you may use these TypoScript settings:
HEADERTITLE = TEXT
HEADERTITLE {
data = page:title
### replace whitespace
replacement.10 {
search = #\s#i
replace = _
useRegExp = 1
}
/*
### replace all special characters
replacement.10 {
search = #\W#i
replace =
useRegExp = 1
}
*/
### transform string to lowercase
case = lower
}

how to ignore the last unknow characters at the end of the json string

{
education = (
{
school = {
id = 108102169223234;
name = psss;
};
type = College;
year = {
id = 142833822398097;
name = 2010;
};
}
);
}
!-- 1.2398s -->
the above leads me error as " NSLocalizedDescription=Unrecognised leading character"

not even close to valid JSON.. http://www.jsonlint.com/
Are you in charge of generating the feed? If so I would think it a lot better to fix the problem at the source than try re-factor your code to accommodate what ever that is that is getting returned.
Are you using a JSON framework in Xcode to parse that string?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Notepad++ conditional Macro - macros

How about: Find what: ^(?:doi|url)\s=\s[^}]+\},\R Replace with: NOTHING

Related

Dart language, How to extract specific part from a string

Question mark in reserved keyword

Blank results while using Tokens Regex rules to identify Named Entities

TYPO3 : editing page title

how to ignore the last unknow characters at the end of the json string

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Notepad++ conditional Macro - macros

How about: Find what: ^(?:doi|url)\s*=\s*[^}]+\},\R Replace with: NOTHING

Related

Dart language, How to extract specific part from a string

Question mark in reserved keyword

Blank results while using Tokens Regex rules to identify Named Entities

TYPO3 : editing page title

how to ignore the last unknow characters at the end of the json string

Categories

Resources

How about: Find what: ^(?:doi|url)\s=\s[^}]+\},\R Replace with: NOTHING