Extract Multiple values from a content using regular expression in perl - perl

I have a content like:
"emailAddress":"akashu87#gmail.com","UserName":"Akash Udupa","active":true,"emailAddress":"coolrohit#rediffmail.com","UserName":"Rohit Hegde","active":true,"emailAddress":"manohar_k#rediffmail.com","UserName":"Manohar Karnam","active":true,"emailAddress":"satishgk#hotmail.com","UserName":"Satish GK","active":true
I want to display only the values of UserName in CSV file through PERL like the following:
Akash Udupa
Rohit Hegde
Manohar Karnam
Satish GK
I am sure you guys will ask me what I have tried. Problem is I am very new to PERL. So can anyone help me with perl code? Please.
Thanks in advance.

There are two ways to do this; the right way, and the fragile way. Since your JSON has its braces and brackets stripped away, you've already started down the path to the fragile way:
my $string = q{"emailAddress":"akashu87#gmail.com","UserName":"Akash Udupa","active":true,"emailAddress":"coolrohit#rediffmail.com","UserName":"Rohit Hegde","active":true,"emailAddress":"manohar_k#rediffmail.com","UserName":"Manohar Karnam","active":true,"emailAddress":"satishgk#hotmail.com","UserName":"Satish GK","active":true};
while ( $string =~ m/"UserName"\s*:\s*"([^"]+)"/g ) {
print "$1\n";
}
This anchors to the "UserName" tag, and allows whitespace (but does not require it) between the tag and its value. It then looks for a double-quote, and captures everything until the next quote into $1.
A brief introduction to Perl's regexes is contained in perlrequick, which comes with Perl. My regex solution doesn't use any constructs not explained in that document. As a matter of fact, perlintro, which should be considered required reading for Perl users, provides information sufficient to this task.
Since it's possible that the logic that stripped away the JSON might have broken something, and since the JSON might possibly throw something at you that our one-off regular expression isn't equipped to handle, it would be wise to revert to the original un-adulterated JSON, and parse it with a proper parser, like this:
use JSON;
my $json = <<'EOJSON';
[
{
"emailAddress": "akashu87#gmail.com",
"UserName": "AkashUdupa",
"active": true
},
{
"emailAddress": "coolrohit#rediffmail.com",
"UserName": "RohitHegde",
"active": true
},
{
"emailAddress": "manohar_k#rediffmail.com",
"UserName": "ManoharKarnam",
"active": true
},
{
"emailAddress": "satishgk#hotmail.com",
"UserName": "SatishGK",
"active": true
}
]
EOJSON
print "$_->{UserName}\n" for #{decode_json($json)}
If the JSON module is too heavy-weight for you, look at JSON::Tiny, which is minimal, well tested, and free of dependencies.
Both the regex and the parser approach will work with the original JSON, so you may find that your code can be simplified by just eliminating the section that strips brackets and braces from the original JSON. Once you've done that, the JSON parser solution can be one line of code. It's a lucky day when removing code can make the code more robust without removing features.

Related

Array in QueryParams wiremock

I need to pass through queryParams in wireMock
Person=[{age=6,name=AAA}]
I've been getting 404 when I test in the curl request. Aly leads would be appreciated.
Only thing i know is
"Person": {
"matches": "^[a-zA-Z0-9_-]*$"
}
I'm not sure on how to validate the array.
Your regex matching is incorrect. The value of Person is [{age=6,name=AAA}], so your matcher will need to capture all of that to be considered a match. You've used some special characters that need escaping, as well as an invalid capture group (which I don't think we really need here.) Without any more guidance from your post on what you precisely need to match, here's two rough ideas:
"Person": {
"matches": "^\[.*\]$"
}
The above just matches any character, an unlimited amount of times, between brackets.
"Person": {
"matches": "^\[\{age=\d*,name=\w*\}\]$"
}
The above matches any digit character, an unlimited amount of times, as well as any word character, an unlimited amount of times, so long as the rest of the value looks like [{age=,name=}].
I would advise you to take a look at a regex tool to work on forming these. My personal favorite is regex101.

Where to find Pokedex Entry Description for Pokemon (PokeAPI)

Are there ways I can get a Pokemon's Pokedex entry description? I've tried looking in the API documentation to come with no avail. The closest thing I could find was a Pokedex JSON but even that didn't contain the Pokemon's PD entry description alongside with Pokemon Species. Unless I didn't look hard enough do you know where I can fetch a Pokemon's dex entry description?
Thanks
I guess you missed the flavor-text-entries in apiv2.
This is for pikachu
GET https://pokeapi.co/api/v2/pokemon-species/25
A part of the output
{
"flavor_text": "It has small electric sacs on both its\ncheeks. If threatened, it looses electric\ncharges from the sacs.",
"language": {
"name": "en",
"url": "https://pokeapi.co/api/v2/language/9/"
},
"version": {
"name": "firered",
"url": "https://pokeapi.co/api/v2/version/10/"
}
}
Similarly it's available for other game versions as well.
to add onto #Rinkesh P's answer, you can replace the '\n's with spaces. The '\n' is actually an escape sequence in ASCII. It's mainly used in programming languages such as C, Pearl, and Java. But in most cases, you're probably just gonna load up the description and display it in HTML --So just write up a function to replace it with a space(' ').

TOML multi-level table syntax inside array -- illegal or not

I'm seeing an error during parsing with the Python and JS parsers. I can't seem to find an example in the TOML repo about whether this should be accepted or not:
[[somearray]]
one.two = false # fails, '.' not allowed in key
I know I can express this as:
[[somearray]]
[somearray.one]
two = false # OK
The TOML readme offers this example (among others):
[fruit]
apple.color = "red"
apple.taste.sweet = true
which does not involve arrays but seems to legitimize this syntax.
I also observe that a plain file with this content:
apple.color = "red"
is rejected. In conclusion, until you enter "table mode" (so to say) with an actual bracketed table, keys can only be singles.
Now again, in the TOML readme it says (although in the tables section):
"Dotted keys define everything to the left of each dot as a table."
Obviously, if mainstream parsers choke on it, it's not a good idea to use it, but I'd like to understand if/why it's a known no-no. Is here some ambiguity I'm not seeing?
These cases are confirmed valid TOML, per the 1.0 spec, which adds many more examples: https://github.com/toml-lang/toml/blob/1.0.0/toml.md
The following is definitely valid Toml syntax:
[[somearray]]
one.two = false
It should result in a model that is equivalent to this Json syntax:
{
"somearray": [
{
"one": {
"two": false
}
}
]
}
It can be validated here.
Dotted keys should indeed define everything to the left of each dot as a table.

Mongo query with regex fails when backslash\newline is there in a field

Hi I have a field in a user collection called "Address".User saving their address from a textarea in my application. mongodb convert it to new line like following.
{
"_id": ObjectId("56a9ba0ffbe0856d4f8b456d"),
"address": "HOUSE NO. 3157,\r\nSECTOR 50-D",
"pincode": "",
},
{
"_id": ObjectId("56a9ba0ffbe0856d4f8b456d"),
"address": "HOUSE NO. 3257,\r\nSECTOR 50-C",
"pincode": "",
}
So now When I am running a search query on the basis of "address".Like following:
guardianAdd = $dm->getRepository('EduStudentBundle:GuardianAddress')->findBy(array(
'address' => new \MongoRegex('/.*' .$data['address'] . '.*/i'),
'isDelete' => false
));
echo count($guardianAdd);die;
it does not give any result. My Searchi key word is : "HOUSE NO.3157 SECTOR 50-D".
However if I am searching using like: HOUSE NO. 3157 its giving correct result.
Please advice how to fix this.Thanks in advance
First of all, trailing .* are redundant. regexps /.*aaa.*/ and /aaa/ are identical and match the same pattern.
Second, you probably need to use multiline modifier /pattern/im
Finally, it is not quite clear what you want to fix. The best think you can do is to provide some basic explanation of regex syntax in the search form, so users can search properly, e.g. HOUSE NO.*3157.*SECTOR 50-D to get best results.
You can make some bold assumptions and build the pattern with something like
$pattern = implode('\W+',preg_split('/\W+/', $data['address']))
which will give you a regexp HOUSE\W+NO\W+3157\W+SECTOR\W+50\W+D for different kind of HOUSE NO.3157 SECTOR 50-D requests, but it will cut all the regex flexibility available with bare input, and eventually will result with unexpected response anyway. You can follow this slippery slope and end up with your own query DSL to compile to regex, but I doubt it can be any better or more convenient than pure regex. It will be more error prone for sure.
Asking right question to get right answers is true not only on SO, but also in your application. Unfortunately there is no general solution to search for something that people have in mind, but fail to ask. I believe that in your particular case best code is no code.

How to escape some characters in postgresql

I have this data in one column in postgresql
{
"geometry":{
"status":"Point",
"coordinates":[
-122.421583,
37.795027
]
},
and i using his query
select * from students where data_json LIKE '%status%' ;
Above query return results but this one does not
select * from students where data_json LIKE '%status:%' ;
How can fix that
Of course the 2nd one doesn't find a match, there's no status: text in the value. I think you wanted:
select * from students where data_json LIKE '%"status":%'
... however, like most cases where you attempt text pattern matching on structured data this is in general a terrible idea that will bite you. Just a couple of problem examples:
{
"somekey": "the value is \"status\": true"
}
... where "status": appears as part of the text value and will match even though it shouldn't, and:
{
status : "blah"
}
where status has no quotes and a space between the quotes and colon. As far as JavaScript is concerned this is the same as "status": but it won't match.
If you're trying to find fields within json or extract fields from json, do it with a json parser. PL/V8 may be of interest, or the json libraries available for tools like pl/perl, pl/pythonu, etc. Future PostgreSQL versions will have functions to get a json key by path, test if a json value exists, etc, but 9.2 does not.
At this point you might be thinking "why don't I use regular expressions". Don't go there, you do not want to try to write a full JSON parser in regex. this blog entry is somewhat relevant.