how to replace particular field in file - perl

In a file :
"name": "test","Address": "UK" "currency": "£" no:121212 ,
"name": "test1","Address": "UK" "currency": "£" no:12123212 ,
"name": "test2","Address": "UK" "currency": "£" no:121223212 ,
"name": "test3","Address": "UK" "currency": "£" no:121223212 ,
"name": "test4","Address": "UK" "currency": "£" no:121223212 ,
I want replace all the no into *
"name": "test","Address": "UK" "currency": "£" no:***** ,
"name": "test1","Address": "UK" "currency": "£" no:***** ,
"name": "test2","Address": "UK" "currency": "£" no:***** ,
"name": "test3","Address": "UK" "currency": "£" no:***** ,
"name": "test4","Address": "UK" "currency": "£" no:***** ,
and want append back into file

This one-liner should do it:
perl -i.bak -pe 's/(?<=no:)\d+/****/' filename

$ perl -pe 's/no:\d+/no:*****/' < input_file > output_file

cat input | perl -lne 's/^(.+)no:(\d+)(.*)/print"$1no:","*" x length($2),"$3"/e' > output

I might be tempted to use File::Map to change the file in place. If I do that, however, I have to replace the digits one-for-one since this won't move the other characters in the file:
use File::Map qw(map_file);
map_file my $map, 'test.txt', '+<';
$map =~ s/(?<=no:)(\d+)(?=\s*,$)/ '*' x length $1 /meg;

Related

Getting extra variable in XML to JSON conversion using perl

My Program is giving extra variable $t in output. Can anyone help me on this?
use XML::XML2JSON;
xml content
my $XML = '<file><sno>1</sno><process>VALID</process><validation_type>C</validation_type><file_type>HTML</file_type><line>2</line><column>78</column><status>0</status><type>Warning</type><code>001</code><rule>aligning content.</rule><desc>Check that non-breaking space.</desc></file>';
my $XML2JSON = XML::XML2JSON->new();
my $JSON = $XML2JSON->convert($XML);
print $JSON;
Output - Extra variable is coming $t
{
"#encoding": "UTF-8",
"#version": "1.0",
"file": {
"status": {
"$t": "0"
},
"rule": {
"$t": "aligning content."
},
"validation_type": {
"$t": "C"
},
"process": {
"$t": "VALID"
},
"sno": {
"$t": "1"
},
"file_type": {
"$t": "HTML"
},
"desc": {
"$t": "Check that non-breaking space."
},
"type": {
"$t": "Warning"
},
"code": {
"$t": "001"
},
"line": {
"$t": "2"
},
"column": {
"$t": "78"
}
}
}
Expected output is:
{
"sno": "1",
"process": "VALID",
"validation_type": "C",
"file_type": "HTML",
"line": "2",
"column": "78",
"status": "0",
"type": "Warning",
"code": "001",
"rule": "aligning content.",
"desc": "Check that non-breaking space."
}
The $t is content key as mentioned in the XML::XML2JSON documentation.
If your intention is to convert from XML to JSON, I would recommend to use XML::Simple and later you can encode using JSON.pm.
Code below:
#!/usr/bin/perl
use strict;
use warnings;
use JSON;
use XML::Simple;
#Create an object
my $xmlSimple = new XML::Simple;
my $XML = '<file><sno>1</sno><process>VALID</process><validation_type>C</validation_type><file_type>HTML</file_type><line>2</line><column>78</column><status>0</status><type>Warning</type><code>001</code><rule>aligning content.</rule><desc>Check that non-breaking space.</desc></file>';
#either we can pass a variable or a xml file which contains xml data
my $dataXML = $xmlSimple->XMLin($XML);
my $jsonString = encode_json($dataXML);
print $jsonString;
Output:
{
"process":"VALID",
"line":"2",
"column":"78",
"type":"Warning",
"file_type":"HTML",
"sno":"1",
"status":"0",
"rule":"aligning content.",
"code":"001",
"desc":"Check that non-breaking space.",
"validation_type":"C"
}

MongoDB - MongoImport of JSON (jsonl) - Rename, change types and add fields

i'm new to the topic MongoDB and have 4 different problems importing a big (16GB) file (jsonl) into my MongoDB (simple PSA-Cluster).
Below attached you will find a sample entry from the mentiond JSON-Dump.
With this file which i get from an external provider I actually have 4 problems.
"hotel_id" is the key and should normally be (re-)named as "_id"
"hotel_id" should not be treated as string rather than as Number
"location" is not properly formatted (if i understood correctly the MongoDB Manual) as GeoJSON as it should be like
"location": {
"type": "Point",
"coordinates": [-93.26838,37.15845]
}
instead of
"location": {
"coordinates": {
"latitude": 37.15845,
"longitude": -93.26838
}
}
"dates" can this be used to efficiently update just the records which needs to be updated?
So my challenge is now to transform the data according to my needs before importing the data or at time of import, but in both cases of course as quickly as possible.
Therefore i searched a lot for hints and best practices, but i was not able to find a solution yet, maybe due to the fact that i'm a beginner with MongoDB.
I played around with "jq" to adjust the data and for example add the type which seems to be necessary for the location (point 3), but wasn't really successful.
cat dump.jsonl | ./bin/jq --arg typeOfField Point '.location + {type: $typeOfField}'
Beside that i was injecting a sample dump of round-about 500MB which took 1,5 mins when importing it the first time (empty database). If i run it in "upsert" mode it will take round-about 12 hours. So i was also wondering what is the best practice to import such a big JSON-dump?
Any help is appreciated!! :-)
Kind regards,
Lumpy
{
"hotel_id": "12345",
"name": "Test Hotel",
"address": {
"line_1": "123 Test St",
"line_2": "Apt A",
"city": "Test City",
},
"ratings": {
"property": {
"rating": "3.5",
"type": "Star"
},
"guest": {
"count": 48382,
"average": "3.1"
}
},
"location": {
"coordinates": {
"latitude": 22.54845,
"longitude": -90.11838
}
},
"phone": "555-0153",
"fax": "555-7249",
"category": {
"id": 1,
"name": "Hotel"
},
"rank": 42,
"dates": {
"added": "1998-07-19T05:00:00.000Z",
"updated": "2018-03-22T07:23:14.000Z"
},
"statistics": {
"11": {
"id": 11,
"name": "Total number of rooms - 220",
"value": "220"
},
"12": {
"id": 12,
"name": "Number of floors - 7",
"value": "7"
}
},
"chain": {
"id": -2,
"name": "Test Hotels"
},
"brand": {
"id": 2,
"name": "Test Brand"
}
}

Sed for parsing

I have file:
"data_personnel": [
{
"id": "1",
"name": "Mathieu"
}
],
"struct_hospital": [
{
"id": "9",
"geo": "chamb",
"nb": ""
},
{
"id": "",
"geo": "jsj",
"nb": "SMITH"
},
{
"id": "10",
"geo": "",
"nb": "12"
},
{
"id": "2",
"geo": "marqui",
"nb": "20"
},
{
"id": "4",
"geo": "oliwo",
"nb": "1"
},
{
"id": "1",
"geo": "par",
"nb": "5"
}
]
How to use sed for for to have all the values ​​of geo in struct_hospital? (chamb, jsj, , marqui, oliwo, etc ..)
The file can be in any form. With tabs, everything on a line, etc ..
As pointed out by Sundeep, it makes more sense to use a proper JSON parser.
But if you are looking for a one-time quick and dirty solution, then this might do:
sed -n '/^"struct_hospital"/,/^]/s/^.*"geo"\s*:\s*"\([^"]*\)"\s*,\?.*$/\1/p' input.txt
Sample output:
chamb
jsj
marqui
oliwo
par
Explanation:
/^"struct_hospital"/,/^]/ - only consider lines between struct_hospital and the closing bracket.
s/.../\1/p search and replace; only print the first capturing subpattern of every matching line
^.*"geo"\s*:\s*"\(.*\)"\s*,\?.*$ matches the geo lines; captures the value following the colon
In case the input spans a single line, you can use another sed invocation as a preprocessor to insert line breaks:
sed 's/]\|,/\n&/g'
This makes the full command:
sed 's/]\|,/\n&/g' input.txt | sed -n '/^"struct_hospital"/,/^]/s/^.*"geo"\s*:\s*"\([^"]*\)"\s*,\?.*$/\1/p'

How do I get user's checked-in places' categories?

I want to get logged in user's Check-ins along with the category of the place he has checked-in.
I tried to fetch the user likes and their categories and this is what worked for me - "/me/likes?fields=category"
Following is the response returned:
{
"data": [{
"name": "11 East Street Cafe",
"category": "Restaurant/Cafe",
"id": "94871278677"
},{
"name": "ABC DEF",
"category": "Politician",
"id": "177526890164"
}],
"paging": {
"cursors": {
"before": "NDkyMzg4OTc3NTYyMjk3",
"after": "MTc3NTI2ODkwMTY0"
}
}
}
For check-ins, I tried "/me/feed?with=location&fields=place,story" and it returns the checked in places list. It returns me following:
{
"data": [{
"place": {
"id": "117586118328311",
"name": "Carraba's",
"location": {
"city": "Ellicott City",
"country": "United States",
"latitude": 39.27153726574,
"longitude": -76.800542073324,
"state": "MD",
"zip": "21043"
}
},
"story": "ABC was at Carraba's.",
"id": "1533836476934935_1546360939015822"
}, {
"place": {
"id": "876737405718973",
"name": "Chipotle Mexican Grill",
"location": {
"city": "Baltimore",
"country": "United States",
"latitude": 39.2866707,
"longitude": -76.6196671,
"state": "MD",
"street": "300 W Pratt St",
"zip": "21201"
}
},
"story": "Aadya Extentia was at Chipotle Mexican Grill.",
"id": "1533836476934935_1546360502349199"
}],
"paging": {
"next": "https://graph.facebook.com/v2.5/1533836476934935/feed?fields=place,story&wi…WzlgXDpe8ZCV8kHWcO2aNxlARSpwIZByS9OWkU8VC3nj2E3VoiLkJNpy2ESOI1R7fHIOJQZDZD"
}
}
But it does not return the category of checked-in place. i.e. I want know if it is a restaurant, museum or stadium, etc.
How do I do it?
Thanks.
You can do this using Field Expansion:
/me/feed?with=location&fields=place{name,category},story
Edit: Looks like this is working for API version <= 2.4 only, but not any more with v2.5.
Edit #2: I filed a bug report to ask if this was deliberately removed, or if it is a bug. https://developers.facebook.com/bugs/1550467468600884/
Even though a Place is a Page, I can't successfully retrieve the category field through Field Expansion...
This can be done with a batch request, by using the output of the first query as input for the second query:
curl \
-F 'access_token={USER_ACCESS_TOKEN}' \
-F 'include_headers=false' \
-F 'batch=[{ "method":"GET","name":"get-places","relative_url":"me/feed?with=location&fields=place{id,name,location},story&limit=100", "omit_response_on_success": false },{"method":"GET","relative_url":"?ids={result=get-places:$.data.*.place.id}&fields=id,name,category"}]' \
https://graph.facebook.com
See
https://developers.facebook.com/docs/graph-api/reference/place/
https://developers.facebook.com/docs/graph-api/making-multiple-requests/

what is regular expression for matching single line with multi match?

$str = "Data = [ {"name": "test","Address": "UK" "currency": "£" },{"name": "test2","Address": "US" "currency": "$" },{"name": "test","Address": "eur" "currency": "E" }
I want to display all address
its not multi line string . It all a single string
Please help on this
Thanks ,
TREE J
Your string is JSON! Treat it as such!
edit: I'm an idiot and can't tell when a question is tagged as perl instead of PHP :-) Link ammended.
This should work:
while ($str =~ /\"Address\":\S+\"(.*?)\"/g) {
print "Address = $1\n";
}
You do it by using the right tool for the job. In this case you fix the corrupt JSON with a regex and then use JSON to get the data:
#!/usr/bin/perl
use strict;
use warnings;
use JSON;
my $input = <DATA>;
my ($json) = $input =~ /DATA = (.*)/;
my $data = decode_json $json;
for my $record (#$data) {
print "$record->{name} has address $record->{Address}\n";
}
__DATA__
DATA = [ {"name": "test", "Address": "UK", "currency": "£" }, {"name": "test2", "Address": "US", "currency": "$" }, {"name": "test", "Address": "eur", "currency": "E" } ]
something like:
my $str = q(Data = [ {"name": "test","Address": "UK" "currency": "£" },{"name": "test2","Address": "US" "currency": "$" },{"name": "test","Address": "eur" "currency": "E" });
my #addresses = $str =~ /"Address":\s*"([^"]*)"/g;
print "#addresses\n";
HTH,
Paul
(PS: post real code, not pseudo code...)