PHP: Using preg_replace to replace an unknown string between two known strings - preg-replace

I have $stringF. Contained within $stringF is the following (the string is all one line, not word-wrapped as below):
http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=
AFQjCNHWQk0M4bZi9xYO4OY4ZiDqYVt2SA&clid=
c3a7d30bb8a4878e06b80cf16b898331&cid=52779892300270&ei=
H4IAW6CbK5WGhQH7s5SQAg&url=https://abcnews.
go.com/Lifestyle/wireStory/latest-royal-wedding-thousands-streets-windsor-55280649
I want to locate that string and make it look like this:
https://abcnews.go.com/Lifestyle/wireStory/latest-royal-
wedding-thousands-streets-windsor-55280649
Basically I need to use preg_replace to find the following string:
http://news.google.com/news/url?sa= ***SOME UNKNOWN CONTENT*** &url=http
and replace it with the following string:
http
I'm a little rusty with my php, and even rustier with regular expressions, so I'm struggling to figure this one out. My code looks like this:
$stringG = preg_replace('http://news.google.com/news/url?sa=*&url=http','http',$stringH);
except I know I can't use wildcards and I know I need to specially deal with the special characters (colon, forward slash, question mark, and sign, etc). Hoping someone can help me out here.
Also of note is that my $stringF contains multiple instances of such strings, so I need the preg_replace to be not greedy - otherwise it will replace a huge chunk of my string unnecessarily.

PHP has tools for that, no need to use a regex. parse_url to get the components of an url (scheme, host, path, anchor, query, ...) and parse_str to get the keys/values of the query part.
$url = 'http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNHWQk0M4bZi9xYO4OY4ZiDqYVt2SA&clid=c3a7d30bb8a4878e06b80cf16b898331&ci=52779892300270&ei=H4IAW6CbK5WGhQH7s5SQAg&url=https://abcnews.go.com/Lifestyle/wireStory/latest-royal-wedding-thousands-streets-windsor-55280649';
parse_str(parse_url($url, PHP_URL_QUERY), $arr);
echo $arr['url'];

Related

How to replace content within quotes via a file

Why I cannot use str_replace() to replace content between the ""? While I replace links within a file they get skipped since they are within quotes.
Example.
href="/path/to/file/is/here"
should be
href="/New/Path/To/File/Goes/Here"
If the paths/urls were not in quotes, str_replace() would work.
I'm assuming this is PHP. So, from the examples here:
http://php.net/manual/en/function.str-replace.php
You can see that you should not intercalate the same type of quotes.
So try changing the quotes in your code to single quotes or, change, the double quotes in your html to single quotes.
If that's not it, I hope at least that doc reference helps you.
This might help I usually code in java but php is pretty similar. Next time input part of your code so that the community can see your logic.
In your if statement on line 67 the 3rd variable $stringToSearch should be regex not the string your assigning it to. The purpose of regex as you know is to replace characters you don't want in your code as you already know
What you had that was not working:
// replacing string from files
//$stringToSearch = str_replace('"', "!!", $stringToSearch);
$stringToSearch = str_replace($toBeReplaced, $toBeReplacedWith, $stringToSearch);
//$stringToSearch = str_replace("!!", '"', $stringToSearch);
What I am thinking it should be:
$stringToRegex = str_replace('"', "!!", $stringToSearch);
$stringToSearch = str_replace($toBeReplaced, $toBeReplacedWith, $stringToRegex );
If anyone else has any suggestion it would be appreciated as i don't code in php.

Regex to remove data between 2 semicolns perl

A string have data with semicolons now i want to remove all the data within the 2 semicolons and leave the rest as it is. I am using perl regex to remove the unwanted data from the string:
String :
$val="Data;test is here ;&data=1dffvdviofv;&dt&;&data=343";
Now we want to remove all the data between each semicolons ,throughout the string :
$val=~s/(.*)(\;.*\;)(.*)$/$1$3/g;
But this is not working for me. Final out should be like below :
Data &data=1dffvdviofv&data=343
One of the problems is that .* is greedy, that is, it will consume as much as it can. You can make it non-greedy by writing .*?, but that alone won't fix your regex since you've anchored it to the end of the string with $. Personally I don't think there is a need for the capture groups, you can just write
$val =~ s/;.*?;//g;
I'm assuming that the extra space in your expected output (Data &data...) is a typo.
You might also want to consider using a proper parser for whatever data format this is.

Warning Control Character '\S' is not valid when concatinating two strings

I have two variables such as:
path='data\voc11\SegmentationClassExt\%s.png'
name='123'
I want to concatenate two strings into one like so:
data\voc11\SegmentationClassExt\123.png
I used the code below:
sprintf(path, name)
However I receive the following error:
Warning: Control Character '\S' is not valid. See 'doc sprintf' for control characters valid in the format string.
ans =
dataoc11
I am using MATLAB on Windows. Could you give me any solution for that. I tried to change path='data\\voc11\\SegmentationClassExt\\%s.png' and when I did that, the above code will work. However, the current data is
path='data\voc11\SegmentationClassExt\%s.png';
use the matlab function fullfile
filename = fullfile ( path, [name '.png'] );
or
filename = fullfile ( path, sprintf ( '%s.png', name ) );
Note: you should avoid using path as a variable as it is already a Matlab function
Before we start, it's highly advised that you do not use path as a local variable. path is a global variable that MATLAB uses to resolve function scope, especially if you are going to use any functions from toolboxes. Overwriting path with your own string will actually make MATLAB not function properly. Use a different variable name.
Now to resolve your problem, you can use either fullfile as what #matlabgui has suggested, or if you don't care about OS compatibility and are only working in Windows, you can either manually change the path as you have placed so that you can introduce two back slashes and it will indeed work on Windows OS, or you can perhaps use a string replace function so that all back slashes will be accompanied with an additional back slash.
Either one of these two methods will work:
Method 1 - Using regular expressions
pat = 'data\voc11\SegmentationClassExt\%s.png';
pat_new = regexprep(pat, '\\', '\\\\');
The function regexprep performs a string replacement by regular expressions. We search for all single backslashes and replace them with double backslashes. Note that the single back slash \ is a special character in regular expressions so if you explicitly what to look for back slashes, you must place an additional back slash beside it.
Method 2 - Using strrep
pat = 'data\voc11\SegmentationClassExt\%s.png';
pat_new = strrep(pat, '\', '\\');
strrep stands for String Replace. It works very similar to regular expressions as we have discussed above. However, what's nice is that you don't have to append an additional back slash when looking for the actual character.
Once you do this, you can use sprintf as normal:
pat_new = sprintf(pat_new, name);

Substitute only one part of a string using perl

I have an array that have some symbols that I want to remove and even thought I find a solution, I will like to know if this is the right way because I'm afraid if I use it with array will remove the character that I might need on future arrays.
Here is an example item on my array:
$string1='22 | logging monitor informational';
so I try the following:
$string1=~ s/\s{6}\|(?=\s{6})//;
So my output is:
22 logging monitor informational
Is the other way that best match "|". I just want to remove the pipe character.
Thanks in advance
"I want to remove just the pipe character."
OK, then do this:
$string1 =~ s/\|//;
This will remove the first pipe character in the string. (You said in another comment that you don't want to remove any additional pipe characters.) If that's not what you want, then I'd suggest telling us exactly what you do want. We can't read minds, you know.
In the mean time, I'd also strongly recommend reading the Perl regular expressions tutorial.

Perl JSON pound sign escaping

I am trying to use a web API of a service written in Perl (OTRS).
The data is sent in JSON format.
One of the string values inside the JSON structure contains a pound sign, which in apparently is used as a comment character in JSON.
This results in a parsing error:
unexpected end of string while parsing
JSON string
I couldn't find how to escape the character in order to get the string parsed successfully.
The obvious slash escaping results in:
illegal backslash escape sequence in
string
Any ideas how to escape it?
Update:
The URL I am trying to use looks something like that (simplified but still causes the error):
http://otrs.server.url/otrs/json.pl?User=username&Password=password&Object=TicketObject&Method=ArticleSend&Data={"Subject":"[Ticket#100000] Test Ticket from OTRS"}
Use Uri::escape:
use URI::Escape;
my $safe = uri_escape($url);
See rfc1738 for the list of characters which can be unsafe.
The hash symbol, #, has a special meaning in URLs, not in JSON. Your URL is probably getting truncated at the hash before the remove server even sees it:
http://otrs.server.url/otrs/json.pl?User=username&Password=password&Object=TicketObject&Method=ArticleSend&Data={"Subject":"[Ticket
And that means that the remote server gets mangled JSON in Data. The solution is to URL encode your parameters before pasting them together to form your URL; eugene y tells you how to do this.