How do I replace spaces with %20 in PowerShell? - powershell

I'm creating a PowerShell script that will assemble an HTTP path from user input. The output has to convert any spaces in the user input to the product specific codes, "%2F".
Here's a sample of the source and the output:
The site URL can be a constant, though a variable would be a better approach for reuse, as used in the program is: /http:%2F%2SPServer/Projects/"
$Company="Company"
$Product="Product"
$Project="The new project"
$SitePath="$SiteUrl/$Company/$Product/$Project"
As output I need:
'/http:%2F%2FSPServer%2FProjects%2FCompany%2FProductF2FThe%2Fnew%2Fproject'

To replace " " with %20 and / with %2F and so on, do the following:
[uri]::EscapeDataString($SitePath)

The solution of #manojlds converts all odd characters in the supplied string.
If you want to do escaping for URLs only, use
[uri]::EscapeUriString($SitePath)
This will leave, e.g., slashes (/) or equal signs (=) as they are.
Example:
# Returns http%3A%2F%2Ftest.com%3Ftest%3Dmy%20value
[uri]::EscapeDataString("http://test.com?test=my value")
# Returns http://test.com?test=my%20value
[uri]::EscapeUriString("http://test.com?test=my value")

For newer operating systems, the command is changed. I had problems with this in Server 2012 R2 and Windows 10.
[System.Net.WebUtility] is what you should use if you get errors that [System.Web.HttpUtility] is not there.
$Escaped = [System.Net.WebUtility]::UrlEncode($SitePath)

The output transformation you need (spaces to %20, forward slashes to %2F) is called URL encoding. It replaces (escapes) characters that have a special meaning when part of a URL with their hex equivalent preceded by a % sign.
You can use .NET framework classes from within Powershell.
[System.Web.HttpUtility]::UrlEncode($SitePath)
Encodes a URL string. These method overloads can be used to encode the entire URL, including query-string values.
http://msdn.microsoft.com/en-us/library/system.web.httputility.urlencode.aspx

Related

PHP: Using preg_replace to replace an unknown string between two known strings

I have $stringF. Contained within $stringF is the following (the string is all one line, not word-wrapped as below):
http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=
AFQjCNHWQk0M4bZi9xYO4OY4ZiDqYVt2SA&clid=
c3a7d30bb8a4878e06b80cf16b898331&cid=52779892300270&ei=
H4IAW6CbK5WGhQH7s5SQAg&url=https://abcnews.
go.com/Lifestyle/wireStory/latest-royal-wedding-thousands-streets-windsor-55280649
I want to locate that string and make it look like this:
https://abcnews.go.com/Lifestyle/wireStory/latest-royal-
wedding-thousands-streets-windsor-55280649
Basically I need to use preg_replace to find the following string:
http://news.google.com/news/url?sa= ***SOME UNKNOWN CONTENT*** &url=http
and replace it with the following string:
http
I'm a little rusty with my php, and even rustier with regular expressions, so I'm struggling to figure this one out. My code looks like this:
$stringG = preg_replace('http://news.google.com/news/url?sa=*&url=http','http',$stringH);
except I know I can't use wildcards and I know I need to specially deal with the special characters (colon, forward slash, question mark, and sign, etc). Hoping someone can help me out here.
Also of note is that my $stringF contains multiple instances of such strings, so I need the preg_replace to be not greedy - otherwise it will replace a huge chunk of my string unnecessarily.
PHP has tools for that, no need to use a regex. parse_url to get the components of an url (scheme, host, path, anchor, query, ...) and parse_str to get the keys/values of the query part.
$url = 'http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNHWQk0M4bZi9xYO4OY4ZiDqYVt2SA&clid=c3a7d30bb8a4878e06b80cf16b898331&ci=52779892300270&ei=H4IAW6CbK5WGhQH7s5SQAg&url=https://abcnews.go.com/Lifestyle/wireStory/latest-royal-wedding-thousands-streets-windsor-55280649';
parse_str(parse_url($url, PHP_URL_QUERY), $arr);
echo $arr['url'];

Trouble understanding C# URL decode with Unicode character(s) in PowerShell

I'm currently working on something that requires me to pass a Base64 string to a PowerShell script. But while decoding the string back to the original I'm getting some unexpected results as I need to use UTF-7 during decoding and I don't understand why. Would someone know why?
The Mozilla documentation would suggest that it's insufficient to use Base64 if you have Unicode characters in your string. Thus you need to use a workaround that consists of using encodeURIComponent and a replace. I don't really get why the replace is needed and shortened it to btoa(escape('✓ à la mode')) to encode the string. The result of that operation would be JXUyNzEzJTIwJUUwJTIwbGElMjBtb2Rl.
Using PowerShell to decode the string back to the original, I need to first undo the Base64 encoding. In order to do System.Convert can be used (which results in a byte array) and its output can be converted to a UTF-8 string using System.Text.Encoding. Together this would look like the following:
$bytes = [System.Convert]::FromBase64String($inputstring)
$utf8string = [System.Text.Encoding]::UTF8.GetString($bytes)
What's left to do is URL decode the whole thing. As it is a UTF-8 string I'd expect only to need to run the URL decode without any further parameters. But if you do that you end up with a accented a that looks like � in a file or ? on the console. To get the actual original string it's necessary to tell the URL decode to use UTF-7 as the character set. It's nice that this works but I don't really get why it's necessary since the string should be UTF-8 and UTF-8 certainly supports an accented a. See the last two lines of the entire script for what I mean. With those two lines you will end up with one line that has the garbled text and one which has the original text in the same file encoded as UTF-8
Entire PowerShell script:
Add-Type -AssemblyName System.Web
$inputstring = "JXUyNzEzJTIwJUUwJTIwbGElMjBtb2Rl"
$bytes = [System.Convert]::FromBase64String($inputstring)
$utf8string = [System.Text.Encoding]::UTF8.GetString($bytes)
[System.Web.HttpUtility]::UrlDecode($utf8string) | Out-File -Encoding utf8 C:\temp\output.txt
[System.Web.HttpUtility]::UrlDecode($utf8string, [System.Text.UnicodeEncoding]::UTF7) | Out-File -Append -Encoding utf8 C:\temp\output.txt
Clarification:
The problem isn't the conversion of the Base64 to UTF-8. The problem is some inconsistent behavior of the UrlDecode of C#. If you run escape('✓ à la mode') in your browser you will end up with the following string %u2713%20%E0%20la%20mode. So we have a Unicode representation of the check mark and a HTML entity for the á. If we use this directly in UrlDecode we end up with the same error. My current assumption would be that it's an issue with the encoding of the PowerShell window and pasting characters into it.
Turns out it actually isn't all that strange. It's just for what I want to do it's advantages to use a newer function. I'm still not sure why it works if you use the UTF-7 encoding. But anyways, as an explanation:
... The hexadecimal form for characters, whose code unit value is 0xFF or less, is a two-digit escape sequence: %xx. For characters with a greater code unit, the four-digit format %uxxxx is used.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/escape
As TesselatingHecksler pointed out What is the proper way to URL encode Unicode characters? would indicate that the %u format wasn't formerly standardized. A newer version to escape characters exists though, which is encodeURIComponent.
The encodeURIComponent() function encodes a Uniform Resource Identifier (URI) component by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character (will only be four escape sequences for characters composed of two "surrogate" characters).
The output of this function actually works with the C# implementation of UrlDecode without supplying an additional encoding of UTF-7.
The original linked Mozilla article about a Base64 encode for an UTF-8 strings modifies the whole process in a way to allows you to just call the Base64 decode function in order to get the whole string. This is realized by converting the URL encode version of the string to bytes.

Warning Control Character '\S' is not valid when concatinating two strings

I have two variables such as:
path='data\voc11\SegmentationClassExt\%s.png'
name='123'
I want to concatenate two strings into one like so:
data\voc11\SegmentationClassExt\123.png
I used the code below:
sprintf(path, name)
However I receive the following error:
Warning: Control Character '\S' is not valid. See 'doc sprintf' for control characters valid in the format string.
ans =
dataoc11
I am using MATLAB on Windows. Could you give me any solution for that. I tried to change path='data\\voc11\\SegmentationClassExt\\%s.png' and when I did that, the above code will work. However, the current data is
path='data\voc11\SegmentationClassExt\%s.png';
use the matlab function fullfile
filename = fullfile ( path, [name '.png'] );
or
filename = fullfile ( path, sprintf ( '%s.png', name ) );
Note: you should avoid using path as a variable as it is already a Matlab function
Before we start, it's highly advised that you do not use path as a local variable. path is a global variable that MATLAB uses to resolve function scope, especially if you are going to use any functions from toolboxes. Overwriting path with your own string will actually make MATLAB not function properly. Use a different variable name.
Now to resolve your problem, you can use either fullfile as what #matlabgui has suggested, or if you don't care about OS compatibility and are only working in Windows, you can either manually change the path as you have placed so that you can introduce two back slashes and it will indeed work on Windows OS, or you can perhaps use a string replace function so that all back slashes will be accompanied with an additional back slash.
Either one of these two methods will work:
Method 1 - Using regular expressions
pat = 'data\voc11\SegmentationClassExt\%s.png';
pat_new = regexprep(pat, '\\', '\\\\');
The function regexprep performs a string replacement by regular expressions. We search for all single backslashes and replace them with double backslashes. Note that the single back slash \ is a special character in regular expressions so if you explicitly what to look for back slashes, you must place an additional back slash beside it.
Method 2 - Using strrep
pat = 'data\voc11\SegmentationClassExt\%s.png';
pat_new = strrep(pat, '\', '\\');
strrep stands for String Replace. It works very similar to regular expressions as we have discussed above. However, what's nice is that you don't have to append an additional back slash when looking for the actual character.
Once you do this, you can use sprintf as normal:
pat_new = sprintf(pat_new, name);

Decoding URL containing unicode characters

I have the following code in a Mako template:
<a href="#" onclick='getCompanyHTML("${fund.investments[inv_name].name | u}"); return false;'>${inv_name}</a>
This applies url escaping to the name string of an object representing a company. The resulting escaped string is then used in a url. The Mako documentation states that url encoding is provided using urllib.quote_plus(string.encode('utf-8')).
On the server I receive the company name part into the argument investment_name:
def Investment(client, fund_name, investment_name, **kwargs):
client = urllib.unquote_plus(client)
fund_name = urllib.unquote_plus(fund_name)
investment_name = urllib.unquote_plus(investment_name)
I then use investment_name as a key back in to the same dictionary from which it was extracted in the template.
This works fine for all the standard cases, such as spaces, slashes, and single quotes in the company name. However, it fails if the company name contains unicode characters outside of the ascii character set.
For instance, the url for company name "Eptisa Servicios de Ingeniería S.L." is rendered as "Eptisa+Servicios+de+Ingenier%C3%ADa+S.L." When this value arrives back at the server, I'm reversing the url escaping but clearly failing to decode the unicode properly because my attempt to use the result as a dictionary key generates a key error.
I've tried adding unicode decoding in these two forms, without luck:
investment_name = urllib.unquote_plus(investment_name.decode('utf-8'))
investment_name = urllib.unquote_plus(investment_name.encode('raw_unicode_escape').decode('utf-8'))
Can anyone suggest what I must do to "Eptisa+Servicios+de+Ingenier%C3%ADa+S.L." to turn it back into "Eptisa Servicios de Ingeniería S.L."?
Do it in the reverse order: first unquote then .decode('utf-8')
Do not mix bytes and Unicode strings.
Example
import urllib
q = "Eptisa+Servicios+de+Ingenier%C3%ADa+S.L."
b = urllib.unquote_plus(q)
u = b.decode("utf-8")
print u
Note: print u might produce UnicodeEncodeError. To fix it:
print u.encode(character_encoding_your_console_understands)
Or set PYTHONIOENCODING environment variable.
On Unix you could try locale.getpreferredencoding() as character encoding, on Windows see output of chcp

Perl JSON pound sign escaping

I am trying to use a web API of a service written in Perl (OTRS).
The data is sent in JSON format.
One of the string values inside the JSON structure contains a pound sign, which in apparently is used as a comment character in JSON.
This results in a parsing error:
unexpected end of string while parsing
JSON string
I couldn't find how to escape the character in order to get the string parsed successfully.
The obvious slash escaping results in:
illegal backslash escape sequence in
string
Any ideas how to escape it?
Update:
The URL I am trying to use looks something like that (simplified but still causes the error):
http://otrs.server.url/otrs/json.pl?User=username&Password=password&Object=TicketObject&Method=ArticleSend&Data={"Subject":"[Ticket#100000] Test Ticket from OTRS"}
Use Uri::escape:
use URI::Escape;
my $safe = uri_escape($url);
See rfc1738 for the list of characters which can be unsafe.
The hash symbol, #, has a special meaning in URLs, not in JSON. Your URL is probably getting truncated at the hash before the remove server even sees it:
http://otrs.server.url/otrs/json.pl?User=username&Password=password&Object=TicketObject&Method=ArticleSend&Data={"Subject":"[Ticket
And that means that the remote server gets mangled JSON in Data. The solution is to URL encode your parameters before pasting them together to form your URL; eugene y tells you how to do this.