Convert route with dynamic parts to string - scala

I have a dynamic route in Play 2.2.2 that spans several /
GET /file/*fileName Controllers.Application.download(fileName)
What's the best way to convert fileName into a non URI encoded, regular string. That is, without %20 for spaces etc?

import java.net.URLDecoder.decode
filename = decode(filename, "UTF-8")

Related

How can I change how Ktor encodes spaces in query parameters?

By default, Ktor encodes spaces as + in query parameters, which I understand is the general convention. However, the API I have to work with is not particularly well-designed, and so it only accepts %20 as a space, even in query parameters. Is there any way I can tell Ktor to always encode spaces as %20 no matter the context?
Unfortunately, you can't change the logic of encoding space in every context. The only thing you can do is encode query parameters manually:
val builder = URLBuilder().apply {
encodedParameters = ParametersBuilder().apply {
append("name", "some value".encodeURLParameter(spaceToPlus = false))
}
}
println(builder.buildString())

get Thumbnail image from wikimedia commons

I do have a filename from wikimedia commons and I want to access the thumbnail-image directly.
Example: Tour_Eiffel_Wikimedia_Commons.jpg
I found a way to get json-data containing the url to the thumbnail I want:
https://en.wikipedia.org/w/api.php?action=query&titles=Image:Tour_Eiffel_Wikimedia_Commons.jpg&prop=imageinfo&iiprop=url&iiurlwidth=200
but I don't want another request. Is there a way to access the thumbnail directly?
If you're okay to rely on the fact the current way of building the URL won't change in the future (which is not guaranteed), then you can do it.
The URL looks like this:
https://upload.wikimedia.org/wikipedia/commons/thumb/a/a8/Tour_Eiffel_Wikimedia_Commons.jpg/200px-Tour_Eiffel_Wikimedia_Commons.jpg
The first part is always the same: https://upload.wikimedia.org/wikipedia/commons/thumb
The second part is the first character of the MD5 hash of the file name. In this case, the MD5 hash of Tour_Eiffel_Wikimedia_Commons.jpg is a85d416ee427dfaee44b9248229a9cdd, so we get /a.
The third part is the first two characters of the MD5 hash from above: /a8.
The fourth part is the file name: /Tour_Eiffel_Wikimedia_Commons.jpg
The last part is the desired thumbnail width, and the file name again: /200px-Tour_Eiffel_Wikimedia_Commons.jpg
Solution in Python based on the solution of #svick:
import hashlib
def get_wc_thumb(image, width=300): # image = e.g. from Wikidata, width in pixels
image = image.replace(' ', '_') # need to replace spaces with underline
m = hashlib.md5()
m.update(image.encode('utf-8'))
d = m.hexdigest()
return "https://upload.wikimedia.org/wikipedia/commons/thumb/"+d[0]+'/'+d[0:2]+'/'+image+'/'+str(width)+'px-'+image
In case anyone is doing this query in SPARQL instead of Python:
There exists an MD5 function in SPARQL and the whole string manipulation can be implemented in SPARQL too!
BIND(REPLACE(wikibase:decodeUri(STR(?image)), "http://commons.wikimedia.org/wiki/Special:FilePath/", "") as ?fileName) .
BIND(REPLACE(?fileName, " ", "_") as ?safeFileName)
BIND(MD5(?safeFileName) as ?fileNameMD5) .
BIND(CONCAT("https://upload.wikimedia.org/wikipedia/commons/thumb/", SUBSTR(?fileNameMD5, 1, 1), "/", SUBSTR(?fileNameMD5, 1, 2), "/", ?safeFileName, "/650px-", ?safeFileName) as ?thumb)
Run this live query in Wikidata's query service: here, as discussed here: https://discourse-mediawiki.wmflabs.org/t/accessing-a-commons-thumbnail-via-wikidata/499

How to add a string to an URL in golang?

I am new to golang and trying this for the first time. I have to call yahoo finance api(YQL) to get the stock price of the symbol in json format.
Here is the api:
http://query.yahooapis.com/v1/public/yql?q=select%20LastTradePriceOnly%20from%20yahoo.finance.quote%20where%20symbol%20in%20(%22AAPL%22,%22FB%22,%22GOOG%22)&format=json&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys
For now i have hard coded the symbol(AAPL,FB, GOOG) in the above select query but these symbols will be coming from the user, it should be dynamic. How should i add the input(symbol) to the above query in golang?
Thanks for the help in advance.
If you have the symbols (provided by the users) as a string array parameter (like: "symbols", being string["\"AAPL\"", "\"FB\"", "\"GOOG\""]), you can use strings/#Join to produce the right string:
s := strings.Joins(symbols, ",")
Then (as detailed in "Encode / decode URLs"), use net/url/#QueryEscape to get the final url string for symbols:
url := "http://query.yahooapis.com/v1/public/yql?q=select%20LastTradePriceOnly%20from%20yahoo.finance.quote%20where%20symbol%20in%20(" +
url.QueryEscape(s) +
")&format=json&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys"

Encoding in Pig

Loading data that contains some particular characters (as for example, À, ° and others) using Pig Latin and storing data in a .txt file is possible to see that these symbols in a txt file are displayed as � and ï characters. That happens because of UTF-8 substitution character.
I would like to ask if is possible to avoid it somehow, maybe with some pig commands, to have in the result (in txt file) for example À instead of �?
In Pig we have built in dynamic invokers that that allow a Pig programmer to refer to Java functions without having to wrap them in custom Pig UDFs. So now u can load the data as UTF-8 encoded strings, then decode it, then perform all your operations on it and then store it back as UTF-8. I guess this should work for the first part:
DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String String');
encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded, 'UTF-8');
The java code responsible for doing this is:
import java.io.IOException;
import java.net.URLDecoder;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
public class UrlDecode extends EvalFunc<String> {
#Override
public String exec(Tuple input) throws IOException {
String encoded = (String) input.get(0);
String encoding = (String) input.get(1);
return URLDecoder.decode(encoded, encoding);
}
}
Now modify this code to return UTF-8 encoded strings from normal strings and store it to your text file. Hope it works.
You are correct this is because of Text (http://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/io/Text.html) converts incoming data (Bytes) to UTF-8 automatically. To avoid this you should not work with Text.
That said you should use bytearray type instead of chararray (bytearray do not use Text and so no conversion is done). Since you don't specify any code, I'll provide an example for illustration:
this is what (likely) you did:
converted_to_utf = LOAD 'strangeEncodingdata' using TextLoader AS (line:chararray);
this is what you wanted to do:
no_conversion = LOAD 'strangeEncodingdata' using TextLoader AS (line:bytearray);

Split url code in c#

I want to split a URL using the following code:
string url="http://images/newyork/1550/t_2911340.JPG";
file_name=server.MapPath("~/storedImages/")+"t_2911340.gif";
save_file_from_url(file_name,url);
But I want my code like this:
file_name=server.MapPath("~/storedImages/") +
( values after last / from url and before ) +
gif // by adding gif i want to rename it
Can you help me to split the code and append it?
Thanks in advance.
See the System.Uri class. Construct an instance of System.Uri, passing your URL string to the constructor. Then access the various properties of the Uri object as your "split" URL. To further split the path portion of the URL into segments, use the Segments Property.