%3d instead of = in file path, then i try to open file from resources - scala

I write some tests and to get absolute path from relative path i use this function
private def getAbsolutePath(filePath: String): String = {
getClass.getResource(filePath).getFile
}
and then i do:
println(getAbsolutePath("/parquetIncrementalProcessor/withPartitioning/"))
println(getAbsolutePath("/parquetIncrementalProcessor/withPartitioning/own_loading_id=1/partition_column=test/"))
i get:
/Users/19658296/csp-fp-snaphot/library/target/scala-2.11/test-classes/parquetIncrementalProcessor/withPartitioning/
/Users/19658296/csp-fp-snaphot/library/target/scala-2.11/test-classes/parquetIncrementalProcessor/withPartitioning/own_loading_id%3d1/partition_column%3dtest/
As you can see, instead of =, I get some strange symbol. At the same time, when I try to read these files with a park, he can read the path without %3d, and with %3d he gets the error "Path does not exist".
How can I fix this?

Seems like its URL encoded, maybe because using stuff from files and resources are designed to work with Universal Resource Locators. You can URLDecode it like so:
import java.net.URLDecoder
def getAbsolutePath(filePath: String): String = {
val path = getClass.getResource(filePath).getFile
URLDecoder.decode(path, "UTF-8")
}

Related

Scala changing parquet path in config (typesafe)

Currently I have a configuration file like this:
project {
inputs {
baseFile {
paths = ["project/src/test/resources/inputs/parquet1/date=2020-11-01/"]
type = parquet
applyConversions = false
}
}
}
And I want to change the date "2020-11-01" to another one during run time. I read I need a new config object since it's immutable, I'm trying this but I'm not quite sure how to edit paths since it's a list and not a String and it definitely needs to be a list or else it's going to say I haven't configured a path for the parquet.
val newConfig = config.withValue("project.inputs.baseFile.paths"(0),
ConfigValueFactory.fromAnyRef("project/src/test/resources/inputs/parquet1/date=2020-10-01/"))
But I'm getting a:
Error com.typesafe.config.ConfigException$BadPath: path parameter: Invalid path 'project.inputs.baseFile.': path has a leading, trailing, or two adjacent period '.' (use quoted "" empty string if you want an empty element)
What's the correct way to set the new path?
One option you have, is to override the entire array:
import scala.collection.JavaConverters._
val mergedConfig = config.withValue("project.inputs.baseFile.paths",
ConfigValueFactory.fromAnyRef(Seq("project/src/test/resources/inputs/parquet1/date=2020-10-01/").asJava))
But a more elegant way to do this (IMHO), is to create a new config, and to use the existing as a fallback.
For example, we can create a new config:
val newJsonString = """project {
|inputs {
|baseFile {
| paths = ["project/src/test/resources/inputs/parquet1/date=2020-10-01/"]
|}}}""".stripMargin
val newConfig = ConfigFactory.parseString(newJsonString)
And now to merge them:
val mergedConfig = newConfig.withFallback(config)
The output of:
println(mergedConfig.getList("project.inputs.baseFile.paths"))
println(mergedConfig.getString("project.inputs.baseFile.type"))
is:
SimpleConfigList(["project/src/test/resources/inputs/parquet1/date=2020-10-01/"])
parquet
As expected.
You can read more about Merging config trees. Code run at Scastie.
I didn't find any way to replace one element of the array with withValue.

Scala how to use regex on endsWith?

I'm trying to figure out how to isolate all file extensions from a list of file names using regex and endsWith.
So as an example
input:
file.txt, notepad.exe
output:
txt, exe
What my idea is, is to use filter to get file names that endsWith("."_). But endsWith("."_) doesn't work.
Any suggestions?
You really do not want to filter, you want to map each filename into its extension.
(and maybe then collect only the ones that had an extension and probably you only want each unique extension)
You can use a regex for that.
object ExtExtractor {
val ExtRegex = """.*\.(\w+)?""".r
def apply(data: List[String]): Set[String] =
data.iterator.collect {
case ExtRegex(ext) => ext.toLowerCase
}.toSet
}
You can see it running here.
how about using split('.') which will return a
String[] parts = fileName.split("\\.");
String extension = parts[parts.length-1];

How to make the platform independent filepath slash "/" with GLib? (GJS)

I have a folderPath which has a directory string:
/home/bastian/Pictures
and I have a variable fileName which contains the name.
I can concatenate the two strings together like this, but it only works on UNIX systems:
let filePath = folderPath + '/' + fileName;
Is there a way with GLib I can concatenate the two to each other without making assumptions about the slash or backslash (to stay fx Windows-compatible)?
With help from guadec, I found out I could use GLib's g_build_filenamev () function.
let filePath = GLib.build_filenamev([folderPath, fileName]);
This builds a path to the file and respects the platform at the same time.
Note: it requires that you import GLib first at the top of your GJS file, like this:
const { GLib } = imports.gi;
If you happen to be using a Gio.File object to manipulate the path, you can also do something like this:
const folder = Gio.File.new_for_path(folderPath);
const file = folder.get_child(fileName);

AKAudioFile exportAsynchronously path errors

I'm trying to use AKAudioFile.exportAsynchronously to convert wav to m4a (based on the sample code here: https://audiokit.io/playgrounds/Playback/Exporting%20Files/). I've chosen .documents as my BaseDirectory, but I just keep getting directory <my_dir> isn't valid errors — e.g.:
AKAudioFile+ProcessingAsynchronously.swift:exportAsynchronously(name:baseDir:exportFormat:fromSample:toSample:callback:):379:ERROR AKAudioFile export: directory "/var/mobile/Containers/Data/Application/20C913AD-B2F4-4F26-AAD2-0DFA0C65A886/Documents/All Of Me.mp4" isn't valid
That URL looks completely reasonable, to me, so what's up?
Okay, following #jake's tip, the solution was to handle the spaces explicitly before passing into AKAudioFile's exportAsynchronously(name:baseDir:exportFormat:callback:). I just did:
var name = String(cafURL.lastPathComponent.split(separator: ".")[0])
name = name.replacingOccurrences(of: " ", with: "%20")
let exportFile = try AKAudioFile(readFileName: "\(name).wav", baseDir: .documents)
exportFile.exportAsynchronously(name: name, baseDir: .documents, exportFormat: .m4a, callback: self.callback)

Compute file content hash with Scala

In our app, we are in need to compute file hash, so we can compare if the file was updated later.
The way I am doing it right now is with this little method:
protected[services] def computeMigrationHash(toVersion: Int): String = {
val migrationClassName = MigrationClassNameFormat.format(toVersion, toVersion)
val migrationClass = Class.forName(migrationClassName)
val fileName = migrationClass.getName.replace('.', '/') + ".class"
val resource = getClass.getClassLoader.getResource(fileName)
logger.debug("Migration file - " + resource.getFile)
val file = new File(resource.getFile)
val hc = Files.hash(file, Hashing.md5())
logger.debug("Calculated migration file hash - " + hc.toString)
hc.toString
}
It all works perfectly, until the code get's deployed into different environment and file file is located in a different absolute path. I guess, the hashing take the path into account as well.
What is the best way to calculate some sort of reliable hash of a file content that well produce the same result for as log as the content of a file stays the same?
Thanks,
Having perused the source code https://github.com/google/guava/blob/master/guava/src/com/google/common/io/Files.java - only the file contents are hashed - the path does not come into play.
public static HashCode hash(File file, HashFunction hashFunction) throws IOException {
return asByteSource(file).hash(hashFunction);
}
Therefore you need not worry about locality of the file. Now why you end up with a different hash on a different fs .. maybe you should compare the size/contents to ensure eg no compound eol's were introduced.