I have the following code where I am reading a file and replacing any occurences of "*.tar.gz" file with the new file name provided. Everything works fine and I can see the replaced changes in the console however I am not being able to write a new file with all the changes.
def modifyFile(newFileName: String, filename: String) = {
Source.fromFile(filename).getLines.foreach { line =>
println(line.replaceAll(".+\\.tar\\.gz", newFileName.concat(".tar.gz")))
}
}
}
You forgot to write your modified lines into the new file:
def modifyFile(newFileName: String, sourceFilePath: String, targetFilePath:String) {
scala.tools.nsc.io.File(targetFilePath).printlnAll(
Source.fromFile(sourceFilePath).getLines().map {
_.replaceAll(".+\\.tar\\.gz", newFileName.concat(".tar.gz"))
}.toSeq:_*)
}
Please note that this approach is not the most efficient in terms of performance, as the content of source file is read fully to memory, processed and then written back. More efficient approach will be more verbose and will include java's FileReader/FileWriter.
Upd
As rightfully pointed in comments you have to chose suitable way to write result to file depending on what tools and dependencies you have.
Related
I am working on a NSDocument based Mac app. Which imports .xml file. It's working fine for some xml files but for few having issues.
Issue is read() is modifying the data when we import file, i need to keep the original data as it is.
what do i need to do to make sure i get original xml data in the read()?
I am using below function to read the file
override func read(from data: Data, ofType typeName: String) throws {
var error:NSError? = nil
var xmlDocument1:XMLDocument? = XMLDocument()
do{
xmlDocument1 = try XMLDocument(data: data, options: XMLNode.Options(rawValue: XMLNode.Options.RawValue(Int(XMLNode.Options.nodePreserveWhitespace.rawValue))))
}catch let err as NSError{
error = err
}
if error != nil {
throw error!
}
}
and i parse xmlDocument1 to read and get all the xml information.
Issue: Doing this way swift is modifying the document, as mentioned below.
Example 1:
Original:
<iws:attr-option name="1 - Poor" />
<iws:attr-option name="2 - Needs Improvement" />
Data getting from Read(), notice the closing tags added automatically
<iws:attr-option name="1 - Poor"></iws:attr-option>
<iws:attr-option name="2 - Needs Improvement"></iws:attr-option>
Example 2:
Original:
<source>
<ph id="12" x="</span>">{12}</ph>
</source>
Data getting from Read(), notice the ">" symbol is replaced with "& gt;"
<source>
<ph id="12" x="</span>">{12}</ph>
</source>
Example 3:
I am not able to paste the code here as the special character is not even displaying here, so adding image.
left is the original and right side one is what i am getting in read(), special character is missing.
Code Sameple : (I am not sure if we can post code directly here)
https://drive.google.com/drive/folders/1WWGE7fFJPKvs5KU5f_PlwWtoqCVxTcS0?usp=sharing
Above drive we have sample xml file and code.
"DevelopingADocumentBasedApp" is the code, just open the "DocumentBasedApp.xcodeproj", run it.
3 .Once it runs, click on Menu->File->Open and open the provided xml file.
In content.swift, Keep a break point at "print(xmlDocument!)"
Here we can see the document is modified by NSDocument, and it is different from the original
Edit:
#matt Thank you for making me understand real problem, Initially i thought that i have issue with NSDocument's read(). But issues is XMLDocument() not returning exact data. I need to find a solution for that.
Reading is not changing your document.
You make an xml document, with XMLDocument(data:...). You are asking for a new valid XML document based on your original, and that is exactly what you get. The resulting structure is not a big string, like your original data; it is an elaborate node tree reflecting the structure of your XML. That node tree is identical to the structure described by your original. That fact does not affect in any way your ability to parse the document; indeed, it is why you are able to parse the document. If you think it does cause an inability to parse the document, your parsing code is wrong (but you didn't show that, so no more can be said).
Also note that your evidence for what is "in" the XML document is indirect; the XML document is a node tree, but the strings you display are the output of a secondary rendering into a string. That rendering representation is arbitrary and malleable; it obeys its own rules of formatting. (And again, you didn't show anything about how you obtain that rendering. Perhaps we are talking about your print statement?)
The point is, you seem to have to some sort of expectation about how passing into an XMLDocument and then back out of it will "round trip" your original string in such a way that the output looks just like the original. That expectation is incorrect. That's not what XMLDocument does.
And merely reading the original data into an XMLDocument did not change the data, I can promise you that.
So don't worry, be happy; as far as the validity of your XML is concerned, everything is fine, and the data you started with has not been altered in any way.
Here's a demonstration:
let xmlstring = """
<testing>
<fun whatever="thingy" />
</testing>
"""
print(xmlstring)
let xmldata = xmlstring.data(using: .utf8)!
let xml = try? XMLDocument(data: xmldata, options: [])
print("=======")
print(xml!)
The output is:
<testing>
<fun whatever="thingy" />
</testing>
=======
<?xml version="1.0"?><testing><fun whatever="thingy"></fun></testing>
As you can see, the output from the print is not the same as the input string. But it is a valid XML representation of the original string, and that's all that matters. And the original xmlstring and xmldata that I started with are, I assure you, completely untouched.
I have a file inside my Flutter-project. A simple .dart file which looks like this:
class EnLanguage implements BaseLanguage {
#override
Map<String, String> get language => {'test': 'test'};
}
Now my goal is that I write I script which I by executing goes through all my Project-files, searches for specific Strings ( the ones with a .tr ending) and adds it to the map in the class above (key and value are the same).
I couldn't find any way to achieve this. How does a simple script looks like that can write inside my project files? Im not asking for the whole logic, I just need a start. I couldn't find anything..
Have a look at the package dcli and specifically the pack command. It does a chunk of what you need.
Not quite certain what you mean by strings ending with a .tr.
But to process each script.
var project = DartProject.self.pathToPackage;
find('*.dart', workingDirectory: project)
.forEach((script) {
read(script). forEach((line) {
If (line. contains('.tr'))
{
Extract line...
Write to generated file..
}
http://spark.apache.org/docs/latest/sql-programming-guide.html#interoperating-with-rdds
The link shows how to change txt file into RDD, and then change to Dataframe.
So how to deal with binary file ?
Ask for an example ,Thank you very much .
There is a similar question without answer here : reading binary data into (py) spark DataFrame
To be more detail, I don't know how to parse the binary file .for example , I can parse txt file into lines or words like this:
JavaRDD<Person> people = sc.textFile("examples/src/main/resources/people.txt").map(
new Function<String, Person>() {
public Person call(String line) throws Exception {
String[] parts = line.split(",");
Person person = new Person();
person.setName(parts[0]);
person.setAge(Integer.parseInt(parts[1].trim()));
return person;
}
});
It seems that I just need the API that could parse the binary file or binary stream like this way:
JavaRDD<Person> people = sc.textFile("examples/src/main/resources/people.bin").map(
new Function<String, Person>() {
public Person call(/*stream or binary file*/) throws Exception {
/*code to construct every row*/
return person;
}
});
EDIT:
The binary file contains structure data (relational database 's table,the database is a self-made database) and I know the meta info of the structure data.I plan to change the structure data into RDD[Row].
And I could change every thing about the binary file when I use FileSystem's API (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html) to write the binary stream into HDFS .And The binary file is splittable. I don't have any idea to parse the binary file like the example code above . So I cann't try anything so far.
There is a binary record reader that is already available for spark (I believe available in 1.3.1, atleast in the scala api).
sc.binaryRecord(path: string, recordLength: int, conf)
Its on you though to convert those binaries to an acceptable format for processing.
I am a Scala/PlayFramework noob here, so please be easy on me :).
I am trying to create an action (serving a GET request) so that when I enter the url in the browser, the browser should download the file. So far I have this:
def sepaCreditXml() = Action {
val data: SepaCreditTransfer = invoiceService.sepaCredit()
val content: HtmlFormat.Appendable = views.html.sepacredittransfer(data)
Ok(content)
}
What it does is basically show the XML in the browser (whereas I actually want it to download the file). Also, I have two problems with it:
I am not sure if using Play's templating "views.html..." is the best idea to create an XML template. Is it good/simple enough or should I use a different solution for this?
I have found Ok.sendFile in the Play's documentation. But it needs a java.io.File. I don't know how to create a File from HtmlFormat.Appendable. I would prefer to create a file in-memory, i.e. no new File("/tmp/temporary.xml").
EDIT: Here SepaCreditTransfer is a case class holding some data. Nothing special.
I think it's quite normal for browsers to visualize XML instead of downloading it. Have you tried to use the application/force-download content type header, like this?
def sepaCreditXml() = Action {
val data: SepaCreditTransfer = invoiceService.sepaCredit()
val content: HtmlFormat.Appendable = views.html.sepacredittransfer(data)
Ok(content).withHeaders("Content-Type" -> "application/force-download")
}
I am using akka-camel to process files. My initial tests were working great, however when I started passing in actual xml files it is puking with type conversions.
Here is my consumer (very simple, but puking at msg.bodyAs[String]
class FileConsumer extends Consumer {
def endpointUri = "file:/data/input/actor"
val processor = context.actorOf(Props[Processor], "processor")
def receive = {
case msg: CamelMessage => {
println("Parent...received %s" format msg)
processor ! msg.bodyAs[String]
}
}
}
Error:
[ERROR] [04/27/2015 12:10:48.617] [ArdisSystem-akka.actor.default-dispatcher-5] [akka://ArdisSystem/user/$a] Error during type conversion from type: org.apache.camel.converter.stream.FileInputStreamCache to the required type: java.lang.String with value org.apache.camel.converter.stream.FileInputStreamCache#4611b35a due java.io.FileNotFoundException: /var/folders/dh/zfqvn9gn7cl6h63d3400y4zxp3xtzf/T/camel-tmp-807558/cos2920459202139947606.tmp (No such file or directory)
org.apache.camel.TypeConversionException: Error during type conversion from type: org.apache.camel.converter.stream.FileInputStreamCache to the required type: java.lang.String with value org.apache.camel.converter.stream.FileInputStreamCache#4611b35a due java.io.FileNotFoundException: /var/folders/dh/zfqvn9gn7cl6h63d3400y4zxp3xtzf/T/camel-tmp-807558/cos2920459202139947606.tmp (No such file or directory)
I am wondering if it has something to do with the actual contents of the xml. They are not big at all (roughly 70kb). I doubt I will be able to provide an actual example of the XML itself. Just baffled as to why something so small and being converted to a string is having issues. Other dummy example xml files have worked fine.
EDIT:
One of the suggestions I had was to enable StreamCache, which I did. However, it still doesn't seem to be working. As Ankush commented, the error is confusing. I am not sure if it actually is a Stream issue or if it really is a conversion problem.
http://camel.apache.org/stream-caching.html
Added the below
camel.context.setStreamCaching(true)
I was finally able to figure out the problem. The issue was not bad data, but the size of the files. To account for this, you need to add addtional settings to the camel context.
http://camel.apache.org/stream-caching.html
The settings I used are below. I will need to further research if I should just turn off the streamcache, but this is a start.
camel.context.getProperties.put(CachedOutputStream.THRESHOLD, "750000");
or turn off streamcache
camel.context.setStreamCaching(false)
Hope this helps someone else.
we were having same issue commenting the streamCaching() helped
from(IEricssonConstant.ROUTE_USAGE_DATA_INDIVIDUAL_PROCSESS)
//.streamCaching()
.split(new ZipSplitter()) .stopOnException()
.streaming()
.unmarshal().csv()
.process(new UsageDataCSVRequestProcessor())