Here is an example:
/*
* Copyright 2013 ScalaFX Project
* All right reserved.
*/
package scalafx.ensemble.example.charts
import scalafx.application.JFXApp
import scalafx.scene.Scene
import scalafx.collections.ObservableBuffer
import scalafx.scene.chart.LineChart
import scalafx.scene.chart.NumberAxis
import scalafx.scene.chart.XYChart
/** A chart in which lines connect a series of data points. Useful for viewing
* data trends over time.
*
* #see scalafx.scene.chart.LineChart
* #see scalafx.scene.chart.Chart
* #see scalafx.scene.chart.Axis
* #see scalafx.scene.chart.NumberAxis
* #related charts/AreaChart
* #related charts/ScatterChart
*/
object BasicLineChart extends JFXApp {
stage = new JFXApp.PrimaryStage {
title = "Line Chart Example"
scene = new Scene {
root = {
val xAxis = NumberAxis("Values for X-Axis", 0, 3, 1)
val yAxis = NumberAxis("Values for Y-Axis", 0, 3, 1)
// Helper function to convert a tuple to `XYChart.Data`
val toChartData = (xy: (Double, Double)) => XYChart.Data[Number, Number](xy._1, xy._2)
val series1 = new XYChart.Series[Number, Number] {
name = "Series 1"
data = Seq(
(0.0, 1.0),
(1.2, 1.4),
(2.2, 1.9),
(2.7, 2.3),
(2.9, 0.5)).map(toChartData)
}
val series2 = new XYChart.Series[Number, Number] {
name = "Series 2"
data = Seq(
(0.0, 1.6),
(0.8, 0.4),
(1.4, 2.9),
(2.1, 1.3),
(2.6, 0.9)).map(toChartData)
}
new LineChart[Number, Number](xAxis, yAxis, ObservableBuffer(series1, series2))
}
}
}
}
object Main {
BasicLineChart.main(Array(""))
}
What I send the line BasicLineChart.main(Array("")) to the console, a JavaFx window shows up with a line chart in it, and the console is blocked. When I close the chart window, I recover access to scala console. When I try to fire up the same window again, I get an error:
scala> BasicLineChart.main(Array(""))
java.lang.IllegalStateException: Application launch must not be called more than once
at com.sun.javafx.application.LauncherImpl.launchApplication(LauncherImpl.java:162)
at com.sun.javafx.application.LauncherImpl.launchApplication(LauncherImpl.java:143)
at javafx.application.Application.launch(Application.java:191)
at scalafx.application.JFXApp$class.main(JFXApp.scala:242)
at BasicLineChart$.main(<console>:23)
... 35 elided
So I have two questions:
How to launch a JavaFx app in the console without blocking?
How to avoid the above error?
Update 1
Following some advice from freenode, I changed the BasicLineChart into a class and did this:
object Main {
val x = new BasicLineChart()
x.main(Array(""))
val y = new BasicLineChart()
y.main(Array(""))
}
Still got the same error.
On question 2, from a quick look at JFXApp it calls through to javafx.application.Application.launch, docs here. That page describes the life cycle, indicating that launch must only be called once. Basically JFXApp expects to be the entry point for a whole application, so shouldn't be called multiple times.
If you want to be able to quickly relaunch your app, I would consider just running it from SBT using run or runMain rather than using the console.
On question 1, if you do decide to run from SBT you should be able to fork in run, there are details in the SBT docs, specifically try adding fork in run := true to build.sbt.
Related
I want to move all files under a directory in my s3 bucket to another directory within the same bucket, using scala.
Here is what I have:
def copyFromInputFilesToArchive(spark: SparkSession) : Unit = {
val sourcePath = new Path("s3a://path-to-source-directory/")
val destPath = new Path("s3a:/path-to-destination-directory/")
val fs = sourcePath.getFileSystem(spark.sparkContext.hadoopConfiguration)
fs.moveFromLocalFile(sourcePath,destPath)
}
I get this error:
fs.copyFromLocalFile returns Wrong FS: s3a:// expected file:///
Error explained
The error you are seeing is because the copyFromLocalFile method is really for moving files from a local filesystem to S3. You are trying to "move" files that are already both in S3.
It is important to note that directories don't really exist in Amazon S3 buckets - The folder/file hierarchy you see is really just key-value metadata attached to the file. All file objects are really sitting in the same big, single level container and that filename key is there to give the illusion of files/folders.
To "move" files in a bucket, what you really need to do is update the filename key with the new path which is really just editing object metadata.
How to do a "move" within a bucket with Scala
To accomplish this, you'd need to copy the original object, assign the new metadata to the copy, and then write it back to S3. In practice, you can copy it and save it to the same object which will overwrite the old version, which acts a lot like an update.
Try something like this (from datahackr):
/**
* Copy object from a key to another in multiparts
*
* #param sourceS3Path S3 object key
* #param targetS3Path S3 object key
* #param fromBucketName bucket name
* #param toBucketName bucket name
*/
#throws(classOf[Exception])
#throws(classOf[AmazonServiceException])
def copyMultipart(sourceS3Path: String, targetS3Path: String, fromS3BucketName: String, toS3BucketName: String) {
// Create a list of ETag objects. You retrieve ETags for each object part uploaded,
// then, after each individual part has been uploaded, pass the list of ETags to
// the request to complete the upload.
var partETags = new ArrayList[PartETag]();
// Initiate the multipart upload.
val initRequest = new InitiateMultipartUploadRequest(toS3BucketName, targetS3Path);
val initResponse = s3client.initiateMultipartUpload(initRequest);
// Get the object size to track the end of the copy operation.
var metadataResult = getS3ObjectMetadata(sourceS3Path, fromS3BucketName);
var objectSize = metadataResult.getContentLength();
// Copy the object using 50 MB parts.
val partSize = (50 * 1024 * 1024) * 1L;
var bytePosition = 0L;
var partNum = 1;
var copyResponses = new ArrayList[CopyPartResult]();
while (bytePosition < objectSize) {
// The last part might be smaller than partSize, so check to make sure
// that lastByte isn't beyond the end of the object.
val lastByte = Math.min(bytePosition + partSize - 1, objectSize - 1);
// Copy this part.
val copyRequest = new CopyPartRequest()
.withSourceBucketName(fromS3BucketName)
.withSourceKey(sourceS3Path)
.withDestinationBucketName(toS3BucketName)
.withDestinationKey(targetS3Path)
.withUploadId(initResponse.getUploadId())
.withFirstByte(bytePosition)
.withLastByte(lastByte)
.withPartNumber(partNum + 1);
partNum += 1;
copyResponses.add(s3client.copyPart(copyRequest));
bytePosition += partSize;
}
// Complete the upload request to concatenate all uploaded parts and make the copied object available.
val completeRequest = new CompleteMultipartUploadRequest(
toS3BucketName,
targetS3Path,
initResponse.getUploadId(),
getETags(copyResponses));
s3client.completeMultipartUpload(completeRequest);
logger.info("Multipart upload complete.");
}
// This is a helper function to construct a list of ETags.
def getETags(responses: java.util.List[CopyPartResult]): ArrayList[PartETag] = {
var etags = new ArrayList[PartETag]();
val it = responses.iterator();
while (it.hasNext()) {
val response = it.next();
etags.add(new PartETag(response.getPartNumber(), response.getETag()));
}
return etags;
}
def moveObject(sourceS3Path: String, targetS3Path: String, fromBucketName: String, toBucketName: String) {
logger.info(s"Moving S3 frile from $sourceS3Path ==> $targetS3Path")
// Get the object size to track the end of the copy operation.
var metadataResult = getS3ObjectMetadata(sourceS3Path, fromBucketName);
var objectSize = metadataResult.getContentLength();
if (objectSize > ALLOWED_OBJECT_SIZE) {
logger.info("Object size is greater than 1GB. Initiating multipart upload.");
copyMultipart(sourceS3Path, targetS3Path, fromBucketName, toBucketName);
} else {
s3client.copyObject(fromBucketName, sourceS3Path, toBucketName, targetS3Path);
}
// Delete source object after successful copy
s3client.deleteObject(fromS3BucketName, sourceS3Path);
}
You will need the AWS Sdk for this.
If you are using AWS Sdk Version 1,
projectDependencies ++= Seq(
"com.amazonaws" % "aws-java-sdk-s3" % "1.12.248"
)
import com.amazonaws.services.s3.transfer.{ Copy, TransferManager, TransferManagerBuilder }
val transferManager: TransferManager =
TransferManagerBuilder.standard().build()
def copyFile(): Unit = {
val copy: Copy =
transferManager.copy(
"source-bucket-name", "source-file-key",
"destination-bucket-name", "destination-file-key"
)
copy.waitForCompletion()
}
If you are using AWS Sdk Version 2
projectDependencies ++= Seq(
"software.amazon.awssdk" % "s3" % "2.17.219",
"software.amazon.awssdk" % "s3-transfer-manager" % "2.17.219-PREVIEW"
)
import software.amazon.awssdk.regions.Region
import software.amazon.awssdk.services.s3.model.CopyObjectRequest
import software.amazon.awssdk.transfer.s3.{Copy, CopyRequest, S3ClientConfiguration, S3TransferManager}
// change Region.US_WEST_2 to your required region
// or it might even work without the whole `.region(Region.US_WEST_2)` thing
val s3ClientConfig: S3ClientConfiguration =
S3ClientConfiguration
.builder()
.region(Region.US_WEST_2)
.build()
val s3TransferManager: S3TransferManager =
S3TransferManager.builder().s3ClientConfiguration(s3ClientConfig).build()
def copyFile(): Unit = {
val copyObjectRequest: CopyObjectRequest =
CopyObjectRequest
.builder()
.sourceBucket("source-bucket-name")
.sourceKey("source-file-key")
.destinationBucket("destination-bucket-name")
.destinationKey("destination-file-key")
.build()
val copyRequest: CopyRequest =
CopyRequest
.builder()
.copyObjectRequest(copyObjectRequest)
.build()
val copy: Copy =
s3TransferManager.copy(copyRequest)
copy.completionFuture().get()
}
Keep in mind that you will need the AWS credentials with appropriate permissions for both the source and destination object. For this, You just need to get the credentials and make them available as following environment variables.
export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key
export AWS_SESSION_TOKEN=your_session_token
Also, "source-file-key" and "destination-file-key" should be the full path of the file in the bucket.
I am trying to execute the Data Generator function provided my Microsoft to test streaming data to Event Hubs.
Unfortunately, I keep on getting the error
Processing failure: No such file or directory
When I try and execute the function:
%scala
DummyDataGenerator.start(15)
Can someone take a look at the code and help decipher why I'm getting the error:
class DummyDataGenerator:
streamDirectory = "/FileStore/tables/flight"
None # suppress output
I'm not sure how the above cell gets called into the function DummyDataGenerator
%scala
import scala.util.Random
import java.io._
import java.time._
// Notebook #2 has to set this to 8, we are setting
// it to 200 to "restore" the default behavior.
spark.conf.set("spark.sql.shuffle.partitions", 200)
// Make the username available to all other languages.
// "WARNING: use of the "current" username is unpredictable
// when multiple users are collaborating and should be replaced
// with the notebook ID instead.
val username = com.databricks.logging.AttributionContext.current.tags(com.databricks.logging.BaseTagDefinitions.TAG_USER);
spark.conf.set("com.databricks.training.username", username)
object DummyDataGenerator extends Runnable {
var runner : Thread = null;
val className = getClass().getName()
val streamDirectory = s"dbfs:/tmp/$username/new-flights"
val airlines = Array( ("American", 0.17), ("Delta", 0.12), ("Frontier", 0.14), ("Hawaiian", 0.13), ("JetBlue", 0.15), ("United", 0.11), ("Southwest", 0.18) )
val reasons = Array("Air Carrier", "Extreme Weather", "National Aviation System", "Security", "Late Aircraft")
val rand = new Random(System.currentTimeMillis())
var maxDuration = 3 * 60 * 1000 // default to three minutes
def clean() {
System.out.println("Removing old files for dummy data generator.")
dbutils.fs.rm(streamDirectory, true)
if (dbutils.fs.mkdirs(streamDirectory) == false) {
throw new RuntimeException("Unable to create temp directory.")
}
}
def run() {
val date = LocalDate.now()
val start = System.currentTimeMillis()
while (System.currentTimeMillis() - start < maxDuration) {
try {
val dir = s"/dbfs/tmp/$username/new-flights"
val tempFile = File.createTempFile("flights-", "", new File(dir)).getAbsolutePath()+".csv"
val writer = new PrintWriter(tempFile)
for (airline <- airlines) {
val flightNumber = rand.nextInt(1000)+1000
val deptTime = rand.nextInt(10)+10
val departureTime = LocalDateTime.now().plusHours(-deptTime)
val (name, odds) = airline
val reason = Random.shuffle(reasons.toList).head
val test = rand.nextDouble()
val delay = if (test < odds)
rand.nextInt(60)+(30*odds)
else rand.nextInt(10)-5
println(s"- Flight #$flightNumber by $name at $departureTime delayed $delay minutes due to $reason")
writer.println(s""" "$flightNumber","$departureTime","$delay","$reason","$name" """.trim)
}
writer.close()
// wait a couple of seconds
//Thread.sleep(rand.nextInt(5000))
} catch {
case e: Exception => {
printf("* Processing failure: %s%n", e.getMessage())
return;
}
}
}
println("No more flights!")
}
def start(minutes:Int = 5) {
maxDuration = minutes * 60 * 1000
if (runner != null) {
println("Stopping dummy data generator.")
runner.interrupt();
runner.join();
}
println(s"Running dummy data generator for $minutes minutes.")
runner = new Thread(this);
runner.run();
}
def stop() {
start(0)
}
}
DummyDataGenerator.clean()
displayHTML("Imported streaming logic...") // suppress output
you should be able to use the Databricks Labs Data Generator on the Databricks community edition. I'm providing the instructions below:
Running Databricks Labs Data Generator on the community edition
The Databricks Labs Data Generator is a Pyspark library so the code to generate the data needs to be Python. But you should be able to create a view on the generated data and consume it from Scala if that's your preferred language.
You can install the framework on the Databricks community edition by creating a notebook with the cell
%pip install git+https://github.com/databrickslabs/dbldatagen
Once it's installed you can then use the library to define a data generation spec and by using build, generate a Spark dataframe on it.
The following example shows generation of batch data similar to the data set you are trying to generate. This should be placed in a separate notebook cell
Note - here we generate 10 million records to illustrate ability to create larger data sets. It can be used to generate datasets much larger than that
%python
num_rows = 10 * 1000000 # number of rows to generate
num_partitions = 8 # number of Spark dataframe partitions
delay_reasons = ["Air Carrier", "Extreme Weather", "National Aviation System", "Security", "Late Aircraft"]
# will have implied column `id` for ordinal of row
flightdata_defn = (dg.DataGenerator(spark, name="flight_delay_data", rows=num_rows, partitions=num_partitions)
.withColumn("flightNumber", "int", minValue=1000, uniqueValues=10000, random=True)
.withColumn("airline", "string", minValue=1, maxValue=500, prefix="airline", random=True, distribution="normal")
.withColumn("original_departure", "timestamp", begin="2020-01-01 01:00:00", end="2020-12-31 23:59:00", interval="1 minute", random=True)
.withColumn("delay_minutes", "int", minValue=20, maxValue=600, distribution=dg.distributions.Gamma(1.0, 2.0))
.withColumn("delayed_departure", "timestamp", expr="cast(original_departure as bigint) + (delay_minutes * 60) ", baseColumn=["original_departure", "delay_minutes"])
.withColumn("reason", "string", values=delay_reasons, random=True)
)
df_flight_data = flightdata_defn.build()
display(df_flight_data)
You can find information on how to generate streaming data in the online documentation at https://databrickslabs.github.io/dbldatagen/public_docs/using_streaming_data.html
You can create a named temporary view over the data so that you can access it from SQL or Scala using one of two methods:
1: use createOrReplaceTempView
df_flight_data.createOrReplaceTempView("delays")
2: use options for build. In this case the name passed to the Data Instance initializer will be the name of the view
i.e
df_flight_data = flightdata_defn.build(withTempView=True)
This code will not work on the community edition because of this line:
val dir = s"/dbfs/tmp/$username/new-flights"
as there is no DBFS fuse on Databricks community edition (it's supported only on full Databricks). It's potentially possible to make it working by:
Changing that directory to local directory, like, /tmp or something like
adding a code (after writer.close()) to list flights-* files in that local directory, and using dbutils.fs.mv to move them into streamDirectory
I have an easy task to accomplish: read a password from a command line prompt without exposing it. I know that there is java.io.Console.readPassword, however, there are times when you cannot access console as if you are running your app from an IDE (such as IntelliJ).
I stumbled upon this Password Masking in the Java Programming Language tutorial, which looks nice, but I fail to implement it in Scala. So far my solution is:
class EraserThread() extends Runnable {
private var stop = false
override def run(): Unit = {
stop = true
while ( stop ) {
System.out.print("\010*")
try
Thread.sleep(1)
catch {
case ie: InterruptedException =>
ie.printStackTrace()
}
}
}
def stopMasking(): Unit = {
this.stop = false
}
}
val et = new EraserThread()
val mask = new Thread(et)
mask.start()
val password = StdIn.readLine("Password: ")
et.stopMasking()
When I start this snippet I get a continuos printing of asterisks on new lines. E.g.:
*
*
*
*
Is there any specific in Scala why this is not working? Or is there any better way to do this in Scala in general?
I'm relatively new to wxpython - really appreciate it any help you can offer me. Basically, I'm having trouble closing the loop between
1) filling a list called ListOfFiles in my OnDropFiles method below and
2) refreshing the FileList so that it displays the items in ListOfFiles.
I know that if you call
FileWindow(None, -1, 'List of Files and Actions')
right at the end of OnDropFiles, it inits a new frame and draws from ListOfFiles when populating the FileList listctrl... but I was hoping there would be a way to update in the same window. I've tried noodling around with Layout() and calling various methods on my FileWindowObject... but there's been no success.
Thanks so much for your help. I think the answer you give me might lead to a real breakthrough in my understanding of wxpython.
#!/usr/bin/env python
import wx
import sys
import traceback
import time
APP_EXIT = 1
ListOfFiles = []
class FileDrop(wx.FileDropTarget): #This is the file drop target
def __init__(self, window):
wx.FileDropTarget.__init__(self) #File Drop targets are subsets of windows
self.window = window
def OnDropFiles(self, x, y, filenames): #FileDropTarget now fills in the ListOfFiles
for DragAndDropFile in filenames:
ListOfFiles.append(DragAndDropFile) #We simply append to the bottom of our list of files.
class FileWindow(wx.Frame):
def __init__(self, parent, id, title): #This will initiate with an id and a title
wx.Frame.__init__(self, parent, id, title, size=(300, 300))
hbox = wx.BoxSizer(wx.HORIZONTAL) #These are layout items
panel = wx.Panel(self, -1) #These are layout items
self.FileList = wx.ListCtrl(panel, -1, style=wx.LC_REPORT) #This builds the list control box
DropTarget = FileDrop(self.FileList) #Establish the listctrl as a drop target
self.FileList.SetDropTarget(DropTarget) #Make drop target.
self.FileList.InsertColumn(0,'Filename',width=140) #Here we build the columns
for i in ListOfFiles: #Fill up listctrl starting with list of working files
InsertedItem = self.FileList.InsertStringItem(sys.maxint, i) #Here we insert an item at the bottom of the list
hbox.Add(self.FileList, 1, wx.EXPAND)
panel.SetSizer(hbox)
self.Show(True)
def main():
ex = wx.App(redirect = True, filename = time.strftime("%Y%m%d%H%M%S.txt"))
FileWindowObject = FileWindow(None, -1, 'List of Files and Actions')
ex.MainLoop()
if __name__ == '__main__':
main() #Execute function#!/usr/bin/env python
The problem is that all you're doing is adding items to a list, not to the ListCtrl itself. You need to subclass wx.ListCtrl and add an update method of some sort. Then you would call that update method instead of appending to a list you don't use anywhere. Here's one way to do it:
import wx
import time
########################################################################
class MyListCtrl(wx.ListCtrl):
""""""
#----------------------------------------------------------------------
def __init__(self, parent):
"""Constructor"""
wx.ListCtrl.__init__(self, parent, style=wx.LC_REPORT)
self.index = 0
#----------------------------------------------------------------------
def dropUpdate(self, path):
""""""
self.InsertStringItem(self.index, path)
self.index += 1
class FileDrop(wx.FileDropTarget): #This is the file drop target
def __init__(self, window):
wx.FileDropTarget.__init__(self) #File Drop targets are subsets of windows
self.window = window
def OnDropFiles(self, x, y, filenames): #FileDropTarget now fills in the ListOfFiles
for DragAndDropFile in filenames:
self.window.dropUpdate(DragAndDropFile) # update list control
class FileWindow(wx.Frame):
def __init__(self, parent, id, title): #This will initiate with an id and a title
wx.Frame.__init__(self, parent, id, title, size=(300, 300))
hbox = wx.BoxSizer(wx.HORIZONTAL) #These are layout items
panel = wx.Panel(self, -1) #These are layout items
self.FileList = MyListCtrl(panel) #This builds the list control box
DropTarget = FileDrop(self.FileList) #Establish the listctrl as a drop target
self.FileList.SetDropTarget(DropTarget) #Make drop target.
self.FileList.InsertColumn(0,'Filename',width=140) #Here we build the columns
hbox.Add(self.FileList, 1, wx.EXPAND)
panel.SetSizer(hbox)
self.Show(True)
def main():
ex = wx.App(redirect = True, filename = time.strftime("%Y%m%d%H%M%S.txt"))
FileWindowObject = FileWindow(None, -1, 'List of Files and Actions')
ex.MainLoop()
if __name__ == '__main__':
main()
I have a GUI ive created in Scala. Its very simple but I would like to modify the DSLOutput object from outside of DSLGUI. Does anyone know how I can call DSLOutput.append() from outside of the DSLGUI? Ive tried importing DSLGUI but I cant seems to figure out how to access DSLOutput.
package api
import swing._
import event._
object DSLGUI extends SimpleSwingApplication{
def top = new MainFrame{
title = "Computer Repair Advisory System"
object Commands extends TextField(columns = 50)
object DSLOutput extends TextArea(rows = 15, columns = 50)
object SendCommand extends Button("Send")
val CommandPanel = new FlowPanel{
contents += Commands
contents += SendCommand
}
contents = new BoxPanel(Orientation.Vertical){
contents +=CommandPanel
contents += DSLOutput
}
listenTo(SendCommand)
reactions += {
case ButtonClicked(SendCommand) =>
DSLOutput append "Test "
}
}
}
You would have to declare it in the scope of DSLGUI, rather than as a local object within your top method. Then you can access it with DSLGUI.DSLOutput.
i.e.
object DSLGUI extends SimpleSwingApplication {
object DSLOutput extends TextArea(rows = 15, columns = 50)
def top = new MainFrame {
...
}
}