Drop table using Pyspark

Drop table using Pyspark - scala

The SparkSession.catalog object has a bunch of methods to interact with the metastore, namely:
['cacheTable',
'clearCache',
'createExternalTable',
'createTable',
'currentDatabase',
'dropGlobalTempView',
'dropTempView',
'isCached',
'listColumns',
'listDatabases',
'listFunctions',
'listTables',
'recoverPartitions',
'refreshByPath',
'refreshTable',
'registerFunction',
'setCurrentDatabase',
'uncacheTable']
Unfortunately, there seems to have no programmatic way to drop a table.
There are multiple ways to achieve this like
spark.sql(f"drop table my_table")
or
spark._jsparkSession.sharedState().externalCatalog().dropTable(db, table, True, True)
but they look a little bit hackish compared to a simple, nonetheless missing, dropTable method?
Is there a better way ?

AFAIK from the above approaches mentioned are most commonly used ones. No other way I feel..
But alternative way I can see from these docs...
you might try this org.apache.spark.sql.hive.HiveUtils which has goodies (to drop tables..) for you.
I am not so good in python, you can see below scala example and follow the same way for python.
package org.apache.spark.sql.hive {
import org.apache.spark.sql.hive.HiveUtils
import org.apache.spark.SparkContext
object utils {
def dropTable(sc: SparkContext, dbName: String, tableName: String, ignoreIfNotExists: Boolean, purge: Boolean): Unit = {
HiveUtils
.newClientForMetadata(sc.getConf, sc.hadoopConfiguration)
.dropTable(dbName, tableName, ignoreIfNotExists, false)
}
}
}
Caller would be like
import org.apache.spark.sql.hive.utils
utils.dropTable(sc, "default", "my_table", true, true)

Related

LibreOffice Tree with columns

I am writing an extension for LibreOffifce.
A tree with columns on my sidebar is needed. (example - https://doc.qt.io/qt-5/qtwidgets-itemviews-simpletreemodel-example.html)
I found information about Tree Control and module "tree", e.g. here
https://wiki.openoffice.org/wiki/Treecontrol
https://www.openoffice.org/api/docs/common/ref/com/sun/star/awt/tree/module-ix.html
But I couldn't find anything about writing a tree with columns.
There is a quote "You can provide your own model which must at least support the interface com.sun.star.awt.XTreeModel." in the article "Tree control", but I also couldn't find any information about providing of my own models...
Please, help me find information and examples, if it is possible to provide tree with columns for LibreOffice extension.

Here is some Python-UNO code (as tagged in your question) that shows how to implement the XTreeDataModel UNO interface. You'll have to write a lot more code in order to render the nodes in multiple columns and do everything else you want. It may be required to create another class that implements XTreeNode.
import uno
import unohelper
from com.sun.star.awt.tree import XTreeDataModel
def myTree():
document = XSCRIPTCONTEXT.getDocument()
ctx = XSCRIPTCONTEXT.getComponentContext()
smgr = ctx.getServiceManager()
dlgprov = smgr.createInstanceWithArgumentsAndContext(
"com.sun.star.awt.DialogProvider", (document,), ctx)
dlg = dlgprov.createDialog(
"vnd.sun.star.script:Standard.Dialog1?location=application")
treeCtrl = dlg.getControl("TreeControl1")
treeModel = treeCtrl.getModel()
mutableTreeDataModel = smgr.createInstanceWithContext(
"com.sun.star.awt.tree.MutableTreeDataModel", ctx)
rootNode = mutableTreeDataModel.createNode("Root", True)
mutableTreeDataModel.setRoot(rootNode)
myTree = MyTreeDataModel(rootNode)
model = mutableTreeDataModel
childNode1 = model.createNode("Parent 1", True)
rootNode.appendChild(childNode1)
subChildNode = model.createNode("Child 1", True)
childNode1.appendChild(subChildNode)
treeModel.setPropertyValue("DataModel", myTree)
dlg.execute()
dlg.dispose()
class MyTreeDataModel(unohelper.Base, XTreeDataModel):
def __init__(self, root):
self.rootNode = root
def getRoot(self):
return self.rootNode
def addTreeDataModelListener(self, listener):
pass
def removeTreeDataModelListener(self, listener):
pass
More information for working with trees is at https://wiki.openoffice.org/wiki/Going_further_with_Dialog_and_Component#The_New_Tree_Control.
If it turns out that there is no convenient way to do this directly with UNO, I once did this with a JTreeTable in Java. LibreOffice extensions can be written in Java, so perhaps that would solve your needs instead.

Wicket NumberTextField in Kotlin throws ClassCastException when submitted

I'm having some issues with a Wicket (8.0.0-M4) NumberTextField in Kotlin (1.1.0).
My stripped-down form looks like this:
class Test : AbstractWebPage() {
val housenumberModel: Model<Int> = Model<Int>()
val housenumber = NumberTextField<Int>("housenumberModel", housenumberModel)
val form: Form<Unit> = object : Form<Unit>("adressForm") {}
override fun onInitialize() {
super.onInitialize()
form.add(housenumber.setRequired(false))
form.add(object : SubmitLink("submit") {
override fun onSubmit() {
super.onSubmit()
println(housenumberModel.`object`) // this is line 28
}
})
add(form)
}
}
After submitting the form I get the following stacktrace:
java.lang.ClassCastException: java.lang.String cannot be cast to
java.lang.Number
at com.mycompany.test.pages.Test$onInitialize$1.onSubmit(Test.kt:28)
at org.apache.wicket.markup.html.form.Form.delegateSubmit(Form.java:1312)
at org.apache.wicket.markup.html.form.Form.process(Form.java:979)
at org.apache.wicket.markup.html.form.Form.onFormSubmitted(Form.java:802)
at org.apache.wicket.markup.html.form.Form.onRequest(Form.java:715)
at org.apache.wicket.core.request.handler.ListenerRequestHandler.internalInvoke(ListenerRequestHandler.java:301)
at org.apache.wicket.core.request.handler.ListenerRequestHandler.invoke(ListenerRequestHandler.java:250)
at org.apache.wicket.core.request.handler.ListenerRequestHandler.invokeListener(ListenerRequestHandler.java:210)
at org.apache.wicket.core.request.handler.ListenerRequestHandler.respond(ListenerRequestHandler.java:203)
at org.apache.wicket.request.cycle.RequestCycle$HandlerExecutor.respond(RequestCycle.java:912)
at org.apache.wicket.request.RequestHandlerExecutor.execute(RequestHandlerExecutor.java:65)
at org.apache.wicket.request.cycle.RequestCycle.execute(RequestCycle.java:283)
at org.apache.wicket.request.cycle.RequestCycle.processRequest(RequestCycle.java:253)
at org.apache.wicket.request.cycle.RequestCycle.processRequestAndDetach(RequestCycle.java:221)
at org.apache.wicket.protocol.http.WicketFilter.processRequestCycle(WicketFilter.java:262)
at org.apache.wicket.protocol.http.WicketFilter.processRequest(WicketFilter.java:204)
at org.apache.wicket.protocol.http.WicketFilter.doFilter(WicketFilter.java:286)
[...]
If I use
val housenumberModel: Model<Int> = Model.of(0)
instead of
val housenumberModel: Model<Int> = Model<Int>()
everything works fine. But since my NumberTextField is optional I don't want to have it pre-initialized with 0.
Me and my colleagues were trying to change the type signature of the Model in every way we could imagine but came to no solution. A co-worker suggested to write a custom Wicket converter since Kotlins Int is represendeted as a primitive type (From the docs: "On the JVM, non-nullable values of this type are represented as values of the primitive type int.") Even though I don't know yet if this would work it seems like an overkill for me.
Another hack I could think of: writing some JavaScript to delete the zero from the input field. Also not really something I would want to do.
Question: Is there a simple solution to my problem?
(And as a bonus-question: has already anyone written a larger Wicket application in Kotlin and could tell me if this combination is ready for prime time to develop a critical project with this stack or is my problem just the tip of the iceberg?)
[edit]
Solution as pointed out by svenmeier:
Using
val housenumber = NumberTextField<Int>("housenumberModel", housenumberModel, Int::class.java)
works.
Or as an alternative:
val housenumbervalue: Int? = null
val housenumberModel: IModel<Int> = PropertyModel<Int>(this, "housenumbervalue")
val housenumber = NumberTextField<Int>("housenumberModel", housenumberModel)

Because of type erasure your NumberTextField cannot detect the generic type parameter of your model. Since your model object is null, it cannot be used to derive the type either.
In this case Wicket assumes a String model object type :/.
Either provide the type to the NumberTextField explicitly, or use a model that keeps its generic information, e.g. a PropertyModel.

There is a way to tell wicket about the type you want, it is by adding the type in the constructor. More here.
In Java it looks like this:
new NumberTextField<Integer>("housenumberModel", housenumberModel, Integer.class);

Compress Output Scalding / Cascading TsvCompressed

So people have been having problems compressing the output of Scalding Jobs including myself. After googling I get the odd hiff of an answer in a some obscure forum somewhere but nothing suitable for peoples copy and paste needs.
I would like an output like Tsv, but writes compressed output.

Anyway after much faffification I managed to write a TsvCompressed output which seems to do the job (you still need to set the hadoop job system configuration properties, i.e. set compress to true, and set the codec to something sensible or it defaults to crappy deflate)
import com.twitter.scalding._
import cascading.tuple.Fields
import cascading.scheme.local
import cascading.scheme.hadoop.{TextLine, TextDelimited}
import cascading.scheme.Scheme
import org.apache.hadoop.mapred.{OutputCollector, RecordReader, JobConf}
case class TsvCompressed(p: String) extends FixedPathSource(p) with DelimitedSchemeCompressed
trait DelimitedSchemeCompressed extends Source {
val types: Array[Class[_]] = null
override def localScheme = new local.TextDelimited(Fields.ALL, false, false, "\t", types)
override def hdfsScheme = {
val temp = new TextDelimited(Fields.ALL, false, false, "\t", types)
temp.setSinkCompression(TextLine.Compress.ENABLE)
temp.asInstanceOf[Scheme[JobConf,RecordReader[_,_],OutputCollector[_,_],_,_]]
}
}

I have also small project showing how to achieve compressed output from Tsv. WordCount-Compressed.
Scalding was setting null to the Cascading TextDelimeted parameter which disables compression.

How to generate an unique ID for an class instance in Scala?

I have a class that needs to write to a file to interface with some legacy C++ application.
Since it will be instantiated several times in a concurrent manner,
it is a good idea to give the file an unique name.
I could use System.currentTimemili or hashcode, but there exists the possibility of collisions.
Another solution is to put a var field inside a companion object.
As an example, the code below shows one such class with the last solution, but I am not sure it is the best way to do it (at least it seems thread-safe):
case class Test(id:Int, data: Seq[Double]) {
//several methods writing files...
}
object Test {
var counter = 0
def new_Test(data: Seq[Double]) = {
counter += 1
new Test(counter, data)
}
}

Did you try this :
def uuid = java.util.UUID.randomUUID.toString
See UUID javadoc, and also How unique is UUID? for a discussion of uniqueness guarantee.

it is a good idea to give the file an unique name
Since all you want is a file, not id, the best solution is to create a file with unique name, not a class with unique id.
You could use File.createTempFile:
val uniqFile = File.createTempFile("myFile", ".txt", "/home/user/my_dir")
Vladimir Matveev mentioned that there is a better solution in Java 7 and later - Paths.createTempFile:
val uniqPath = Paths.createTempFile(Paths.get("/home/user/my_dir"), "myFile", ".txt"),

Scala drivers for couchdb and partial schemas

One question I have about current Scala couchdb drivers is whether they can work with "partial" schemas". I'll try to explain what I mean: the libraries I've see seem to all want to do a complete conversion from JSON docs in the database to a Scala object, handle the Scala object, and convert it back to JSON. This is is fine if your application knows everything about that type of object --- especially if it is the sole piece of software interacting with that database. However, what if I want to write a little application that only knows about part of the JSON object: for example, what if I'm only interested in a 'mybook' component embedded like this:
{
_id: "0ea56a7ec317138700743cdb740f555a",
_rev: "2-3e15c3acfc3936abf10ea4f84a0aeced",
type: "user",
profiles: {
mybook: {
key: "AGW45HWH",
secret: "g4juh43ui9hg929gk4"
},
.. 6 or 7 other profiles
},
.. lots of other stuff
}
I really don't want to convert the whole JSON AST to a Scala object. On the other hand, in couchdb, you must save back the entire JSON doc, so this needs to be preserved somehow. I think what I really what is something like this:
class MyBook {
private val userJson: JObject = ... // full JSON retrieved from the database
lazy val _id: String = ... // parsed from the JSON
lazy val _rev: String = ... // parsed from the JSON
lazy val key: String = ... // parsed from the JSON
lazy val secret: String = ... // (ditto)
def withSecret(secret: String): MyBook = ... // new object with altered userJson
def save(db: CouchDB) = ... // save userJson back to couchdb
}
Advantages:
computationally cheaper to extract only needed fields
don't have to sync with database evolution except for 'mybook' part
more suitable for development with partial schemas
safer, because there is less change as inadvertently deleting fields if we didn't keep up with the database schema
Disadavantages:
domain objects in Scala are not pristinely independent of couch/JSON
more memory use per object
Is this possible with any of the current Scala drivers? With either of scouchdb or the new Sohva library, it seems not.

As long as you have a good JSON library and a good HTTP client library, implementing a schemaless CouchDB client library is really easy.
Here is an example in Java: code, tests.

My couchDB library uses spray-json for (de)serialization, which is very flexible and would enable you to ignore parts of a document but still save it. Let's look at a simplified example:
Say we have a document like this
{
dontcare: {
...
},
important: "foo"
}
Then you could declare a class to hold information from this document and define how the conversion is done:
case class Dummy(js:JsValue)
case class PartialDoc(dontcare: Dummy, important: String)
implicit object DummyFormat extends JsonFormat[Dummy] {
override def read(js:JsValue):Dummy = Dummy(js)
override def write(d:Dummy):JsValue = d.js
}
implicit val productFormat = jsonFormat2(PartialDoc)
This will ignore anything in dontcare but still safe it as a raw JSON AST. Of course this example is not as complex as the one in your question, but it should give you an idea how to solve your problem.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Drop table using Pyspark - scala

Related

LibreOffice Tree with columns

Wicket NumberTextField in Kotlin throws ClassCastException when submitted

Compress Output Scalding / Cascading TsvCompressed

How to generate an unique ID for an class instance in Scala?

Scala drivers for couchdb and partial schemas

Categories

Resources