I cannot load ContextSpellCheckerModel from path. But I'm able to load other models. Any issue with this model? Thank you.
spellModel = ContextSpellCheckerModel\
.load(parameters["paths"]["model_check_spelling"])\
.setInputCols("token")\
.setOutputCol("checked")\
Related
From the spark-nlp Github page I downloaded a .zip file containing a pre-trained NerCRFModel. The zip contains three folders: embeddings, fields, and metadata.
How do I load that into a Scala NerCrfModel so that I can use it? Do I have to drop it into HDFS or the host where I launch my Spark Shell? How do I reference it?
you just need to provide the path where the folders you mentioned are contained,
import com.johnsnowlabs.nlp.annotators.ner.crf.NerCrfModel
val path = "path/to/unziped/file/folder"
val model = NerCrfModel.read.load(path)
// use your model
model.setInputCols(someCol)
model.transform(yourData) // which contains 'someCol',
As long as I remember, you can place the folder in local FS or distributed FS, hope this helps other users as well!.
best,
Alberto.
In Spark MLlib, BisectingKMeansModel in pyspark have no save/load function.
why?
How to save or load the BisectingKMeans Model with Python to HDFS ?
It may be your spark version. For bisecting k_means is recommended to have above 2.1.0.
You can find a complete example here on the class pyspark.ml.clustering.BisectingKMeans, hope it helps:
https://spark.apache.org/docs/2.1.0/api/python/pyspark.ml.html#pyspark.ml.clustering.BisectingKMeans%20featuresCol=%22features%22,%20predictionCol=%22prediction%22
The last part of the example code include a model save/load:
model_path = temp_path + "/bkm_model"
model.save(model_path)
model2 = BisectingKMeansModel.load(model_path)
It works for hdfs as well, but make sure that temp_path/bkm_model folder does not exist before saving the model or it will give you an error:
(java.io.IOException: Path <temp_path>/bkm_model already exists)
Hi Everyone,
While reading data from a file in spark I'm getting an error like path does not exist. Please find the screenshot for the same.
Could you please tell me what I missed regarding processing data?
Many thanks for your help in advance.
Regards,
Sunitha.
Your data should contain path with file extension. It's missing here.
Add extension to us-500.
I am new to Apache Spark. I ran the sample ALS algorithm code present in the examples folder. I gave a csv file as an input. When I use model.save(path) to save the model, it is stored in gz.parquet file.
When I tried to open this file, I get these errors
Now I want to store the recommendation model generated in a text or csv file for using it outside Spark.
I tried the following function to store the model generated in a file but it was useless:
model.saveAsTextFile("path")
Please suggest me a way to overcome this issue.
Lest say you have trained your model with something like this:
val model = ALS.train(ratings, rank, numIterations, 0.01)
All that you have to do is:
import org.apache.spark.mllib.recommendation.ALS
import org.apache.spark.mllib.recommendation.MatrixFactorizationModel
import org.apache.spark.mllib.recommendation.Rating
// Save
model.save(sc, "yourpath/yourmodel")
// Load Model
val sameModel = MatrixFactorizationModel.load(sc, "yourpath/yourmodel")
As it turns out saveAsTextFile() only works on the slaves.Use collect() to collect the data from the slaves so it can be saved locally on the master. Solution can be found here
I have two inputs image and music.
With image i have not problems handling it as
val imageFile = request.body.file("imageFile").get.ref.file
however, musicFile can be multiple and i couldn't find a way to get them with request.body.file("musicFile")
I can get them as request.body.files but this also return the image file now the problem how i am gonna identify them.?
I am using playframework 2.1.1 with Scala
Cheers,
I found my way : you can get all files by request.body.files
and then check the files =>file
file.key.equals("musicFile")