How to save design document / view in CouchDB database from Scala program? - scala

I have written a scala program for creating new database, and adding documents/views into it.
object CouchDBTest extends App {
val dbSession = new Session("localhost", 5984)
val db = dbSession.createDatabase("couchschooltest")
val newC1 = new Document
newC1.put("Type", "Class")
newC1.put("ClassId", "C1")
newC1.put("ClassName", "C-2A")
newC1.put("ClassTeacher", "T1")
newC1.accumulate("Students", "S1");
newC1.accumulate("Students", "S2");
newC1.accumulate("Students", "S3");
db.saveDocument(newC1)
val viewDocClass = new Document
viewDocClass.addView("Class", "function(doc) {if(doc.Type == 'Class') { emit([doc.ClassId, doc.ClassName, doc.ClassTeacher, doc.Students], doc);}}")
db.saveDocument(viewDocClass)
}
When I run this code, it creates the new database in CouchDB and adds the class document in this database. However, it doesn't add the views into the database. It gives the runtime error while adding the viewDocClass as
Error adding document - null null
For this I used couchdb4j API

Related

H2 database content is not persisting on insert and update

I am using h2 database to test my postgres slick functionality.
I created a below h2DbComponent:
trait H2DBComponent extends DbComponent {
val driver = slick.jdbc.H2Profile
import driver.api._
val h2Url = "jdbc:h2:mem:test;MODE=PostgreSQL;DB_CLOSE_DELAY=-1;DATABASE_TO_UPPER=false;INIT=runscript from './test/resources/schema.sql'\\;runscript from './test/resources/schemadata.sql'"
val logger = LoggerFactory.getLogger(this.getClass)
val db: Database = {
logger.info("Creating test connection ..................................")
Database.forURL(url = h2Url, driver = "org.h2.Driver")
}
}
In the above snippet i am creating my tables using schema.sql and inserting a single row(record) with schemadata.sql.
Then i am trying to insert a record into the table as below using my test case:
class RequestRepoTest extends FunSuite with RequestRepo with H2DBComponent {
test("Add new Request") {
val response = insertRequest(Request("XYZ","tk", "DM", "RUNNING", "0.1", "l1", "file1",
Timestamp.valueOf("2016-06-22 19:10:25"), Some(Timestamp.valueOf("2016-06-22 19:10:25")), Some("scienceType")))
val actualResult=Await.result(response,10 seconds)
assert(actualResult===1)
val response2 = getAllRequest()
assert(Await.result(response2, 5 seconds).size === 2)
}
}
The above assert of insert works fine stating that the record is inserted. But the getAllRequest() assert fails as the output still contains the single row(as inserted by schemadata.sql) => which means the insertRequest change is not persisted. However the below statements states that the record is inserted as the insert returned 1 stating one record inserted.
val response = insertRequest(Request("CMP_XYZ","tesco_uk", "DM", "RUNNING", "0.1", "l1", "file1",
Timestamp.valueOf("2016-06-22 19:10:25"), Some(Timestamp.valueOf("2016-06-22 19:10:25")),
Some("scienceType")))
val actualResult=Await.result(response,10 seconds)
Below is my definition of insertRequest:
def insertRequest(request: Request):Future[Int]= {
db.run { requestTableQuery += request }
}
I am unable to figure out how can i see the inserted record. Is there any property/config which i need to add?
But the getAllRequest() assert fails as the output still contains the single row(as inserted by schemadata.sql) => which means the insertRequest change is not persisted
I would double-check that the assert(Await.result(response2, 5 seconds).size === 2) line is failing because of a size difference. Could it be failing for some other general failure?
For example, as INIT is run on each connection it could be that you are re-creating the database for each connection. Unless you're careful with the SQL, that could produce an error such as "table already exists". Adding TRACE_LEVEL_SYSTEM_OUT=2; to your H2 URL can be helpful in tracking what H2 is doing.
A couple of suggestions.
First, you could ensure your SQL only runs as needed. For example, your schema.sql could add checks to avoid trying to create the table twice:
CREATE TABLE IF NOT EXISTS my_table( my_column VARCHAR NULL );
And likewise for your schemadata.sql:
MERGE INTO my_table KEY(my_column) VALUES ('a') ;
Alternatively, you could establish schema and test data around your tests (e.g., possibly in Scala code, using Slick). Your test framework probably has a way to ensure something is run before and after a test or test suit.

need help in understanding if the way I am testing a function is correct

I have written this function which is called when a user clicks a link. The function basically creates a copy of the user data with one field altered (thus keeping the original value unchanged i.e. not-mutable) and then updates the database with the new value
def confirmSignupforUser(user:User):Future[Option[User]] = {
println("confirming user: "+user)
val newInternalProfile = user.profile.internalProfileDetails.get.copy(confirmed=true)//new data which should be added in the database
println("old internal profile: "+user.profile.internalProfileDetails.get)
println("new internal profile: "+newInternalProfile)
val newProfile = UserProfile(Some(newInternalProfile),user.profile.externalProfileDetails)
println("old profile: "+user.profile)
println("new profile: "+newProfile)
val confirmedUser = user.copy(profile=newProfile)
for(userOption <- userRepo.update(confirmedUser)) yield { //database operation
println("returning modified user:"+userOption)
userOption
}
}
To test the code, I have written the following spec
"confirmSignupforUser" should {
"change confirmed status to True" in {
val testEnv = new TestEnv(components.configuration)
val externalProfile = testEnv.externalUserProfile
val internalUnconfirmedProfile = InternalUserProfile(testEnv.loginInfo,1,false,None)
val internalConfirmedProfile = internalUnconfirmedProfile.copy(confirmed=true)
val unconfirmedProfile = UserProfile(Some(internalUnconfirmedProfile),externalProfile)
val confirmedProfile = UserProfile(Some(internalConfirmedProfile),externalProfile)
val origUser = User(testEnv.mockHelperMethods.getUniqueID(),unconfirmedProfile)
val confirmedUser = origUser.copy(profile = confirmedProfile)
//the argument passed to update is part of test. The function confirmSignupforUser should pass a confirmed profile
when(testEnv.mockUserRepository.update(confirmedUser)).thenReturn(Future{Some(confirmedUser)})
//// await is from play.api.test.FutureAwaits
val updatedUserOption:Option[User] = await[Option[User]](testEnv.controller.confirmSignupforUser(origUser))
println(s"received updated user option ${updatedUserOption}")
updatedUserOption mustBe Some(confirmedUser)
}
}
I am not confident if I am testing the method correctly. The only way I can check that the confirmed field got changed is by looking at the return value of confirmSignupforUser. But I am actually mocking the value and I have already set the field confirmed to true in the mocked value (when(testEnv.mockUserRepository.update(confirmedUser)).thenReturn(Future{Some(confirmedUser)}).
I know the code works because in the above mock, the update method expects confirmedUser or in other words, a user with confirmed field set to true. So if my code wasn't working, update would have been called with user whose confirmed field was false and mockito would have failed.
Is this the right way to test the method or is there a better way?
You don't need to intialize internalConfirmedProfile in your test. The whole point is to start with confirmed=false, run the confirmSignupforUser method, and make sure that the output is confirmed=true.
You should check 2 things:
check that the return value has confirmed=true (which you do)
check that the repository has that user saved with confirmed=true (which you don't check). To check that you would need to load the user back from the repository at the end.

Mixed Content XML parsing using DataFrame

I have an XML document that has mixed content and I am using a custom schema in Dataframe to parse it. I am having an issue where the schema will only pick up the text for "Measure".
The XML looks like this
<QData>
<Measure> some text here
<Answer>Answer1</Answer>
<Question>Question1</Question>
</Measure>
<Measure> some text here
<Answer>Answer1</Answer>
<Question>Question1</Question>
</Meaure>
</QData>
My schema is as follows:
def getCustomSchema():StructType = {StructField("QData",
StructType(Array(
StructField("Measure",
StructType( Array(
StructField("Answer",StringType,true),
StructField("Question",StringType,true)
)),true)
)),true)}
When I try to access the data in Measure I am only getting "some text here" and it fails when I try to get info from Answer. I am also just getting one Measure.
EDIT: This is how I am trying to access the data
val result = sc.read.format("com.databricks.spark.xml").option("attributePrefix", "attr_").schema(getCustomSchema)
.load(filename.toString)
val qDfTemp = result.mapPartitions(partition =>{val mapper = new QDMapper();partition.map(row=>{mapper(row)}).flatMap(list=>list)}).toDF()
case class QDMapper(){
def apply(row: Row):List[QData]={
val qDList = new ListBuffer[QData]()
val qualData = row.getAs[Row]("QData") //When I print as list I get the first Measure text and that is it
val measure = qualData.getAs[Row]("Measure") //This fails
}
}
you can use row tag as a root tag and access other element:-
df_schema = sqlContext.read.format('com.databricks.spark.xml').options(rowTag='<xml_tag_name>').load(schema_path)
please visit https://github.com/harshaltaware/Pyspark/blob/main/Spark-data-parsing/xmlparsing.py for brief code

mapReduce inline results with java mongodb driver 3.2

How to have inline results from a mapReducet with the mongodb java driver 3.2?
with driver version 2.x I was doing:
DBColleciont coll = client.getDB(dbName).getCollection(collName);
coll.mapReduce(map, reduce, null, OutputType.INLINE, query);
the new 3.x driver has two mapReduce() methods returning MapReduceIterable which misses a method to specify the INLINE output mode.
MongoCollection<Documetn> coll = client.getDatabase(dbName).getCollection(collName)
coll
.mapReduce(map, reduce).
.filter(query);
You can create the map-reduce command manually:
String mapFunction = ...
String reduceFunction = ...
BsonDocument command = new BsonDocument();
BsonJavaScript map = new BsonJavaScript(mapFunction);
BsonJavaScript red = new BsonJavaScript(reduceFunction);
BsonDocument query = new BsonDocument("someidentifier", new BsonString("somevalue"));
command.append("mapreduce", new BsonString("mySourceCollection"));
command.append("query", query);
command.append("map", map);
command.append("reduce", red);
command.append("out", new BsonDocument("inline", new BsonBoolean(true)));
Document result = mongoClient.getDatabase(database).runCommand(command);
I think this is extremely ugly, but it is the only working solution I found so far using 3.2. (... and would be very interested in a better variant, too... ;-))
I think I found it...
I had a deeper look into mongodb's Java driver source and it seems that the INLINE output feature is implicitly accessible:
The class MapReduceIterableImpl<TDocument, TResult>(MapReduceIterableImpl.java), which is the default implementation of the Interface return type of mapReduce(),
holds a private boolean inline with initial value true.
The only place where this can ever be switched to false is the method collectionName(final String collectionName) for which the description is as follows:
Sets the collectionName for the output of the MapReduce
The default action is replace the collection if it exists, to change this use action(com.mongodb.client.model.MapReduceAction).
If you never call this method on the object instance after mapReduce(), it will remain true as initialized...meaning: if there is no output collection, it must be inline.
Later on, when you access your result with iterator(), first(), forEach(...) etc internally the execute() method gets called which hast the magic if condition:
if (inline) {
MapReduceWithInlineResultsOperation<TResult> operation =
new MapReduceWithInlineResultsOperation<TResult>(namespace,
new BsonJavaScript(mapFunction),
new BsonJavaScript(reduceFunction),
codecRegistry.get(resultClass))
.filter(toBsonDocument(filter))
.limit(limit)
.maxTime(maxTimeMS, MILLISECONDS)
.jsMode(jsMode)
.scope(toBsonDocument(scope))
.sort(toBsonDocument(sort))
.verbose(verbose)
.readConcern(readConcern);
....
} else {
MapReduceToCollectionOperation operation =
new MapReduceToCollectionOperation(namespace, new BsonJavaScript(mapFunction), new BsonJavaScript(reduceFunction),
collectionName)
.filter(toBsonDocument(filter))
.limit(limit)
.maxTime(maxTimeMS, MILLISECONDS)
.jsMode(jsMode)
.scope(toBsonDocument(scope))
.sort(toBsonDocument(sort))
.verbose(verbose)
.action(action.getValue())
.nonAtomic(nonAtomic)
.sharded(sharded)
.databaseName(databaseName)
.bypassDocumentValidation(bypassDocumentValidation);
...so it is instanciating MapReduceWithInlineResultsOperation when collectionName() has not been called.
I had no chance to test it because my NetBeans hates me at the moment, but I think it is pretty clear.
What do you think, did I miss something?
Would be glad if I could help you shift the code to API 3.x, great project!

Cassandra-Hector-Scala:How can I get all row Composite key in column family?

My data storage format is:
Family name :Test
Rowkey: comkey1:comkey2
=>(name=name,value='xyz',timestamp=1554515485)
-------------------------------------------------------
Rowkey: comkey1:comkey3
=>(name=name,value='abc',timestamp=1554515485)
-------------------------------------------------------
Rowkey: comkey1:comkey4
=>(name=name,value='pqr',timestamp=1554515485)
-------------------------------------------------------
now i want to fetch all composite key from "test" family
and i am trying
def test=Action{
val cluster = HFactory.getOrCreateCluster("Test Cluster", "127.0.0.1:9160");
val keyspace = HFactory.createKeyspace("winoriatest", cluster)
var startKey = new Composite();
var endKey= new Composite();
startKey.addComponent("comkey1", StringSerializer.get());
startKey.addComponent("comkey2", StringSerializer.get());
endKey.addComponent("comkey1", StringSerializer.get());
endKey.addComponent("comkey4", StringSerializer.get());
val rangeSlicesQuery = HFactory.createRangeSlicesQuery(keyspace, CompositeSerializer.get(), StringSerializer.get(),StringSerializer.get())
rangeSlicesQuery.setColumnFamily("test");
// CompositeSerializer.get() is not working.
rangeSlicesQuery.setKeys(startKey,endKey)
rangeSlicesQuery.setRange(null,null,false,Integer.MAX_VALUE);
rangeSlicesQuery.setReturnKeysOnly()
val result = rangeSlicesQuery.execute()
val orderedRows = result.get();
import scala.collection.JavaConversions._
for (sc <- orderedRows) {
println(sc.getKey())
}
Ok(views.html.index("Your new application is ready."))
}
Error :[NullPointerException: null] on line
val result = rangeSlicesQuery.execute()
Cassandra 2.0 scala 2.10.2
Thank you for your help in resolving this, in advance.
it giving me null pointer exception, and the same code is working with java
and my java code is
Cluster cluster = HFactory.getOrCreateCluster("Test Cluster","127.0.0.1:9160");
Keyspace keyspace = HFactory.createKeyspace("winoriatest", cluster);
Serializer<String> se= StringSerializer.get() ;
Serializer<Long> le= LongSerializer.get() ;
Serializer<Integer> ie= IntegerSerializer.get() ;
CompositeSerializer ce = new CompositeSerializer();
RangeSlicesQuery<Composite,String,byte[]> rangeSliceQuery=HFactory.createRangeSlicesQuery(keyspace,ce,se, BytesArraySerializer.get());
rangeSliceQuery.setColumnFamily("test");
rangeSliceQuery.setRange(null,null, false, Integer.MAX_VALUE);
QueryResult<OrderedRows<Composite,String,byte[]>>result=rangeSliceQuery.execute();
OrderedRows<Composite,String,byte[]> orderedRows=result.get();
for (Row<Composite,String,byte[]> r:orderedRows)
{
System.out.println("Compositekey="+r.getKey().get(0,se)+":"+r.getKey().get(1, se));
}
I'm not quite sure what "i want to fetch all composite key in test family" means. If you mean, you want to get just the partition [row] key components, then you can do this in CQL as simply as:
SELECT DISTINCT a, b FROM test
(Assigning a and b to be the column names.)
This is a good example of how much simpler CQL makes Cassandra development, which is why we're pushing people to use the native CQL driver over legacy clients like Hector.
For more on how CQL makes sense of a Thrift data model like this, see http://www.datastax.com/dev/blog/cql3-for-cassandra-experts.