dynamic value class - schema not known until runtime

dynamic value class - schema not known until runtime - geode

All the examples for storing multi-field data require specifying a value class. However, I do not know the fields or their types until run-time. I would like to be able to create a region with a dynamic set of field values. For example,
put --key=101 --value=('firstname':'James','lastname':'Gosling')
--region=/region1 --value-class=data.Person
However, the data.Person class does not exist.
Furthermore, I would like to be able to query on the firstname field (or any other field of the value).
How can I do this with Geode?

You don't need a domain class to store data in Geode. You can store json natively in Geode. OQL queries make no distinction between PDX serialized objects and json values. In fact, when you store a json value in Geode, beneath the covers it is converted into a PDXInstance. You can read more about PDX Serialization in the documentation.

You can use PdxInstance.
Example using Java:
region.put(101, cache.createPdxInstanceFactory("data.Person").writeString("firstname","James")
.writeString("lastname","lastname").create());

Related

OData REST API where table has columns unique to customer

We would like to create an OData REST API. Our data model is such that each customer has their own database. All database objects have the same definition across all customer databases, with the exception of a single table.
The customer specific table we will call Contact. When a customer adds a column the system creates a column with a standardised name with a definition translated from options selected by the user in the UI. The user only refers to the column data by a field name they have specified to enable the user to be able to generate friendly queries.
It seems to me that the following approaches could be used to enable OData for the model described:
1) Create an OData open type to cater for the dynamic properties. This has the disadvantage of user requests for a customer not providing an indication of the dynamic properties that can be queried against. Even though they will be known for the user (via token authentication). Also, because dynamic properties are a dictionary, some data pivoting and inefficient query writing would be required. Not sure how to implement the IQueryable handling of query options for the dynamic properties to enable our own custom field querying.
2) Create a POCO class with e.g. 50 properties; CustomField1, CustomField2... Then somehow control which fields are exposed for use in OData calls. We would then include a separate API call to expose the custom field mapping. E.g. custom field friendly name of MobileNumber = CustomField12.
3) At runtime, check to see if column definitions of table changed since last check. If have, generate class specific to customer using CodeDom and register it with OData. Aiming for a unique URL for each customer. E.g. http://domain.name/{customer guid}/odata
I think the ideal for us is option 2. However, the fact the CustomField1 could be an underlying SQL data type of nvarchar, int, decimal, datetime, etc, there are added complications.
Has anyone a working example of how to achieve what has been described, satisfactorily?
Thanks in advance for any help.
Rik

We have run into a similar situation but with our entire dataset being unknown until runtime. Using the ODataConventionModelBuilder and EdmModel classes, you can add properties dynamically to the model at runtime.
I'm not sure whether you will have to manually add all of the properties for this object type even though only some of them are unknown or whether you can add your main object and then add your dynamic ones afterwards, but I guess either would be workable.
If you can get hold of which type of user it is on the server, you could then add only the properties that you are interested in (like option 3 but not having to CodeDom).
There is an example of this kind of untyped OData server in the OData samples here that should get you started: https://github.com/OData/ODataSamples/tree/master/WebApi/v4/ODataUntypedSample

The research we carried out actually posed Option 1 as the most suitable approach for some operations. i.e. Create an SQL view that unpivots the data in a table to a key/value pair of column name/column value for each column in the table. This was suitable for queries returning small datasets. This was far less effort than Option 3 and less confusing for the user than Option 2. The unpivot query converted the field values to nvarchar (string) values and thus meant that filtering in the UI by column value data types was not simple to achieve. (If we decide to implement this ability, I believe this can be achieved by creating a custom attribute that derives from EnablQueryAttribute, marking the controller action with it and manipulate the IQueryable before execution).
However, we wanted to expose a /Contacts/Export endpoint that when called would output the columns from a table with a fixed schema joined on a table with a client specific schema and output to a CSV file. All the while utilising the OData supported filter syntax. One of our customer databases has more than 12 million rows of data and is made up of approximately 30 columns.
To achieve this it looks like our best bet would have been to work with the Microsoft.OData.Core.UriParser.UriQueryExpressionParser class, unfortunately Microsoft in their wisdom have declared this as internal, as well as many of it's dependants.
Walking an abstract syntax tree built from OData supported query options and applying our own visitor to each node to build some dynamic Linq query/SQL seems like a possible solution.
For the time-being we will simply implement a cut-down set of supported $filter criteria without the support for grouping parenthesis.

MongoDB : Use string instead of BsonObjectId

I have a REST service for my game and decided to give MongoDB a try - seems pretty straight forward except for the "_id" field requirement.
The problem here is that my client app is using the REST services and has no knowledge of "mongodb", its drivers or dependencies - nor should it. To decouple the client from the server side (REST service provider) I need to get around the fact that MongoDB appears to require a "_id" field of type BsonObjectId.
Currently I'm using a lightweight DAO layer, so instead of having:
using System;
public class Item {
private BsonObjectId _id;
private string name;
}
I am using a DAO to translate this to something "mongodb agnostic":
using System;
public class ItemDAO {
private string _id;
private string name;
}
Ideally it would be nice to be rid of BsonObjectId entirely - is there some annotation/custom serialization handler that can be used or some way that I'm able to use a string instead of BsonObjectId?
Failing that, any way to get objects wrapped by MongoDB so they are decorated with the _id which I can inject back into the row as a string?
The ideal result would be to have no DAO class at all just "Item" and have Item using a string for _id (or something that does not require mongodb dependencies to bleed into client implementation).

Your documents must have an _id field, but it doesn't have to be an ObjectID. The only requirement is that it is unique for the collection.
MongoDB will generate an ObjectId for you if you don't supply an _id field when saving a new document, but that is just a helper function.

If you don't want to "polute" your model clases, you could register appropriate Id generator in you data access code.
BsonSerializer.RegisterIdGenerator(typeof(string), new StringObjectIdGenerator());
This way you will have String field in your model, but underneath it will be ObjectId, which is kind of nice i.e. you can see when the records where added (approx)
If you decide however that in your REST service you will accept Ids from clients (via PUT) then ObjectId is obviously not the way to go.
Have a look at this article since it describes how to setup serialization options etc.

How do you query for the objects? If you don't want / need the _id field in the client, you can use a projection to exclude the _id field from the result.
Also, be aware that generating your own string-based _id's can have severe impact on database size. The ObjectId seems to be a pretty efficient structure. I have experimented with using strings for _id, to avoid having an index on a field I would never use - but in my case the cost in database size made it unfeasible. Depends on your database size and the rest of the contents, of course.

It's commonplace to represent BSON ObjectIds as strings; by default, Mongo drivers will generate 96-bit IDs, which you can then obviously represent as 24 hex bytes. Most client libraries have facilities for creating ObjectIds out of strings and casting them to strings.
You would have your external interfaces treat _id as a string, and then when your Mongo-aware DB layer receives an _id as a string, it would just internally convert it with ObjectId.from_string(_id) or whatnot. When writing results, you would just cast the OID to a string.

Using ObjectIds as data type for your primary keys makes a lot of sense for various reasons. Generating good, non-sequential, monotonic IDs with low collision probability isn't trivial, and re-inventing the wheel or essentially rewriting the feature isn't worth the trouble.
Data mapping should be done in controllers, and they should interact with the outside using DTOs. In other words, your REST Endpoints (Controllers/Modules) should known DTOs while your database uses your database models.
The DTOs will look very similar to your models, but they might have a few less fields (neat if there's internal data you don't want exposed via the API) and they use strings where the models use ObjectIds.
To avoid stupid copying code, use something like AutoMapper.

Data Mapper pattern implementation with zend

I am implementing data mapper in my zend framework 1.12 project and its working fine as expected. Now further more to enhance it i wants to optimize it in following way.
While fetching any data what id i wants to fetch any 3 field data out of 10 fields in my model table? - The current issue is if i fetches the only required values then other valus in domain object class remains blank and while saving that data i am saving while model object not a single field value.
Can any one suggest the efficient way of doing this so that i can fetch/update only required values and no need to fetch all field data to update the record.

If property is NULL ignore it when crafting the update? If NULLs are valid values, then I think you would need to track loaded/dirty states per property.
How do you go about white-listing the fields to retrieve when making the call to the mapper? If you can persist that information I think it would make sense to leverage that knowledge when going to craft the update.
I don't typically go down this path. I will lazy load certain fields on a model when it makes sense, but I don't allow loading parts of the object like this, rather I create an alternate object for use in rendering a list when loading the full object is too resource intensive. A generic dummy list object I just use with tabular data. It being populated from SQL or stored procedures result-sets, usually with my generic table mapper.

casbah mongodb more typesafe way to access object parameters

In casbah, there are two methods called .getAs and .getAsOrElse in MongoDBObject, which returns the relevant fields' values in the type which given as the type parameter.
val dbo:MongoDBObject = ...
dbo.getAs[String](param)
This must be using type casting, because we can get a Long as a String by giving it as the type parameter, which might caused to type cast exception in runtime. Is there any other typesafe way to retrieve the original type in the result?
This must be possible because the type information of the element should be there in the getAs's output.

Check out this excellent presentation on Salat by it's author. What you're looking for is Salat grater which can convert to and from DBObject.

Disclamer: I am biased as I'm the author of Subset
I built this small library "Subset" exactly for the reason to be able to work effectively with DBObject's fields (both scalar and sub-documents) in a type-safe manner. Look through Examples and see if it fits your needs.

The problem is that mongodb can store multiple types for a single field, so, I'm not sure what you mean by making this typesafe. There's no way to enforce it on the database side, so were you hoping that there is a way to enforce it on the casbah side? You could just do get("fieldName"), and get an Object, to be safest--but that's hardly an improvement, in my opinion.
I've been happy using Salat + Casbah, and when my database record doesn't match my Salat case class, I get a runtime exception. I just know that I have to run migration scripts when I change the types in my model, or create a new model for the new types (multiple models can be stored in the same collection). At least the Salat grater/DAO methods make it less of a hassle (you don't have to specify types every time you access a variable).

Why would I want to have a non-standard attribute?

The documentation on Core Data entities says:
You might implement a custom class,
for example, to provide custom
accessor or validation methods, to use
non-standard attributes, to specify
dependent keys, to calculate derived
values, or to implement any other
custom logic.
I stumbled over the non-standard attributes claim. It's just a guess: If my attribute is anything other than NSString, NSNumber or NSDate I will want to have a non-standard Attribute with special setter and getter methods? So, for example, if I wanted to store an image, this would be a non-standard Attribute with type NSData and a special method, say -(void)setImageWithFileURL:(NSURL*)url which then pulls the image data from the file, puts in in an NSData and assigns it to core data?
Or did I get that wrong?

A non-standard attribute can be anything. Some common examples are:
an image
a binary key
encrypted data
audio
Just about anything that cannot be represented as a number or string falls into this category.
update
Transformable is not a data type of it's own. It is a way to say that a non-standard value is going to be stored here. Under the covers it is binary. The Transformable tag is a hint to Core Data to go look at the subclass's property setting.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse