caching and computing map

caching and computing map - guava

correlation use case:
read input
if (correlation-id is already generated for this input)
{
lookup the correlation-id from the cache;
return correlation-id;
}
else
{
generate the correlation-id;
cache it;
return correlation-id;
}
Constraints:
- The number of input records can go till 500K so doesn't want to use
strong references.
- Doesn't want to generate one-way hashes as of now (i know that if we
use one-way hash then there is no need to cache)
Can someone tell me how to use ComputingMap for this. I am asking this
because there is a note in the javadoc that says "it uses a identity
equality for weak/soft keys".

With Google Guava/Collection classes and soft or weak keys or values your keys needs to be strong references for the map to use equals() and not == to lookup cached values. If you have weak/soft keys then the lookup is done with identity so you'll always get a cache miss. So if you want the garbage collector to GC items from your cache then you'll need to make the values soft or weak.
I understand Google will add an Equivalence feature in the future so you could state if you want equals() or == and not have this choice made for you by picking strong, weak or soft references.
Since your Tuple object implements equals() and hashCode(), then you just do
new MapMaker()
.softValues()
.makeComputingMap(new Function<Tuple,String>() {
public String apply(Tuple t) {
// generate the correlation-id
}
});

Related

Guava : Is Cache.asMap().remove() better?

I want to get & remove an item from Cache
final Cache<String, PendingRequest> pendingRequest = CacheBuilder.newBuilder().build();
// get first
pendingCall = pendingRequest.getIfPresent(key);
pendingRequest.invalidate(key); // then remove.
I also found another way
pendingCall = pendingRequest.asMap().remove(key);
Does asMap method clone all the items? Is it a heavy call? Which manner is better if considering performance.

There's no real difference between those calls because Cache#asMap() is defined as:
Returns a view of the entries stored in this cache as a thread-safe map. Modifications made to the map directly affect the cache.
Calling asMap() may be slightly less performant (because it's possible a view has to be created) but the time is constant (and negligible) and is an implementation detail (see internal Guava LocalCache and LocalManualCache classes for more details).
What's more important, Cache#invalidate(K) is more idiomatic and I'd recommend using it instead of map view methods (edit after #BenManes' comment below) if you don't need returned value associated with the key, otherwise use map view.

GraphQL mutations on nested resources

Mutations are queries for manipulating data. If so then my root query and root mutation tree should look similar right? They both should allow nested fields (nested mutations). I was playing with this (using express-graphql) and it works.
Example:
// PUT /projects/:project_id/products/:id
mutation {
findProject(id: 1) { // make sure that project exists and we can access it before mutating data
updateProduct(id: 1, name: "Foo") { // the resolve function receives a valid `project` as the first argument
id
}
}
}
Is this a valid example? Should mutations be nested like this? If no, how should I handle nested resources? I can not find any real-life example that would mutate nested resources. All examples define mutations only on the first level (fields on the root mutation).

The product has a unique ID, so that's all you need to identify it.
mutation {
updateProduct(id: 1, name: "Foo") {
id
}
}
To verify that the user is authorized to modify the product, you should check the products' project. You'd probably have some centralized authorization anyway:
resolve({ id }, { user }) {
authorize(user, 'project', Product.find(id).project) // or whatever
... // update
}
Old answer:
This is certainly a valid example.
I'd guess the lack of nested object mutation examples may be due to the fact that even if a product is linked to a project, it would in most cases still have a unique ID -- so you can find the product even without the project id.
Another alternative would be to include the project id as an argument to updateProduct:
mutation {
updateProduct(projectId: 1, id: 1, name: "Foo") {
id
}
}
Your solution seems nicer to me, though.
As a note, mutations are, in fact, exactly the same as queries. The only difference is that the resolve function typically includes some permanent change, like modifying some data. Even then though, the mutation behaves just like a query would -- validate args, call resolve function, return data of the declared type.
We declare such method as a mutation (and not a query) to make explicit that some data are going to be changed but also for the more important reason: The order in which you modify data is important. If you declare multiple mutations in one request, the executor will run them sequentially to maintain consistency (this doesn't attempt to solve distributed writes though, that's a whole another problem).

Lua light userdata

I have a problem with Lua and I don't know if I going in the right direction. In C++ I have a dictionary that I use to pass parameter to a resource manager. This dictionary is really similar to a map of hash and string.
In Lua I want to access to these resource so I need a representation of hashes. Also hashes must be unique cause are used as index in a table. Our hash function is 64bit and I'm working on 32bit enviroment (PS3).
C++ I have somethings like that:
paramMap.insert(std::make_pair(hash64("vehicleId"), std::string("004")));
resourceManager.createResource(ResourceType("Car"), paramMap);
In Lua want use these resources to create a factory for other userdata.
I do stuff like:
function findBike(carId)
bikeParam = { vehicleId = carId }
return ResourceManager.findResource('car', bikeParam)
end
So, sometime parameter are created by Lua, sometime parameter are created by C++.
Cause my hashkey ('vehicleId') is an index of a table it need to be unique.
I have used lightuserdata to implement uint64_t, but cause I'm in a 32bit enviroment I can't simply store int64 in pointer. :(
I have to create a table to store all int64 used by the program and save a reference in userdata.
void pushUInt64(lua_State *L, GEM::GUInt64 v)
{
Int64Ref::Handle handle = Int64Ref::getInstance().allocateSlot(v);
lua_pushlightuserdata(L, reinterpret_cast<void*>(handle));
luaL_setmetatable(L, s_UInt64LuaName);
}
but userdata are never garbage collected. Then my int64 are never released and my table will grow forever.
Also lightuserdata don't keep reference to metadata so they interfere with other light userdata. Checking the implementation the metadata table is added in L->G_->mt_[2].
doing that
a = createLightUserDataType1()
b = createLightUserDataType2()
a:someFunction()
will use the metatable of b.
I thought that metatable where bounded to type.
I'm pretty confused, with the current implementation lightuserdata have a really limited use case.
With Python you have a hash metafunction that is called anytime the type is used as index for a dictionary. It's possible to do something similar?
Sorry for my english, I'm from Italy. :-/

How to specify different cache keys on the same key object in simple-spring-memcached

I am trying to implement a distributed cache with spring-memcached. The docs suggest that
to use an object as the key I need to have a method in my domain class with #CacheKeyMethod
annotation on it.
But the problem is I am using the same domain class in different scenarios and the key to be generated in each case has different logic. For examples for a User class one of the scenarios requires the key to be unique in terms of city and gender and but in the other case it requires to be unique in terms of the user's email, it's essentially what your lookup is based on.
Although a user's email would determine the city and gender, so I can use email as the key in first case as well but that would mean separate cache entries for each user while the cached data would be same as long as the gender and city are same, which is expected to increase the hit ratio by a huge margin(just think how many users you can expect to be males from bangalore).
Is there a way I could define different keys. Also it would be great if the logic of
generating the key could be externalised from the domain class itself.
I read the docs and figured out that something called CacheKeyBuilder and/or CacheKeyBuilderImpl could do the trick but I couldn't understand how to proceed.
Edit..
ok.. I got one clue! What CacheKeyBuliderImpl does is, it calls the generateKey method on defaultKeyProvider instance which looks for #cachekeyannotation on the provided domain class's methods and executes the method found to obtain the key.
So replacing either the CacheKeyBuilderImpl with custom Impl or replacing KeyProvider's default implementation within CacheKeyBuilderImpl with yours might do the trick... but the keyprovider reference is hardwired to DefaultKeyProvider.
Can anybody help me implement CacheKeyBuilder(with respect to what different methods do;the documentation doesn't clarify it) and also how do I inject it to be used instead of ususal CacheKeyBuilderImpl

Simple Spring Memcached (SSM) hasn't be designed to allow such low level customization. As you wrote one of way is to replace CacheKeyBuilderImpl with own implementation. The default implementation is hardwired but it can be easily replaces using custom simplesm-context.xml configuration.
As I understand your question, you want to cache your User objects under different keys depends on use case. It's supported out of the box because by default SSM uses method argument to generate cache key not the result.
Example:
#ReadThroughMultiCache(namespace = "userslist.cityandgenre", expiration = 3600
public List<User> getByCityAndGenre(#ParameterValueKeyProvider(order = 0) String city, #ParameterValueKeyProvider(order = 1) String genre) {
// implementation
}
#ReadThroughSingleCache(namespace = "users", expiration = 3600)
public User getByEmail(#ParameterValueKeyProvider String email) {
// implementation
}
In general the #CacheKeyMethod is only used to generate cache key if object that contains the method is passed as a parameter to the method and the parameter is annotated by #ParameterValueKeyProvider

Are singletons automatically persisted between requests in ASP.NET MVC?

I have a lookup table (LUT) of thousands integers that I use on a fair amount of requests to compute stuff based on what was fetched from database.
If I simply create a standard singleton to hold the LUT, is it automatically persisted between requests or do I specifically need to push it to the Application state?
If they are automatically persisted, then what is the difference storing them with the Application state?
How would a correct singleton implementation look like? It doesn't need to be lazily initialized, but it needs to be thread-safe (thousands of theoretical users per server instance) and have good performance.
EDIT: Jon Skeet's 4th version looks promising http://csharpindepth.com/Articles/General/Singleton.aspx
public sealed class Singleton
{
static readonly Singleton instance=new Singleton();
// Explicit static constructor to tell C# compiler
// not to mark type as beforefieldinit
static Singleton()
{
}
Singleton()
{
}
public static Singleton Instance
{
get
{
return instance;
}
}
// randomguy's specific stuff. Does this look good to you?
private int[] lut = new int[5000];
public int Compute(Product p) {
return lut[p.Goo];
}
}

Yes, static members persists (not the same thing as persisted - it's not "saved", it never goes away), which would include implementations of a singleton. You get a degree of lazy initialisation for free, as if it's created in a static assignment or static constructor, it won't be called until the relevant class is first used. That creation locks by default, but all other uses would have to be threadsafe as you say. Given the degree of concurrency involved, then unless the singleton was going to be immutable (your look-up table doesn't change for application lifetime) you would have to be very careful as to how you update it (one way is a fake singleton - on update you create a new object and then lock around assigning it to replace the current value; not strictly a singleton though it looks like one "from the outside").
The big danger is that anything introducing global state is suspect, and especially when dealing with a stateless protocol like the web. It can be used well though, especially as an in-memory cache of permanent or near-permanent data, particularly if it involves an object graph that cannot be easily obtained quickly from a database.
The pitfalls are considerable though, so be careful. In particular, the risk of locking issues cannot be understated.
Edit, to match the edit in the question:
My big concern would be how the array gets initialised. Clearly this example is incomplete as it'll only ever have 0 for each item. If it gets set at initialisation and is the read-only, then fine. If it's mutable, then be very, very careful about your threading.
Also be aware of the negative effect of too many such look-ups on scaling. While you save for mosts requests in having pre-calculation, the effect is to have a period of very heavy work when the singleton is updated. A long-ish start-up will likely be tolerable (as it won't be very often), but arbitrary slow downs happening afterwards can be tricky to trace to their source.

I wouldn't rely on a static being persisted between requests. [There is always the, albeit unlikely, chance that the process would be reset between requests.] I'd recommend HttpContext's Cache object for persisting shared resources between requests.

Edit: See Jon's comments about read-only locking.
It's been a while since I've dealt with singleton's (I prefer letting my IOC container deal with lifetimes), but here's how you can handle the thread-safety issues. You'll need to lock around anything that mutates the state of the singleton. Read only operations, like your Compute(int) won't need locking.
// I typically create one lock per collection, but you really need one per set of atomic operations; if you ever modify two collections together, use one lock.
private object lutLock = new object();
private int[] lut = new int[5000];
public int Compute(Product p) {
return lut[p.Goo];
}
public void SetValue(int index, int value)
{
//lock as little code as possible. since this step is read only we don't lock it.
if(index < 0 || index > lut.Length)
{
throw new ArgumentException("Index not in range", "index");
}
// going to mutate state so we need a lock now
lock(lutLock)
{
lut[index] = value;
}
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

caching and computing map - guava

Related

Guava : Is Cache.asMap().remove() better?

GraphQL mutations on nested resources

Lua light userdata

How to specify different cache keys on the same key object in simple-spring-memcached

Are singletons automatically persisted between requests in ASP.NET MVC?

Categories

Resources