OrientDB Lightweight Edges with Ridbags? - orientdb

Recently I've come across a use case where it made a lot of sense to use lightweight edges. It made for much faster queries when checking whether two vertices are related as part of the select conditional.
That said, because I operate in a highly concurrent environment, I've run into some conflicts (OConcurrentModificationException). I got past this by setting the conflict strategy to auto-merge for that particular class.
In investigating further, I came across this article on concurrency when adding edges: http://orientdb.com/docs/2.1/Concurrency.html#concurrency-when-adding-edges
It recommends using RID Bags for situations where edges change very frequently, and has the neat advantage of not incrementing the version each time an edge is added/removed. Sounds great, but I can't get it to work.
I've tried adding the -DridBag.embeddedToSbtreeBonsaiThreshold=-1 to my client, with no effect. I then went into my code and added:
OGlobalConfiguration.RID_BAG_EMBEDDED_TO_SBTREEBONSAI_THRESHOLD.setValue(-1);
I finally also tried adding the -DridBag.embeddedToSbtreeBonsaiThreshold=-1 to my orientdb server (in server.sh).Still no effect. Each time the edges get updated, the version gets incremented (which is how I assume I can tell that its not working properly).
Does anybody have thoughts about how lightweight edges might work with ridbags (or not work for that matter)?
Thanks!

Related

Get count(null) as zero in Grafana - InfluxDB data source

Is it possible to set the value of count to zero when the result to which count is applied is null.
SELECT count(status) FROM ( SELECT last("P-status") AS "status" FROM "autogen"."Pl" WHERE ("Platform" = 'Database-plat' AND "P-status" = 'ERROR') AND time >= now() - 1m GROUP BY time(500ms), "Node" fill(0) )
In this case if the inner query returns null (for all the Node), count doesnt give any value , since fill will be ignored. I need the value to be zero, so that if i have to perform any other operation on the returned result, it can be done.
If possible, how can it be done?
I know this is an old question, but as it still is very relevant and I have been struggling with this problem for over a year now, I'd like to answer to this unanswered question with the current status of this issue according to my research:
The Github issues here and here imply that this problem is known since 2016 and not really understood by the contributers as a problem, as there are questionable rationales for the implementation (like "it's not a bug, it's a feature, because ambiguities of multiple series") that can easily be answered with special rules for unique series identification, but there has not been much activity any more despite heavy interest of the user community. Another point is that they have published version 2.x, which relies more on their new querying language (Flux), so it is very likely they have more or less practically abandoned the 1.x branch with InfluxQL (maybe except for QL backwards compatibility in 2.x and some minor updates, not sure).
Meanwhile I updated Grafana several times, but I had to stick with InfluxDB 1.x for a couple of reasons and the Flux support changed at some point (deprecated Flux plugin, but Flux included in standard InfluxDB plugin, but latter doesn't really work), so that Flux in Grafana is basically not working any more for a while now. I hoped for a better handling of the counting problem there, but now I'm out of luck regarding counting anything in InfluxDB reliably. I even tried some tricks with sum() function, fancy grouping and dummy values that I need to resubtract again and whatnot, but it always boiled down to the same conclusion: InfluxDB can do a lot, but counting just doesn't work.
It's very unsatisfying, but there doesn't seem to be a way to achieve the "eager" goal of counting data points without a system of bloated queries, excessive use of strange rules, dummy values and an insecurity that any query might break any time or break if you need to query only a specific time frame (where any dummy value workaround might not work). And regarding the priority given, this might not be fixed in the near future.

Set start-time for histogram sample

The usecase
We got multiple changelogs stored in the database, and want to create a histogram monitoring the duration between changes.
The problem
There doesn't seem to be a way to set the start time of a Historgram.Timer, e.g we want to set it to lastUpdated given the current changelog.
Avenues of approach
1 Subclassing Histogram
Should work. However the java-lib use protected/package-private extensively, thus making it hard without copying large portions of the library.
2 Using reflection
After a Histogram.Timer is created it should be possible to use reflection to set the start field. The field is marked as private final, and thus a SecurityManager could stop us in some environments.
Ideas?
Neither of the solutions seems like the correct way to go, and I suspect that I'm overlooking a simpler solution (but could find anything at SO or google). We're using grafana to visualize our metrics, if thats at all helpful in this scenario.
You don't need to subclass Histogram, as you don't need to use Histogram.Timer only because your histogram is measuring times.
Simply call myHistogram.observe(System.now() - lastUpdated) every time you record a new change in the database.

What is the difference between reg_defaults and reg_defaults_raw in regmap facility?

When configuring regmap, it is possible to include a list of power on defaults for the registers. As I understand it, the purpose is to pre-populate the cache in order to avoid an initial read after power on or after waking up. I'm confused by the fact that there is both a reg_defaults and a reg_defaults_raw field. Only one or the other is ever used. The vast majority of drivers use reg_defaults however there's a small handful that use reg_defaults_raw. I did look through the git history and found the commit that introduced reg_defaults and the later commit that introduced reg_defaults_raw. Unfortunately I wasn't able to divine the reason for that new field.
Does anyone know the difference between those fields?

How to implement deterministic single threaded network simulation

I read about how FoundationDB does its network testing/simulation here: http://www.slideshare.net/FoundationDB/deterministic-simulation-testing
I would like to implement something very similar, but cannot figure out how they actually did implement it. How would one go about writing, for example, a C++ class that does what they do. Is it possible to do the kind of simulation they do without doing any code generation (as they presumeably do)?
Also: How can a simulation be repeated, if it contains random events?? Each time the simulation would require to choose a new random value and thus be not the same run as the one before. Maybe I am missing something here...hope somebody can shed a bit of light on the matter.
You can find a little bit more detail in the talk that went along with those slides here: https://www.youtube.com/watch?v=4fFDFbi3toc
As for the determinism question, you're right that a simulation cannot be repeated exactly unless all possible sources of randomness and other non-determinism are carefully controlled. To that end:
(1) Generate all random numbers from a PRNG that you seed with a known value.
(2) Avoid any sort of branching or conditionals based on facts about the world which you don't control (e.g. the time of day, the load on the machine, etc.), or if you can't help that, then pseudo-randomly simulate those things too.
(3) Ensure that whatever mechanism you pick for concurrency has a mode in which it can guarantee a deterministic execution order.
Since it's easy to mess all those things up, you'll also want to have a way of checking whether determinism has been violated.
All of this is covered in greater detail in the talk that I linked above.
In the sims I've built the biggest issue with repeatability ends up being proper seed management (as per the previous answer). You want your simulations to give different results only when you supply a different seed to your random number generators than before.
After that the biggest issue I've seen seems tends to be making sure you don't iterate over collections with nondeterministic ordering. For instance, in Java, you'd use a LinkedHashMap instead of a HashMap.

Should ErrorCollector (which is part of JSR-303 functionality) in GWT 2.4 be redesigned?

I know this question is not about particular problem I have. It's rather question to GWT SDK team. As far as I remember StackOverflow is now their official communication channel with engineering community :)
Problem:
For aliased editors such as ValueBoxEditorDecorator you'll receive duplicate errors in your HasEditorErrors.showErrors() - one for ValueBoxEditorDecorator itself and another one for nested ValueBoxEditor. Current implementation of ValueBoxEditorDecorator iterates through list of errors and rejects ones that don't belong to nested editor. It looks like a hacky workaround to me :)
Question:
I think duplicates should be discarded somewhere earlier, for example in SimpleViolation.pushViolations or DelegateMap.of or ErrorCollector.endVisit.
Initially I thought why not just keep one delegate per unique EditorContext.getAbsolutePath and drop the rest. Then I realized that perhaps there is a use-case when ValueBoxEditorDecorator and its inner ValueBoxEditor would get different errors although I can't come up with the scenario how it can happen due to my short-time knowledge of gwt's sources.
So here is what I think. Why don't we use map in ErrorCollector.errorStack instead of List where the key will be combination of EditorError.getAbsolutePath() and EditorError.getUserData() ? It would solve two issues IMO:
We won't need to filter out dupkicate errors in our editors.
ErrorCollector.visit() won't assume that editors like this one are traversed in hierarchical order. I don't see anywhere in documentation that visitors would always work that way.
What do you think ?