Maximum number of tags allowed in OpenTsdb - tags

What is the maximum number of tags allowed in openTsdb? I read some of the documents and it seems that the default value is 8. But I am confused that whether the maximum number is 8 or is it configurable to a higher value.
I know lesser number of tags should be kept, but its the requirement.
So can I have more number of tags than 8?

It is now configurable, at least it is in version 2.3.1.
A snippet from Const.java:
static void setMaxNumTags(final short tags) {
if (tags < 1) {
throw new IllegalArgumentException("tsd.storage.max_tags must be greater than 0");
}
MAX_NUM_TAGS = tags;
}
And then a snippet from TSDB.java:
public TSDB(final HBaseClient client, final Config config) {
...
if (config.hasProperty("tsd.storage.max_tags")) {
Const.setMaxNumTags(config.getShort("tsd.storage.max_tags"));
}
...
}
Just setting the "tsd.storage.max_tags" value in opentsdb.conf and a restart is enough.
The file can be found at /usr/share/opentsdb/etc/opentsdb/ in my case.

Don't use too many tags, keep it to a fairly small number, usually up to 4 or 5 tags (By default, OpenTSDB supports a maximum of 8 tags).
OpenTSDB does not support more than 8 tags. and it is not configurable.

Not configurable but still easy to reach. You can change the source code src/core/Const.java change MAX_NUM_TAGS to some number you want.
https://github.com/StumbleUponArchive/opentsdb/blob/e68948bc/src/core/Const.java#L22
But if you really need a lot of tags you should try some other products like elasticsearch.

Related

Is it possible to create an RHQ plugin that collects historic measurements from files?

I'm trying to create an RHQ plugin to gather some measurements. It seems relativity easy to create a plugin that return a value for the present moment. However, I need to collect these measurements from files. These files are created on a schedule, for example one per hour, but they contain much finer measurements, for example a measurement for every minute. The file may look something like below:
18:00 20
18:01 42
18:02 39
...
18:58 12
18:59 15
Is it possible to create a RHQ plugin that can return many values with timestamps for a measurement?
I think you can within org.rhq.core.pluginapi.measurement.MeasurementFacet#getValues return as many values as you want within the MeasurementReport.
So basically open the file, seek to the last known position (if the file is always appended to), read from there and for each line you go
MeasurementData data = new MeasurementDataNumeric(timeInFile, request, valueFromFile);
report.add(data);
Of course alerting on this (historical) data is sort of questionable, as if you only read the file one hour later, the alert can not be retroactively fired at the time the bad value happened :->
Yes it is surely possible .
#Override
public void getValues(MeasurementReport report, Set<MeasurementScheduleRequest> metrics) throws Exception {
for (MeasurementScheduleRequest request : metrics) {
Double result = SomeReadUtilClass.readValueFromFile();
MeasurementData data = new MeasurementDataNumeric(request, result)
report.addData(data );
}
}
SomeReadUtilClass is a utility class to read the file and readValueFromFile is the function, you can write you login to read the value from file.
result is the Double variable that is more important, this result value you can calculate from database or read file. And then this result value you have to provide to MeasurementDataNumeric function MeasurementDataNumeric(request, result));

Autocomplete with Firebase

How does one use Firebase to do basic auto-completion/text preview?
For example, imagine a blog backed by Firebase where the blogger can tag posts with tags. As the blogger is tagging a new post, it would be helpful if they could see all currently-existing tags that matched the first few keystrokes they've entered. So if "blog," "black," "blazing saddles," and "bulldogs" were tags, if the user types "bl" they get the first three but not "bulldogs."
My initial thought was that we could set the tag with the priority of the tag, and use startAt, such that our query would look something like:
fb.child('tags').startAt('bl').limitToFirst(5).once('value', function(snap) {
console.log(snap.val())
});
But this would also return "bulldog" as one of the results (not the end of the world, but not the best either). Using startAt('bl').endAt('bl') returns no results. Is there another way to accomplish this?
(I know that one option is that this is something we could use a search server, like ElasticSearch, for -- see https://www.firebase.com/blog/2014-01-02-queries-part-two.html -- but I'd love to keep as much in Firebase as possible.)
Edit
As Kato suggested, here's a concrete example. We have 20,000 users, with their names stored as such:
/users/$userId/name
Oftentimes, users will be looking up another user by name. As a user is looking up their buddy, we'd like a drop-down to populate a list of users whose names start with the letters that the searcher has inputted. So if I typed in "Ja" I would expect to see "Jake Heller," "jake gyllenhaal," "Jack Donaghy," etc. in the drop-down.
I know this is an old topic, but it's still relevant. Based on Neil's answer above, you more easily search doing the following:
fb.child('tags').startAt(queryString).endAt(queryString + '\uf8ff').limit(5)
See Firebase Retrieving Data.
The \uf8ff character used in the query above is a very high code point
in the Unicode range. Because it is after most regular characters in
Unicode, the query matches all values that start with queryString.
As inspired by Kato's comments -- one way to approach this problem is to set the priority to the field you want to search on for your autocomplete and use startAt(), limit(), and client-side filtering to return only the results that you want. You'll want to make sure that the priority and the search term is lower-cased, since Firebase is case-sensitive.
This is a crude example to demonstrate this using the Users example I laid out in the question:
For a search for "ja", assuming all users have their priority set to the lowercased version of the user's name:
fb.child('users').
startAt('ja'). // The user-inputted search
limitToFirst(20).
once('value', function(snap) {
for(key in snap.val()){
if(snap.val()[key].indexOf('ja') === 0) {
console.log(snap.val()[key];
}
}
});
This should only return the names that actually begin with "ja" (even if Firebase actually returns names alphabetically after "ja").
I choose to use limitToFirst(20) to keep the response size small and because, realistically, you'll never need more than 20 for the autocomplete drop-down. There are probably better ways to do the filtering, but this should at least demonstrate the concept.
Hope this helps someone! And it's quite possible the Firebase guys have a better answer.
(Note that this is very limited -- if someone searches for the last name, it won't return what they're looking for. Hence the "best" answer is probably to use a search backend with something like Kato's Flashlight.)
It strikes me that there's a much simpler and more elegant way of achieving this than client side filtering or hacking Elastic.
By converting the search key into its' Unicode value and storing that as the priority, you can search by startAt() and endAt() by incrementing the value by one.
var start = "ABA";
var pad = "AAAAAAAAAA";
start += pad.substring(0, pad.length - start.length);
var blob = new Blob([start]);
var reader = new FileReader();
reader.onload = function(e) {
var typedArray = new Uint8Array(e.target.result);
var array = Array.prototype.slice.call(typedArray);
var priority = parseInt(array.join(""));
console.log("Priority of", start, "is:", priority);
}
reader.readAsArrayBuffer(blob);
You can then limit your search priority to the key "ABB" by incrementing the last charCode by one and doing the same conversion:
var limit = String.fromCharCode(start.charCodeAt(start.length -1) +1);
limit = start.substring(0, start.length -1) +limit;
"ABA..." to "ABB..." ends up with priorities of:
Start: 65666565656565650000
End: 65666665656565650000
Simples!
Based on Jake and Matt's answer, updated version for sdk 3.1. '.limit' no longer works:
firebaseDb.ref('users')
.orderByChild('name')
.startAt(query)
.endAt(`${query}\uf8ff`)
.limitToFirst(5)
.on('child_added', (child) => {
console.log(
{
id: child.key,
name: child.val().name
}
)
})

Why Couchbase round numeric values in a json Document?

Is there anyone who had noted that Couchbase changes the numerical value of a property, over a certain limit, when recording a Json document?
Here is an example. For this test, I use the live input via the couchbase web interface.
The property "inputValue" corresponds to the value entered in the property "valueAfterSave" before clicking the save button.
The property "valueAfterSave" corresponds to the value after the save.
To a number with 16 digits, it's good:
{
"inputValue": "1234567890123456",
"valueAfterSave": 1234567890123456
}
But from 17 digits, the system begins to change the value:
{
"inputValue": "12345678901234567",
"valueAfterSave": 12345678901234568
}
or
{
"inputValue": "12345678901234599",
"valueAfterSave": 12345678901234600
}
or
{
"inputValue": "12345678901234567890",
"valueAfterSave": 12345678901234567000
}
Just out of curiosity with 40 digits
{
"inputValue": "1234567890123456789012345678901234567890",
"valueAfterSave": 1.234567890123457e+39
}
This behavior is specified somewhere? Is there a way to change it ?.
There is the solution through String values ​​but I admit that I'm curious.
I use Couchbase Server 2.1.0 on Windows 7 Pro 32-bit platform.
Tugdual Grall, technical evangelist at Couhbase, brought me the answer.
This is due to the behavior of JavaScript when displaying such values ​​as evidenced by the following test with NodeJS:
$ node
> console.log(12345678901234567890)
12345678901234567000
On the other hand, the value returned by the Java API is correct (12345678901234567890 in our example). It is just the console which shows this difference.
If you modify the document, through the web administration console, it is the modified value which will be saved.
So be careful with the use of the administration Console when we handle this type of data.
Tug thank you.

Maximum Size of characters in Page Name in CQ5

when we are creating a page using Scaffolding it is only taking page name maximum 20 characters,is there any other way to override that validation? Thanks in advance....
In the default scaffolding, (/libs/wcm/scaffolding/components/scaffolding/body.jsp) on line 242 you see the following code:
var title = frm.findField("./jcr:content/jcr:title");
if (title) {
var hint = title.getValue();
if (hint) {
params[":nameHint"] = hint;
}
}
The main thing to take away from this code is params[":nameHint"]. This param is submitted when you create a page. The nameHint paramater is what causes the node name to be limited to x amount of characters. When nameHint is submitted, it runs through a filter which formats the name for JCR. This is done to ensure a valid JCR name. It is for your protection. You can read more about this in the Algorithm for Node Name Creation section on this page: http://sling.apache.org/site/manipulating-content-the-slingpostservlet-servletspost.html
To overwrite this problem, you would need to change params[":nameHint"] to params[":name"]. Just remember, that this won't ensure a valid JCR name. If this is a concern, you can always right some code to change the title to a valid JCR name and then set it to the :name param.
One other thing, I did read this - "when :nameHint is filtered it cuts the name to a configurable maximum length (default is 20 characters)". I cannot find how this is configured though.
Configure CQ POST servlet.. and you should be good

Make Lucene index a value and store another

I want Lucene.NET to store a value while indexing a modified, stripped-down version of the stored value. e.g. Consider the value:
this_example-has some/weird (chars) 100%
I want it stored right like that (so that I can retrieve exactly that for showing in the results list), but I want lucene to index it as:
this example has some weird chars 100
(you see, like a "sanitized" version of the original value) for a simplified search.
I figure this would be the job of an analyzer, but I don't want to mess with rolling my own. Ideally, the solution should remove everything that is not a letter, a number or quotes, replacing the removed chars by a white-space before indexing.
Any suggestions on how to implement that?
This is because I am indexing products for an e-commerce search, and some have realy creepy names. I think this would improve search assertiveness.
Thanks in advance.
If you don't want a custom analyzer, try storing the value as a separate non-indexed field, and use a simple regex to generate the sanitized version.
var input = "this_example-has some/weird (chars) 100%";
var output = Regex.Replace(input, #"[\W_]+", " ");
You mention that you need another Analyzer for some searching functionality. Dont forget the PerFieldAnalyzerWrapper which will allow you to use different analyzers within the same document.
public static void Main() {
var wrapper = new PerFieldAnalyzerWrapper(defaultAnalyzer: new StandardAnalyzer(Version.LUCENE_29));
wrapper.AddAnalyzer(fieldName: "id", analyzer: new KeywordAnalyzer());
IndexWriter writer = null; // TODO: Retrieve these.
Document document = null;
writer.AddDocument(document, analyzer: wrapper);
}
You are correct that this is the work of the analyzer. And I'd start by using a tool like luke to see what the standard analyzer does with your term before getting into what to use -- it tends to do a good job stripping noise characters and words.