Neo4J Import Tool Inconsistencies

Neo4J Import Tool Inconsistencies - import

I've tried searching for documentation to use the "new" neo4j-admin import tool but have not found anything in regards to it's usage compared to the, soon-to-be, deprecated neo4j-import tool.
I receive no fatal errors when using neo4j-import and am able to query the database. The command I use for import is as follows:
./neo4j-import \
--into /Users/rlinchan/Applications/neo4j-community-3.0.8/data/databases/graph.db/ \
--stacktrace 'true' \
--delimiter "|" \
--array-delimiter ";" \
--quote "\"" \
--bad-tolerance 10000 \
--nodes "/Users/rlinchan/Desktop/v4/nodes/disease_ontology.do.nodes.csv" \
--nodes "/Users/rlinchan/Desktop/v4/nodes/ebi.chebi.nodes.csv" \
--relationships "/Users/rlinchan/Desktop/v4/edges/disease_ontology.do.edges.csv" \
--relationships "/Users/rlinchan/Desktop/v4/edges/ebi.chebi.edges.csv"
There are many more node and relationship files being loaded (~6 GB of data) but I've excluded them here for brevity. The issue I'm having is, upon a cursory inspection of the data using the Neo4j browser, an inability to see the relationship types in the "Database Information" section.
I am able to query the database and receive results in the browser, showing that the relationships do exist. I am not able to modify the color, size, or label of nodes and edges in the Neo4j browser visualizations however, which I need for publication figures. All nodes and edges are gray, and selections of color, size, and label are not applied to the query results.
Is this an issue with my data import? I've used this command for the import tool on various other Neo4j versions without fault.
Here are examples of the node files and edge files, if that helps at all:
Node Header
source_id:ID|name:string|synonyms:string[]|definition:string|:LABEL
Edge Header
:START_ID|:TYPE|source:string|:END_ID

The labels assigned to node types will throw an error if the label contains special characters, a period in specific.
Previous graph databases I had created worked just fine, regardless the version of Neo4j and I realized it had to be the dataset itself.
Periods in the label must have a naming convention issue within javascript or java itself (usually used for calling classes, properties, methods).
I've simply renamed the labels in my dataset by replacing periods with underscores and coloring, naming, and size modifications in the Neo4j browser are no longer an issue. (See image below)
Neo4j Browser Modifications (since I can't post images)

It could just be that some of the metadata in the browser is temporarily out of sync with the server. Try disconnecting from the server using the browser command :server disconnect, and then logging back in.

Related

Tracking events with prometheus and grafana

There's an article "Tracking Every Release" which tells about displaying a vertical line on graphs for every code deployment. They are using Graphite. I would like to do something similar with Prometheus 2.2 and Grafana 5.1. More specifically I want to get an "application start" event displayed on a graph.
Grafana annotations seem to be the appropriate mechanism for this but I can't figure out what type of prometheus metric to use and how to query it.

The simplest way to do this is via the same basic approach as in the article, by having your deployment tool tell Grafana when it performs a deployment.
Grafan has a built-in system for storing annotations, which are displayed on graphs as vertical lines and can have text associated with them. It would be as simple as creating an API key in your Grafana instance and adding a curl call to your deploy script:
curl -H "Authorization: Bearer <apikey>" http://grafana:3000/api/annotations -H "Content-Type: application/json" -d '{"text":"version 1.2.3 deployed","tags":["deploy","production"]}'
For more info on the available options check the documentation:
http://docs.grafana.org/http_api/annotations/
Once you have your deployments being added as annotations, you can display those on your dashboard by going to the annotations tab in the dashboard settings and adding a new annotation source:
Then the annotations will be shown on the panels in your dashboard:

You can get the same result purely from Prometheus metrics, no need to push anything into Grafana:
If you wanted to track all restarts your search expression could be something like:
changes(start_time_seconds{job="foo",env="prod"} > 0
Or something like this if you only wanted to track version changes (and you had some sort of info metric that provided the version):
alertmanager_build_info unless max_over_time(alertmanager_build_info[1d] offset 5m)
The latter expression should only produce an output for 5 minutes whenever a new alertmanager_build_info metric appears (i.e. one with different labels such as version). You can further tweak it to only produce an output when version changes, e.g. by aggregating away all other labels.

A note here as technology has evolved. We get deployment job state information in Prometheus metrics format scraped directly from the community edition of Hashicorp's Nomad and we view this information in Grafana.
In your case, you would just add an additional query to an existing panel to overlay job start events, which is equivalent to a new deployment for us. There are a lot of related metrics "out of the box," such as for a change in job version that can be considered as well. The main point is no additional work is required besides adding a query in Grafana.

Different S3 behavior using different endpoints?

I'm currently writing code to use Amazon's S3 REST API and I notice different behavior where the only difference seems to be the Amazon endpoint URI that I use, e.g., https://s3.amazonaws.com vs. https://s3-us-west-2.amazonaws.com.
Examples of different behavior for the the GET Bucket (List Objects) call:
Using one endpoint, it includes the "folder" in the results, e.g.:
/path/subfolder/
/path/subfolder/file1.txt
/path/subfolder/file2.txt
and, using the other endpoint, it does not include the "folder" in the results:
/path/subfolder/file1.txt
/path/subfolder/file2.txt
Using one endpoint, it represents "folders" using a trailing / as shown above and, using the other endpoint, it uses a trailing _$folder$:
/path/subfolder_$folder$
/path/subfolder/file1.txt
/path/subfolder/file2.txt
Why the differences? How can I make it return results in a consistent manner regardless of endpoint?
Note that I get these same odd results even if I use Amazon's own command-line AWS S3 client, so it's not my code.

And the contents of the buckets should be irrelevant anyway.
Your assertion notwithstanding, your issue is exactly about the content of the buckets, and not something S3 is doing -- the S3 API has no concept of folders. None. The S3 console can display folders, but this is for convenience -- the folders are not really there -- or if there are folder-like entities, they're irrelevant and not needed.
In Amazon S3, buckets and objects are the primary resources, where objects are stored in buckets. Amazon S3 has a flat structure with no hierarchy like you would see in a typical file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. Amazon S3 does this by using key name prefixes for objects.
http://docs.aws.amazon.com/AmazonS3/latest/UG/FolderOperations.html
So why are you seeing this?
Either you've been using EMR/Hadoop, or some other code written by someone who took a bad example and ran with it... or is doing something differently than it should have been done for quite some time.
Amazon EMR is a web service that uses a managed Hadoop framework to process, distribute, and interact with data in AWS data stores, including Amazon S3. Because S3 uses a key-value pair storage system, the Hadoop file system implements directory support in S3 by creating empty files with the <directoryname>_$folder$ suffix.
https://aws.amazon.com/premiumsupport/knowledge-center/emr-s3-empty-files/
This may have been something the S3 console did many years ago, and apparently (since you don't report seeing them in the console) it still supports displaying such objects as folders in the console... but the S3 console no longer creates them this way, if it ever did.
I've mirrored the bucket "folder" layout exactly
If you create a folder in the console, an empty object with the key "foldername/" is created. This in turn is used to display a folder that you can navigate into, and upload objects with keys beginning with that folder name as a prefix.
The Amazon S3 console treats all objects that have a forward slash "/" character as the last (trailing) character in the key name as a folder
http://docs.aws.amazon.com/AmazonS3/latest/UG/FolderOperations.html
If you just create objects using the API, then "my/object.txt" appears in the console as "object.txt" inside folder "my" even though there is no "my/" object created... so if the objects are created with the API, you'd see neither style of "folder" in the object listing.

That is probably a bug in the API endpoint which includes the "folder" - S3 internally doesn't actually have a folder structure, but instead is just a set of keys associated with files, where keys (for convenience) can contain slash-separated paths which then show up as "folders" in the web interface. There is the option in the API to specify a prefix, which I believe can be any part of the key up to and including part of the filename.

EMR's s3 client is not the apache one, so I can't speak accurately about it.
In ASF hadoop releases (and HDP, CDH)
The older s3n:// client uses $folder$ as its folder delimiter.
The newer s3a:// client uses / as its folder marker, but will handle $folder$ if there. At least it used to; I can't see where in the code it does now.
The S3A clients strip out all folder markers when you list things; S3A uses them to simulate empty dirs and deletes all parent markers when you create child file/dir entries.
Whatever you have which processes GET should just ignore entries with "/" or $folder at the end.
As to why they are different, the local EMRFS is a different codepath, using dynamo for implementing consistency. At a guess, it doesn't need to mock empty dirs, as the DDB tables will host all directory entries.

Firebase REST API: Delete sometimes fails

I'm currently building a web frontend for a Matlab program. I'm using webread/webwrite to interface with the Firebase realtime database (Though I'll be shifting to urlread2 soon for compatibility reasons). The Matlab end has to delete nodes from the database on a regular basis. I do this by using webwrite to send a POST request and putting "X-HTTP-Method-Override: DELETE" in the header. This works, but after a few deletes it stops working until data is either added to or removed from the database. It seems completely random, my teammate and I have been trying to find a pattern for a few days and we've found nothing.
Here is the relevant Matlab code:
modurl = strcat(url, modkey, '.json');
modurlstr = char(modurl);
webop = weboptions('KeyName', 'X-HTTP-Method-Override', 'KeyValue','DELETE');
webwrite(modurlstr, webop);
Where url is our database url and modkey is the key of the node we're trying to delete. There's no authentication because the database is set to public (Security is not an issue for us).
The database is organized pretty simply. The root node just has a bunch of children. We only delete a whole child (i.e. we don't ever try to delete the individual components of a child).
Are we doing something wrong?
Thanks in advance!

We found out some of the keys had hyphens in them, which were getting translated to their ascii representation. The reason it seemed random was because the delete was only bugging out on the nodes which had a hyphen in their keys. When we switched them back everything worked fine.

New Relic Servers API- Fetch List of labels for server

using New Relic Rest API v2, is there a way to fetch all associated labels on a sever? i.e. I have the server ID, but I want the labels attached to it? Just like you would in AWS EC2 Tags or Chef Tag attributes. I've checked their API docs, but I haven't found anything relevant yet... Thanks!

Probably not with out a little coding. This command:
curl -X GET 'https://api.newrelic.com/v2/labels.json' \
-H 'X-Api-Key:{api_key}' -i
will return all the labels on the account. Part of the output returned contains a "links" section for each label that will list the applications and servers the label is applied to. It will appear like this (in part) showing a single server, in this case:
"links": {
"applications": [],
"servers": [
5980960
]
You would have to 'walk' through the list of labels returned by that request, looking for the server ID of the one you are interested in. Remember that the output is paginated so you you may have more than one page to investigate.
This would be worth contacting New Relic support and suggesting a Feature Request.

how to display a (at least partial) list of keys in Hazelcast using REST

I know that it is possible to display the value of a key that is part of a Map that was stored in hazelcast using REST.
Using curl this would be done this way:
curl -v -X GET http://127.0.0.1:5701/hazelcast/rest/maps/myMap/nameOfKeyInMap
But what if I don't know which keys were used ? Following REST principles I skipped the nameOfKeyInMap part of the URL and expected Hazelcast to return a list of keys. But it didn't.
Do you know a way to get at least a part of the keys used ? I use Hazelcast 3.4 and checked the manual but didn't find a solution
I want to use REST because I need to have people check the maps that don't have much more than a browser on their computers.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse