My hyperledger sawtooth validators ran out of storage - data-management

Is there any way to move data out of the ledger while keeping the data consistent?

With Sawtooth Version 1.1 (Bumper) they've introduced "pruning" which allows you to remove generational data history through configurable settings. Check the online documentation or implementation in github to see if it addresses your needs specifically.
Clearly the integrity of the chain is a given.

Related

Enabling enhanced measurement programatically

we are using an Admin API to programmatically create Data Streams for given properties - https://developers.google.com/analytics/devguides/config/admin/v1/rest/v1beta/properties.dataStreams/create.
However all of the created Data Streams have enhanced measurement turned off by default. Is there or will there be a possibility to enable it together with the create request?
Thank you
Okay after a bit of digging
EnhancedMeasurementSettings we can find the following
EnhancedMeasurementSettings appear to only be for web DataStream's
Singleton resource under a WebDataStream, configuring measurement of additional site interactions and content.
The issue being is that this is only listed under alpha1 not the beta version of the api.
Then if we look a bit more we find
2021-10-12 EnhancedMeasurementSettings methods removed from the API.
Which seams to say its being removed but will be added back but its a year later and nothing.
I am going to ping the team and see when it will be added back.

Distributed database which allows custom CRDT merging

I‘m rather new to distributed databases, though I have already studied related literature (e.g. CAP theorem, CRDT) and implemented some POC to allow scaling my application horizontally.
Now I however face a challenging problem. In ordere to scale the app horizontally, communication between services is done via a distributed queue. As a background here, I do require a custom CRDT method to keep the data eventually consistent, and I do require my application to work like a cache (remotely related to REDIS).
The challenge is now that I also need to persist the data. That requires me to keep the data within the application cache and database eventually consistent. I‘ve checked Cassandra, I saw a ticket [1] where somebody tried to add functionality for custom CRDT merge functionality (which as I mentioned do require for a reason). That never made it into Cassandra, and seems to have a few issues to resolve.
What are my options, either in form of a concrete distributed database engine allowing custom merging, or an algorithm that could help solve the problem (e.g. in form of a db trigger or something like this).
[1] https://issues.apache.org/jira/browse/CASSANDRA-6412
As far as I know, there are very few databases that allow you to specify your own custom conflict resolution algorithms. Tbh. the only one I really found - disclaimer: I'm not a Microsoft Advocate - is Azure CosmosDB. It has MongoDB-compatible API and can be configured to use master-master replication strategy, where you need to specify your own conflict resolution algorithm (using JavaScript). You can use it to define your own merge operation.
If you'll take a look outside of database-native solutions into application-level ones, there are several tools, like ie. Akka (available in both JVM or .NET version) which enables you to write custom CRDTs inside of distributed-data module. JVM version additionally supports multi-datacenter persistence, which is conceptually closer to how commutative CRDTs work and can be integrated with Cassandra backend.
I've implemented a MerkleClock CRDT at my merkle-crdt repository.
You could use an approach that when you update the database record column, you fetch the column's value and then you merge it with your CRDT of your current state and then when you save, you serialise the CRDT as JSON and store it in the database.

exporting data from Bluemix Presence Insights

I'm trying to export data from Presence Insights on Bluemix, I followed the following documentation:
https://presenceinsights.ng.bluemix.net/pidocs/analytics/export/
however I can't seem to find export button mentioned inside the document.
Data can be exported from the IBM Presence Insights Dashboard if you have data available. There are also REST APIs for exporting data. They are documented in the Floors, Sites, and Zones sections of the API Reference.
There were REST APIs in the product some time ago, but they were found to have limitations that made them less useful in production. In particular, the amount of data that builds up forces the response time on the API to grow beyond what the Bluemix infrastructure allowed. The API requests would timeout. To that end, the APIs were backed out, but it appears the documentation was left. That will be removed shortly.
Presence Insights still understands the value of exporting the data, so a new scheme is under investigation. For example, it would be ideal if the data could be exported under the covers to a production data storage facility, on a regular time frame.
In the interim, an alternative solution would be to use a Subscription to gather the backend enter/exit/dwell/timeout events directly and roll your own solution to store only what you need in whatever facility works for your application.

Where is the MongoDB integration with Splunk's Hunk?

I've been looking into NoSQL integration for Splunk using Hunk but there seems to be virtually no documentation on this process?
Is it possible? Are there docs?
I'm using Hunk 6.1
Specifically I am trying to use MongoDB as a backend datastore to splunk.
The short answer, is likely that there is no other NoSQL integrations out of the box and you'll have to roll your own for now.
Hunk has been marketed widely as Splunk for Hadoop and that is their first use case. (This can be seen in the Developer PDF still. General user documentation can be found http://docs.splunk.com/Documentation/Hunk/latest.
But the design of Hunk is that you define a "Virtual Index" that instead of connecting to and querying data from Splunk's built in indexes, you connect externally to another NoSQL data store. This requires a library to make requests to the NoSQL store and stream results back to Splunk (as can be seen in the diagram on this page)
I would suspect that they may expand the virtual index concept and ship libraries for other NoSQL datastores in the future as they have demand, but for now they have provided an example app that reportedly has documentation inside the app itself guide you on how to roll your own connecting library: http://apps.splunk.com/app/1802/
This functionality now exists with Hunk and the Hunk App for MongoDB. Here is the announcement : http://www.mongodb.com/presentations/splunks-hunk-powerful-way-visualize-your-data-stored-mongodb
The app is available here: https://apps.splunk.com/app/1810/
I hope that helps!

Free CORS proxy

I have a chrome extensions that, for now, uses storage.sync as a data store.
I am trying upgrade to a more robust data store (cloudant, iriscouch) so I can expand what the extension can do.
The problem I am facing is that none of the free cloud store options I have seen (cloudant, iriscouch, mongolab) support cors (or in case of mongolab - free tier is not enough)
So I have fewquestions:
Is there a free nosql cloud store option that fully supports cors?
If not, is there a free proxy that I can use. I have tried http://www.corsproxy.com/ but it only seems to support GET so it's not enough for me.
How would you solves this problem for free and without hosting anything on my own hardware.
Thank you!
Cloudant will have CORS support from January 2014. Iris Couch also supports CORS, just not via a UI - you have to modify the configuration via the CouchDB API.
Couchappy supports CORS and PouchDB, and introduces a selective domain security feature, which allows fine-grained security(https only or both https+http) on a per-domain basis.