can i restore soft deleted entity in apache atlas? - apache-atlas

i want restore a soft deleted entity in apache atlas, cause i want to delete the classification tagged on that entity.
i'm using atlas for data governance in HDP cluster.
And i deleted an entity without deleting the classification tagged on that.
When i want to delete the classification tagged on that entity, it turns out i can't delete that classification.
Cause the tagged entity was soft deleted, so atlas thought the classification i want to delete still have references.
So i want to restore the soft deleted entity, i found there are no API support restore in Atlas 7.0, so i try to change the data in related Solr index and Hbase tables.
I changed the data in Solr index "vertex_index" from "DELETED" TO "ACTIVE', and the data in Hbase table "ATLAS_ENTITY_AUDIT_EVENTS" show the entity status is "ACTIVE".
But when i search it in Atlas UI, it still shows that the entity is 'DELETED'.
So, i'm wondering do i miss some thing, can anyone know where exactly atlas store its entity data, and if i can't restore that data, can i delete it in the database or some where else?

You can directly delete the classification associated with the deleted entity. Atlas provides REST API for the same.
"/v2/entity/guid/{guid}/classification/{classificationName}"

Related

How to delete Edges and how to delete Document with relationships with Arango and Spring Data integration?

I need to remove a document of ArangoDb through Spring Data integration.
This happens with relatedRepo.delete (document); .
It seems that if a Document is deleted the Edges from/to it still remain.
Is it correct?
In this case how can I remove the Edges?
How can I find an Edge?
Spring Data ArangoDB does not perform cascade deletes of related edges. To achieve such behavior you could use the driver directly: https://www.arangodb.com/docs/3.6/drivers/java-reference-graph-vertices.html#arangovertexcollectiondeletevertex

How to delete temporal documents from MarkLogic database physically?

I came across the temporal function "temporal.documentDelete" which "logically deletes" temporal documents in a MarkLogic database hence removing it from the latest collection. But the document is still not physically deleted from MarkLogic database. You can still retrieve the deleted documents using its URI.
Is there any way, where I can as well physically delete the temporal documents ingested into my MarkLogic database?
You can use temporal.documentWipe, but bear in mind that it will wipe all versions of that document. You would basically be rewriting history, which is against the nature of temporal.
Also note that you can only wipe documents whose protection has expired. You protect temporal documents using temporal.documentProtect.
More notes on deleting and wiping temporal documents can be found in the Temporal Guide:
http://docs.marklogic.com/guide/temporal/managing#id_10558
HTH!

Is it possible to populate without schema

I have an application that uses mongo's oplog to follow insert, update and delete operations. I would like to get to full document through mongoose (populated), everytime one of the operations occur (findById with the id I get from the oplog), but I do not have the Schemas as there are defined in another application.
Do you think it is possible to get the full document without cloning the other application and registering each schemas for each model?
Thanks in advance!

Apache OpenNLP Persist Model to DB

I am exploring Apache OpenNLP product in my project and one of the requirement is to persist the trained model in DB - Mongo DB / couchbase in my case.
Right now primarily I am looking to store document categorizer model output to DB so that I do not have to rerun unless its modified
I see that the library classes are not serializable e.g. DocumentCategorizerME and I am getting json deserilization exception if I try to retrieve the persisted records so want to know if someone is already doing that.
In general what would be the approach to persist even if I want to use any other open source NLP products.
One of the approach that can be followed is using DoccatModel.serialize to serialize and store the model to Mongo DB - GridFs
Couchbase DB has hard limit of 20 MB size for binary data to be stored.

Inserting data with relations into Strongloop application

I have a Strongloop application using MongoDB. I have a model with several sub-models, using relations. When I try and insert an object into the database, only the top level data is added. All the related data is ignored. How do I get an entire object with sub-objects into the database?
See the embedded models documentation: http://docs.strongloop.com/display/public/LB/Embedded+models+and+relations