How to get complete metadata of dataset in Palantir Foundry through API call? - pyspark

I want to fetch complete metadata of the given dataset through API call. Can anyone please suggest how to fetch metadata

You actually already manipulate and interact with various forms of metadata inside your Transforms Python builds today, but in a way that is structured to be safe when reading and writing.
While not all forms of metadata are possible to access today, this generally is because of the desire to ensure product stability and good version controls of your builds.
That said, if there's a certain interaction with metadata you'd like to see in the product, I'd recommend reaching out to your support engineers with a feature request so they can understand your request more specifically and discuss with our product teams.

Related

Fetch all metadata of Salesforce

I've been trying to implement a way to download all the changes made by a particular user in salesforce using PowerShell script & create a package The changes could be anything whether it can be added or modified, Apex classes, profiles, Account, etc based on the modified by the user, component ID, timestamp, etc. below is the URL that exposes the API. The URL Does not explain any way to do this by using a script.
https://developer.salesforce.com/docs/atlas.en-us.api_meta.meta/api_meta/meta_listmetadata.htm
Does anyone know how I can implement this?
Regards,
Kramer
Salesforce orgs other than scratch orgs do not currently provide source tracking, which makes it possible to pinpoint user changes in metadata and extract only those changes. This is done by an SFDX/Metadata API client, like Salesforce DX or CumulusCI (disclaimer: I'm on the CumulusCI team).
I would not try to implement a Metadata API client in PowerShell; instead, harness one of the existing tools to do so.
Salesforce orgs other than scratch orgs don't provide source tracking at present. To identify user changes, you can either
Attempt to extract all metadata and diff it against your version control, which is considerably harder than it sounds and is implemented by a variety of commercial DevOps tools for Salesforce (GearSet, Copado, etc).
Have the user manually add components to a Change Set or Unmanaged Package, and use a Metadata API client as above to retrieve the contents of that package. (Little-known fact, a Change Set can be retrieved as a package!)
To emphasize: DevOps on Salesforce does not work like other platforms. Working on the Metadata API requires a fair amount of time investment and specialization. Harness the existing work of the Salesforce community where you can, but be aware that the task you are laying out may be rather more involved than you think and it's not necessarily something you can just throw together from off-the-shelf components.

Does the Recommendation service allow enriching an existing model with new data?

We are able to provide an initial training model and ask for recommendations. When asking for recommendations we can provide new usage events. Are these persisted at all into the model? Do they manipulate the model at all?
Is there another way the data is supposed to be updated or do we need to retrain a new model every time we want to enrich the model?
https://azure.microsoft.com/en-us/services/cognitive-services/recommendations/
EDIT:
We are trying to use the "Recommendations Solution Template" which deploys a solution to Azure and provides a swagger endpoint for working with the model (https://gallery.cortanaintelligence.com/Tutorial/Recommendations-Solution)
It appears the Cognitive Services API is much richer than this. Can the swagger version's models be updated?
After more experience with this I discovered a few things as of August 21st, 2017:
While not intuitive for the uninitiated, new data requires training a new model for the data to be persisted into the model.
This allows a form of versioning the model, and means when you make new models you can switch recommendations to work how they did before if they don't work as well.
The recommended method appears to be to batch usage data and create new builds of the model on an interval.
The APIs do allow passing in recent usage data to allow recent data to be accounted for at scoring time, it's just not persisted.
The "upload usage events" call in the cognitive services API does not seem to work. Uploading the new usage data via a file does appear to work.
The Recommended Solutions Template vs. The Cognitive Services API
It appears the Recommended Solutions Template is a packaged version of the SAR (Smart Adaptive Recommendations) model inside the Cognitive Services API that is optimized for ease of use.
I'm presuming for other popular recommendation models like FBT the Cognitive Services API should be used as the deployable template only allows one model type.
Additional note on the Preview Status of the API
It seems microsoft is deprecating the datamart as of February and sending people to this preview API instead. Therefore it seem reasonable to presume this Preview is highly likely to move on past preview and not be killed.

exporting data from Bluemix Presence Insights

I'm trying to export data from Presence Insights on Bluemix, I followed the following documentation:
https://presenceinsights.ng.bluemix.net/pidocs/analytics/export/
however I can't seem to find export button mentioned inside the document.
Data can be exported from the IBM Presence Insights Dashboard if you have data available. There are also REST APIs for exporting data. They are documented in the Floors, Sites, and Zones sections of the API Reference.
There were REST APIs in the product some time ago, but they were found to have limitations that made them less useful in production. In particular, the amount of data that builds up forces the response time on the API to grow beyond what the Bluemix infrastructure allowed. The API requests would timeout. To that end, the APIs were backed out, but it appears the documentation was left. That will be removed shortly.
Presence Insights still understands the value of exporting the data, so a new scheme is under investigation. For example, it would be ideal if the data could be exported under the covers to a production data storage facility, on a regular time frame.
In the interim, an alternative solution would be to use a Subscription to gather the backend enter/exit/dwell/timeout events directly and roll your own solution to store only what you need in whatever facility works for your application.

Document versioning with MarkLogic REST API

We're currently using MarkLogic's dls functions to handle document versioning, and are trying to switch over to use the REST API. The document endpoint doesn't use versioning by default, and I can't figure out a way to get it to. I'm referring to the dls functions for keeping multiple document versions, btw, not the new "content versioning" the REST API documentation mentions. In fact, the only reference to document versions in the REST API docs seems to be a line saying that content versioning isn't the same thing.
The only solution we've been able to come up with is to write a custom endpoint that duplicates everything the existing document endpoint's PUT does, plus document management. I'd rather avoid that if possible, especially when looking at MarkLogic 7's partial document updates. We're using MarkLogic 6 now, if it matters, but it doesn't look like 7 has any new features related to this.
Is there a way to do this using MarkLogic's existing endpoints?
You can write a REST API extension that automates the DLS operations. See http://docs.marklogic.com/guide/rest-dev/extensions. You will largely end up duplicating a lot of the same things, but this will plug into the existing endpoints.
Yes, MarkLogic 7 added content versioning to make refreshing of caches easier. And unfortunately, the DLS library hasn't been integrated into the REST api so far. You can file a feature request at support if you like.
In the mean time, the best suggestion I can give is use a separate route to do document updates using DLS (your current route or a limited custom endpoint that only supports the DLS functions you need for doc updates), and do anything else (as far as possible) using the existing REST api. You can look at this other stackoverflow question to see how to limit searches to the latest doc versions:
Marklogic REST API search for latest document version
HTH!
A member of MarkLogic has put together a REST extension to provide better DLS support in the REST-api. Hopefully that makes working with DLS over the MarkLogic REST-api a lot easier:
https://github.com/sanjuthomas/marklogic-dls-rest-extension
HTH!

Programmatic export/dump/mass data retrieval (BaaS)

Does anyone have experiences with programmatic exports of data in conjunction with BaaS providers like e.g. parse.com or StackMob?
I am aware that both providers (as far as I can tell from the marketing talk) offer a REST API which will allow for queries against the database, not only to be used by mobile clients but also by e.g. custom web apps.
I am also aware that both providers offer a manual export of data (parse.com via their web interface, StackMob via support).
But lets say I would like to dump all data nightly, so that I can import it into a reporting system for instance. Or maybe simply to have an up-to-date backup.
In this case, I would need a programmatic way to export/replicate the data stored in the backend. Manual exports are not an option for obvious reasons.
The REST APIs offered however seem to be designed for specific queries, not for mass reads (performance?). Let alone the pricing - I assume none of the providers would be happy about a nightly X Gigabyte data export via their REST API, so their probably will be a price tag.
I just couldn't find any specific information on this topic so far, so I was wondering if anyone else has already gone through this. Also, any suggestions on StackMob/parse alternatives are welcome, especially if related to the data export topic.
Cheers, Alex
Did you see the section of the Parse REST API on Batch operations? Batch operations reduce the number of API calls needed to grab data so that you are not using a call for every row you retrieve. Keep in mind that there is still a limit (the default is 100, but you can set it to a maximum of 1000). That means you are still limited to pulling down 1000 rows per API call.
I can't comment on StackMob because I haven't used it. At my present job, we are using Parse and we wrote a C# app which compares the data in a Parse class with a SQL table and pulls down any changes.