custom export/Import for alfresco - import

the obvious question is that is there any solution to export some alfresco contents which have a custom condition, for example export files which their create date is between a given date range?
the goal of this solution is:
1- to have a minimum mount of data volume in export/import action
2- in my weekly or monthly export/import action on backup alfresco server, I shouldn't have duplicate records for import action
thanks a lot for any kind of help

One idea is to use a library like OpenCMIS (Java) or cmislib (Python), both available from the Apache Chemistry project. Then use a CMIS query to restrict the data you want to export to a certain date range. If you want examples of CMIS queries, including ones that use date ranges, take a look at this Java example.
Another idea would be to use CMIS change tokens. Using this approach, you ask Alfresco what has changed since the last time your code ran. Alfresco responds back with a set of changes. You can then iterate over those changes and process them accordingly. The CMIS & Apache Chemistry in Action book has a change token example that uses Python to run a polling sync server between to CMIS repositories. The source code lives here.
Both of these options use CMIS. If you would rather have a native Alfresco option you could write a custom action that runs on a schedule to call the export. Or, you could use the File Transfer Service to write files to a file system on a schedule.
If what you are really trying to do is back up your repository, don't use any of these options. Instead you should be following standard practice for backup up the repo which is to dump the database and backup the content store.

Maybe you can use the Alfresco Replication Jobs to export your contents into a different repository.
In addition, you can export the contents to a file system using the FSTR feature.
Replication jobs use Alfresco Transfer Services that can be customised to only transfer some kind of content.

Related

Tableau - Auto Archiving Historic Data

I have a live data source connected to Tableau Desktop. This source overwrites as it is updated. Is there a way to automatically save off monthly data from Tableau? Preferably, save the charts I have built.
There are a few options to consider if you are using Tableau Server:
REST API Export View: If you don't need interactivity, then you can automate exporting the image using the REST API. This is analogous to using the Worksheet -> Export -> Image feature in the front-end. There are other export options such as PDF and CSV. I would recommend using the Python client library over the raw REST API, but both are valid methods to export content.
If you need interactivity, then you will need to use an extract instead of a live data source. With this, you can automate exporting the workbook using the REST API. If you want a live query for normal use, then you can duplicate the workbook, covert the data source to extract, schedule the extract refresh, and download the workbook.
Scheduled Subscription: If you don't want to code, then you can schedule emailing an image of the view and manually save the images as needed. You could setup a dedicated shared resource mailbox and subscribe a user, typically a service account, to the schedule. This would allow you to consolidate all the subscriptions to a dedicated mailbox for future use.
TabCmd: If you are comfortable with tabcmd, you can automate exporting to CSV, Image, and PDF.
If you are only using Tableau Desktop, then the best option may be to convert the data source to an extract, automate refreshing the extract locally with tabcmd, and save a copy of the workbook to a folder while renaming the file to include a YYYY_MM_DD in the name. This will give you a fully functional copy of the workbook.

Fetch all metadata of Salesforce

I've been trying to implement a way to download all the changes made by a particular user in salesforce using PowerShell script & create a package The changes could be anything whether it can be added or modified, Apex classes, profiles, Account, etc based on the modified by the user, component ID, timestamp, etc. below is the URL that exposes the API. The URL Does not explain any way to do this by using a script.
https://developer.salesforce.com/docs/atlas.en-us.api_meta.meta/api_meta/meta_listmetadata.htm
Does anyone know how I can implement this?
Regards,
Kramer
Salesforce orgs other than scratch orgs do not currently provide source tracking, which makes it possible to pinpoint user changes in metadata and extract only those changes. This is done by an SFDX/Metadata API client, like Salesforce DX or CumulusCI (disclaimer: I'm on the CumulusCI team).
I would not try to implement a Metadata API client in PowerShell; instead, harness one of the existing tools to do so.
Salesforce orgs other than scratch orgs don't provide source tracking at present. To identify user changes, you can either
Attempt to extract all metadata and diff it against your version control, which is considerably harder than it sounds and is implemented by a variety of commercial DevOps tools for Salesforce (GearSet, Copado, etc).
Have the user manually add components to a Change Set or Unmanaged Package, and use a Metadata API client as above to retrieve the contents of that package. (Little-known fact, a Change Set can be retrieved as a package!)
To emphasize: DevOps on Salesforce does not work like other platforms. Working on the Metadata API requires a fair amount of time investment and specialization. Harness the existing work of the Salesforce community where you can, but be aware that the task you are laying out may be rather more involved than you think and it's not necessarily something you can just throw together from off-the-shelf components.

Which kind of Google Cloud Platform mobile backend client is appropriate?

THE PROBLEM
I'm writing a mobile app which will allow a user to log in, save some preferences that must be stored in a database, and display congressional bills to the user.
I've only written simple RESTful services with PHP and MySQL in the past. I'd like to take advantage of newer technologies, and am a little lost on general direction.
The bill data (formatted as JSON) can be gathered by running the scrapers found here. Using docker, I managed to set a working directory and download the files on my local machine.
I've designed a MySQL database for holding the relevant bill and user data.
I started to mess around in Google Cloud Platform, and read the doc that describes different models. I'm thinking of a few different ideas, but aren't familiar with GCP or what I can actually accomplish.
QUESTIONS
1) What are App Engine, Compute Engine, and Container Engine each for? I get the gist that Container Engine holds different instances of stuff you load up with docker, and that Compute Engine sets up a VM, but I don't really understand the relationships. How should I think of them?
2) When I run those scrapers from the shell, where are the files being stored, and how can I check on them? On my computer, I set a working directory, but how do directories work in GCP? Is it just a directory in the currently selected VM, or is this what Buckets are for?
IDEAS
1) Since my bill data already comes as JSON, should I skip the entire process of building a database for the bills and insert them into Firebase somehow? Is this even possible? If so, am I stuck using Firebase's NoSQL, or can I still set up a relational database?
2) I could schedule the scrapers to run periodically, detect new files, and run a script to parse the JSON and insert new bill data into my a database (PostgrSQL?/MySQL?). Then I would write an API.
3) Download the JSON files to a bucket, and write an API that reads from them. Not sure how the performance would compare to using a DB.
I'm open to other suggestions as well.
For your use case (stateless web application), App Engine is probably your best choice. The Google documentation has severalcomparisons of your computing options
You can use App Engine with PHP and cloud-hosted MySQL if you want, which could be a good way to get your toes wet without going in over your head.

How to integrate Oracle APEX and Alfresco via CMIS

A question regarding the integration of the document management system Alfresco into Oracle Application Express (APEX) based on CMIs-repository:
The aim is to use APEX as the portal-page and Alfresco showing it's results (document lists) based on search parameters coming form APEX.
A search result from a CMIS-query should be displayed in an APEX page-region.
Unfortunately I have no experience in this sector (REST, CMIS) - so any advice would be welcome!
A related question regarding user authentication and authorization via CMIS does also arise.
Has anyone out there implemented something like this or used these components together, yet?
The first thing that pops into my mind is making the choice where you want your communication with the repository to take place: client side or server side?
Alfresco supports Web Scripts, so I would be possible to create a javascript-heavy thick client which connects to your repository, get information about your files and redirect to their download links.
The alternative would be to design some way to connect to the repository from the database server. Again there are many ways to do this. You can connect to the repository during your page load and use PL/SQL regions to fire scripts that connect to your repository, get the data you want, and render your region with that information.
Another way would be to periodically check the repository for changes, and maintain a 'shadow copy' of the repository within your oracle database tables.
Of course all of these solutions have their own drawbacks.

automating downloads of facebook insight data

I'm looking for a tool or process for exporting facebook insights data for a facebook page and a facebook app. Currently I am just manually downloading csv files from their Insights interface but ideally I want to automate this process and load the data into Pentaho Kettle, so I can perform some operations on the data.
Is there some way to automate the downloading and input of csv files? Or will I have to use the facebook graph api explorer? I am currently looking at a set-up where I use NetBeans and RestFB to pull the data I want, and then access that data using Pentaho Kettle. I am not sure if this will work, or if it is the best approach.
As Codek says, a Kettle plugin is a very good idea, and would be very helpful to the Kettle project. However, it's also a serious effort.
If you don't want to put in that kind of effort, you can certainly download files with a Kettle Job as long as the files are available through a standard transfer method (FTP, SFTP, SSH, etc). I've never used RestFB, so I don't know what's available. You might be able to get directly from a web service with the REST Client transform step.
After downloading the files, you can send them to a transform to be loaded. You can do this with either the Execute for every input row? option on the Transformation job step, or you can get the filenames from the job's result set in the transform with Get files from result.
Then you can archive the files after loading with Copy or Move result filenames. In one job, I find only files that are not in my archive using a Get File Names and Merge Join, and then a Set files in result step in a transform, so that can be done if need too.
To automate it, you can run your job from a scheduler using Kitchen.bat/Kitchen.sh. Since I use PostgreSQL a lot, I use PGAgent as my scheduler, but the Windows scheduler or cron work too.
Hope that helps.