Firestore created document Eventarc audit log methods are inconsistent - google-cloud-firestore

I am trying to call a service when a new Firestore document is created. With Cloud Functions v1 this was simple and worked great (https://firebase.google.com/docs/functions/firestore-events). With Cloud Functions v2/Cloud Run being triggered via Eventarc, I'm struggling to get similar or even consistent behavior and feel like I'm missing something.
My expectation that would get us the same behavior as exists with v1 functions I would need to have an audit log recorded with a service name of firestore.googleapis.com and a method name of google.firestore.v1.Firestore.CreateDocument (per https://cloud.google.com/eventarc/docs/reference/supported-events#cloud-firestore) to be created. I'm getting audit logs written for document creation, but no consistent behavior.
If I create a new document in the GCP or Firebase Console, an audit log record with a service name of firestore.googleapis.com and method name of google.firestore.v1.Firestore.Write generated.
If I create a new document using the Firestore Client SDK (tested with Android), an audit log record with a service name of firestore.googleapis.com and method name of google.firestore.v1.Firestore.Write is generated.
If I create a new document using the Firestore Admin SDK (tested with both "#google-cloud/firestore and firebase-admin for Node and cloud.google.com/go/firestore for Go...all the same behavior), an audit log record with a service name of firestore.googleapis.com and method name of google.firestore.v1.Firestore.Commit generated.
// JS implementation...similar implementation for Go
const {Firestore} = require('#google-cloud/firestore');
...
const db = new Firestore({...});
const collection = db.collection('users');
const res = await collection.add({});
If I create a new document using the Firestore REST API, an audit log record with a service name of firestore.googleapis.com and method name of google.firestore.v1.Firestore.CreateDocument generated.
curl --request \
POST 'https://firestore.googleapis.com/v1/projects/MY_PROJECT/databases/(default)/documents/users' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer MY_TOKEN' \
--data '{"fields":{}}'
My goal is to have a behavior similar to v1 functions where I can reliably respond to document creation.
None of the above are necessarily wrong, but not having consistent behavior is not great. I don't feel like I'm doing something wrong, but clearly am not doing something right either. What is it that I'm missing or should I correct my Eventarc mental model?

From https://cloud.google.com/functions/docs/calling/cloud-firestore (as recent as Sept 7, 2022):
Cloud Functions (2nd gen) does not currently support Cloud Firestore triggers.
From https://cloud.google.com/functions/docs/calling (as recent as Sept 7, 2022):
Note: Eventarc does not currently support direct events from Firestore, Google Analytics for Firebase, or Firebase Authentication. Use Cloud Functions (1st gen) to use these events.
If you want the v1 behavior with Eventarc, it's not currently supported. Continue to use v1 functions.

Related

Google cloud platform data fusion instance triggering

I want to trigger the google data fusion instance with the command as follows
POST -H "Authorization: Bearer ${AUTH_TOKEN}" "${CDAP_ENDPOINT}/v3/namespaces/namespace-id/apps/pipeline-name/workflows/DataPipelineWorkflow/start"
but I cant able to figure out one thing that what will be the CDAP_ENDPOINT in this kindly help me out by telling that from where i can find out the cdap_endpoint
Thanks
This is nicely explained in GCP documentation, as you might be able to get Data fusion API endpoint throughout the following command lines, invoking them in cloud shell:
export INSTANCE_ID=your-Data fusion instance-id
export CDAP_ENDPOINT=$(gcloud beta data-fusion instances describe \
--location=us-central1 \
--format="value(apiEndpoint)" \
${INSTANCE_ID})

"Access Denied. Provided scope(s) are not authorized" error when trying to make objects public using the REST API

I am attempting to set permissions on individual objects in a Google Cloud Storage bucket to make them publicly viewable, following the steps indicated in Google's documentation. When I try to make these requests using our application service account, it fails with HTTP status 403 and the following message:
Access denied. Provided scope(s) are not authorized.
Other requests work fine. When I try to do the same thing but by providing a token for my personal account, the PUT request to the object's ACL works... about 50% of the time (the rest of the time it is a 503 error, which may or may not be related).
Changing the IAM policy for the service account to match mine - it normally has Storage Admin and some other incidental roles - doesn't help, even if I give it the overall Owner IAM role, which is what I have.
Using neither the XML API nor the JSON version makes a difference. That the request sometimes works with my personal credentials indicates to me that the request is not incorrectly formed, but there must be something else I've thus far overlooked. Any ideas?
Check for the scope of the service account incase you are using the default compute engine service account. By default the scope is restricted and for GCS it is read only. Use rm -r ~/.gsutil to clear cache in case of clearing cache
When trying to access GCS from a GCE instance and getting this error message ...
the default scope is devstorage.read_only, which prevents all write operations.
Not sure if scope https://www.googleapis.com/auth/cloud-platform is required, when scope https://www.googleapis.com/auth/devstorage.read_only is given by default (eg. to read startup scripts). The scope should rather be: https://www.googleapis.com/auth/devstorage.read_write.
And one can use gcloud beta compute instances set-scopes to edit the scopes of an instance:
gcloud beta compute instances set-scopes $INSTANCE_NAME \
--project=$PROJECT_ID \
--zone=$COMPUTE_ZONE \
--scopes=https://www.googleapis.com/auth/devstorage.read_write \
--service-account=$SERVICE_ACCOUNT
One can also pass all known alias names for scopes, eg: --scopes=cloud-platform. The command must be run outside of the instance, because of permissions - and the instance must be shutdown, in order to change the service account.
Follow the documentation you provided, taking into account these points:
Access Control system for the bucket has to be Fine-grained (not uniform).
In order to make objects publicly available, make sure the bucket does not have the public access prevention enabled. Check this link for further information.
Grant the service account with the appropriate permissions in the bucket. The Storage Legacy Object Owner role (roles/storage.legacyObjectOwner) is needed to edit objects ACLs as indicated here. This role can be granted for individual buckets, not for projects.
Create the json file as indicated in the documentation.
Use gcloud auth application-default print-access-token to get authorization access token and use it in the API call. The API call should look like:
curl -X POST --data-binary #JSON_FILE_NAME.json \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
"https://storage.googleapis.com/storage/v1/b/BUCKET_NAME/o/OBJECT_NAME/acl"
You need to add OAuth scope: cloud-platform when you create the instance. Look: https://cloud.google.com/sdk/gcloud/reference/compute/instances/create#--scopes
Either select "Allow full access to all Cloud APIs" or select the fine-grained approach
So, years later, it turns out the problem is that "scope" is being used by the Google Cloud API to refer to two subtly different things. One is the available permission scopes available to the service account, which is what I (and most of the other people who answered the question) kept focusing on, but the problem turned out to be something else. The Python class google.auth.credentials.Credentials, used by various Google Cloud client classes to authenticate, also has permission scopes used for OAuth. You see where this is going - the client I was using was being created with a default OAuth scope of 'https://www.googleapis.com/auth/devstorage.read_write', but to make something public requires the scope 'https://www.googleapis.com/auth/devstorage.full_control'. Adding this scope to the OAuth credential request means the setting public permissions on objects works.

How to find the user id and password for IBM API?

I am a beginner for IBM API. I just launched an IBM Natural Language Understanding service. However, what I got is the API key instead of user id and passwords. Like this:
{
"apikey": "••••••••••••••••••••••••••••••••••••••••••••",
"iam_apikey_description": "Auto generated apikey during resource-key operation for Instance - crn:v1:bluemix:public:natural-language-understanding:us-east:a/6514bcdaafbc465498a244edb484cbe5:53e5f23b-f255-4d6c-b48d-cfce09c975b1::",
"iam_apikey_name": "auto-generated-apikey-51f2d016-d3ec-46bc-8be7-496ae621983d",
"iam_role_crn": "crn:v1:bluemix:public:iam::::serviceRole:Manager",
"iam_serviceid_crn": "crn:v1:bluemix:public:iam-identity::a/6514bcdaafbc465498a244edb484cbe5::serviceid:ServiceId-d83a34de-5860-443a-817a-b3cb3fb44e2a",
"url": "https://gateway-wdc.watsonplatform.net/natural-language-understanding/api"
}
In the example below, it shows I need a user id and a password. Where can I find them? Thanks!
import json
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 \
import Features, EntitiesOptions, KeywordsOptions
natural_language_understanding = NaturalLanguageUnderstandingV1(
username='username',
password='password',
version='2018-03-16')
response = natural_language_understanding.analyze(
text='IBM is an American multinational technology company '
'headquartered in Armonk, New York, United States, '
'with operations in over 170 countries.',
features=Features(
entities=EntitiesOptions(
emotion=True,
sentiment=True,
limit=2),
keywords=KeywordsOptions(
emotion=True,
sentiment=True,
limit=2)))
print(json.dumps(response, indent=2))
This is all explained in the getting started tutorial from the instance.
Click Show to view your credentials.
Copy the username, password, and url values.
Important: The tutorial uses service instance credentials to
authenticate to the Natural Language Understanding service. In some
regions, new service instances instead use IBM® Cloud Identity and
Access Management (IAM) tokens for authentication. Authenticate by
using the approach that is right for your region and service instance.
They mention differing authentication types by region - but they don't really specify which regions use which type.
It is pointed out in the release notes
29 May 2018
The service now supports a new API authentication process for service
instances created in Sydney (au-syd). IBM® Cloud is in the process of
migrating to token-based Identity and Access Management (IAM)
authentication. IAM uses access tokens rather than service credentials
for authentication with a service.
As of 29th of May only newly created instance in Sydney (au-syd) use a different authentication method. I'm not sure if there is a better way to find this information out besides crawling through release notes chronologically.
So if your instance was created in the Sydney (au-syd) region after the 28th May 2018 or other regions have since been moved over to this system you'll have to generate a token and pass it through instead.
Using basic auth to intially get the token
curl -k -X POST \
--header "Authorization: Basic Yng6Yng=" \
--header "Content-Type: application/x-www-form-urlencoded" \
--header "Accept: application/json" \
--data-urlencode "grant_type=urn:ibm:params:oauth:grant-type:apikey" \
--data-urlencode "apikey={api_key}" \
"https://iam.bluemix.net/identity/token"
Then using the token from response for further API calls.
curl -X GET \
--header "Authorization: Bearer {token}" \
"https://gateway.watsonplatform.net/discovery/api/v1/environments?version=2017-11-07"
Just keep in mind that you will need to refresh the token periodically.
We added support for IAM in version 1.3.3. Always make sure you are using the latest version.
With IAM you will replace username and password with an iam_apikey parameter from the apikey credential field.
import json
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 \
import Features, EntitiesOptions, KeywordsOptions
natural_language_understanding = NaturalLanguageUnderstandingV1(
iam_apikey='the apikey value from your question',
url='https://gateway.watsonplatform.net/natural-language-understanding/api',
version='2018-03-16')
response = natural_language_understanding.analyze(
text='IBM is an American multinational technology company '
'headquartered in Armonk, New York, United States, '
'with operations in over 170 countries.',
features=Features(
entities=EntitiesOptions(
emotion=True,
sentiment=True,
limit=2),
keywords=KeywordsOptions(
emotion=True,
sentiment=True,
limit=2)))
print(json.dumps(response, indent=2))
Looks like your app needs to use the API key there to request a bearer token from the Identity and Access Manager, according to the instructions at https://github.com/watson-developer-cloud/node-sdk/blob/master/README.md#authentication and https://console.bluemix.net/docs/services/watson/getting-started-iam.html#iam .

How to manage Presto query session variables using REST API?

I am using the Presto REST api to query the database, but all my sessions appear to be ephemeral. For example, if I do something like
query_presto('set session columnar_processing = true')
query_presto('show session')
Despite setting the column_processing variable in the first query, the second query will show that column_processing is still the default value of false.
I read somewhere that
Presto maintains sessions for each authenticated user. This session
expires if no new requests are received within the timeout period
configured for Presto.
However, I can't find this configuration anywhere in the code nor documentation.
My question is how do I maintain a database session using the RESTful API?
After too much time digging around, I found that there is a header X-Presto-Session which you can set comma separated variables, ie
curl --data "show session" http://127.0.0.1:8889/v1/statement/ --header "X-Presto-User: myuser" --header "X-Presto-Schema: myschema" --header "X-Presto-Time-Zone: UTC" --header "X-Presto-Catalog: mycatalog" --header "User-Agent: myagent" --header "X-Presto-Session: session_variable_1=900,session_variable_2=true"
Despite what the doc says, I don't think there is a way for Presto to remember session variables set in previous executions. I have to locally cache them in my program and pass them all every execution.

Instance environment variables

I have several Google Compute Engine instances, and have set instance metadata on each, under the assumption these are available on the instance itself as an environment variable, but they don't show up. I then read here that I need to query the metadata server for this data, but that just returns a 403 unauthorized when run from the instance itself. Is there a way to access metadata as environment variables?
It may be worth studying Metadata querying a bit more, but my guess is that you are attempting to get custom metadata, which is resulting in it not being found. Make sure you are using the attributes directory to access any custom metadata.
For example, this will get the built-in tags metadata:
curl "http://metadata.google.internal/computeMetadata/v1/instance/tags" \
-H "Metadata-Flavor: Google"
while this will get your custom metadata for attribute foo:
curl "http://metadata.google.internal/computeMetadata/v1/<instance|project>/attributes/foo" \
-H "Metadata-Flavor: Google"