Google Cloud Storage object finalize event triggered multiple times - google-cloud-storage

Scenario
I have a couple of Google Cloud Functions triggered by a Google Cloud Storage object.finalize event. For that I'm using two buckets and transfer job with "Synchronization options: Overwrite objects at destination" which copies every day a single file from one source bucket to destination one. The source bucket is the same for both functions and the destination buckets are different.
Problem
Most of the time it works as expected but sometimes I see multiple events at the almost the same time. Most of the time I see 2 duplicates but once was 3. I put in log event payload but it always the same.
More details
Here is an example of multiple log entries
Question
Could it be a known issue for Google Cloud Storage?
If no then most probably something is wrong in my code.
I'm using the following project structure:
/functions
|--/foo-code
|--executor.js
|--foo.sql
|--/bar-code
|--executor.js
|--bar.sql
|--/shared-code
|--utils.js
|--index.js
|--package.json
index.js
let foo;
let bar;
exports.foo = (event, callback) => {
console.log(`event ${JSON.stringify(event)}`);
foo = foo || require(`./foo-code/executor`);
foo.execute(event, callback);
};
exports.bar = (event, callback) => {
console.log(`event ${JSON.stringify(event)}`);
bar = bar || require(`./bar-code/executor`);
bar.execute(event, callback);
};
./foo-code/executor.js
const utils = require('../shared-code/utils.js)
exports.execute = (event, callback) => {
// run Big Query foo.sql statement
};
./bar-code/executor.js
const utils = require('../shared-code/utils.js)
exports.execute = (event, callback) => {
// run Big Query bar.sql statement
};
And finally deployment:
foo background function with specific bucket trigger:
gcloud beta functions deploy foo \
--source=https://<path_to_repo>/functions \
--trigger-bucket=foo-destination-bucket \
--timeout=540 \
--memory=128MB
bar background function with specific bucket trigger:
gcloud beta functions deploy bar \
--source=https://<path_to_repo>/functions \
--trigger-bucket=bar-destination-bucket \
--timeout=540 \
--memory=128MB
For me looks that the most possible problem is due to the fact of multiple deployments (only trigger-bucket flag is different). But the weird thing is that above setup works most of the time.

The normal behavior of Cloud Function is that at least once the events are delivered and background functions are invoked, which means that rarely, spurious duplicates may occur.
To make sure that your function behaves correctly on retried execution attempts, you should make it idempotent by implementing it so that an event results in the desired results (and side effects) even if it is delivered multiple times.
Check the documentation for some guidelines for making a background function idempotent.

Related

Stop huge error output from testing-library

I love testing-library, have used it a lot in a React project, and I'm trying to use it in an Angular project now - but I've always struggled with the enormous error output, including the HTML text of the render. Not only is this not usually helpful (I couldn't find an element, here's the HTML where it isn't); but it gets truncated, often before the interesting line if you're running in debug mode.
I simply added it as a library alongside the standard Angular Karma+Jasmine setup.
I'm sure you could say the components I'm testing are too large if the HTML output causes my console window to spool for ages, but I have a lot of integration tests in Protractor, and they are SO SLOW :(.
I would say the best solution would be to use the configure method and pass a custom function for getElementError which does what you want.
You can read about configuration here: https://testing-library.com/docs/dom-testing-library/api-configuration
An example of this might look like:
configure({
getElementError: (message: string, container) => {
const error = new Error(message);
error.name = 'TestingLibraryElementError';
error.stack = null;
return error;
},
});
You can then put this in any single test file or use Jest's setupFiles or setupFilesAfterEnv config options to have it run globally.
I am assuming you running jest with rtl in your project.
I personally wouldn't turn it off as it's there to help us, but everyone has a way so if you have your reasons, then fair enough.
1. If you want to disable errors for a specific test, you can mock the console.error.
it('disable error example', () => {
const errorObject = console.error; //store the state of the object
console.error = jest.fn(); // mock the object
// code
//assertion (expect)
console.error = errorObject; // assign it back so you can use it in the next test
});
2. If you want to silence it for all the test, you could use the jest --silent CLI option. Check the docs
The above might even disable the DOM printing that is done by rtl, I am not sure as I haven't tried this, but if you look at the docs I linked, it says
"Prevent tests from printing messages through the console."
Now you almost certainly have everything disabled except the DOM recommendations if the above doesn't work. On that case you might look into react-testing-library's source code and find out what is used for those print statements. Is it a console.log? is it a console.warn? When you got that, just mock it out like option 1 above.
UPDATE
After some digging, I found out that all testing-library DOM printing is built on prettyDOM();
While prettyDOM() can't be disabled you can limit the number of lines to 0, and that would just give you the error message and three dots ... below the message.
Here is an example printout, I messed around with:
TestingLibraryElementError: Unable to find an element with the text: Hello ther. This could be because the text is broken up by multiple elements. In this case, you can provide a function for your text matcher to make your matcher more flexible.
...
All you need to do is to pass in an environment variable before executing your test suite, so for example with an npm script it would look like:
DEBUG_PRINT_LIMIT=0 npm run test
Here is the doc
UPDATE 2:
As per the OP's FR on github this can also be achieved without injecting in a global variable to limit the PrettyDOM line output (in case if it's used elsewhere). The getElementError config option need to be changed:
dom-testing-library/src/config.js
// called when getBy* queries fail. (message, container) => Error
getElementError(message, container) {
const error = new Error(
[message, prettyDOM(container)].filter(Boolean).join('\n\n'),
)
error.name = 'TestingLibraryElementError'
return error
},
The callstack can also be removed
You can change how the message is built by setting the DOM testing library message building function with config. In my Angular project I added this to test.js:
configure({
getElementError: (message: string, container) => {
const error = new Error(message);
error.name = 'TestingLibraryElementError';
error.stack = null;
return error;
},
});
This was answered here: https://github.com/testing-library/dom-testing-library/issues/773 by https://github.com/wyze.

Can I define a simple trigger from a standalone script?

Specifically, I want to use the onSelectionChange(e) event to show a sidebar depending on what's in the selected cell. The problem is: the project I am working on is a standalone script. So I want to know if there is a way to use the onOpen event (for example) and check if the script is being run from a spreadsheet and somehow 'inject' the trigger.
I was trying something pretty much like in the documentation, but I never got it to fire, I guess because it was a standalone script.
const onSelectionChange = (e) => {
Logger.log(`onSelectionChange triggered: ${e.toString()}`);
const { range } = e;
if (range.getNumRows() === 1 && range.getNumColumns() === 1) {
range.setBackground('green');
}
};
So the actual solution for me was to create a new document with clasp. Using the command:
npx clasp create --type sheets --title "foo" --rootDir ./dist
and then uploading the script to this new project.

How to manually trigger a cloudwatch rule with ScheduleExpression(10 days)

I have to setup "AWS::Events::Rule" in cloudwatch with ScheduleExpression(10 days), and write some code to test it, but I can not change the "10 days" to 1 minute or call the lambda function directly. I know that we can call put event for calling a rule with EventPattern.
But not know how to do that for ScheduleExpression.
Any comment is welcome, Thanks.
To my knowledge there's no possibility for you to manually trigger the rule and make it execute the lambda function. What you can do is change the frequency from 10 days to 1 minute, let it execute, and when it executes switch it back to 10 days
I also met this problem. I checked AWS document and it says that a rule can only contain either EventPattern or ScheduleExpression. But in order to call aws events put-events we must provide a Source for EventPattern match. So I think we cannot manually trigger a scheduled event.
Not sure what's your use case, but I have decided to move to use Invoke API of AWSLambda client.
SDK Approach:
Yes, you can use the putRule SDK function to update the ScheduleExpression of the CloudWatch Rule. As I mentioned in the below snippet
let params =
{
Name: timezoneCronName, /* required */
ScheduleExpression: cronExpression
}
return CloudWatchEvents.putRule(cloudWatchEventsParams).promise().then((response) => {
console.debug(`CloudWatch Events response`, response);
return response;
}).catch((error) => {
console.error(`Error occurred while updating Cloud Watch Event:${error.message}`);
throw error;
});
See this Official AWS SDK DOC.
CLI Approach:
Run the following command though CLI
aws events put-rule --name "You Rule name (not full ARN)" --schedule-expression "cron(0/1 * * * ? *)"

Error using CLI for cloud functions with IAM namespaces

I'm trying to create an IBM Cloud Function web action from some python code. This code has a dependency which isn't in the runtime, so I've followed the steps here to package the dependency with my code. I now need to create the action on the cloud for this package, using the steps described here. I've got several issues.
The first is that I want to check that this will be going into the right namespace. However though I have several, none are showing up when i do ibmcloud fn namespace list, I just get the empty table with headers. I checked that I was targeting the right region using ibmcloud target -r eu-gb.
The second is that when I try to bypass the problem above by creating a namespace from the command line using ibmcloud fn namespace create nyNamespaceName, it works, but I then check on the web UI, and this new namespace has been created in the Dallas region instead of the London one… I can’t seem to get it to create a namespace in the region that I am currently targeting for some reason, it’s always Dallas.
The third problem is that when I try to follow the steps 2 and 3 from here regardless, accepting that it will end up in the unwanted Dallas namespace, by running the equivalent of ibmcloud fn action create demo/hello <filepath>/hello.js --web true, it keeps telling me I need to target an org and a space. But my namespace is an IAM namespace, it doesn’t have an org and a space, so there are none to give?
Please let me know if I’m missing something obvious or have misunderstood something, because to me it feels like the CLI is not respecting the targeting of a region and not handling IAM stuff correctly.
Edit: adding code as suggested, but this code runs fine locally, it's the CLI part that I'm struggling with?
import sys
import requests
import pandas as pd
import json
from ibm_ai_openscale import APIClient
def main(dict):
# Get AI Openscale GUID
AIOS_GUID = None
token_data = {
'grant_type': 'urn:ibm:params:oauth:grant-type:apikey',
'response_type': 'cloud_iam',
'apikey': 'SOMEAPIKEYHERE'
}
response = requests.post('https://iam.bluemix.net/identity/token', data=token_data)
iam_token = response.json()['access_token']
iam_headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer %s' % iam_token
}
resources = json.loads(requests.get('https://resource-controller.cloud.ibm.com/v2/resource_instances', headers=iam_headers).text)['resources']
for resource in resources:
if "aiopenscale" in resource['id'].lower():
AIOS_GUID = resource['guid']
AIOS_CREDENTIALS = {
"instance_guid": AIOS_GUID,
"apikey": 'SOMEAPIKEYHERE',
"url": "https://api.aiopenscale.cloud.ibm.com"
}
if AIOS_GUID is None:
print('AI OpenScale GUID NOT FOUND')
else:
print('AI OpenScale FOUND')
#GET OPENSCALE SUBSCRIPTION
ai_client = APIClient(aios_credentials=AIOS_CREDENTIALS)
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
for sub in subscriptions_uids:
if ai_client.data_mart.subscriptions.get_details(sub)['entity']['asset']['name'] == "MYMODELNAME":
subscription = ai_client.data_mart.subscriptions.get(sub)
#EXPLAINABILITY TEST
sample_transaction_id="SAMPLEID"
run_details = subscription.explainability.run(transaction_id=sample_transaction_id, cem=False)
#Formating results
run_details_json = json.dumps(run_details)
return run_details_json
I know the OP said they were 'targeting the right region'. But I want to make it clear that the 'right region' is the exact region in which the namespaces you want to list or target are located.
Unless you target this region, you won't be able to list or target any of those namespaces.
This is counterintuitive because
You are able to list Service IDs of namespaces in regions other than the one you are targeting.
The web portal allows you to see namespaces in all regions, so why shouldn't the CLI?
I was having an issue very similar to the OP's first problem, but once I targeted the correct region it worked fine.

Add single record to mongo collection with meteor

I am a new user to JavaScript and the meteor framework trying to understand the basic concepts. First of all I want to add a single document to a collection without duplicate entries.
this.addRole = function(roleName){
console.log(MongoRoles.find({name: roleName}).count());
if(!MongoRoles.find({name: roleName}).count())
MongoRoles.insert({name: roleName});
}
This code is called on the server as well as on the client. The log message on the client tells me there are no entries in the collection. Even if I refresh the page several times.
On the server duplicate entries get entered into the collection. I don't know why. Probably I did not understand the key concept. Could someone point it out to me, please?
Edit-1:
No, autopublish and insecure are not installed anymore. But I already published the MongoRoles collection (server side) and subscribed to it (client side). Furthermore I created a allow rule for inserts (client side).
Edit-2:
Thanks a lot for showing me the meteor method way but I want to get the point doing it without server side only methods involved. Let us say for academic purposes. ;-)
Just wrote a small example:
Client:
Posts = new Mongo.Collection("posts");
Posts.insert({title: "title-1"});
console.log(Posts.find().count());
Server:
Posts = new Mongo.Collection("posts");
Meteor.publish(null, function () {
return Posts.find()
})
Posts.allow({
insert: function(){return true}
})
If I check the server database via 'meteor mongo' it tells me every insert of my client code is saved there.
The log on the client tells me '1 count' every time I refresh the page. But I expected both the same. What am I doing wrong?
Edit-3:
I am back on my original role example (sorry for that). Just thought I got the point but I am still clueless. If I check the variable 'roleCount', 0 is responded all the time. How can I load the correct value into my variable? What is the best way to check if a document exists before the insertion into a collection? Guess the .find() is asynchronous as well? If so, how can I do it synchronous? If I got it right I have to wait for the value (synchronous) because I really relay on it.
Shared environment (client and server):
Roles = new Mongo.Collection("jaqua_roles");
Roles.allow({
insert: function(){return true}
})
var Role = function(){
this.addRole = function(roleName){
var roleCount = Roles.find({name: roleName}).count();
console.log(roleCount);
if(roleCount === 0){
Roles.insert({name: roleName}, function(error, result){
try{
console.log("Success: " + result);
var roleCount = Roles.find({name: roleName}).count();
console.log(roleCount);
} catch(error){
}
});
}
};
this.deleteRole = function(){
};
}
role = new Role();
role.addRole('test-role');
Server only:
Meteor.publish(null, function () {
return Roles.find()
})
Meteor's insert/update/remove methods (client-side) are not a great idea to use. Too many potential security pitfalls, and it takes a lot of thought and time to really patch up any holes. Further reading here.
I'm also wondering where you're calling addRole from. Assuming it's being triggered from client-side only, I would do this:
Client-side Code:
this.addRole = function(roleName){
var roleCount = MongoRoles.find({name: roleName}).count();
console.log(roleCount);
if (roleCount === 0) {
Meteor.call('insertRole', roleName, function (error, result) {
if (error) {
// check error.error and error.reason (if I'm remembering right)
} else {
// Success!
}
});
}
}
How I've modified this code and why:
I made a roleCount variable so that you can avoid calling MongoRoles.find() twice like that, which is inefficient and consumes unneeded resources (CPU, disk I/O, etc). Store it once, then reference the variable instead, much better.
When checking numbers, try to avoid doing things like if (!count). Using if (count === 0) is clearer, and shows that you're referencing a number. Statements like if (!xyz) would make one think this is a boolean (true/false) value.
Always use === in JavaScript, unless you want to intentionally do a loose equality operation. Read more on this.
Always use open/closed curly braces for if and other blocks, even if it contains just a single line of code. This is just good practice so that if you decide to add another line later, you don't have to then wrap it in braces. Just a good practice thing.
Changed your database insert into a Meteor method (see below).
Side note: I've used JavaScript (ES5), but since you're new to JavaScript, I think you should jump right into ES6. ES is short for ECMAScript (which is what JS is based on). ES6 (or ECMAScript 2015) is the most recent stable version which includes all kinds of new awesomeness that JavaScript didn't previously have.
Server-side Code:
Meteor.method('insertRole', function (roleName) {
check(roleName, String);
try {
// Any security checks, such as logged-in user, validating roleName, etc
MongoRoles.insert({name: roleName});
} catch (error) {
// error handling. just throw an error from here and handle it on client
if (badThing) {
throw new Meteor.Error('bad-thing', 'A bad thing happened.');
}
}
});
Hope this helps. This is all off the top of my head with no testing at all. But it should give you a better idea of an improved structure when it comes to database operations.
Addressing your edits
Your code looks good, except a couple issues:
You're defining Posts twice, don't do that. Make a file, for example, /lib/collections/posts.js and put the declaration and instantiation of Mongo.Collection in there. Then it will be executed on both client and server.
Your console.log would probably return an error, or zero, because Posts.insert is asynchronous on the client side. Try the below instead:
.
Posts.insert({title: "title-1"}, function (error, result) {
console.log(Posts.find().count());
});