When does Google Nearline delete files via lifecycle management? - google-cloud-storage

TL;DR I want to know if the lifecycle rules I created for Google nearline are correct, but Google Cloud Storage has not deleted the files I think it should in my test, despite waiting a couple of days.
The Longer Version
I'm setting up Google Nearline storage for backups using versioning & I'm trying to manage the saving of old versions. I've read through the documentation on Object Lifecycle Management and I think I understand, but it's not behaving as I expect.
Here's the situation.
Following the examples in the documentation, I set up the lifecycle management to keep 6 versions of files, deleting any that are older than that. Here's the json document I used to set that up:
{
"rule": [
{
"action": {
"type": "Delete"
},
"condition": {
"numNewerVersions": 6
}
}
]
}
I implemented that rule (saved in a file called nearline.json) with
gsutil lifecycle set nearline.json gs://bucket_name
I checked to ensure that the rule was successfully applied with
gsutil lifecycle get gs://bucket_name
and got back {"rule": [{"action": {"type": "Delete"}, "condition": {"numNewerVersions": 6}}]} as a reply - so, it appears the rule was successfully applied.
Next, I set out to test it, by executing the following commands to create a simple test file with multiple revisions:
# append the current unix timestamp to my test file
echo "Date = $(date +%s)" >> test.txt
# send the new revision to nearline
gsutil cp test.txt gs://bucket_name
I did this a total of 10 times.
Next, I checked to see what Google shows in the bucket. Running gsutil ls -la gs://bucket_name gives:
23 2016-10-08T15:59:59Z gs://bucket_name/test.txt#1475942400031000 metageneration=1
46 2016-10-08T16:00:09Z gs://bucket_name/test.txt#1475942410008000 metageneration=1
69 2016-10-08T16:00:18Z gs://bucket_name/test.txt#1475942418466000 metageneration=1
92 2016-10-08T16:00:26Z gs://bucket_name/test.txt#1475942426563000 metageneration=1
115 2016-10-08T16:00:38Z gs://bucket_name/test.txt#1475942438484000 metageneration=1
138 2016-10-08T16:00:44Z gs://bucket_name/test.txt#1475942444562000 metageneration=1
161 2016-10-08T16:00:54Z gs://bucket_name/test.txt#1475942454455000 metageneration=1
184 2016-10-08T16:01:06Z gs://bucket_name/test.txt#1475942466301000 metageneration=1
207 2016-10-08T16:01:16Z gs://bucket_name/test.txt#1475942476052000 metageneration=1
230 2016-10-08T16:01:50Z gs://bucket_name/test.txt#1475942510806000 metageneration=1
So, again, it looks like everything is successful. Except that, instead of seeing only six entries, I see all ten.
I should see six entries because the rule I set up says to delete items with at six or more newer versions. That should include the first four versions in the list above because they all have six ore more newer versions.
Now the documentation does say "if an object meets the conditions for deletion, the object might not be deleted right away", but it has been a couple of days & it hasn't happened. I did find this answer in which it is stated that "there is no guarantee that it will be deleted immediately, but it will usually happen in less than a day".
So, it appears that one of three things is going on:
I just haven't waited long enough
There's something wrong with my lifecycle rule
There's something wrong with the way I'm testing it
Can anyone tell me which of these three it is?

You just need to wait a little longer.
Since the Cloud Storage Nearline is used for data that you do not aspect to access frequently, it probably requires more time in order to apply the livecycle rules than the other Google Cloud Storage options.
"Data you do not expect to access frequently (i.e., no more than once per month). Typically this is backup data for disaster recovery, or so called "cold" storage that is archived and may or may not be needed at some future time."
https://cloud.google.com/storage/docs/storage-classes

Related

Memcached touch fails on AWS Elasticache

I'm trying to use the Memcached "touch" command to reset expiration times, but I'm consistently getting a generic error response. I've simplified things to using telnet, and I've got a pretty simple use case that demonstrates the issue.
set TestKey 0 60 9
TestValue
STORED
get TestKey
VALUE TestKey 0 9
TestValue
END
touch TestKey 300
ERROR
get TestKey
VALUE TestKey 0 9
TestValue
END
In the above snippet, I believe I am caching a value "TestValue" with the key "TestKey", and a timeout of 60 seconds. I'm then reading the value (using the key), which demonstrates it is stored correctly. I try to use touch to set the expiration to 300 seconds, but I get a response: ERROR. Finally, I get the value again (mostly to demonstrate that the entire test happened before the original value times out).
Additional details:
I also get ERROR if I try to touch a key that doesn't exist, or use the gat or gats` commands (to Get And Touch)
I've tried to make sure all my commands are formatted according to https://github.com/memcached/memcached/blob/master/doc/protocol.txt
Am I using these commands incorrectly? Does AWS ElastiCache for Memcached lack support for touch? (I can't find any documentation asserting that it does or does not)
Figured it out - We were running version 1.4.5 of the memcached engine, but apparently touch isn't fully supported (at least outside binary mode) until 1.4.24.
Upgrading to the latest engine version (1.6.12) resolved the issue.

FleetApi - How to use legal rest times?

I'd like to know the position in which the driver would need to rest with given waypoints.
I am calling GET https://fleet.ls.hereapi.com/2/calculateroute.json with the following params:
{
mode: "fastest;car;traffic:enabled",
waypoint0: "19.286769, -99.652773",
waypoint1: "19.419185, -99.17755430000001",
waypoint2: "19.41530,-99.17844",
waypoint3: "31.29778, -110.93690",
restTimes: "MEX",
maxSpeed: 110,
departure: "2021-07-20T15:00:00.000Z"
}
This returns warnings with the info of rest times like this:
{
"message": "Taking the short driver rest time after 18036 sec for 1800 sec at routeLinkSeqNum 1485",
"code": 14,
"routeLinkSeqNum": 1485
}
I would like to know how to use/read this info. I don't know what routeLinkSeqNum is and how to utilize it.
Governments impose rules on how long a truck driver can drive before he needs to rest. Routing can consider these regulations w.r.t. short rests during a day and long (overnight) rests.
For example, in EU countries drivers have to rest after 4.5 hours of driving for at least 45 minutes, and must not exceed a total of 9 working hours per day before they have to rest for 11 hours.
Activate this feature in the router using the "&restTimes=local", in this case, it is the "MEX" in the request parameter. Routing will then consider each country's local regulations.
In the same parameter, you can specify whether the driver starts the route freshly or how long he is already driving / on duty since his last short or long rest period
routeLinkSeqNum is an index of the link array within a Leg. If you check the response, there will be response>route>[0]>leg[0]>[2]>link[1485].
So one route can have n-legs, 1 leg can have m-links
This will help you to plot the rest times.
Here is an example shown in the tool:
https://tcs.ext.here.com/examples/v3/fleet_telematics_api

How do I make `mmm` omit the time-consuming doc build steps?

Working with the source code from AOSP, after I make a trivial change to a
source file under frameworks/base/core/java/android/,
mmm frameworks/base -j9 takes about 4 minutes.
A large portion of that time seems to be waiting for steps with names containing "Droiddoc" or "Docs droiddoc" to complete:
...
[ 14% 4/28] Docs droiddoc: out/target/common/docs/api-stubs
[ 21% 6/28] //frameworks/base:test-api-stubs-docs Droiddoc [common]
DroidDoc took 102 sec. to write docs to out/soong/.intermediates/frameworks/base/test-api-stubs-docs/android_common/docs/out
[ 28% 7/25] Docs droiddoc: out/target/common/docs/api-stubs
DroidDoc took 113 sec. to write docs to out/target/common/docs/api-stubs
[ 32% 8/25] //frameworks/base:api-stubs-docs Droiddoc [common]
DroidDoc took 115 sec. to write docs to out/soong/.intermediates/frameworks/base/api-stubs-docs/android_common/docs/out
[ 40% 9/22] //frameworks/base:system-api-stubs-docs Droiddoc [common]
DroidDoc took 117 sec. to write docs to out/soong/.intermediates/frameworks/base/system-api-stubs-docs/android_common/docs/out
...
I really don't need or want any documentation to be built on every little incremental recompile.
Is there a way to omit all these doc-related steps?
I'd be interested in either a command line flag if there is one,
or a hopefully simple hack to one or more Makefiles and/or .mk files.
I've looked through the .mk files; in particular build/make/core/droiddoc.mk
seems relevant. I tried cutting some wires in it without really understanding what I was doing, without success.
I'm hoping someone who understands how these .mk files are put together
will know how to do this easily.
I expect this will be of interest to anyone who runs mmm frequently.
During make or mmm invocations, there are apparently two different kinds of build steps that build docs.
Each must be dealt with in its own way.
The build steps that have "Docs droiddoc" in their progress messages. That string comes from build/make/core/droiddoc.mk.
I was able to suppress these build steps as follows: delete all lines from build/make/core/droiddoc.mk, so it becomes an empty file.
The build steps that have "Droiddoc" in their progress messages. That string comes from build/soong/java/droiddoc.go.
I was able to suppress these build steps as follows: delete or comment out the last two blocks in the calling file build/soong/java/androidmk.go:
func (jd *Javadoc) AndroidMk() android.AndroidMkData {
...
}
func (ddoc *Droiddoc) AndroidMk() android.AndroidMkData {
...
}
I confirmed that it's no longer spending time building docs, on Darwin, by keeping an eye on the Activity Monitor during the build,
and verifying that javadoc processes no longer appear.
With docs omitted, mmm frameworks/base -j9 after a small code change now takes 45 to 55 seconds, instead of 4 minutes.

How do I disable Celery's default timeout for a task, and/or prevent it from retrying?

I'm having some troubles with celery. Unfortunately the person who set it up isn't working here any more, and until now we've never had problems and thought we understood how it works well enough. Now it has become clear that we don't, and after hours of searching through documentation and other posts on here, I have to admit defeat. Hopefully, someone here can shed some light on what I am missing.
We're using several tasks, all of them are defined in a CELERYBEAT_SCHEDULE like this:
CELERYBEAT_SCHEDULE = {
'runs-every-5-minutes': {
'task': 'tasks.webhook',
'schedule': crontab(minute='*/5'),
'args': (WEBHOOK_BASE + '/task/refillordernumberbuffer', {'refill_count': 1000})
},
'send-sameday-delivery-confirmation': {
'task': 'tasks.webhook',
'schedule': crontab(minute='*/2'),
'args': (WEBHOOK_BASE + '/task/sendsamedaydeliveryconfirmation', {})
},
'send-customer-hotspot-notifications': {
'task': 'tasks.webhook',
'schedule': crontab(hour=9, minute=0),
'args': (WEBHOOK_BASE + '/task/sendcustomerhotspotnotifications', {})
},
}
That's not all of them, but they all work like this. All of those are actually PHP scripts that have no knowledge of the whole celery concept. They are just scripts that execute certain things, and send notifications if necessary. When they are done, they just spit out a JSON response that says success=true.
As far as I know, celery is only used to execute them periodically. We don't have problems with any of them except the last one from my code snippet. That task/script sends out emails, usually 5 to 10, but sometimes a lot more. And that's where the problems start, because (as far as I could examine by watching in celery events, I could honestly not find any confirmation for this in the docs anywhere) when the successful JSOn response from the PHP script doesn't arrive within 3 minutes, celery retries the task, and the script sends a lot of emails again. And again, because just a small amount of emails was saved as "done" form the tasks initial run. This often leads to 4 or 5 retries until finally enough emails were marked as "successfully sent" by the prior retries that finally the last retry finishes under this mystical 3 minute limit.
My questions:
Is there a default time limit? Where is it set? How do I override it? I've read about time_limit and soft_time_limit, but nothing I tried in the config seemed to help. If this is the solution, I would be in need of assistance as to how the settings are properly applied.
Can't I "just" disable the whole retry concept (for one task or for all, doesn't really matter) altogether? It seems to me that we don't need it, as we're running our tasks periodically and missing one due to a temporary error would not matter. I guess that means we shouldn't have used celery in the first place as we're misusing it, but for now I'd just like to understand it better.
Thanks for any help, and sorry if I left anything unclear – happy to answer any follow-up questions and provide more details if necessary.
The rest of the config file goes like this:
## Broker settings.
databases = parse_databases_xml()
settings = parse_custom_settings_xml()
BROKER_URL = 'redis://' + databases['taskqueue']['host'] + '/' + databases['taskqueue']['dbname']
# List of modules to import when celery starts.
CELERY_IMPORTS = ("tasks", )
## Using the database to store task state and results.
CELERY_RESULT_BACKEND = BROKER_URL
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ANNOTATIONS = {
"*": {"rate_limit": "100/m"},
"ping": {"rate_limit": "100/m"},
}
There is no time_limit to be found anywhere, so I don't think we're setting it ourselves. I left out the python imports and the functions that read from our config xml files, as that stuff is all working fine and just concerns some database auth data.

EF6/Code First: Super slow during the 1st query, but only in Debug

I'm using EF6 rc1 with Code First strategy, without precompiled views and the problem is:
If I compile and run the exe application it takes like 15 seconds to run the first query (that's okay, since I'm still working on the pre-generated views). But if I use Visual Studio 2013 Preview to Debug the exact same application it takes almost 2 minutes BEFORE running the first query:
Dim Context = New MyEntities()
Dim Query = From I in Context.Itens '' <--- The debug takes 2 minutes in here
Dim Item = Query.FirstOrDefault()
Is there a way to remove this extra time? Am I doing something wrong here?
Ps.: The context itself is not complicated, its just full with 200+ tables.
Edit: Found out that the problem is that during debug time the EF appears to be generating the Views ignoring the pre-generated ones.
Using the source code from EF I discovered that the property:
IQueryProvider IQueryable.Provider
{
get
{
return _provider ?? (_provider = new DbQueryProvider(
GetInternalQueryWithCheck("IQueryable.Provider").InternalContext,
GetInternalQueryWithCheck("IQueryable.Provider").ObjectQueryProvider));
}
}
is where the time is being consumed. But this is strange since it only takes time in debug. Am I missing something here?
Edit: Found more info related to the question:
Using the Process Monitor (by Sysinternals) I found out that there its the 'desenv.exe' process that is consuming tons of time. To be more specific its consuming time with an 'Thread Exit'. It repeats the Thread Exit stack 36 times. I don't know if this info is very useful, but I saved a '.cvs' with the stack, here is his body: [...] (edit: removed the '.cvs' body, I can post it again by the comments if someone really think its going to be useful, but it was confusing and too big.)
Edit: Installed VS2013 Ultimate and Entity Framework 6 RTM. Installed the Entity Framework Power Tools Beta 4 and used it to generate the Views. Nothing changed... If I run the exe it takes 20 seconds, if I 'Start' debugging it takes 120 seconds.
Edit: Created a small project to simulate the error: http://sdrv.ms/16pH9Vm
Just run the project inside the environment and directly through the .exe, click the button and compare the loading time.
This is a known performance issue in Lazy (which EF is using) when the debugger is attached. We are currently working on a fix (the current approach we are looking at is removing the use of Lazy). We hope to ship this fix in a patch release soon. You can track progress of this issue on our CodePlex site - http://entityframework.codeplex.com/workitem/1778.
More details on the coming 6.0.2 patch release that will include a fix are here - http://blogs.msdn.com/b/adonet/archive/2013/10/31/ef6-performance-issues.aspx
I don't know if you have found the solution. But in my case, I had similar issue which wasted me close to a week after trying different suggestions. Finally, I found a solution by changing my web.config to optimizeCompilations="true" and performance improved dramatically from 15-30 seconds to about 2 seconds.