What's memcached's maximum key expiration time?
If I don't provide an expiration time and the cache gets full, what happens?
You can set key expiration to a date, by supplying a Unix timestamp instead of a number of days. This date can be more than 30 days in the future:
Expiration times are specified in unsigned integer seconds. They can be set from 0, meaning "never expire", to 30 days (60*60*24*30). Any time higher than 30 days is interpreted as a unix timestamp date. If you want to expire an object on january 1st of next year, this is how you do that.
https://github.com/memcached/memcached/wiki/Programming#expiration
But, as you say, if you’re setting key expiration to an amount of time rather than a date, the maximum is 2,592,000 seconds, or 30 days.
If you don't provide expiration and cache gets full then the oldest key-values are expired first:
Memory is also reclaimed when it's time to store a new item. If there are no free chunks, and no free pages in the appropriate slab class, memcached will look at the end of the LRU for an item to "reclaim". It will search the last few items in the tail for one which has already been expired, and is thus free for reuse. If it cannot find an expired item however, it will "evict" one which has not yet expired. This is then noted in several statistical counters
https://github.com/memcached/memcached/wiki/UserInternals#when-are-items-evicted
No there is no limit. The 30 days limit is if you give the amount of seconds it should stay there, but if you give a timestamp, there is only the max long or int value on the machine which can be a limit.
->set('key', 'value', time() + 24*60*60*365) will make the key stay there for a year for example, but yeah if the cache gets full or restarted in between, this value can be deleted.
An expiration time, in seconds. Can be up to 30 days. After 30 days,
is treated as a unix timestamp of an exact date.
https://code.google.com/p/memcached/wiki/NewCommands#Standard_Protocol
OK, I found out that the number of seconds may not exceed 2592000 (30 days). So the maximum expiration time is 30 days.
Looks like some answers are not valid anymore.
I found out a key does not get set at all when the TTL is too high. For example 2992553564.
Tested with the following PHP code:
var_dump($memcached->set($id, "hello", 2992553564); // true
var_dump($memcached->get($id)); // empty!
var_dump($memcached->set($id, "hello", 500); // true
var_dump($memcached->get($id)); // "hello"
Version is memcached 1.4.14-0ubuntu9.
On laravel config.session.lifetime setting that if set to be an equivalent of 30days above, will be considered as a timestamp (this will give an error of token mismatch everytime assuming that memcached is used).
To answer, memcached expiration could be set anytime. (Laravel's default setting (on v5.0) will set you to an already expire timestamp). If you did not set it, the defualt will be used.
If I don't provide an expiration time and the cache gets full, what happens?
If the expiration is not provided (or TTL is set to 0) and the cache gets full then your item may or may not get evicted based on the LRU algorithm.
Memcached provides no guarantee that any item will persist forever. It may be deleted when the overall cache gets full and space has to be allocated for newer items. Also in case of a hard reboot all the items will be lost.
From user internals doc
Items are evicted if they have not expired (an expiration time of 0 or
some time in the future), the slab class is completely out of free
chunks, and there are no free pages to assign to a slab class.
Below is how you can reduce the chance's of your item getting cleaned by the LRU job.
Create an item that you want to expire in a
week? Don't always fetch the item but want it to remain near the top
of the LRU for some reason? add will actually bump a value to the
front of memcached's LRU if it already exists. If the add call
succeeds, it means it's time to recache the value anyway.
source on "touch"
It is also good to monitor overall memory usage of memcached for resource planning and track the eviction statistics counter to know how often cache's are getting evicted due to lack of memory.
Related
I've recently upgraded my kafka streams from 2.0.1 to 2.5.0. As a result I'm seeing a lot of warnings like the following:
org.apache.kafka.streams.kstream.internals.KStreamWindowAggregate$KStreamWindowAggregateProcessor Skipping record for expired window. key=[325233] topic=[MY_TOPIC] partition=[20] offset=[661798621] timestamp=[1600041596350] window=[1600041570000,1600041600000) expiration=[1600059629913] streamTime=[1600145999913]
There seem to be new logic in the KStreamWindowAggregate class that checks if a window has closed. If it has been closed the messages are skipped. Compared to 2.0.1 these messages where still processed.
Question
Is there a way to get the same behavior like before? I'm seeing lots of gaps in my data with this upgrade and not sure how to solve this, as previously these gaps where not seen.
The aggregate function that I'm using already deals with windowing and as a result with expired windows. How does this new logic relate to this expiring windows?
Update
While further exploring I indeed see it to be related to the graceperiod in ms. It seems that in my custom timestampextractor (that has the logic to use the timestamp from the payload instead of the normal timestamp), I'm able to see that the incoming timestamp for the expired window warnings indeed is bigger than the 24 hours compared to the event time from the payload.
I assume this is caused by consumer lags of over 24 hours.
The timestamp extractor extract method has a partition time which according to the docs:
partitionTime the highest extracted valid timestamp of the current record's partition˙ (could be -1 if unknown)
so is this the create time of the record on the topic? And is there a way to influence this in a way that my records are no longer skipped?
Compared to 2.0.1 these messages where still processed.
That is a little bit surprising (even if I would need to double check the code), at least for the default config. By default, store retention time is set to 24h, and thus in 2.0.1 older messages than 24h should also not be processed as the corresponding state got purged already. If you did change the store retention time (via Materialized#withRetention) to a larger value, you would also need to increase the window grace period via TimeWindows#grace() method accordingly.
The aggregate function that I'm using already deals with windowing and as a result with expired windows. How does this new logic relate to this expiring windows?
Not sure what you mean by this or how you actually do this? The old and new logic are similar with regard to how a long a window is stored (retention time config). The new part is the grace period that you can increase to the same value as retention time if you wish).
About "partition time": it is computed base on whatever TimestampExtractor returns. For your case, it's the max of whatever you extracted from the message payload.
I'm trying to create a graph of total POST requests per minute in a graph, but there's this "ramp up" pattern that leads me to believe that I'm not getting the actual total of requests per minute, but getting an accumulative value.
Here is my query:
sum_over_time(django_http_responses_total_by_status_view_method_total{job="django-prod-app", method="POST", view="twitch_webhooks"}[1m])
Here are the "ramp up" patterns over 7days (drop offs indicating a reboot):
What leads me to believe my understanding of sum_over_time() is incorrect is because the existing webhooks should always exist. At the time of the most recent reboot, we have 72k webhook subscriptions, so it doesn't make sense for the value to climb over time, it would make more sense to see a large spike at the start for catching webhooks that were not captured during downtime.
Is this query correct for what I'm trying to achieve?
I am using django-prometheus for exporting.
You want increase rather than sum_over_time, as this is a counter.
If the django_http_responses_total_by_status_view_method_total metrics is a counter, then increase() function must be used for returning the number of requests during the last minute:
increase(django_http_responses_total_by_status_view_method_total[1m])
Note that increase() function in Prometheus can return fractional results even if django_http_responses_total_by_status_view_method_total metric contains only integer values. This is due to implementation details - see this comment and this article for details.
If the django_http_responses_total_by_status_view_method_total metric is a gauge, which shows the number of requests since the previous sample, then sum_over_time() function must be used for returning requests per last minute:
sum_over_time(django_http_responses_total_by_status_view_method_total[1m])
How to choose the right value for the TTL? We need a push messsage delivered reliably, not being dropped, but at the same time we would like it delivered faster, because it is used to initiate live calls. I understand that 0 is not an option for us, since it has a good chance to be dropped? But then should it be 60*60 (an hour) or 60 (a minute) or what is the right way of thinking here?
You must remember that the value of TTL paramater must be a duration from 0 to 2,419,200 seconds, and it corresponds to the maximum period of time of push message to live on the push service before it's delivered.
If you set a TTL of zero, the push service will attempt to deliver the
message immediately, but if the device can't be reached, your message
will be immediately dropped from the push service queue.
You can also consider the following best practice of using TTL:
The higher the TTL, the less frequently caching name servers need to query authoritative name servers.
A higher TTL reduces the perceived latency of a site and decreases the dependency on the authoritative name servers.
The lower the TTL, the sooner the cached record expires. This allows queries for the records to occur more frequently.
I have an application that consumes work to do from an AWS topic. Work is added several times a day and my application quickly consumes it and the queue length goes back to 0. I am able to produce a metric for the length of the queue.
I would like a metric for the time since the length of queue was last zero. Any ideas how to get started?
Assuming a queue_size gauge that records the size of the queue, you can define a recorded rule like this:
# Timestamp of the most recent `queue_size` == 0 sample; else propagate the previous value
- record: last_empty_queue_timestamp
expr: timestamp(queue_size == 0) or last_empty_queue_timestamp
Then you can compute the time since the last time the queue was empty as simply as:
timestamp(queue_size) - last_empty_queue_timestamp
Note however that because this is a gauge (and because of the limitations of sampling), you may end up with weird results. E.g. if one work item is added every minute, your sampling interval is one minute and you sample exactly after the work items have been added, your queue may never (or very rarely) appear empty from the point of view of Prometheus. If that turns out to be an issue (or simply a concern) you may be better off having your application export a metric that is the last timestamp when something was added to an empty queue (basically what the recorded rule attempts to compute).
Similar to Alin's answer; upon revisiting this problem I found this from the Prometheus documentation:
https://prometheus.io/docs/practices/instrumentation/#timestamps,-not-time-since
If you want to track the amount of time since something happened, export the
Unix timestamp at which it happened - not the time since it happened.
With the timestamp exported, you can use the expression time() -
my_timestamp_metric to calculate the time since the event, removing the need for
update logic and protecting you against the update logic getting stuck.
The following discussion is in the context of Apache Flink:
Imagine that we have a keyedStream whose key is its id and event time is its timestamp, if we want to calculate how many events arrived within 10 minutes for each event.
The problems need to be solved are:
How to design the window ?
We can create a window of 10 minutes after each event arrives, but this mean that for each event, there will be a delay of 10 minutes because the wait for the window of 10 minutes.
We can create a window of 10 minutes which takes the timestamp of each event as the maximum timestamp in this window, which means that we don't need to wait for 10 minutes, because we take the last 10 minutes of elements before the element arrives. But this kind of window is not easy to define, as far as I know.
How to deal with memory or other resource issues ? Even we succeed to create a window, maybe the kind of ids of events are diverse, so many window like this, how the system keep their states in the memory ? There is a big possibility of stakoverflow of memory.
Maybe there are some problems that I don't mention here, or maybe there are some good solutions except window(i.e. Patterns). If you have a good solutions, please give me a clue, thank you.
You could do this with a GlobalWindow and a Trigger than fires on every event and an Evictor that removes events that are more than 10 minutes old before counting the remaining events. (A naive implementation could easily perform very poorly, however.)
Yes, this may require keeping a lot of state -- you'll be keeping every event from the past 10 minutes (well, you only need to store the timestamp from each event). If you setup the RocksDB state backend then Flink will spill to disk if need be, but with some obvious performance penalty. Probably better to use a cluster large enough to hold 10 minutes of traffic in memory. Even at one million events per second, each with a 32-bit timestamp, that's only 2.4GB in 10 minutes (1 million events per second x 600 seconds x 4 bytes per event) -- doesn't seem like a problem at all.