logrotate by filesize and keep as many old files until maxage - logrotate

I wish to rotate logs by file size and retain old logs by age. The maxage parameter puts a check on how old rotated files could be. But I am looking for a case where you rotate files when they reach a specific size (say 100k) and keep as many old files for a specific number of days (say 7 days).
It could be possible that there are lesser number of writes to the log file and there could be only one rotation in 7 days, or you could have multiple rotation within a single day. In any case I want to keep old files for 7 days (in this case) from the day they are created

Related

TYPO3 var/log - how to auto remove entries after x days

In TYPO3 10 logs are beeing stored, amoung others, in var/log. Files that are stored there are growing over time. Is there a way to keep it clean and keep automatically entries from x last days?
Would be eventually quite a task of an operation. Normaly, log rotating is a job for the OS, so you may use something like logrotate or similar to rotate the logs.

Apache Druid Appending Segment without dropping or summing it

I have three JSON files with the same timestamp but different values to upload to the Druid. I want to upload them separately with the same segment granularity. However, it drops the existing segment and uploads the new one.
I don't want to use appendToExisting: True bc it sums the values of the same rows. This is the situation that I don't want to happen (I may be adding the same file in the future).
Is there a way to add new data to a specific segment without dropping or summing it?

GRIB files with incremental updates

Folks,
I am new to dealing with GRIB format and seek your advice on the following question:
we have an application where we plan to receive data at every 6 hour interval. The forecast will be for next 10 to 15 days.
There is a requirement where to reduce the download size, the system should only download incremental changes meaning the new GRIB files will only contain data which has changed.
So all the previously downloaded GRIB files should display data and for the parts where there was a change (assuming clients will know) the client will downloaded and display the GRIB file which has this incremental update ..
Is this kind of incremental changes to GRIB supported by standard?
I suspect this option is not supported by GRIB files. As the data in GRIB files is packed, you cannot know what variables have changed and which not.
In addition, most likely most of the parameters have a slight and insignificant change between the forecasts (I mean the forecast for let us say 07:00 o'clock done at 00:00 and done at 06:00 will have differences for most of the parameters, but they can be in order of 10^-X - meaning they are insignificant). Some parameters or regions of course might have larger differences that you would like to highlight.

Google nearline pricing on overwrites

I have Google Nearline storage set up and working fine via gcloud/gsutil.
So far I have been using rsync to back some databases up eg...
rsync -d -R /sourcedir/db_dir gs://backup_bucket/
Currently the files are datastamped in the filename, so we get a different filename every day.
I've just spotted the mention of early deletion charges (currently on trial).
I'm assuming whenever I delete a file with -d, I will get charged for that file up to 30 days ? If so, there's no point deleting it before then (but will get charged).
But if I keep the filename the same, but overwrite the file with the latest days backup, the text says...
"if you create an object in a bucket configured for Nearline, and 10 days later you overwrite it, the object is considered an early deletion and you will be charged for the remaining 20 days of storage."
So I'm a bit unclear, if I have a file and overwrite it with a new version, am I then charged again for each file/day, every time its updated as well as the new file ?
eg, for one file, backed up daily via rsync (assuming same filename this time)...over 30 days
day1 myfile is created
day2 myfile is updated
day3 myfile is updated
... and so on
Am I now being charged (filespaceday1 * 30days) + (filespaceday2 * 29days) + (filespaceday3 * 28) and so on... just for the one file (rather than filespace * 30 days)?
Or does it just mean, if I create a 10gig file, and overwrite it with a 2meg file, I will be charged for 10gig for the 30 days (and ignore the 2meg file costs) ?
If so, are there any best practices for rsync and keeping charges down ?
Overwriting an object in GCS is equivalent to deleting the old object and inserting a new object in its place. You are correct that overwriting an object does incur the early delete charge, and so if you were to overwrite the same file every day, you would be charged for 30 days of storage every day.
Nearline storage is primarily meant for objects that will be retained for a long time and infrequently read or modified, and it's priced accordingly. If you want to modify an object on a daily basis, standard or durable reduced availability would likely be a cheaper option.

mongodb: old data aggregation

I have a script that collects data from somewhere and writes it into a mongodb collection every 10 minutes. Then we have a frontend that displays the historical trends of data in form of graphs/charts etc. I noticed that now we have data for around 2 years. All the data is at a 10 minute resolution. Generally, we like to see data at resolution of 10 minutes for past 6 months only. 6 month to 1 year old data is checked at an hourly resolution while older than an year at a daily resolution.
This means that we should somehow aggregate the 10 minute resolution data to higher resolution if its older than some time. E.g. average out (or may be max/min depending on the parameter) the data older than a year at an hourly basis and remove the 10 entries and make a new single entry.
Are there any frameworks available out there that could support such policy based data management?