How do you configure rolling logs with wildfly

How do you configure rolling logs with wildfly - wildfly

We are using wildfly and in our wildflyhome/standalone/log directory it's filling up with logs and eventually running out of disc space. I would like to set up rolling logs and know that it's possible but just don't know how to do it. Any help would be appreciated.

If you're on linux just setup a cron job to purge the files.
If you want to use a size rotation that will only keep as many rotations as you tell it you can use a size-rotating-file-handler.
The following CLI command will replace the default periodic-rotating-file-handler with a size-rotating-file-handler which rotates when the log file reaches 50MB and will only keep 10 rotations.
batch
/subsystem=logging/root-logger=ROOT:remove-handler(name=FILE)
/subsystem=logging/periodic-rotating-file-handler=FILE:remove
/subsystem=logging/size-rotating-file-handler=FILE:add(append=true, autoflush=true, named-formatter=PATTERN, max-backup-index=10, rotate-size=50m, file={relative-to=jboss.server.log.dir, path=server.log})
/subsystem=logging/root-logger=ROOT:add-handler(name=FILE)
run-batch

Related

MDT step by step deployment capture not generating wim

New to MDT.
So I am following through the MS step by step guides:
https://learn.microsoft.com/en-us/windows/deployment/windows-10-poc
https://learn.microsoft.com/en-us/windows/deployment/windows-10-poc-mdt
I am at step 28 in (in the second guide):
Deploy Windows 10 in a test lab using Microsoft Deployment Toolkit
Where the deployment wizard has been launched in a VM on the host system and have watched the process continue for an hour. It finally finishes but it does not create the .wim on the the server share as
expected and as referred to in the bootstrap.ini:
Bootstrap.ini
[Settings]
Priority=Default
[Default]
DeployRoot=\\SRV1\MDTBuildLab$
UserDomain=CONTOSO
UserID=MDT_BA
UserPassword=pass#word1
SkipBDDWelcome=YES
I have verified that the share "DeployRoot" exists and can be connected to using the provided credentials and that the share has the correct permissions to create/delete files.
Not sure what I'm missing but my expectation was a .wim should have been created in \srv1\MDTBuildLab$\Captures but there is nothing in that folder.
Just before stopping the deployment wizard reboots several times in quick succession, which to me doesn't appear correct but as I have never witnessed a successful capture I can't say for sure this isn't what's supposed to happen.
I'm not even sure where I can view any log files to figure out why it fails.
Any assistance appreciated!
Further Info:
Activated monitoring. It gets to step 86 of 93. The last thing I see is "Applying WinPE (BD)" or something similar and then it restarts. Then several quick reboots occur (the loading bar appears for a second or two and then reboots) (Which I think are failing) finally it gives up! The process never completes!
When I attempt to mount the client REFW10X64-001.vhdx to check the logs I am greeted with this message
The disk image isn't initialized, contains partitions that aren't recognizable, or contains volumes that haven't been assigned drive letters. Please use the Disk Management snap-in to make sure that the disk, partitions, and volumes are in a usable state.
So it looks like the last step totally screwed the disk! Which would explain the last several boots failing to load anything.
So no errors no warnings, no logs, no finish and no wim generated.
How do I troubleshoot this?

I know this post is old, but the normal behavior would be as follows:
Using the boot image, you boot into WinPE
The task sequence is started and the OS gets applied to the disk
Reboot
Boot into full Windows where the task sequence also continues
Under full Windows, one of the last steps is that WinPE gets applied again
Reboot
Computer boots automatically into WinPE
The wim file gets created (WinPE is running on the RAM disk and the regular C: drive (and any additional drives) is being mirrored into the wim file)
Computer performs the FINISHACTION.
We would need at least BDD.log and smsts.log to further troubleshoot. My guess is that WinPE was not applied correctly.

FabricDCA and MaxDiskQuotaInMB Configuration

There's two parts to this question. First, what falls under the purview of the Diagnostics---MaxDiskQuotaInMB configuration? Is it everything under SvcFab/Log? Just SvcFab/Log/AppInstanceData/? Having more info on this would be nice.
Second, what is the proper course of action if the FabricDCA.exe is running but the SvcFab/Log and SvcFab/Log/AppInstanceData/ folders exceed the limits we've set on their size? My team set them to 10,000 MB, but SvcFab/Log regularly takes up 12-16 GB.
The cluster configuration on Azure recognizes the change to the MaxDiskQuotaInMB configuration but there seems to be no impact on the node itself. I've tried resetting FabricDCA.exe as well and so far it has not helped either (after several hours).
One node in our cluster had so much space taken up by logs (over our limit) that remaining storage space was reduced to 1 MB.

Posting a more complete answer since it may be helpful to other people.
Most of the things under SvcFab/Log folder should fall under the quota set by MaxDiskQuotaInMB. There are a few things that may not, but the majority of things that usually take disk space are included. Keep in mind also that the task cleaning the disk usually runs every 5 minutes so you may see usage go over the quota within this timeframe.
If FabricDCA.exe is not properly cleaning files from this folder it is possible that you are hitting a bug in .Net runtime where all system.threading.timers stop firing and the disk to not be cleaned because FabricDCA relies on these timers to do so.
This is the bug on the .NET core side tracking the issue: (https://github.com/dotnet/coreclr/issues/26771). It seems to happen when the machine is running out of memory intermittently.
There is an auto-mitigation added in FabricDCA in Service Fabric 7.0.
The manual mitigation is usually to kill FabricDCA.exe process.
The process should start again and after a few minutes it will start cleaning again.
You mentioned that you already tried killing FabricDCA.exe so maybe the solution above does not work for you. In this case, try taking a look at the Service Fabric cluster manifest directly, it might be the case where your new configurations seem to be accepted by the ARM template deployment but the new configuration doesn't reach the cluster manifest which is the source of truth in this case.
Update:
There was a regression introduced as part of the auto-mitigation above which caused The AppInstanceFolder to fill up the disk. This is fixed in SF version 7.0.466

How can I get log rotation working inside a kubernetes container/pod?

Our setup:
We are using kubernetes in GCP.
We have pods that write logs to a shared volume, with a sidecar container that sucks up our logs for our logging system.
We cannot just use stdout instead for this process.
Some of these pods are long lived and are filling up disk space because of no log rotation.
Question:
What is the easiest way to prevent the disk space from filling up here (without scheduling pod restarts)?
I have been attempting to install logrotate using: RUN apt-get install -y logrotate in our Dockerfile and placing a logrotate config file in /etc/logrotate.d/dynamicproxy but it doesnt seem to get run. /var/lib/logrotate/status never gets generated.
I feel like I am barking up the wrong tree or missing something integral to getting this working. Any help would be appreciated.

We ended up writing our own daemonset to properly collect the logs from the nodes instead of the container level. We then stopped writing to shared volumes from the containers and logged to stdout only.
We used fluentd to the logs around.
https://github.com/splunk/splunk-connect-for-kubernetes/tree/master/helm-chart/splunk-kubernetes-logging

In general, you should write logs to stdout and configure log collection tool like ELK stack. This is the best practice.
However, if you want to run logrotate as a separate process in your container - you may use Supervisor, which serves as a very simple init system and allows you to run as many parallel process in container as you want.
Simple example for using Supervisor for rotating Nginx logs can be found here: https://github.com/misho-kr/docker-appliances/tree/master/nginx-nodejs

If you write to the filesystem the application creating the logs should be responsible for rotation. If you are running a java application with logback or log4j it is simple configuration change. For other languages/frameworks it is usually similar.
If that is not an option you could use a specialized tool to handle the rotation and piping the output to it. One example would be http://cr.yp.to/daemontools/multilog.html
As method of last resort you could investigate to log into a named pipe (FIFO) instead of a real file and have some other process handling the retrieval and writing of the data - including the rotation.

Run out of storage on Service Fabric scale set

I've run out of storage on my Azure Service Fabric sclesets, so can no longer deploy any updates. I'm guessing this is because SF is keeping track of all the deployments and using up space.
Can anyone tell me if there is:
1) A way to tell service fabric to delete old deployments (say older than 10 days ago.)
2) A way to increase the storage available on the scalesets (Service Fabric is currently using the OS disk for deployments)

Regarding your first question,
There is no way to tell SF to auto-delete old packages based on days, you can either:
Do upgrades using the flag -UnregisterUnusedApplicationVersionsAfterUpgrade = $true when running the Deploy-FabricApplication.ps1 script
Update the Deploy-FabricApplication.ps1 script or create a scheduled script to check for unused packages older than a specific version, something like described in this SO
Regarding the second Question:
Yes you can change the disk size via ARM template update,
But the issue might also be the LOGs size, take a look in this question might help solve the problem without bigger disks.

How to scale MongoDB?

I know that MongoDB can scale vertically. What about if I am running out of disk?
I am currently using EC2 with EBS. As you know, I have to assign EBS for a fixed size.
What if the MongoDB growth bigger than the EBS size? Do I have to create a larger EBS and Copy & Paste the files?
Or shall we start more MongoDB instance and each connect to different EBS disk? In such case, I could connect to a different instance for different databases.

If you're running out of disk, you obviously need to get a bigger disk.
There are several ways to migrate your data, it really depends on the type of up-time you need. First steps of course involve bundling the machine and creating the new volume.
These tips go from easiest to hardest.
Can you take the database completely off-line for several minutes?
If so, do this (migration by copy):
Mount new EBS on the server.
Stop your app from connecting to Mongo.
Shut down mongod and wait for everything to write (check the logs)
Copy all of the data files (and probably the logs) to the new EBS volume.
While the copy is happening, update your mongod start script (or config file) to point to the new volume.
Start mongod and check connection
Restart your app.
Can you take the database off-line for just a few minutes?
If so, do this (slaving and switch):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a --slave pointing at the current database. (you may need to re-start the current as --master)
The slave will do a fresh synchronization. Once the slave is up-to-date, you'll do a "switch" (next steps).
Turn off writes from the system.
Shut down the original mongod process.
Re-start the "new" mongod as a master instead of the slave.
Re-activate system writes pointing at the new master.
Done correctly those last three steps can happen in minutes or even seconds.
Can you not afford any down-time?
If so, do this (master-master):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a master and a slave against the current database. (may need to re-start current as master, minimal down-time?)
The new computer should do a fresh synchronization.
Once the new computer is up-to-date, switch the system to point at the new server.
I know it seems like this last version is actually the best, but it can be a little dicey (as of this writing). The reason is simply that I've honestly had a lot of issues with "Master-Master" replication, especially if you don't start with both active.
If you plan on using this method, I highly suggest a smaller practice run first. If something bombs here, Mongo might simply wipe all of your data files which will have the effect of taking more stuff down.
If you get a good version of this please post the commands, I'd like to see it in action.

Doesn't the E in EBS stand for elastic meaning something like resizing on the fly?
Currently the MongoDB team is working on finishining sharding which will allow you horizontal scaling by partitioning data separately on different servers. Give it a month or two and it will work fine. The developers are quite good at keeping their promises.
http://api.mongodb.org/wiki/current/Sharding%20Introduction.html
http://api.mongodb.org/wiki/current/Sharding%20Limits.html

You could slave the bigger disk off the smaller until it's caught up
or
fsync+lock and take a file system snapshot and copy it onto the bigger disk.

well, I am using Mongo DB now. I am pretty amazed the performance it generated, especially on some simple sorting.
I believe it's a good tool for simple web application logic. The remaining concern for is how to scale and backup. I will continue to explore.
The only disadvantage I have is that I didn't have any good tools to reveal the data stored inside. For example, I want to put my logging from MYSQL into Mongo as well. However, it's pretty difficult for me to view the log. Previously, i can use MYSQL query to fetch what I want easily.
Anyway, it's a good tool and I will continue to use it.