List of all kubernetes pod statuses [duplicate] - kubernetes

This question already has answers here:
List of all possible status / reasons in Kubernetes
(4 answers)
Closed 4 years ago.
I am writing some monitoring tools that will track and alert on kubernetes pod statuses. I know of a handful of common ones (e.g. Running, CrashLoopBackoff, ImagePullBackoff, Exiting, etc.) but I can't seem to find a full list.

Answer as of master commit id 8a5e9ecf8febdfe5ca72d4d99340ce22cbf55cbb:
// Container event reason list
CreatedContainer = "Created"
StartedContainer = "Started"
FailedToCreateContainer = "Failed"
FailedToStartContainer = "Failed"
KillingContainer = "Killing"
PreemptContainer = "Preempting"
BackOffStartContainer = "BackOff"
ExceededGracePeriod = "ExceededGracePeriod"
// Pod event reason list
FailedToKillPod = "FailedKillPod"
FailedToCreatePodContainer = "FailedCreatePodContainer"
FailedToMakePodDataDirectories = "Failed"
NetworkNotReady = "NetworkNotReady"
// Image event reason list
PullingImage = "Pulling"
PulledImage = "Pulled"
FailedToPullImage = "Failed"
FailedToInspectImage = "InspectFailed"
ErrImageNeverPullPolicy = "ErrImageNeverPull"
BackOffPullImage = "BackOff"
// kubelet event reason list
NodeReady = "NodeReady"
NodeNotReady = "NodeNotReady"
NodeSchedulable = "NodeSchedulable"
NodeNotSchedulable = "NodeNotSchedulable"
StartingKubelet = "Starting"
KubeletSetupFailed = "KubeletSetupFailed"
FailedAttachVolume = "FailedAttachVolume"
FailedDetachVolume = "FailedDetachVolume"
FailedMountVolume = "FailedMount"
VolumeResizeFailed = "VolumeResizeFailed"
VolumeResizeSuccess = "VolumeResizeSuccessful"
FileSystemResizeFailed = "FileSystemResizeFailed"
FileSystemResizeSuccess = "FileSystemResizeSuccessful"
FailedUnMountVolume = "FailedUnMount"
FailedMapVolume = "FailedMapVolume"
FailedUnmapDevice = "FailedUnmapDevice"
WarnAlreadyMountedVolume = "AlreadyMountedVolume"
SuccessfulDetachVolume = "SuccessfulDetachVolume"
SuccessfulAttachVolume = "SuccessfulAttachVolume"
SuccessfulMountVolume = "SuccessfulMountVolume"
SuccessfulUnMountVolume = "SuccessfulUnMountVolume"
HostPortConflict = "HostPortConflict"
NodeSelectorMismatching = "NodeSelectorMismatching"
InsufficientFreeCPU = "InsufficientFreeCPU"
InsufficientFreeMemory = "InsufficientFreeMemory"
NodeRebooted = "Rebooted"
ContainerGCFailed = "ContainerGCFailed"
ImageGCFailed = "ImageGCFailed"
FailedNodeAllocatableEnforcement = "FailedNodeAllocatableEnforcement"
SuccessfulNodeAllocatableEnforcement = "NodeAllocatableEnforced"
UnsupportedMountOption = "UnsupportedMountOption"
SandboxChanged = "SandboxChanged"
FailedCreatePodSandBox = "FailedCreatePodSandBox"
FailedStatusPodSandBox = "FailedPodSandBoxStatus"
// Image manager event reason list
InvalidDiskCapacity = "InvalidDiskCapacity"
FreeDiskSpaceFailed = "FreeDiskSpaceFailed"
// Probe event reason list
ContainerUnhealthy = "Unhealthy"
// Pod worker event reason list
FailedSync = "FailedSync"
// Config event reason list
FailedValidation = "FailedValidation"
// Lifecycle hooks
FailedPostStartHook = "FailedPostStartHook"
FailedPreStopHook = "FailedPreStopHook"
UnfinishedPreStopHook = "UnfinishedPreStopHook"
This can be found in https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/events/event.go
Unfortunately, I'm not sure which release versions have this.

Related

Get aks publicIP/loadbalancer IP of Kuberbenetes address after apply bedrock terraform

Currently I'm applying following terraform template in order to create kubernetes cluster, everything work as I expected.
module "subnet" {
source = "git::https://github.com/microsoft/bedrock//cluster/azure/subnet/?ref=master"
subnet_name = var.subnet_name
vnet_name = var.vnet_name
resource_group_name = data.azurerm_resource_group.keyvault.name
address_prefixes = [var.subnet_prefix]
}
module "aks-gitops" {
source = "git::https://github.com/microsoft/bedrock//cluster/azure/aks-gitops/?ref=master"
acr_enabled = var.acr_enabled
agent_vm_count = var.agent_vm_count
agent_vm_size = var.agent_vm_size
cluster_name = var.cluster_name
dns_prefix = var.dns_prefix
flux_recreate = var.flux_recreate
gc_enabled = var.gc_enabled
gitops_ssh_url = var.gitops_ssh_url
gitops_ssh_key_path = var.gitops_ssh_key_path
gitops_path = var.gitops_path
gitops_poll_interval = var.gitops_poll_interval
gitops_label = var.gitops_label
gitops_url_branch = var.gitops_url_branch
kubernetes_version = var.kubernetes_version
resource_group_name = data.azurerm_resource_group.cluster_rg.name
service_principal_id = var.service_principal_id
service_principal_secret = var.service_principal_secret
ssh_public_key = var.ssh_public_key
vnet_subnet_id = module.subnet.subnet_id
network_plugin = var.network_plugin
network_policy = var.network_policy
oms_agent_enabled = var.oms_agent_enabled
}
The next step in terrafrom is configure the CDN/Domain setup, and it requires the public IP address (which already created in above steps under module "aks-gitops") but the output seem to be not returned with that Ip address.
Any idea for that, since I've just dug all the resource on internet.
every comment is appreciated. !
Thank mates !
To retrieve the FQDN which resolves to the public IP of the cluster, create a data resource that references the newly created cluster.
data "azurerm_kubernetes_cluster" "aks-cluster" {
name = var.cluster_name
resource_group_name = data.azurerm_resource_group.cluster_rg.name
}
The address of the newly created cluster can then be accessed via data.aks-cluster.fqdn
You can follow a similar pattern to retrieve details of a load balancer, or any other resource that is not returned in the module outputs.

Zope/Plone code reload after deployment on production

Is there a way to reload the code without restarting Zope when in Production ?
New features are implemented almost once in 2 days and have to be uploaded to the server. The only way it works currently is by restarting the zeo server and all instances. Can't use "plone.reload" as it only works in the development environment when the debug mode is on. Below is the buildout.cfg content
[buildout]
parts =
# instance
zeo
client1
client2
client3
zopepy
zopeskel
test
# mysql
# varnish-build
# varnish
supervisor
pidproxy
extends =
https://dist.plone.org/versions/zope-2-13-19-versions.cfg
find-links =
https://dist.plone.org/release/4.2.4
https://dist.plone.org/thirdparty
extensions =
mr.developer
# buildout.dumppickedversions
sources = sources
versions = versions
develop =
[versions]
plone.recipe.zeoserver = 1.3.1
plone.recipe.zope2instance = 4.2.8
five.localsitemanager = 2.0.5
Products.PluginRegistry = 1.3
Products.CMFCore = 2.2.7
Products.GenericSetup = 1.7.3
Products.ZSQLMethods = 2.13.4
zope.interface = 3.6.7
zope.app.publication = 3.12.0
#setuptools = 17.1.1
funcsigs = 0.4
openpyxl = 2.4.0
plone.reload = 2.0.2
[zeo]
recipe = plone.recipe.zeoserver
zeo-address = 127.0.0.1:9100
zeo-var = ${buildout:directory}/var
blob-storage = ${zeo:zeo-var}/blobstorage
#ggs = plone.app.blob
[client1]
recipe = plone.recipe.zope2instance
http-address = 9081
zeo-client = on
zeo-address = ${zeo:zeo-address}
shared-blob = on
blob-storage = ${zeo:zeo-var}/blobstorage
user = admin:Slick_RP#21!
products = ${buildout:directory}/matrix_git/prod/
debug-mode = off
verbose-security = off
eggs =
# pillow
mysql-python
simplejson
haversine
openpyxl
requests
httpagentparser
ordereddict
python-memcached
# python-crontab
# setuptools
Products.CMFCore
Products.ZMySQLDA
# Products.SQLAlchemyDA
Products.PluggableAuthService
# Products.ZopeProfiler
# Products.MemoryProfiler
# reportlab
Products.BeakerSessionDataManager
collective.fsexternalmethod
plone.reload
zope-conf-additional =
extensions ${buildout:directory}/matrix_git/Extensions
<product-config beaker>
session.type file
session.data_dir ${buildout:directory}/var/sessions/data
session.lock_dir ${buildout:directory}/var/sessions/lock
session.key beaker.session
session.secret secret
</product-config>
zcml =
collective.fsexternalmethod
plone.reload
event-log-max-size = 5 MB
event-log-old-files = 5
access-log-max-size = 20 MB
access-log-old-files = 10
[client2]
recipe = plone.recipe.zope2instance
http-address = 9082
zeo-client = ${client1:zeo-client}
zeo-address = ${client1:zeo-address}
blob-storage = ${client1:blob-storage}
shared-blob = ${client1:shared-blob}
user = ${client1:user}
products = ${client1:products}
debug-mode = off
verbose-security = off
eggs = ${client1:eggs}
zcml = ${client1:zcml}
zope-conf-additional = ${client1:zope-conf-additional}
event-log-max-size = ${client1:event-log-max-size}
event-log-old-files = ${client1:event-log-old-files}
access-log-max-size = ${client1:access-log-max-size}
access-log-old-files = ${client1:access-log-old-files}
[client3]
recipe = plone.recipe.zope2instance
http-address = 9083
zeo-client = ${client1:zeo-client}
zeo-address = ${client1:zeo-address}
blob-storage = ${client1:blob-storage}
shared-blob = ${client1:shared-blob}
user = ${client1:user}
products = ${client1:products}
debug-mode = off
verbose-security = off
eggs = ${client1:eggs}
zcml = ${client1:zcml}
zope-conf-additional = ${client1:zope-conf-additional}
event-log-max-size = ${client1:event-log-max-size}
event-log-old-files = ${client1:event-log-old-files}
access-log-max-size = ${client1:access-log-max-size}
access-log-old-files = ${client1:access-log-old-files}
[zopepy]
recipe = zc.recipe.egg
eggs = ${client1:eggs}
interpreter = zopepy
scripts = zopepy
[test]
recipe = zc.recipe.testrunner
defaults = ['--auto-color', '--auto-progress']
eggs =
${client1:eggs}
[zopeskel]
recipe = zc.recipe.egg
eggs =
ZopeSkel
PasteScript
[mysql]
recipe = zest.recipe.mysql
# Note that these urls usually stop working after a while... thanks...
mysql-url = http://downloads.mysql.com/archives/mysql-5.0/mysql-5.0.86.tar.gz
mysql-python-url = http://pypi.python.org/packages/source/M/MySQL-python/MySQL-python-1.2.3.tar.gz
[varnish-build]
recipe = zc.recipe.cmmi
url = ${varnish:download-url}
[varnish]
recipe = plone.recipe.varnish
daemon = ${buildout:parts-directory}/varnish-build/sbin/varnishd
bind = 127.0.0.1:8000
backends = 127.0.0.1:8080
cache-size = 50M
[pidproxy]
recipe = zc.recipe.egg
eggs = supervisor
scripts = pidproxy
[supervisor]
recipe = collective.recipe.supervisor
port = 127.0.0.1:24007
serverurl = http://127.0.0.1:24007
programs =
# 10 mysql ${buildout:directory}/bin/pidproxy [${buildout:directory}/var/mysql/mysql.pid ${buildout:directory}/parts/mysql/install/bin/mysqld_safe --pid-file=${buildout:directory}/var/mysql/mysql.pid --socket=${buildout:directory}/var/mysql.socket] ${buildout:directory} true
20 zeo ${buildout:directory}/bin/zeo [console] ${buildout:directory} true
30 client1 ${buildout:directory}/bin/client1 [console] ${buildout:directory} true
40 client2 ${buildout:directory}/bin/client2 [console] ${buildout:directory} true
50 client3 ${buildout:directory}/bin/client3 [console] ${buildout:directory} true
If you are deploying so frequently, you can either deploy at low traffic times (i.e. at night).
If the website should be always up, you could have two sets of Plone instances: one set is active and serving requests, the second one is not active.
When updating, the offline servers are updated and when they are done, a switch is turned (HAProxy for example) to replace the active servers.
You could even have all servers available always, but for updating, put some offline while they are updated.
As others, and you as well are pointing, I would never use plone.reload or similar development tools in production.
Yes there is a way, allthough I'd never do that in production it's a great time-saver when developing, to do a reload within a browser-view:
from plone.reload.code import reload_code
from Products.Five.browser import BrowserView
class View(BrowserView):
def __call__(self):
reload_code()
return 'Code loaded.'
Then call the view with the name you registered it with upon the site. This even works in non-debug-mode while the instance is running in background. Tested with a standalone instance (non-ZEO).

Cannot convert gsm to unicode

Dears
I am using kannel 1.5.0 gateway with smpp on RHEL6 and when I receive an sms I get these errors:
2016-01-28 13:28:07 [8613] [6] WARNING: Could not convert GSM (0xd4) to Unicode.
2016-01-28 13:28:07 [8613] [6] WARNING: Could not convert GSM (0xf2) to Unicode.
.....
and I receive the messages incorrectly to my application, here is the request captured:
http://127.0.0.1:9091/services/smsReceive?msisdn=%2B353872849216&coding=0&smsText=%C3%85%3CH%C3%B9a%C3%91%C3%B9%25evM%C3%B9)zX%C3%ACp&DCS=-1&charset=UTF-8'
and this is my kannel configuration:
group = core
admin-port = 13001
smsbox-port = 13002
admin-password = bar
log-file = "/home/user/logs/kannellogs/SmscGateway.log"
log-level = 0
box-deny-ip = "*.*.*.*"
box-allow-ip = "127.0.0.1;172.*.*.*;192.*.*.*;10.*.*.*"
admin-allow-ip = "127.0.0.1;172.*.*.*;192.*.*.*;10.*.*.*"
admin-deny-ip = "*.*.*.*"
access-log = "/home/user/logs/kannellogs/access.log"
# SMSBOX SETUP
group = smsbox
bearerbox-host = localhost
sendsms-port = 13013
log-file="/home/user/logs/kannellogs/smsbox.log"
log-level = 0
access-log="/home/user/logs/kannellogs/sms_access.log"
reply-couldnotfetch = "Service is down, please try again later.(notfetch)"
reply-couldnotrepresent = "Service is down, please try again later.(notrepresent)"
reply-requestfailed = "Service is down, please try again later.(failed)"
reply-emptymessage = ""
mo-recode = true
# SEND-SMS USERS
group = sendsms-user
username=test
password=test
user-allow-ip = "*.*.*.*"
concatenation = true
split-chars = "#!^&*("
max-messages = 10
# SMPP PARAMETERS for SMSC account
group = smsc
smsc = smpp
smsc-id =Smsc12345
smsc-username = Voda
smsc-password = 12345678
host = 123.222.111.11
port = 1040
system-type = Vodafone403
interface-version = 34
source-addr-autodetect = false
source-addr-ton = 0
source-addr-npi = 1
dest-addr-ton = 1
dest-addr-npi = 1
reconnect-delay = false
reconnect-delay = 10
transceiver-mode = true
throughput = 10
address-range = "^12345$"
max-pending-submits = 3
group = sms-service
accepted-smsc = "Smsc12345"
keyword = default
get-url = "http://127.0.0.1:9091/services/smsReceive?msisdn=%p&coding=%c&smsText=%a&DCS=%m&charset=%C"
catch-all=true
max-messages = 0
I am new to kannel please help if i am doing anything wrong
You should check Kannel docs;
for a "normal" message, it will be "GSM" (coding=0), "binary" (coding=1) or "UTF-16BE" (coding=2)
What I see in url is that
&coding=0
what should be:
&coding=2
and also take care that it is url encoded correctly and about length of unicode message (if you are using aggregators not all support concatenation and long messages)
Hope it helps.
Vedran

Quartz schedulers controlled from an external app

I am currently working on Quartz.NET (version 2.3.1). I have created different Schedulers with different jobs using the code below (for each scheduler):
NameValueCollection properties = new NameValueCollection();
properties["quartz.scheduler.instanceName"] = "QuartzSchedulerTest";
properties["quartz.scheduler.instanceId"] = AUTO;
properties["quartz.threadPool.type"] = "Quartz.Simpl.SimpleThreadPool, Quartz";
properties["quartz.threadPool.threadPriority"] = "Normal";
properties["quartz.jobStore.misfireThreshold"] = "60000";
properties["quartz.jobStore.clustered"] = "true";
properties["quartz.jobStore.tablePrefix"] = "QRTZ_";
properties["quartz.jobStore.type"] = "Quartz.Impl.AdoJobStore.JobStoreTX, Quartz";
properties["quartz.jobStore.dataSource"] = "default";
properties["quartz.jobStore.useProperties"] = "false";
properties["quartz.jobStore.driverDelegateType"] = "Quartz.Impl.AdoJobStore.SqlServerDelegate, Quartz";
properties["quartz.dataSource.default.connectionString"] = "myConnString"
properties["quartz.dataSource.default.provider"] = "SqlServer-20";
// Get scheduler
ISchedulerFactory sf = new StdSchedulerFactory(properties);
IScheduler scheduler = sf.GetScheduler();
Now I have all scheduling information stored on a SQL database and everything works.
I created a new Console Application because I need to manage all schedulers (get schedulers list, jobs for each scheduler, send command to pause and resume triggers ecc...).
This is the code I wrote to try to have handlers to all existing schedulers:
NameValueCollection properties = new NameValueCollection();
properties["quartz.threadPool.type"] = "Quartz.Simpl.SimpleThreadPool, Quartz";
properties["quartz.threadPool.threadPriority"] = "Normal";
properties["quartz.jobStore.misfireThreshold"] = "60000";
properties["quartz.jobStore.clustered"] = "true";
properties["quartz.jobStore.tablePrefix"] = "QRTZ_";
properties["quartz.jobStore.type"] = "Quartz.Impl.AdoJobStore.JobStoreTX, Quartz";
properties["quartz.jobStore.dataSource"] = "default";
properties["quartz.jobStore.useProperties"] = "false";
properties["quartz.jobStore.driverDelegateType"] = "Quartz.Impl.AdoJobStore.SqlServerDelegate, Quartz";
properties["quartz.dataSource.default.connectionString"] = "myConnString"
properties["quartz.dataSource.default.provider"] = "SqlServer-20";
// Get scheduler
ISchedulerFactory sf = new StdSchedulerFactory(properties);
var schedulers = sf.AllSchedulers;
But no handlers returned (schedulers count is 0). Can anyone tell me how can I get all schedulers? Is it possible?
Sorry for my english and thanks in advance.
You have to connect to each scheduler instance directly using remoting. The schedulers are not aware of each other and there is no way to get a list of all of the schedulers that are in a cluster.
Once you connect to each scheduler then you'll be able to pull a list of running jobs and manipulate the job schedule as necessary. If all of the schedulers are in a cluster, then you don't have to connect to all of them to manipulate the jobs themselves. You can do that from any of the instances. However, the list of running jobs has to be compiled by asking each scheduler individually.

one node of a cluster does not show up in Ganglia web portal

In Ganglia, I have configured a 2 clusters. cluster A has 2 nodes, cluster B has 13 nodes respectively. cluster B works well, while cluster A only has 1 node shown. The other node has exactly the same gmond.conf file, which is shown below:
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 0 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 0
}
cluster {
#name = "unspecified"
name = "rpt"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
host {
location = "unspecified"
}
udp_send_channel {
#mcast_join = 239.2.11.71
host = qt-dw-master
port = 8557
ttl = 1
}
/*
udp_recv_channel {
#mcast_join = 239.2.11.71
port = 8557
#bind = 239.2.11.71
#bind = qt-dw-master
}
*/
tcp_accept_channel {
port = 8557
}
gmetad.conf on qt-dw-master is shown below:
data_source "rpt" 60 rpt0:8557 rpt1-db:8557
I have tried using multicasting, but does not work. I also want to find log files of gmond, but failed. Anyone can help on this problem?
Are all gmond running in the cluster A? Use commond service gmond status to confirm it