Understanding get_misses in memaslap

Understanding get_misses in memaslap - memcached

I am trying to understand the behaviour of memcached by running memaslap against it.
The command I use is
memaslap -s localhost:11212 -T 64 -c 64 --verbose -S 1s -t 30s
and this is the output I get (on local machine)
Get Statistics (3157196 events)
Min: 36
Max: 17957
Avg: 545
Geo: 533.99
Std: 167.05
Log2 Dist:
4: 0 0 11 271
8: 2813 1379137 1706845 66407
12: 1110 446 152 4
Set Statistics (350829 events)
Min: 60
Max: 16184
Avg: 547
Geo: 536.33
Std: 161.65
Log2 Dist:
4: 0 0 3 28
8: 313 151210 191252 7861
12: 112 40 10
Total Statistics (3508025 events)
Min: 36
Max: 17957
Avg: 545
Geo: 534.22
Std: 167.17
Log2 Dist:
4: 0 0 14 299
8: 3126 1530347 1898097 74268
12: 1222 486 162 4
cmd_get: 3157252
cmd_set: 350837
get_misses: 1716802
written_bytes: 608669765
read_bytes: 1610227982
object_bytes: 381710656
Run time: 30.0s Ops: 3508089 TPS: 116914 Net_rate: 70.5M/s
As you can see I have get_misses which amount to 1716802, which is more than 50% of the get requests sent by memaslap. I am curious about what a get_miss means. I checked the docs at
http://docs.libmemcached.org/bin/memaslap.html
and plainly says
get_misses
How many objects can’t be gotten from server
What does it mean to not get a key from server? Is it because the key is not present or is it because of latencies?
If its because the key is not present, what exactly is memaslap testing by sending this request?

It says how many times we've got cache miss while running the get command, in other words - how many times key was not found.
The main reason why there are cache misses is eviction. Memcached throws some items out of memory to free space for the new ones.
Number of get_misses should drop if memcached would have more RAM to use (for instance: -m 1G for 1 Gb of RAM)

Related

Liferay 7, Hikari-Pool connection not available error, happen on production but not on pre-production environment

First of all, i have two environments strictly identically configured (exept IP) with two vm each. One in pre-production and one in production (currently in configuration phase). There is one vm with a liferay 7.0.6 tomcat bundle (build from 7.0.6-cumulative patch of the community-security-team) and an other with postgresql 9.4.26.
Everything works fine on pre-production environment.
On production environment, a few hours after beginning to create users in liferay i ran into this error (full stack at the end) :
Caused by: java.sql.SQLTransientConnectionException: HikariPool-2 - Connection is not available, request timed out after 937980ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:591)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:194)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:146)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.getTargetConnection(LazyConnectionDataSourceProxy.java:403)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.invoke(LazyConnectionDataSourceProxy.java:376)
at com.sun.proxy.$Proxy7.prepareStatement(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:534)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:452)
at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:161)
at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1700)
at org.hibernate.loader.Loader.doQuery(Loader.java:801)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.loadEntity(Loader.java:2037)
... 50 more
So i checked if there is differences between my liferay configuration on pre-production and the one in production using comparison software, and exept IP i found nothing. Idem with postgresql configurations on both environment.
I also checked time synchronization between vm and they are both synchronize via ntp on debian pool server.
Database vm :
n$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
0.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
1.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
2.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
3.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
+mail.klausen.dk 193.67.79.202 2 u 116 1024 377 14.435 -0.214 0.358
+any.time.nl 85.199.214.99 2 u 871 1024 377 1.666 -0.183 0.258
-rag.9t4.net 131.188.3.221 2 u 102 1024 377 16.491 -3.769 0.571
*ntp1.m-online.n 212.18.1.106 2 u 318 1024 377 16.608 -0.263 0.240
-tor-relais1.lin 131.188.3.223 2 u 199 1024 377 14.149 0.272 0.661
-www.kashra.com .DCFp. 1 u 150 1024 377 22.623 1.126 0.816
and Liferay vm :
$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
0.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
1.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
2.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
3.debian.pool.n .POOL. 16 p - 64 0 0.000 0.000 0.002
+stratum2-4.ntp. 129.70.137.82 2 u 200 1024 377 30.471 1.069 2.414
+138.201.16.225 131.188.3.221 2 u 696 1024 377 16.613 1.357 0.397
-kuehlich.com 131.188.3.221 2 u 373 1024 377 22.656 2.025 0.885
-time.cloudflare 10.21.8.19 3 u 566 1024 377 8.123 -0.640 0.277
*a.chl.la 131.188.3.222 2 u 167 1024 377 14.472 1.033 2.448
+195.50.171.101 145.253.3.52 2 u 266 1024 377 10.804 0.928 0.395
I also notice an error line in postgresql log about a request of process cancelation on unknown PID happening exactly the amout (937980ms) of milliseconds before the timeout error in Liferay:
LOG: PID 1767 in the cancel request does not match any process
I have tried re-installing Liferay from scratch but nothing change.
It should exist a difference between pre-production and production because it works fine on pre-production but i can't find it.
HikariCP configuration in liferay is default on both environment
jdbc.default.connectionTimeout=30000
jdbc.default.driverClassName=org.postgresql.Driver
jdbc.default.idleConnectionTestPeriod=60
jdbc.default.idleTimeout=600000
jdbc.default.initialPoolSize=10
jdbc.default.liferay.pool.provider=hikaricp
jdbc.default.maxActive=100
jdbc.default.maxIdleTime=3600
jdbc.default.maxLifetime=0
jdbc.default.maxPoolSize=100
jdbc.default.maximumPoolSize=100
jdbc.default.minIdle=10
jdbc.default.minPoolSize=10
jdbc.default.minimumIdle=10
, and same for postgresql :
max_connections = 100
HikariPool full satck from liferay :
2021-06-30 00:12:32.397 ERROR [liferay/scheduler_dispatch-6][JDBCExceptionReporter:234] HikariPool-2 - Connection is not available, request timed out after 937980ms.
2021-06-30 00:12:32.401 ERROR [liferay/scheduler_dispatch-6][BasePersistenceImpl:264] Caught unexpected exception
com.liferay.portal.kernel.exception.SystemException: com.liferay.portal.kernel.dao.orm.ORMException: org.hibernate.exception.GenericJDBCException: could not load an entity: [com.liferay.counter.model.impl.CounterImpl#com.liferay.counter.kernel.model.Counter]
at com.liferay.portal.kernel.service.persistence.impl.BasePersistenceImpl.processException(BasePersistenceImpl.java:270)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._obtainIncrement(CounterFinderImpl.java:391)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._competeIncrement(CounterFinderImpl.java:339)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._competeIncrement(CounterFinderImpl.java:325)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl.increment(CounterFinderImpl.java:111)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl.increment(CounterFinderImpl.java:100)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl.increment(CounterFinderImpl.java:95)
at com.liferay.counter.service.impl.CounterLocalServiceImpl.increment(CounterLocalServiceImpl.java:42)
at sun.reflect.GeneratedMethodAccessor638.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.liferay.portal.spring.aop.ServiceBeanMethodInvocation.proceed(ServiceBeanMethodInvocation.java:163)
at com.liferay.portal.spring.transaction.CounterTransactionExecutor.execute(CounterTransactionExecutor.java:50)
at com.liferay.portal.spring.transaction.TransactionInterceptor.invoke(TransactionInterceptor.java:58)
at com.liferay.portal.spring.aop.ServiceBeanMethodInvocation.proceed(ServiceBeanMethodInvocation.java:137)
at com.liferay.portal.spring.aop.ServiceBeanAopProxy.invoke(ServiceBeanAopProxy.java:169)
at com.sun.proxy.$Proxy78.increment(Unknown Source)
at com.liferay.counter.kernel.service.CounterLocalServiceUtil.increment(CounterLocalServiceUtil.java:238)
at com.liferay.portal.kernel.systemevent.SystemEventHierarchyEntryThreadLocal.push(SystemEventHierarchyEntryThreadLocal.java:134)
at com.liferay.portal.kernel.systemevent.SystemEventHierarchyEntryThreadLocal.push(SystemEventHierarchyEntryThreadLocal.java:96)
at com.liferay.portal.repository.capabilities.TemporaryFileEntriesCapabilityImpl._runWithoutSystemEvents(TemporaryFileEntriesCapabilityImpl.java:313)
at com.liferay.portal.repository.capabilities.TemporaryFileEntriesCapabilityImpl.deleteExpiredTemporaryFileEntries(TemporaryFileEntriesCapabilityImpl.java:113)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener.deleteExpiredTemporaryFileEntries(TempFileEntriesMessageListener.java:111)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener$1.performAction(TempFileEntriesMessageListener.java:134)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener$1.performAction(TempFileEntriesMessageListener.java:130)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery.performAction(DefaultActionableDynamicQuery.java:405)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery$1.call(DefaultActionableDynamicQuery.java:315)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery$1.call(DefaultActionableDynamicQuery.java:277)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery.doPerformActions(DefaultActionableDynamicQuery.java:335)
at com.liferay.portal.kernel.dao.orm.DefaultActionableDynamicQuery.performActions(DefaultActionableDynamicQuery.java:86)
at com.liferay.document.library.web.internal.messaging.TempFileEntriesMessageListener.doReceive(TempFileEntriesMessageListener.java:139)
at com.liferay.portal.kernel.messaging.BaseMessageListener.receive(BaseMessageListener.java:26)
at com.liferay.portal.kernel.scheduler.messaging.SchedulerEventMessageListenerWrapper.receive(SchedulerEventMessageListenerWrapper.java:66)
at com.liferay.portal.kernel.messaging.InvokerMessageListener.receive(InvokerMessageListener.java:74)
at com.liferay.portal.kernel.messaging.ParallelDestination$1.run(ParallelDestination.java:52)
at com.liferay.portal.kernel.concurrent.ThreadPoolExecutor$WorkerTask._runTask(ThreadPoolExecutor.java:756)
at com.liferay.portal.kernel.concurrent.ThreadPoolExecutor$WorkerTask.run(ThreadPoolExecutor.java:667)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.liferay.portal.kernel.dao.orm.ORMException: org.hibernate.exception.GenericJDBCException: could not load an entity: [com.liferay.counter.model.impl.CounterImpl#com.liferay.counter.kernel.model.Counter]
at com.liferay.portal.dao.orm.hibernate.ExceptionTranslator.translate(ExceptionTranslator.java:34)
at com.liferay.portal.dao.orm.hibernate.SessionImpl.get(SessionImpl.java:205)
at com.liferay.portal.kernel.dao.orm.ClassLoaderSession.get(ClassLoaderSession.java:326)
at com.liferay.counter.service.persistence.impl.CounterFinderImpl._obtainIncrement(CounterFinderImpl.java:369)
... 36 more
Caused by: org.hibernate.exception.GenericJDBCException: could not load an entity: [com.liferay.counter.model.impl.CounterImpl#com.liferay.counter.kernel.model.Counter]
at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:140)
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:128)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.loader.Loader.loadEntity(Loader.java:2041)
at org.hibernate.loader.entity.AbstractEntityLoader.load(AbstractEntityLoader.java:86)
at org.hibernate.loader.entity.AbstractEntityLoader.load(AbstractEntityLoader.java:76)
at org.hibernate.persister.entity.AbstractEntityPersister.load(AbstractEntityPersister.java:3294)
at org.hibernate.event.def.DefaultLoadEventListener.loadFromDatasource(DefaultLoadEventListener.java:496)
at org.hibernate.event.def.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:477)
at org.hibernate.event.def.DefaultLoadEventListener.load(DefaultLoadEventListener.java:227)
at org.hibernate.event.def.DefaultLoadEventListener.lockAndLoad(DefaultLoadEventListener.java:403)
at org.hibernate.event.def.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:155)
at org.hibernate.impl.SessionImpl.fireLoad(SessionImpl.java:1090)
at org.hibernate.impl.SessionImpl.get(SessionImpl.java:1075)
at org.hibernate.impl.SessionImpl.get(SessionImpl.java:1066)
at com.liferay.portal.dao.orm.hibernate.SessionImpl.get(SessionImpl.java:201)
... 38 more
Caused by: java.sql.SQLTransientConnectionException: HikariPool-2 - Connection is not available, request timed out after 937980ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:591)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:194)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:146)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.getTargetConnection(LazyConnectionDataSourceProxy.java:403)
at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.invoke(LazyConnectionDataSourceProxy.java:376)
at com.sun.proxy.$Proxy7.prepareStatement(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:534)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:452)
at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:161)
at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1700)
at org.hibernate.loader.Loader.doQuery(Loader.java:801)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.loadEntity(Loader.java:2037)
... 50 more
I'm actually out of other idea to test, any help would be thankfull.

Finally we found the solution.
On pre-production environment the two vms are on the same VLAN, which was not the case on production.
Solution: putting the vms on the same VLAN solve the problem.

remove description lines and add time to the first column

AWk experts, I have a file as descried below and I wonder if it is possible to easily convert it to the form that I want:
The file containing multiple variables over one month (one observance ONLY in one day, but some days may be missing). The format for each day is the same except the date/value. However there is some description lines (containing words and numbers) at the end of each day, and the number of description lines varies among different days.
KBO BTA Observations at 12Z 01 Feb 2020
-----------------------------------------------------------------------------
PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
hPa m C C % g/kg deg knot K K K
-----------------------------------------------------------------------------
1000.0 92
925.0 765
850.0 1516
754.0 2546 13.0 9.3 78 9.85 150 2 310.2 340.6 312.0
752.0 2569 14.0 9.2 73 9.80 149 2 311.5 342.0 313.4
700.0 3173 -9.20 7.5 89 9.38 120 6 312.6 341.9 314.4
Station information and sounding indices
Station elevation: 2546.0
Lifted index: 1.83
Pres [hPa] of the Lifted Condensation Level: 693.42
1000 hPa to 500 hPa thickness: 5798.00
Precipitable water [mm] for entire sounding: 21.64
8022 KBO BTA Observations at 00Z 02 Feb 2020
-----------------------------------------------------------------------------
PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
hPa m C C % g/kg deg knot K K K
-----------------------------------------------------------------------------
1000.0 97
925.0 758
850.0 1515
753.0 2546 10.8 6.8 76 8.30 190 3 307.9 333.4 309.5
750.0 2580 12.6 7.9 73 8.99 186 3 310.2 338.1 311.9
Here is what I want: remove all the description lines and read the date/time information and put it as the first column.
Time PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV
20200201t12Z 754.0 2546 13.0 9.3 78 9.85 150 2 310.2 340.6 312.0
20200201t12Z 752.0 2569 14.0 9.2 73 9.80 149 2 311.5 342.0 313.4
20200201t12Z 700.0 3173 -9.2 7.5 89 9.38 120 6 312.6 341.9 314.4
20200202t00Z 753.0 2546 10.8 6.8 76 8.30 190 3 307.9 333.4 309.5
20200202t00Z 750.0 2580 12.6 7.9 73 8.99 186 3 310.2 338.1 311.9
Any help is appreciated.
Kelly

something like this...
$ awk 'function m(x)
{return sprintf("%02d",int(index("JanFebMarAprMayJunJulAugSepOctNovDec",x)-1)/3+1)}
NR==1 {print "time PRES TEMP WDIR WSPD RELH"}
/^-+$/ {f=!f}
f {date=p[n] m(p[n-1]) p[n-2]}
!f {n=split($0,p)}
NF==11 && !/[^ 0-9.-]/ {print date,$0}' file | column -t
time PRES TEMP WDIR WSPD RELH
20200201 1000 10 230 5 90
20200201 900 9 200 6 85
20200201 800 9 100 6 87
20200202 1000 9.2 233 5 90
20200202 900 9.1 200 4 80
20200202 800 9 176 2 80
Explanation
function just returns the month number from the month string by looking up the index of and converting to formatted number
f keeps track of the dashed lines so that from the previous line we can parse the date,
finally to find the data lines the heuristic is number of fields and no non-number signs (digits, spaces, dots or negative signs).

$ cat tst.awk
/^-+$/ && ( ((++dashCnt) % 2) == 1 ) {
mthNr = (index("JanFebMarAprMayJunJulAugSepOctNovDec",p[n-1])+2)/3
time = sprintf("%04d%02d%02d", p[n], mthNr, p[n-2])
}
/^[[:upper:][:space:]]+$/ && !doneHdr++ { print "Time", $0 }
/^[0-9.[:space:]]+$/ { print time, $0 }
{ n = split($0,p) }
.
$ awk -f tst.awk file | column -t
Time PRES TEMP WDIR WSPD RELH
20200001 1000 10 230 5 90
20200001 900 9 200 6 85
20200001 800 9 100 6 87
20200002 1000 9.2 233 5 90
20200002 900 9.1 200 4 80
20200002 800 9 176 2 80

CEPH raw space usage

I can't understand, where my ceph raw space is gone.
cluster 90dc9682-8f2c-4c8e-a589-13898965b974
health HEALTH_WARN 72 pgs backfill; 26 pgs backfill_toofull; 51 pgs backfilling; 141 pgs stuck unclean; 5 requests are blocked > 32 sec; recovery 450170/8427917 objects degraded (5.341%); 5 near full osd(s)
monmap e17: 3 mons at {enc18=192.168.100.40:6789/0,enc24=192.168.100.43:6789/0,enc26=192.168.100.44:6789/0}, election epoch 734, quorum 0,1,2 enc18,enc24,enc26
osdmap e3326: 14 osds: 14 up, 14 in
pgmap v5461448: 1152 pgs, 3 pools, 15252 GB data, 3831 kobjects
31109 GB used, 7974 GB / 39084 GB avail
450170/8427917 objects degraded (5.341%)
18 active+remapped+backfill_toofull
1011 active+clean
64 active+remapped+wait_backfill
8 active+remapped+wait_backfill+backfill_toofull
51 active+remapped+backfilling
recovery io 58806 kB/s, 14 objects/s
OSD tree (each host has 2 OSD):
# id weight type name up/down reweight
-1 36.45 root default
-2 5.44 host enc26
0 2.72 osd.0 up 1
1 2.72 osd.1 up 0.8227
-3 3.71 host enc24
2 0.99 osd.2 up 1
3 2.72 osd.3 up 1
-4 5.46 host enc22
4 2.73 osd.4 up 0.8
5 2.73 osd.5 up 1
-5 5.46 host enc18
6 2.73 osd.6 up 1
7 2.73 osd.7 up 1
-6 5.46 host enc20
9 2.73 osd.9 up 0.8
8 2.73 osd.8 up 1
-7 0 host enc28
-8 5.46 host archives
12 2.73 osd.12 up 1
13 2.73 osd.13 up 1
-9 5.46 host enc27
10 2.73 osd.10 up 1
11 2.73 osd.11 up 1
Real usage:
/dev/rbd0 14T 7.9T 5.5T 59% /mnt/ceph
Pool size:
osd pool default size = 2
Pools:
ceph osd lspools
0 data,1 metadata,2 rbd,
rados df
pool name category KB objects clones degraded unfound rd rd KB wr wr KB
data - 0 0 0 0 0 0 0 0 0
metadata - 0 0 0 0 0 0 0 0 0
rbd - 15993591918 3923880 0 444545 0 82936 1373339 2711424 849398218
total used 32631712348 3923880
total avail 8351008324
total space 40982720672
Raw usage is 4x real usage. As I understand, it must be 2x ?

Yes, it must be 2x. I don't really shure, that the real raw usage is 7.9T. Why do you check this value on mapped disk?
This are my pools:
pool name KB objects clones degraded unfound rd rd KB wr wr KB
admin-pack 7689982 1955 0 0 0 693841 3231750 40068930 353462603
public-cloud 105432663 26561 0 0 0 13001298 638035025 222540884 3740413431
rbdkvm_sata 32624026697 7968550 31783 0 0 4950258575 232374308589 12772302818 278106113879
total used 98289353680 7997066
total avail 34474223648
total space 132763577328
You can see, that the total amount of used space is 3 times more than the used space in the pool rbdkvm_sata (+-).
ceph -s shows the same result too:
pgmap v11303091: 5376 pgs, 3 pools, 31220 GB data, 7809 kobjects
93736 GB used, 32876 GB / 123 TB avail

I don't think you have just one rbd image. The result of "ceph osd lspools" indicated that you had 3 pools and one of pools had name "metadata".(Maybe you were using cephfs). /dev/rbd0 was appeared because you mapped the image but you could have other images also. To list the images you can use "rbd list -p ". You can see the image info with "rbd info -p "

prstat on solaris - can you make it flash when size exceeds a limit?

I have been told to make prstat flash the background from white to black a few times when any value in the size category passes a threshold. Is there a way to edit the command and put this in here or will this never happen?

I'm not trying to be mean, but somebody who asked for this is not being reasonable or does not understand. I would guess the "asker" has no clue about prstat. Look at these two examples:
example% prstat -u root -n 5 -P 1,2 1 1
PID USERNAME SWAP RSS STATE PRI NICE TIME CPU PROCESS/LWP
306 root 3024K 1448K sleep 58 0 0:00.00 0.3% sendmail/1
102 root 1600K 592K sleep 59 0 0:00.00 0.1% in.rdisc/1
250 root 1000K 552K sleep 58 0 0:00.00 0.0% utmpd/1
288 root 1720K 1032K sleep 58 0 0:00.00 0.0% sac/1
1 root 744K 168K sleep 58 0 0:00.00 0.0% init/1
TOTAL: 25, load averages: 0.05, 0.08, 0.12
example% prstat -S rss -n 5 -vc -u root,john
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWP
1 root 0.0 0.0 - - - - 100 - 0 0 0 0 init/1
102 root 0.0 0.0 - - - - 100 - 0 0 3 0 in.rdisc/1
250 root 0.0 0.0 - - - - 100 - 0 0 0 0 utmpd/1
1185 john 0.0 0.0 - - - - 100 - 0 0 0 0 csh/1
240 root 0.0 0.0 - - - - 100 - 0 0 0 0 powerd/4
TOTAL: 71, load averages: 0.02, 0.04, 0.08
So, what value do you look for? There are lots of things prstat displays, so you have to learn all of them then code for whatever each of the many possible outputs means.
To do this:
What you will have to do is to run prstat with arguments entered on the command line, in a child process, read and interpret everything it produces, then map it to output and flash the screen as appropriate. You can do this with coprocesses in ksh or zsh or by using fifos in bash. Consider running prtstat in -e mode regardless of the what the user enters so you have full screens to read and manipulate.
Flashing the screen can be done with escape sequences, like changing background color or whatever you want. Here is a starting point for Windows based terminals:
ANSI escape sequences
And for Vt100 (UNIX)
terminal escape codes

uwsgi long timeouts

I am using ubuntu 12, nginx, uwsgi 1.9 with socket, django 1.5.
Config:
[uwsgi]
base_path = /home/someuser/web/
module = server.manage_uwsgi
uid = www-data
gid = www-data
virtualenv = /home/someuser
master = true
vacuum = true
harakiri = 20
harakiri-verbose = true
log-x-forwarded-for = true
profiler = true
no-orphans = true
max-requests = 10000
cpu-affinity = 1
workers = 4
reload-on-as = 512
listen = 3000
Client tests from Windows7:
C:\Users\user>C:\AppServ\Apache2.2\bin\ab.exe -c 255 -n 5000 http://www.someweb.com/about/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking www.someweb.com (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Finished 5000 requests
Server Software: nginx
Server Hostname: www.someweb.com
Server Port: 80
Document Path: /about/
Document Length: 1881 bytes
Concurrency Level: 255
Time taken for tests: 66.669814 seconds
Complete requests: 5000
Failed requests: 1
(Connect: 1, Length: 0, Exceptions: 0)
Write errors: 0
Total transferred: 10285000 bytes
HTML transferred: 9405000 bytes
Requests per second: 75.00 [#/sec] (mean)
Time per request: 3400.161 [ms] (mean)
Time per request: 13.334 [ms] (mean, across all concurrent requests)
Transfer rate: 150.64 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 8 207.8 1 9007
Processing: 10 3380 11480.5 440 54421
Waiting: 6 1060 3396.5 271 48424
Total: 11 3389 11498.5 441 54423
Percentage of the requests served within a certain time (ms)
50% 441
66% 466
75% 499
80% 519
90% 3415
95% 36440
98% 54407
99% 54413
100% 54423 (longest request)
I have set following options too:
echo 3000 > /proc/sys/net/core/netdev_max_backlog
echo 3000 > /proc/sys/net/core/somaxconn
So,
1) I make first 3000 requests super fast. I see progress in ab and in uwsgi requests logs -
[pid: 5056|app: 0|req: 518/4997] 80.114.157.139 () {30 vars in 378 bytes} [Thu Mar 21 12:37:31 2013] GET /about/ => generated 1881 bytes in 4 msecs (HTTP/1.0 200) 3 headers in 105 bytes (1 switches on core 0)
[pid: 5052|app: 0|req: 512/4998] 80.114.157.139 () {30 vars in 378 bytes} [Thu Mar 21 12:37:31 2013] GET /about/ => generated 1881 bytes in 4 msecs (HTTP/1.0 200) 3 headers in 105 bytes (1 switches on core 0)
[pid: 5054|app: 0|req: 353/4999] 80.114.157.139 () {30 vars in 378 bytes} [Thu Mar 21 12:37:31 2013] GET /about/ => generated 1881 bytes in 4 msecs (HTTP/1.0 200) 3 headers in 105 bytes (1 switches on core 0)
I dont have any broken pipes or worker respawns.
2) Next requests are running very slow or with some timeout. Looks like that some buffer becomes full and I am waiting before it becomes empty.
3) Some buffer becomes empty.
4) ~500 requests are processed super fast.
5) Some timeout.
6) see Nr. 4
7) see Nr. 5
8) see Nr. 4
9) see Nr. 5
....
....
Need your help

check with netstat and dmesg. You have probably exhausted ephemeral ports or filled the conntrack table.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Understanding get_misses in memaslap - memcached

Related

Liferay 7, Hikari-Pool connection not available error, happen on production but not on pre-production environment

remove description lines and add time to the first column

CEPH raw space usage

prstat on solaris - can you make it flash when size exceeds a limit?

uwsgi long timeouts

Categories

Resources