How many times to run stego-image to find Performance Metrics (psnr, mse, rmse)?

How many times to run stego-image to find Performance Metrics (psnr, mse, rmse)? - average

i want to calculate the psnr of stego-image for PVD(Pixel Differing Value) and each time psnr is different even with same 'secret message'.
Thus, I want to get the average psnr value of the stego-image. So, how many times should i run the code (inserting 'secret message',calculating psnr) to get an average value of psnr for my research paper?
(can it be 150 times)?

Related

Anylogic - Parameter Variation Experiment - Varying Number of Replications - Distribution and Error percent

I am trying to use a parameter variation experiment to perform replications with a variable number based on the confidence interval. I have some problems understanding the calculation basis behind this feature and understanding the final number of replications that result here.
On which distribution is the calculation based (normal or t-distribution)? I couldn't find anything online regarding this question. Also, when I calculate the intervals manually (both with normal and t-distribution), I sometimes get a higher percentage error than specified. Can you tell me on which calculation basis the number of replications is determined? Is it possible that the "Error percent" value does not have to be given in percent?

Removing outlier with multiple consecutive values similar to a step

I am processing an ocean wave data, where I have a timeseries of the Peak Wave Period (Tp (s)). The typical values for Tp ranges from 2s-15s for this location. However, it may reach higher values above 15s during extreme events such as a storm. Hence, removing data based on a threshold value is not suitable.
As you can see in the figure below, there are multiple values that are outliers. The high values occurred for a small duration and then dropped down. An extreme event would last for hours.
I have tried the functions filloutlier and medfilt1, but they are not successful in removing the outlier, which I presume is because multiple consecutive outlier data points exists.
Is there a built-in Matlab function exist to handle such situation?
Else, if I need to write my own function to filter such signals, could you provide some guidance.
Attaching a small data sample here as well: Download Data
Dataset plot (Only the segment in the provided data above)
Zoomed in plot at one of the outliers.

If we know that we need the values to be in the range of (2,15), we can clip the values > 15 to 15.
Another way is to use the value of a high percentile (say 95) of the observations and clip values about it.
filloutlier, medfilt1 methods are not removing values like 18 because they are not treating them as outliers. 18 is not very far away from the typical range of (2, 15).

Prometheus query quantile of pod memory usage performance

I'd like to get the 0.95 percentile memory usage of my pods from the last x time. However this query start to take too long if I use a 'big' (7 / 10d) range.
The query that i'm using right now is:
quantile_over_time(0.95, container_memory_usage_bytes[10d])
Takes around 100s to complete
I removed extra namespace filters for brevity
What steps could I take to make this query more performant ? (except making the machine bigger)
I thought about calculating the 0.95 percentile every x time (let's say 30min) and label it p95_memory_usage and in the query use p95_memory_usage instead of container_memory_usage_bytes, so that i can reduce the amount of points the query has to go through.
However, would this not distort the values ?

As you already observed, aggregating quantiles (over time or otherwise) doesn't really work.
You could try to build a histogram of memory usage over time using recording rules, looking like a "real" Prometheus histogram (consisting of _bucket, _count and _sum metrics) although doing it may be tedious. Something like:
- record: container_memory_usage_bytes_bucket
labels:
le: 100000.0
expr: |
container_memory_usage_bytes > bool 100000.0
+
(
container_memory_usage_bytes_bucket{le="100000.0"}
or ignoring(le)
container_memory_usage_bytes * 0
)
Repeat for all bucket sizes you're interested in, add _count and _sum metrics.
Histograms can be aggregated (over time or otherwise) without problems, so you can use a second set of recording rules that computes an increase of the histogram metrics, at much lower resolution (e.g. hourly or daily increase, at hourly or daily resolution). And finally, you can use histogram_quantile over your low resolution histogram (which has a lot fewer samples than the original time series) to compute your quantile.
It's a lot of work, though, and there will be a couple of downsides: you'll only get hourly/daily updates to your quantile and the accuracy may be lower, depending on how many histogram buckets you define.
Else (and this only came to me after writing all of the above) you could define a recording rule that runs at lower resolution (e.g. once an hour) and records the current value of container_memory_usage_bytes metrics. Then you could continue to use quantile_over_time over this lower resolution metric. You'll obviously lose precision (as you're throwing away a lot of samples) and your quantile will only update once an hour, but it's much simpler. And you only need to wait for 10 days to see if the result is close enough. (o:

The quantile_over_time(0.95, container_memory_usage_bytes[10d]) query can be slow because it needs to take into account all the raw samples for all the container_memory_usage_bytes time series on the last 10 days. The number of samples to process can be quite big. It can be estimated with the following query:
sum(count_over_time(container_memory_usage_bytes[10d]))
Note that if the quantile_over_time(...) query is used for building a graph in Grafana (aka range query instead of instant query), then the number of raw samples returned from the sum(count_over_time(...)) must be multiplied by the number of points on Grafana graph, since Prometheus executes the quantile_over_time(...) individually per each point on the displayed graph. Usually Grafana requests around 1000 points for building smooth graph. So the number returned from sum(count_over_time(...)) must be multiplied by 1000 in order to estimate the number of raw samples Prometheus needs to process for building the quantile_over_time(...) graph. See more details in this article.
There are the following solutions for reducing query duration:
To add more specific label filters in order to reduce the number of selected time series and, consequently, the number of raw samples to process.
To reduce the lookbehind window in square brackets. For example, changing [10d] to [1d] reduces the number of raw samples to process by 10x.
To use recording rules for calculating coarser-grained results.
To try using other Prometheus-compatible systems, which may process heavy queries at faster speed. Try, for example, VictoriaMetrics.

Negative option prices for certain input values in MATLAB?

In the course of testing an algorithm I computed option prices for random input values using the standard pricing function blsprice implemented in MATLAB's Financial Toolbox.
Surprisingly ( at least for me ) ,
the function seems to return negative option prices for certain combinations of input values.
As an example take the following:
> [Call,Put]=blsprice(67.6201,170.3190,0.0129,0.80,0.1277)
Call =-7.2942e-15
Put = 100.9502
If I change time to expiration to 0.79 or 0.81, the value becomes non-negative as I would expect.
Did anyone of you ever experience something similar and can come up with a short explanation why that happens?

I don't know which version of the Financial Toolbox you are using but for me (TB 2007b) it works fine.
When running:
[Call,Put]=blsprice(67.6201,170.3190,0.0129,0.80,0.1277)
I get the following:
Call = 9.3930e-016
Put = 100.9502
Which is indeed positive

Bit late but I have come across things like this before. The small negative value can be attributed to numerical rounding error and / or truncation error within the routine used to compute the cumulative normal distribution.
As you know computers are not perfect and small numerical error always persists in all calculations, in my view therefore the question one should must ask instead is - what is the accuracy of the input parameters being used and therefore what is the error tolerance for outputs.
The way I thought about it when I encountered it before was that, in finance, typical annual stock price return variance are of the order of 30% which means the mean returns are typically sampled with standard error of roughly 30% / sqrt(N) which is roughly of the order of +/- 1% assuming 2 years worth of data (so N = 260 x 2 = 520, any more data you have the other problem of stationarity assumption). Therefore on that basis the answer you got above could have been interpreted as zero given the error tolerance.
Also we typically work to penny / cent accuracy and again on that basis the answer you had could be interpreted as zero.
Just thought I'd give my 2c hope this is helpful in some ways if you are still checking for answers!

Using randi in MATLAB to get random values: values are not distributed uniformly

I am generating a random population of strings made of 0s and 1s. I am using randi(2)-1 to get a randomly generated single value 0 or 1. I expect to get 1s almost as frequently as 0s. Instead, when I view all the individuals in the population, they mostly consist of 1s. Below is the code - what is wrong?
for iInd=1:individualsCount
individual(attrCount) = 0;
for i=1:attrCount
individual(i) = randi(2)-1;
end
population{iInd} = individual;
end

Firstly, you don't need a loop to generate a random string of 0's and 1's. Try this instead:
individual = randi([0 1],[attrCount,1]);
Secondly, again, you don't need a loop to construct your population cell. Try this instead:
population=arrayfun(#(x)randi([0 1],[attrCount,1]),1:individualsCount,'UniformOutput',false)
You might have to change the order of rows and columns depending on how you want to set it up.
Now, coming to your question, you ought to understand that these distributions are stochastic and approach a truly uniform distribution of 50% 1s and 50% 0s only as your sample size approaches infinity. If your attrCount is small enough, do not be surprised if you don't find numbers close to 50% for each. That doesn't mean it is wrong. It is what it is.
Here's how the distribution of 1s looks like for a random binary vector of different sample sizes. You can see that for small sample sizes, there is high variability (and by no means is this exact... it will be different each time), whereas as you start approaching large sample sizes of 1000 and above, your distribution of 1s gets closer and closer to 50%, eventually being exactly 50% at infinity.