It was my interview question write MC/DC for (A&&B)||A - mc-dc

How many test cases will be for this boolean expression?
If I follow MCDC rules I am getting only two test cases
(A&&B)||A decision
1 T. F. T. T
2 F. T. F. F
at least one true true combination for AND gate I don't know how make without breaking other rules.

I believe you need to create a minimum of 10 test cases. Firstly, there are two conditions; A & B, for which you need two cases each. Means a total of 4 here. Secondly, you need to create two test cases for each combination of conditions, which means an additional 4 test cases. And lastly, you would need to create two test cases for the overall decision made by the expression which will be again for a total of 2 test cases. that means it will be a total of 10 Test Cases.

Related

Finding Conditional Moments in a Markov Process

This question combines math and programming. I will first describe the general problem and then give an example that is (hopefully) simpler to understand.
General Question: Consider a Markov-chain process of N-states with transition matrix Π. Each state is associated with a value x_n (n in {1,…,n}). Our goal is to find the unconditional average of the first two moments (mean and var) along T-period paths conditional on (i) the path starts in a subset of states, N_0, (ii) it ends in a subset of states, N_T, and (iii) it is not going through a subset of states, N_not, in any of the periods between 1 to T-1. By saying we are interested in the unconditional average of these two moments, I basically mean what would be the average of these two moments in the stationary distribution. To be more concrete, let me illustrate the goal of the exercise in a simple case.
Simple Example: Consider a 3-state Markov-chain process with transition matrix Π, and let the three state be denoted by A, B, and C. Each of these states are associated with some value (x_A, x_B, and x_C), respectively. We are interested in what happens along paths that satisfy the following condition. The path starts at point A, after 3 periods are in either points B or C, and between periods 1 to 3 never go again through point A. Denote this condition by (#). So, for example, a path which we are interested in would be {A,B,B,C} with the associated values {x_A, x_B, x_B, x_C}. We are interested in the average and standard deviation along such paths. In particular, we would like to find the unconditional average of these first two moments in paths that satisfy (#).
Let me now propose a solution based on simulating the process. Since both T and N are quite large, this solution is too slow for my purpose.
Simulation Solution: Starting from some initial point simulate the process for a very long time period, and drop the first τ periods. Extract all paths along the simulation that satisfy condition (#) and compute the mean and std along each of these paths. Finally, simply take the average across these paths.
I’m hoping there is a better and more efficient way to achieve the goal. Since I want the solution to be accurate and the size of T and N the simulation takes a long time.
I would love to hear your thoughts and if you know of efficient methods to achieve this goal. Please let me know if something is not clear and I'll try to clarify it.
Thank you!!!
I think I know how to do this if N_0 consists of one state, let's call that state A.
The long run probability of being in A is pi(A) and can be obtained by solving pi = pi*P, with P the transition matrix.
The other thing you need to calculate is the probability of those transient paths. You probably need to introduce a modified P, where all states i in the set N_not are absorbing (i.e. P[i,i]=1 and P[i,j]=0 for j is not i). Then starting from a vector p(0) which has a 1 in the element corresponding to state A and 0 otherwise, you can keep calculating p(n) = p(n-1)*P to get the probabilities of your transient paths.
Multiply the result of that by pi(A) to get the unconditional probability.
You can probably do something like this as well when N_0 is a set, but I don't know how you should select p(0) in that case.

Programming practice: Does not creating variables at first lead to faster computation?

I have a, b and A.
a = some expression 1
b = some expression 2
A = a + b
vs
A = some expression 1 + some expression 2
In my code, there are not just a and b but a lot of those. By using the later method without creating variables at first, i.e. by just summing all the expressions in A, I get 1s faster in my program, total is about 11s. This is confirmed after a long of tests. So it reduces from 11s to about 10s. Is this due to just not creating variables at first? Does not creating variables at first lead to faster computation?
I need to run a lot of for loop and run ode solver and for long computation. Variables are calculated and created inside the loop. If i can get a about 10% decrease this is good.
In general (not just MATLAB).
Your first scenario these additional steps are required, which do not apply to the second scenario.
When variable is created, memory needs to be allocated where the value for the variable can be stored.
When a value is assigned to that variable, that value needs to be written to the variable's space in memory.
When the calculation is requested, the value for each variable needs to be retrieved from memory.
Many compilers optimize away these additional overheads by using various techniques, but many interpreted languages do not. (This is not a hard and fast rule though, there are smart interpreted languages and stupid compiled ones).
I do not know exactly how the internals of MATLAB works, but I do think it is
interpreted, which means that the additional steps likely will incur additional overhead.
The problem with your second scenario is that is less readable and maintainable in the long run though. It is easier to read computations and intermediate steps when variable names are used. The trick is balance performance and readability.
I'm not sure how much of a difference it would make in terms of performance, but I don't think it would be a sizeable difference. Maybe a few hundredths of a second.
You can test it for yourself by using the tic toc function.
tic
a = some expression 1
b = some expression 2
A = a + b
toc
VS
tic
A = some expression 1 + some expression 2
toc
As mentioned in the other answer, readability is the main difference. You want to keep your code as simple as possible so that if there is a problem you know exactly where it is and hopefully why there was a problem!

How to test properties of random generator

Using Scala, I've a method that return a set of 5 random numbers, that should be between 1 and a constant LIMIT.
What's the best approach to test that a answer will never return more/less than 5 elements, and all elements are between 1 and LIMIT? Making a simple test is easy. But should I make a loop of, lets say, 1000 iterations to better test it? Or there is some feature in unit testing for such cases?
Using Scala and ScalaTest.FunSuite
Take a look at https://softwareengineering.stackexchange.com/questions/147134/how-should-i-test-randomness
My approach would be to generate 100 sets for the limit 20 and test if the occurrences of each number are nearly equally divided.
Let's try QuickTheories - property based testing framework.
It runs tests with many different possible generated input.

Chi-sqaure type-1-error

I have a question about the chi-square test.
I have two between-subject factors, each with two levels (so 4 conditions). Furthermore, I have one dependent variable (qualitative), also consisting of two levels.
Now I want to make pairwise comparisons (so I have 6 chi-sqaure test in total). Is there any way I can control type-1-errors? In the literature I saw they often calculated interaction with a chi-sqaure test. Is this the way to do it, and if so, how do I do it?
I can work with both SPSS and Matlab.
Thank in advance!
Niels

Algorithm generation

I have a rather large(not too large but possibly 50+) set of conditions that must be placed on a set of data(or rather the data should be manipulated to fit the conditions).
For example, Suppose I have the a sequence of binary numbers of length n,
if n = 5 then a element in the data might be {0,1,1,0,0} or {0,0,0,1,1}, etc...
BUT there might be a set of conditions such as
x_3 + x_4 = 2
sum(x_even) <= 2
x_2*x_3 = x_4 mod 2
etc...
Because the conditions are quite complex in that they come from experiment(although they can be written down in logic form) and are hard to diagnose I would like instead to use a large sample set of valid data. i.e., Data I know satisfies the conditions and is a pretty large set. i.e., it is easier to collect the data then it is to deduce the conditions that the data must abide by.
Having said that, basically what I'm doing is very similar to neural networks. The difference is, I would like an actual algorithm, in some sense optimal, in some form of code that I can run instead of the network.
It might not be clear what I'm actually trying to do. What I have is a set of data in some raw format that is unique and unambiguous but not appropriate for my needs(in a sense the amount of data is too large).
I need to map the data into another set that actually is ambiguous to some degree but also has certain specific set of constraints that all the data follows(certain things just cannot happen while others are preferred).
The unique constraints and preferences are hard to figure out. That is, the mapping from the non-ambiguous set to the ambiguous set is hard to describe(which is why it is ambiguous). The goal, actually, is to have an unambiguous map by supplying the right constraints if at all possible.
So, on the vein of my initial example, I'm given(or supply) a set of elements and need some way to derive a list of constraints similar to what I've listed.
In a sense, I simply have a set of valid data and train it very similar to neural networks.
Then, after this "Training" I'm given the mapping function I can then use on any element in my dataset and it will produce a new element satisfying the constraint's, or if it can't, will give as close as possible an unambiguous result.
The main difference between neural networks and what I'm trying to achieve is I'd like to be able to use have an algorithm to code to be used instead of having to run a neural network. The difference here is the algorithm would probably be a lot less complex, not need potential retraining, and a lot faster.
Here is a simple example.
Suppose my "training set" are the binary sequences and mappings
01000 => 10000
00001 => 00010
01010 => 10100
00111 => 01110
then from the "Magical Algorithm Finder"(tm) I would get a mapping out like
f(x) = x rol 1 (rol = rotate left)
or whatever way one would want to express it.
Then I could simply apply f(x) to any other element, such as x = 011100 and could apply f to generate a hopefully unambiguous output.
Of course there are many such functions that will work on this example but the goal is to supply enough of the dataset to narrow it down to hopefully a few functions that make the most sense(at the very least will always map the training set correctly).
In my specific case I could easily convert my problem into mapping the set of binary digits of length m to the set of base B digits of length n. The constraints prevents some numbers from having an inverse. e.g., the mapping is injective but not surjective.
My algorithm could be a simple collection if statements acting on the digits if need be.
I think what you are looking for here is an application of Learning Classifier Systems, LCS -wiki. There are actually quite a few LCS open-source applications available, but you may need to experiment with the parameters in order to get a good result.
LCS/XCS/ZCS have the features that you are looking for including individual rules that could be heavily optimized, pressure to reduce the rule-set, and of course a human-readable/understandable set of rules. (Unlike a neural-net)