What is the difference between Coefficient of Regression and Elasticity - linear-regression

I am studying elasticity of demand and how to get the optimal price from elasticity using regression. I have referred Rbloggers and medium blogs to understand the concepts. But still I have a doubt. Say I have a linear equation as below
Sales of Eggs = 137.37 – (16.12)Price.Eggs + 4.15 (Ad.Type) – (8.71)Price.Cookies
Mean of Price.Eggs= 4.43,
Mean of Price.Cookies= 4.37,
Mean of Sales of Eggs= 30
We can deduce the equation as : increase in sales of eggs increases the price of cookies by 8.71 and price of eggs by 16.12.
But in the case of elasticity, we calculate the formula and the elasticity of price of eggs is -2.38 and elasticity of price of cookies is -1.27 which also tells the unit increase in value with respect to dependant variable. What is the difference between these two ? I know the values are different but both meant the same right ? Please advice and correct if I am wrong

Well it depends. I'm going to simplify the model a bit to one product (eggs for example):
Assuming elasticity is not constant and the demand curve is linear:
E = Elasticity
Q = Quantity Demanded
P = Price
t = time
b0 = constant
b1 = coefficient (slope)
See the breakdown for elasticity here
Picture a graph of the Demand Curve with Q on the vertical axis and P on the horizontal axis - because we're assuming Quantity Demanded will change in response to changes in Price.
I can't emphasize this enough - in the case where demand is linear and elasticity is not constant along the entire demand curve:
The coefficient (slope) is the change (difference) in the dependent variable (Q) divided by the change in the independent variable (P) measured in units - it is the derivative of your linear equation. The coefficient is the change in Q units with respect to a change in P units. Using your eggs example, increasing the price of eggs by 1 unit will decrease the quantity demanded of eggs by 16.12 units - regardless of whether the price of eggs increases from 1 to 2 or 7 to 8, the quantity demanded will decrease by 16.12 units.
From the link above, you can see that Elasticity adds a bit more information. That is because elasticity is the percent change in Quantity Demanded divided by the percent change in Price - ie the relative difference in Quantity Demanded with respect to the relative difference in Price. Let's use your eggs model but exclude Ad.Type and Price.Cookies
Sales of Eggs = 137.37 - 16.12 * Price.Eggs
"P" "Qd" "E"
1.00 121.25 -0.13
2.00 105.13 -0.31
3.00 89.01 -0.54
4.00 72.89 -0.88
5.00 56.77 -1.42
6.00 40.65 -2.38
7.00 24.53 -4.60
8.00 8.41 -15.33
See graph of Demand Curve vs Elasticity
In the table you can see that as P increases by 1.00, Qd decreases by 16.12 regardless if it's from 1.00 to 2.00 or 7.00 to 8.00.
Elasticity, however, does change rather significantly relative to changes in price, so even if the change in units for each variable remains the same, the percent change for each variable will change.
A price increase from 1 to 2 is a 100% increase and would result in a change in quantity demanded from 121.25 to 105.13 which is a 13% decrease.
A price change from 7 to 8 is a 14% increase and would result in a quantity demanded change from 24.53 to 8.41 which is a 66% decrease.
If you're interested in learning more about different ways to measure elasticity I highly recommend These lecture slides especially slide 6.26.

Related

Does DBI (data bus inversion) conserve Entropy?

I have been reading up on DBI on Wikipedia, which references this research paper: http://www.cs.columbia.edu/~cs4823/handouts/stan-burleson-tvlsi-95.pdf
The paper says:
While the maximum number of transitions is reduced by half the
decrease in the average number of transitions is not as good. For an
8-bit bus for example the average number of transitions per time-slot
by using the Bus-invert coding becomes 3.27 (instead of 4), or 0.41
(instead of 0.5) transitions per bus-line per time-slot.
However this would suggest it reduces the entropy of the 8 bit message, no?
So the entropy of a random 8 bit message is 8 (duh). Add a DBI bit which shifts the probability distribution to the left, but it (I thought) wouldn't reduce the area under the curve. You should still be left with a minimum of 8 bits of entropy, but spread over 9 bits. But they claim the average is now 0.41, instead of 0.5, which suggests the entropy is now -log( (0.59)^9) = ~6.85. I would have assumed the average would (at best) become 0.46 (-log(0.54 ^9) = ~8).
Am I misunderstanding something?

or-tools how solver is choosing vehicles

Situation that we have(using VRP):
We are using only first solution strategy without local search metaheuristics.
Also we are using span cost coefficient for each vehicle and also fixed cost.
Also we have pallet, weight and volume capacities in place.
We have 80 vehicles and 374 visits.
If we are making all vehicles the same (same span cost, same fixed cost, same capacities etc.) then we are getting a best solution with 54 routes and a total solution price is 7917. Solver is producing 99 solutions.
If we are changing a single vehicle span cost (and by a tiny margin) then the solution becomes significantly worse than before with 54 routes and a total solution price of 8149. Solver is producing 390 solutions (significantly more).
Why solver is creating more solutions, but giving worse outcome?
And the next situation.
All data is the same and all solution making configuration are the same.
If we are changing pallet capacity of two vehicles from 36 pallets to 19 pallets and making them slightly cheaper (fixed price 49 compared to 50 and span cost 0.34 compared to 0.35) then the solver is using those two vehicles.
It seems logical at first, but it’s local optimum not a global one, because total solution price is worse than the solution that would have used pricier vehicles with bigger pallet capacity.
Can you explain how solver is choosing vehicles for the solution? Is it reaching for local optimum(best price for the route), or global one(best price for total solution)?

Lottery Ticket Hypothesis - Iterative Pruning

I was reading about The Lottery Ticket Hypothesis and it was mentioned in the paper:
we focus on iterative pruning, which repeatedly trains, prunes, and
resets the network over n rounds; each round prunes (p^(1/n))% of the
weights that survive the previous round.
Can someone please explain this say for each round with numbers, when n = 5 (rounds) and the final sparsity desired (p) = 70%.
In this example, the numbers I computed are as follows:
Round (p^(1/n))% of weights pruned
1 0.93114999
2 0.86704016
3 0.80734437
4 0.75175864
5 0.7
According to these calculations, it seems that the first round prunes 93.11% (approx) of the weights, whereas, the fifth round prunes 70% of the weights. It's as if as the rounds progress, the percentage of weights being pruned decreases.
What am I doing wrong?
Thanks!
You are using p^(1/n). As you increase n after each iteration, your p^(1/n) term decreases!

What do you do if the sample size for an A/B test is larger than the population?

I have a list of 7337 customers (selected because they only had one booking from March-August 2018). We are going to contact them and are trying to test the impact of these activities on their sales. The idea is that contacting them will cause them to book more and increase the sales of this largely inactive group.
I have to setup an A/B test and am currently stuck on the sample size calculation.
Here's my sample data:
Data
The first column is their IDs and the second column is the total sales for this group for 2 weeks in January (i took 2 weeks as the customers in this group purchase very infrequently).
The metric I settled on was Revenue per customer (RPC = total revenue/total customer) so I can take into account both the number of orders and the average order value of the group.
The RPC for this group is $149,482.7/7337=$20.4
I'd like to be able to detect at least a 5% increase in this metric at 80% power and 5% significance level. First I calculated the effect size.
Standard Deviation of the data set = 153.9
Effect Size = (1.05*20.4-20.4)/153.9 = 0.0066
I then used the pwr package in R to calculate the sample size.
pwr.t.test(d=0.0066, sig.level=.05, power = .80, type = 'two.sample')
Two-sample t test power calculation
n = 360371.048
d = 0.0066
sig.level = 0.05
power = 0.8
alternative = two.sided
The sample size I am getting however is 360,371. This is larger than the size of my population (7337).
Does this mean I can not run my test at sufficient power? The only way I can determine to lower the sample size without compromising on significance or power is to increase the effect size to determine a minimum increase of 50% which would give me an n=3582.
That sounds like a pretty high impact and I'm not sure that high of an impact is reasonable to expect.
Does this mean I can't run an A/B test here to measure impact?

Average Effectiveness of Variables in Binary Logistic Model

I have constructed a binary logistic model to check the effect of various variables on the probability of a consumer to buy. I have 5 different brands, and in the model I have 5 price variables which are specific to the brand (interaction between brand dummy and price). So my output looks like this:
Coefficient P-value
Price_Brand_A 0.25 0.02
Price_Brand_B 0.50 0.01
Price_Brand_C 0.10 0.09
Price_Brand_D 0.40 0.15
Price_Brand_E 0.65 0.02
What I would like to ask, is if it is correct to say something about the overall effect of price, and not specifically about the brands. For instance would it be correct to take the average of the coefficients and say that the average effect of price is equal to 0.38? Or is there some statistical procedure I should follow to report the overall effect of price? Also would the same apply for the P-value?
I am working with spss and I am new at modelling so any help would be appreciated.
Many Thanks
If you test an interaction hypothesis, you have to include a number of terms in your model. In this case, you would have to include:
Base effect of price
Base effects of brands (dummies)
Interaction effects of brand dummies * price.
Since you have 5 brands, you will have to include 4 of the 5 dummy variables. The dummy you leave out will be your reference category. The same goes for the interaction terms. In this case, the base effect of price will be the effect of price for the reference category of brands. The base of dummies will be the difference between brands if the price would be 0. The interaction effects can be interpreted in two different ways. On way is to say that the effect of an interaction term would be the additional price effect of one brand, compared to the reference category of brands. The other way is to say that the interaction effect is the additional difference between the brand and the reference brand if price increases by one.
If you want to know what the average effect of price is, why would you include interaction terms? In that case, I would leave out the interactions in a first model, and then include the interactions to show that the average effect of price is not accurate if you would look at the effect for each brand.
Maybe you could post some more output? I think you got more out of it than you posted in your question?
Good luck!

Categories