Create a conditional variable by group and time and day over the panel data

Create a conditional variable by group and time and day over the panel data - postgresql

I am trying to analyse connection between probability of call and distance to vehicles.
The example dataset (here csv) looks like this:
id day time called d
1 2009-06-24 1700 0 1037.6
1 2009-06-24 1710 1 1191.9
1 2009-06-24 1720 0 165.5
The real dataset has 10 million rows. There are ids that represent locations that called or not, in different time windows of (here) 10 minutes.
I would like to first drop all rows with the same id that never called at this time at any date during the whole period.
Then I am left with rows that represent ids that called at some day during the analysis at the given time.
I would like to create a variable that at the row of the call has value of 0 and day before (or hour, week, month, whatever, but here day) at the same time it equals -1 and day after +1 etc. Later I would use that variable as input together with called and distance for analysis and comparison across different locations
I have looked for other answered questions but did not find something that fits. So answer or pointer to one would be appreciated. I am using Stata 13, but solving this with Postgres 9.3 or R would be welcome as well.
I shall need to repeat this procedure multiple times for several datasets, so ideally I would like to automatise as much as possible.
Update:
Here is example of the desired result:
id day time called d newvar newvar2
1 2009-06-24 1700 0 1037.6 null
1 2009-06-24 1710 1 1191.9 0 -2
1 2009-06-24 1720 0 165.5 -1
1 2009-06-25 1700 0 526.7 null
1 2009-06-25 1710 0 342.5 1 -1
1 2009-06-25 1720 1 416.1 0
1 2009-06-26 1700 0 428.3 null
1 2009-06-26 1710 1 240.7 2 0
1 2009-06-26 1720 0 228.7 1
1 2009-06-27 1700 0 282.5 null
1 2009-06-27 1710 0 182.1 3 1
1 2009-06-27 1720 0 195.5 2
2 2009-06-24 1700 0 198.0 -1
2 2009-06-24 1710 0 157.4 null
2 2009-06-24 1720 0 234.9 null
2 2009-06-25 1700 1 247.0 0
I added the newvar2 because some locations might call several times at the given time window

When looking for a Stata solution, it is best to provide a data example using dataex (from SSC).
The problem is hard to visualize until the data is sorted by id and time (and further sorted by day). I did not convert the day variable to a Stata numeric date because, as constructed, the string sort order matches the natural date order.
You appear to want, for each call within a group of id time, the date offset in relation to the day of the call. This can be done by generating an order variable to track the index of the current observation within each id time group and then subtracting the index of the observation which makes the call.
Since you can have more than one call per time slot, you have to loop over the maximum number of calls in any given time slot in the data.
There's one difference with the results generated by this solution compared to yours: you seem to ignore the call on 2009-06-27 at 1710 for id == 2.
In the following example, the original data is presented sorted by id time day to provide the readers with a better intuition of what's going on.
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id str10 day int time byte called float distance str4 newvar byte newvar2
1 "2009-06-24" 1700 0 1037.6 "null" .
1 "2009-06-25" 1700 0 526.7 "null" .
1 "2009-06-26" 1700 0 428.3 "null" .
1 "2009-06-27" 1700 0 282.5 "null" .
1 "2009-06-24" 1710 1 1191.9 "0" -2
1 "2009-06-25" 1710 0 342.5 "1" -1
1 "2009-06-26" 1710 1 240.7 "2" 0
1 "2009-06-27" 1710 0 182.1 "3" 1
1 "2009-06-24" 1720 0 165.5 "-1" .
1 "2009-06-25" 1720 1 416.1 "0" .
1 "2009-06-26" 1720 0 228.7 "1" .
1 "2009-06-27" 1720 0 195.5 "2" .
2 "2009-06-24" 1700 0 198 "-1" .
2 "2009-06-25" 1700 1 247 "0" .
2 "2009-06-26" 1700 0 188.7 "1" .
2 "2009-06-27" 1700 0 203.5 "2" .
2 "2009-06-24" 1710 0 157.4 "null" .
2 "2009-06-25" 1710 0 221.3 "null" .
2 "2009-06-26" 1710 0 283.8 "null" .
2 "2009-06-27" 1710 1 91.7 "null" .
2 "2009-06-24" 1720 0 234.9 "null" .
2 "2009-06-25" 1720 0 249.6 "null" .
2 "2009-06-26" 1720 0 279.7 "null" .
2 "2009-06-27" 1720 0 198.2 "null" .
3 "2009-06-24" 1700 0 156.1 "-1" .
3 "2009-06-25" 1700 1 19.9 "0" .
3 "2009-06-26" 1700 0 195.2 "1" .
3 "2009-06-27" 1700 0 306.2 "2" .
3 "2009-06-24" 1710 0 150.1 "null" .
3 "2009-06-25" 1710 0 163.7 "null" .
3 "2009-06-26" 1710 0 288.2 "null" .
3 "2009-06-27" 1710 0 311.7 "null" .
3 "2009-06-24" 1720 0 135.1 "-2" .
3 "2009-06-25" 1720 0 186 "-1" .
3 "2009-06-26" 1720 1 297.2 "0" .
3 "2009-06-27" 1720 0 375.9 "1" .
end
* order observations by date within a id time group
sort id time day
by id time: gen order = _n
* number of calls at any given time
by id time: gen call = sum(called)
* repeat enough to cover the max number of calls per time
sum call, meanonly
local n = r(max)
forvalues i = 1/`n' {
// the index of the called observation in the id time group
by id time: gen index = order if called & call == `i'
// replicate the index for all observations in the id time group
by id time: egen gindex = total(index)
// the relative position of each obs in groups with a call
gen wanted`i' = order - gindex if gindex > 0
drop index gindex
}
list, sepby(id time) noobs compress
and the results
. list, sepby(id time) noobs compress
+----------------------------------------------------------------------------------------+
| id day time cal~d dist~e new~r new~2 order call wan~1 wan~2 |
|----------------------------------------------------------------------------------------|
| 1 2009-06-24 1700 0 1037.6 null . 1 0 . . |
| 1 2009-06-25 1700 0 526.7 null . 2 0 . . |
| 1 2009-06-26 1700 0 428.3 null . 3 0 . . |
| 1 2009-06-27 1700 0 282.5 null . 4 0 . . |
|----------------------------------------------------------------------------------------|
| 1 2009-06-24 1710 1 1191.9 0 -2 1 1 0 -2 |
| 1 2009-06-25 1710 0 342.5 1 -1 2 1 1 -1 |
| 1 2009-06-26 1710 1 240.7 2 0 3 2 2 0 |
| 1 2009-06-27 1710 0 182.1 3 1 4 2 3 1 |
|----------------------------------------------------------------------------------------|
| 1 2009-06-24 1720 0 165.5 -1 . 1 0 -1 . |
| 1 2009-06-25 1720 1 416.1 0 . 2 1 0 . |
| 1 2009-06-26 1720 0 228.7 1 . 3 1 1 . |
| 1 2009-06-27 1720 0 195.5 2 . 4 1 2 . |
|----------------------------------------------------------------------------------------|
| 2 2009-06-24 1700 0 198 -1 . 1 0 -1 . |
| 2 2009-06-25 1700 1 247 0 . 2 1 0 . |
| 2 2009-06-26 1700 0 188.7 1 . 3 1 1 . |
| 2 2009-06-27 1700 0 203.5 2 . 4 1 2 . |
|----------------------------------------------------------------------------------------|
| 2 2009-06-24 1710 0 157.4 null . 1 0 -3 . |
| 2 2009-06-25 1710 0 221.3 null . 2 0 -2 . |
| 2 2009-06-26 1710 0 283.8 null . 3 0 -1 . |
| 2 2009-06-27 1710 1 91.7 null . 4 1 0 . |
|----------------------------------------------------------------------------------------|
| 2 2009-06-24 1720 0 234.9 null . 1 0 . . |
| 2 2009-06-25 1720 0 249.6 null . 2 0 . . |
| 2 2009-06-26 1720 0 279.7 null . 3 0 . . |
| 2 2009-06-27 1720 0 198.2 null . 4 0 . . |
|----------------------------------------------------------------------------------------|
| 3 2009-06-24 1700 0 156.1 -1 . 1 0 -1 . |
| 3 2009-06-25 1700 1 19.9 0 . 2 1 0 . |
| 3 2009-06-26 1700 0 195.2 1 . 3 1 1 . |
| 3 2009-06-27 1700 0 306.2 2 . 4 1 2 . |
|----------------------------------------------------------------------------------------|
| 3 2009-06-24 1710 0 150.1 null . 1 0 . . |
| 3 2009-06-25 1710 0 163.7 null . 2 0 . . |
| 3 2009-06-26 1710 0 288.2 null . 3 0 . . |
| 3 2009-06-27 1710 0 311.7 null . 4 0 . . |
|----------------------------------------------------------------------------------------|
| 3 2009-06-24 1720 0 135.1 -2 . 1 0 -2 . |
| 3 2009-06-25 1720 0 186 -1 . 2 0 -1 . |
| 3 2009-06-26 1720 1 297.2 0 . 3 1 0 . |
| 3 2009-06-27 1720 0 375.9 1 . 4 1 1 . |
+----------------------------------------------------------------------------------------+

Related

How do I create a list of 1's in KDB?

I'm trying to create a list of 1's in KDB.
I've tried:
{1} each (til 500)
which worked, but is ugly. Is there a more elegant way? (I tried ' as well, but this didn't work).

You can use the 'take' (#) operator for this.
q)500#1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1..

K-map ( karnaugh map ) 8,4,-2,-1 to binary code conversion

I'm taking computer science courses and some digital design knowledge is required, so I'm taking digital design 101.
Image above is representing the conversion process of 8,4,-2,-1 to binary using K-map (Karnaugh map).
I have no idea why 0001, 0011, 0010, 1100, 1101, 1110 are marked as 'X'.
For 0001, 0011, 0010, they could be expressed as 8,4,-2,-1 as 0111, 0110, 0101.
And for 1100, 1101, 1110,
1110 can still be expressed as 1100 in 8,4,-2,-1 form as 1100.
rests cannot be expressed in 8,4,-2,-1 since 1100 is the biggest amount of number in 8,4,-2,-1 binary form (I think).
Is there something I'm missing?
I understand the excess-3 to binary code conversion provided from my textbook example ( m10-m15 are marked as 'X' since excess-3 were used to express only 0-9.)

According to the definition of BCD, 1 decimal digit (NOT one number) is represented by 4 bits.
The 4 given inputs can therefore represent only values from interval from 0 to 9.
The corresponding and complete truth-table looks like this:
decimal | 8 4 -2 -1 | decimal || BCD
/index | A B C D | result || W X Y Z
----------------------------------||---------
0 | 0 0 0 0 | 0 || 0 0 0 0 ~ 0
1 | 0 0 0 1 | -1 || X X X X
2 | 0 0 1 0 | -2 || X X X X
3 | 0 0 1 1 | -2-1=-3 || X X X X
4 | 0 1 0 0 | 4 || 0 1 0 0 ~ 4
5 | 0 1 0 1 | 4-1=3 || 0 0 1 1 ~ 3
6 | 0 1 1 0 | 4-2=2 || 0 0 1 0 ~ 2
7 | 0 1 1 1 | 4-2-1=1 || 0 0 0 1 ~ 1
8 | 1 0 0 0 | 8 || 1 0 0 0 ~ 8
9 | 1 0 0 1 | 8-1=7 || 0 1 1 1 ~ 7
10 | 1 0 1 0 | 8-2=6 || 0 1 1 0 ~ 6
11 | 1 0 1 1 | 8-2-1=5 || 0 1 0 1 ~ 5
12 | 1 1 0 0 | 8+4=12 || X X X X
13 | 1 1 0 1 | 8+4-1=11 || X X X X
14 | 1 1 1 0 | 8+4-2=10 || X X X X
15 | 1 1 1 1 | 8+4-2-1=9 || 1 0 0 1 ~ 9
The K-maps then match the truth-table by its indexes:
Using the K-maps, it can be indeed simplified to these boolean expressions:
W = A·B + A·¬C·¬D
X = ¬B·C + ¬B·D + B·¬C·¬D
Y = ¬C·D + C·¬D
Z = D

Circuit that accept 4-bit number and generate its triple, what does that mean? Can someone give me an example of it?

I have to design a combinational circuit that accepts a 4-bit number and generate its triple, what does it mean? Can someone please give me an example of a specific input and its output so I can understand the question?
If you could give me also any hint for designing this circuit I would be very grateful.
Thank you.

You'll want to make a truth table and then derive the combinational logic from either boolean algebra (sum of truths), or from a k-map.
If, for example, the input is 0100 which is decimal 4, then the triple would be 12, or 1100. Since the highest number is 1111 (15), then your output has to be able to represent 45, or 101101 (6 bits).
Hence, you'll have something like:
Input | Output
-----------------
abcd uvwxyz
0000 | 000000
0001 | 000011
0010 | 000110
0011 | 001001
0100 | 001100
0101 | 001111
0110 | 010010
0111 | 010101
1000 | 011000
1001 | 011011
1010 | 011110
1011 | 100001
1100 | 100100
1101 | 100111
1110 | 101010
1111 | 101101
From that you can build a k-map for each output bit and find the minimum combinational logic required per output bit.
For example, to find the combinational logic for bit u then you would use the following k-map:
AB
00 01 11 10
CD 00 0 0 1 0
01 0 0 1 0
11 0 0 1 1
10 0 0 1 0
Which reduces to ACD + AB
Repeat for the other 5 bits (v-z) and you'll have the full combinational logic needed to implement the solution.

Start with a truth table:
IN OUT
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 3 0 0 0 0 1 1
2 0 0 1 0 6 0 0 0 1 1 0
3 0 0 1 1 9 0 0 1 0 0 1
4 0 1 0 0 C 0 0 1 1 0 0
5 0 1 0 1 F 0 0 1 1 1 1
6 0 1 1 0 12 0 1 0 0 1 0
7 0 1 1 1 15 0 1 0 1 0 1
8 1 0 0 0 18 0 1 1 0 0 0
9 1 0 0 1 1B 0 1 1 0 1 1
A 1 0 1 0 1E 0 1 1 1 1 0
B 1 0 1 1 21 1 0 0 0 0 1
C 1 1 0 0 24 1 0 0 1 0 0
D 1 1 0 1 27 1 0 0 1 1 1
E 1 1 1 0 2A 1 0 1 0 1 0
F 1 1 1 1 2D 1 0 1 1 0 1
Then use a standard technique such as a Karnaugh Map to deduce the input/output expressions.

How to apply logical indexing on multi-dimension matrix in matlab

My question is simple. I have an rgb image and a logical matrix. I want to set the pixel which is true in the corresponding element of logical matrix to (150,160,170).
For example:
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
r= 1 1 1 1 1 g= 1 1 1 1 1 b=1 1 1 1 1 logical_mat =1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
I want it results in
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
r= 150 1 1 1 1 g= 160 1 1 1 1 b=170 1 1 1 1
150 150 1 1 1 160 160 1 1 1 170 170 1 1 1
150 150 150 1 1 160 160 160 1 1 170 170 170 1 1
I have tried logical index, if set pixel into same color is easy
lm = repmat(logical_mat,[1 1 3]);
rgb(lm) = 150;
But I dont know how to set the value channel by channel.
Thanks in advance.

You're already creating the right logical matrix:
lm = repmat(logical_mat,[1 1 3]);
You need to create a 3-channel color matrix the same size.
cm = repmat(cat(3,150,160,170), size(lm,1), size(lm,2))
Then, index into the color matrix with lm:
rgb(lm) = cm(lm);

Pattern multiply - sed?

I have these kind of rows
0 1 1
I would like to multiply it by let's say 2 or 4 to get this pattern
0 0 0 0 1 1 1 1 1 1 1 1
Now, I have some piece of old code, which basically does this in the case of multiplying by 5.
But I cannot convert this script to do it for example 2 or 4 times...
Can anyone help me to figure it out?
Here is the code:
sed -e 's/\([01]\)/\1\1\1\1/7g ; s/\([01]\{2,\}\)/\1\1\1/g ; s/\b\([01]\)\b/\1\1\1\1\1/g ; s/\([01]\)\B/\1 /g'

$ echo '0 1 1' | sed -r 's/\S/& & & & &/g'
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

using sed repeat 4 times:
kent$ echo "0 1 1
1 1 0"|sed 's/[01]/& & & &/g'
0 0 0 0 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 0 0 0 0
with awk, you can give the times you want to repeat as parameter: e.g. say repeat 5 times:
kent$ echo "0 1 1
dquote> 1 1 0"|awk -v t=5 '{f=1;while(f<=NF){ n=1;while(n<=t){printf "%s ",$f;n++;}f++;} print "";}'
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 0 0 0 0 0

This might work for you:
echo -e '0 1 1\n1 1 1 0 0' | sed "s/\S/$(echo {1..4}| sed 's/\S*/\&/g')/g"
0 0 0 0 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
This provides an OTT solution but it is programmable i.e. change 4 to any value you wish to multiply by.