Using a growth formula for grouped observations

Using a growth formula for grouped observations - macros

I have a dataset which is shown below:
clear
input year price growth id
2008 5 -0.444 1
2009 . . 1
2010 7 -0.222 1
2011 9 0 1
2011 8 -0.111 1
2012 9 0 1
2013 11 0.22 1
2012 10 0 2
2013 12 0.2 2
2013 . . 2
2014 13 0.3 2
2015 17 0.7 2
2015 16 0.6 2
end
I want to generate variable growth which is the growth of price. The growth formula is:
growth = price of second-year - price of base year / price of base year
The base year is always 2012.
How can I generate this growth variable for each group of observation (by id)?

The base price can be picked out directly by egen:
bysort id: egen price_b = total(price * (year == 2012))
generate wanted = (price - price_b) / price_b
Notice that total is used along with the assumption that, for each id, you have only one observation with year = 2012.

The following works for me:
bysort id: generate obs = _n
generate double wanted = .
levelsof id, local(ids)
foreach x of local ids {
summarize obs if id == `x' & year == 2012, meanonly
bysort id: replace wanted = (price - price[`=obs[r(min)]']) / ///
price[`=obs[r(min)]'] if id == `x'
}
If the id values are consecutive, then the following will be faster:
forvalues i = 1 / 2 {
summarize obs if id == `i' & year == 2012, meanonly
bysort id: replace wanted = (price - price[`=obs[r(min)]']) / ///
price[`=obs[r(min)]'] if id == `i'
}
Results:
list, sepby(id)
+-----------------------------------------------+
| year price growth id obs wanted |
|-----------------------------------------------|
1. | 2008 5 -.444 1 1 -.44444444 |
2. | 2009 . . 1 2 . |
3. | 2010 7 -.222 1 3 -.22222222 |
4. | 2011 9 0 1 4 0 |
5. | 2011 8 -.111 1 5 -.11111111 |
6. | 2012 9 0 1 6 0 |
7. | 2013 11 .22 1 7 .22222222 |
|-----------------------------------------------|
8. | 2012 10 0 2 1 0 |
9. | 2013 12 .2 2 2 .2 |
10. | 2013 . . 2 3 . |
11. | 2014 13 .3 2 4 .3 |
12. | 2015 17 .7 2 5 .7 |
13. | 2015 16 .6 2 6 .6 |
+-----------------------------------------------+

Related

Problems with fprintf format (Matlab)

I want to correct variables' format in a txt file (show at the end, replace spaces for tab spaces), using the next Matlab code (previous import):
id = fopen('datoscorfecha.txt', 'w');
fprintf(id, '%5s %3s %3s %3s %4s %3s %6s\n',...
'fecha', 'dia','mes', 'ano', 'hora', 'min', 'abs370');
datos = cat(2,dia, mes, ano, hora, min1, abs370);
datos = datos';
fecha = Fecha'; % Imported as a string
fprintf(id, '%16s %2i %2i %4i %2i %2i %8.4f\n',...
fecha, datos);
fclose(id);
type datoscorfecha.txt
But I get this error:
Error using fprintf
Unable to convert 'string' value to
'int64'.
Fecha dia mes ano hora min abs370
03/06/2016 00:00 3 6 2016 0 0 29.356218
03/06/2016 00:05 3 6 2016 0 5 30.45703
03/06/2016 00:10 3 6 2016 0 10 27.53877
03/06/2016 00:15 3 6 2016 0 15 23.19832
03/06/2016 00:20 3 6 2016 0 20 22.333924
03/06/2016 00:25 3 6 2016 0 25 22.086426
03/06/2016 00:30 3 6 2016 0 30 20.933898

Maybe something like this can allow you to replace the spaces with tabs. Here I read the text file using the textscan() function and separate the columns. I also parse each value/term as a string. By using the writematrix() function I can write the data to a new text file the but with the Delimeter set to tab.
Text.txt (Input)
Fecha dia mes ano hora min abs370
03/06/2016 00:00 3 6 2016 0 0 29.356218
03/06/2016 00:05 3 6 2016 0 5 30.45703
03/06/2016 00:10 3 6 2016 0 10 27.53877
03/06/2016 00:15 3 6 2016 0 15 23.19832
03/06/2016 00:20 3 6 2016 0 20 22.333924
03/06/2016 00:25 3 6 2016 0 25 22.086426
03/06/2016 00:30 3 6 2016 0 30 20.933898
datoscorfecha.txt (Output)
Fecha dia mes ano hora min abs370
03/06/2016 00:00 3 6 2016 0 0 29.3562
03/06/2016 00:05 3 6 2016 0 5 30.4570
03/06/2016 00:10 3 6 2016 0 10 27.5388
03/06/2016 00:15 3 6 2016 0 15 23.1983
03/06/2016 00:20 3 6 2016 0 20 22.3339
03/06/2016 00:25 3 6 2016 0 25 22.0864
03/06/2016 00:30 3 6 2016 0 30 20.9339
Full Script:
File_ID = fopen("Text.txt");
Data = textscan(File_ID, '%s %s %s %s %s %s %s %s', 'Delimiter',' ');
fclose(File_ID);
% Data = readtable("Text.txt");
Column_1 = string(Data{:,1});
Column_2 = string(Data{:,2});
Column_3 = string(Data{:,3});
Column_4 = string(Data{:,4});
Column_5 = string(Data{:,5});
Column_6 = string(Data{:,6});
Column_7 = string(Data{:,7});
Column_8 = string(Data{:,8});
for Index = 2: length(Column_8)
Number = str2double(char(Column_8(Index,1)));
Number = num2str(Number);
Decimal_String = split(Number,".");
Decimal_String = Decimal_String{2};
if length(Decimal_String) ~= 4
Number = string(Number) + "0";
end
Column_8(Index,1) = Number;
end
Table = [Column_1 Column_2 Column_3 Column_4 Column_5 Column_6 Column_7 Column_8];
writematrix(Table,"datoscorfecha.txt",'Delimiter','tab');
type datoscorfecha.txt
Ran using MATLAB R2019b

How can I merge two data sets with ID variation in stata

I have following two data sets.
The first one from children looks like this.
ID year Q1 Q2 Q3 Q4 ....
101 2014 1 2 2 2
101 2016 1 2 2 2
101 2017 1 2 2 2
101 2018 1 2 2 2
401 2014 1 2 2 2
401 2015 1 2 3 3
401 2016 1 2 2 2
401 2017 1 2 1 1
401 2018 1 2 2 2
402 2014 1 1 0 3
402 2015 1 1 2 2
402 2016 1 1 2 2
402 2017 1 1 3 3
402 2018 1 1 2 3
Here's the second one from their parents.
ID year Q101 Q102
1 2014 1 3
1 2015 1 3
1 2016 1 3
1 2017 1 2
1 2018 1 2
2 2014 2 .
2 2015 1 2
2 2016 . .
2 2017 1 3
2 2018 2 .
4 2014 1 3
4 2015 1 3
4 2016 1 3
4 2017 1 3
4 2018 1 3
So the parent data ID can be matched to the children data ID deleted last two digits. It seems that parent ID 4 has two children.
I tried
merge 1:m ID using kids data as the master data set.
but it didn't work.
Thank you.

Getting good answers is made more likely by (a) attempting code and showing what you tried and (b) giving data in the form of code anybody using Stata can run. The code here follows from editing your post and is close to what you could get directly by using dataex as explained in the Stata tag wiki or indeed at help dataex in an up-to-date Stata or one in which you installed dataex from SSC.
clear
input ID year Q1 Q2 Q3 Q4
101 2014 1 2 2 2
101 2016 1 2 2 2
101 2017 1 2 2 2
101 2018 1 2 2 2
401 2014 1 2 2 2
401 2015 1 2 3 3
401 2016 1 2 2 2
401 2017 1 2 1 1
401 2018 1 2 2 2
402 2014 1 1 0 3
402 2015 1 1 2 2
402 2016 1 1 2 2
402 2017 1 1 3 3
402 2018 1 1 2 3
end
gen IDP = floor(ID/100)
save children
clear
input ID year Q101 Q102
1 2014 1 3
1 2015 1 3
1 2016 1 3
1 2017 1 2
1 2018 1 2
2 2014 2 .
2 2015 1 2
2 2016 . .
2 2017 1 3
2 2018 2 .
4 2014 1 3
4 2015 1 3
4 2016 1 3
4 2017 1 3
4 2018 1 3
end
rename ID IDP
merge 1:m IDP year using children
list
+----------------------------------------------------------------------+
| IDP year Q101 Q102 ID Q1 Q2 Q3 Q4 _merge |
|----------------------------------------------------------------------|
1. | 1 2014 1 3 101 1 2 2 2 matched (3) |
2. | 1 2015 1 3 . . . . . master only (1) |
3. | 1 2016 1 3 101 1 2 2 2 matched (3) |
4. | 1 2017 1 2 101 1 2 2 2 matched (3) |
5. | 1 2018 1 2 101 1 2 2 2 matched (3) |
|----------------------------------------------------------------------|
6. | 2 2014 2 . . . . . . master only (1) |
7. | 2 2015 1 2 . . . . . master only (1) |
8. | 2 2016 . . . . . . . master only (1) |
9. | 2 2017 1 3 . . . . . master only (1) |
10. | 2 2018 2 . . . . . . master only (1) |
|----------------------------------------------------------------------|
11. | 4 2014 1 3 401 1 2 2 2 matched (3) |
12. | 4 2015 1 3 401 1 2 3 3 matched (3) |
13. | 4 2016 1 3 402 1 1 2 2 matched (3) |
14. | 4 2017 1 3 401 1 2 1 1 matched (3) |
15. | 4 2018 1 3 402 1 1 2 3 matched (3) |
|----------------------------------------------------------------------|
16. | 4 2014 1 3 402 1 1 0 3 matched (3) |
17. | 4 2015 1 3 402 1 1 2 2 matched (3) |
18. | 4 2016 1 3 401 1 2 2 2 matched (3) |
19. | 4 2017 1 3 402 1 1 3 3 matched (3) |
20. | 4 2018 1 3 401 1 2 2 2 matched (3) |
+----------------------------------------------------------------------+
As far as the merge is concerned the essentials are identifiers with the same name(s) in both datasets and the correct pattern for merging. The parent identifier is only implied by the children dataset.

Perl function localtime giving incorrect values for years between 1964 and 1967

I was getting some whacky values from localtime function in Perl. The following is some code for which I get incorrect values.
In particular, this code is meant to determine the weekday for the first of each year.
#!/usr/bin/perl
use strict 'vars';
use Time::Local;
use POSIX qw(strftime);
mytable();
sub mytable {
print "Year" . " "x4 . "Jan 1st (localtime)" . " "x4 . "Jan 1st (Gauss)\n";
foreach my $year ( 1964 .. 2017 )
{
my $janlocaltime = evalweekday( 1,1,$year);
my $jangauss = gauss($year);
my $diff = $jangauss - $janlocaltime;
printf "%4s%10s%-12s ",$year,"",$janlocaltime;
printf "%12s",$jangauss;
printf " <----- ERROR: off by %2s", $diff if ( $diff != 0 );
print "\n";
}
}
sub evalweekday {
## Using "localtime"
my ($day,$month,$year) = #_;
my $epoch = timelocal(0,0,0, $day,$month-1,$year-1900);
my $weekday = ( localtime($epoch) ) [6];
return $weekday;
}
sub gauss {
## Alternative approach
my ($year) = #_;
my $weekday =
( 1 + 5 * ( ( $year - 1 ) % 4 )
+ 4 * ( ( $year - 1 ) % 100 )
+ 6 * ( ( $year - 1 ) % 400 )
) % 7;
return $weekday;
}
Here is the output which shows the years with incorrect values:
Year Jan 1st (localtime) Jan 1st (Gauss)
1964 2 3 <----- ERROR: off by 1
1965 4 5 <----- ERROR: off by 1
1966 5 6 <----- ERROR: off by 1
1967 6 0 <----- ERROR: off by -6
1968 1 1
1969 3 3
1970 4 4
1971 5 5
1972 6 6
1973 1 1
1974 2 2
1975 3 3
1976 4 4
1977 6 6
1978 0 0
1979 1 1
1980 2 2
1981 4 4
1982 5 5
1983 6 6
1984 0 0
1985 2 2
1986 3 3
1987 4 4
1988 5 5
1989 0 0
1990 1 1
1991 2 2
1992 3 3
1993 5 5
1994 6 6
1995 0 0
1996 1 1
1997 3 3
1998 4 4
1999 5 5
2000 6 6
2001 1 1
2002 2 2
2003 3 3
2004 4 4
2005 6 6
2006 0 0
2007 1 1
2008 2 2
2009 4 4
2010 5 5
2011 6 6
2012 0 0
2013 2 2
2014 3 3
2015 4 4
2016 5 5
2017 0 0
In fact, the errors seem to extend as far back as 1900, but I just haven't verified that they are in fact wrong prior to 1964.
perl --version returns the following:
This is perl 5, version 18, subversion 2 (v5.18.2) built for darwin-thread-multi-2level
(with 2 registered patches, see perl -V for more detail)
Copyright 1987-2013, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
I'm not sure whether it's relevant, but my operating system is macOS Sierra Version 10.12.3.
I've read through the documentation, but I don't see anything (or I'm being blind) regarding values returned prior to 1968. I've also tried to do a websearch but am not pulling up anything beyond the typical misunderstandings of array values and the numbering of months and days of the year.
Could someone help me out and explain what I'm getting wrong? Or, if this is an issue with my version of Perl, let me know what I can do to fix it.

This is likely to do with how negative epoch values are handled in Time::Local. Have a look at perldoc Time::Local #Negative-Epoch-Values
On my Linux box (perl 5.20), your code demonstrates the issue nicely. If you print out the epoch value received, you will see the issue, namely that the epoch returned by timelocal becomes huge instead of more negative:
Year Epoch Jan 1st (localtime) Jan 1st (Gauss)
1964 2966342400 2 3 <----- ERROR: off by 1
1965 2997964800 4 5 <----- ERROR: off by 1
1966 3029500800 5 6 <----- ERROR: off by 1
1967 3061036800 6 0 <----- ERROR: off by -6
1968 -63185400 1 1
1969 -31563000 3 3
1970 -27000 4 4
1971 31509000 5 5
1972 63045000 6 6
Why don't you try using DateTime library instead:
use DateTime;
my $dt = DateTime->new(
year => 1966, # Real Year
day => 1, # 1-31
month => 1, # 1-12
hour => 0, # 0-23
second => 0, # 0-59
);
print $dt->dow . "\n";
6
6 = Saturday which matches the Wikipedian view: Jan 1, 1966 (Saturday)

Stata merge with multiple match variables

I am having difficulty combining datasets for a project. Our primary dataset is organized by individual judges. It is an attribute dataset.
judge
j | x | y | z
----|----|----|----
1 | 2 | 3 | 4
2 | 5 | 6 | 7
The second dataset is a case database. Each observation is a case and judges can appear in one of three variables.
case
case | j1 | j2 | j3 | year
-----|----|----|----|-----
1 | 1 | 2 | 3 | 2002
2 | 2 | 3 | 1 | 1997
We would like to merge the case database into the attribute database, matching by judge. So, for each case that a judge appears in j1, j2, or j3, an observation for that case would be added creating a dataset that looks like below.
combined
j | x | y | z | case | year
---|----|----|----|-------|--------
1 | 2 | 3 | 4 | 1 | 2002
1 | 2 | 3 | 4 | 2 | 1997
2 | 5 | 6 | 7 | 1 | 2002
2 | 5 | 6 | 7 | 2 | 1997
My best guess is to use
rename j1 j
merge 1:m j using case
rename j j1
rename j2 j
merge 1:m j using case
However, I am unsure that this will work, especially since the merging dataset has three possible variables that the j identification can occur in.

Your examples are clear, but even better would be present them as code that would not require engineering edits to remove the scaffolding. See dataex from SSC (ssc inst dataex).
It's a case of the missing reshape, I think.
clear
input j x y z
1 2 3 4
2 5 6 7
end
save judge
clear
input case j1 j2 j3 year
1 1 2 3 2002
2 2 3 1 1997
end
reshape long j , i(case) j(which)
merge m:1 j using judge
list
+-------------------------------------------------------+
| case which j year x y z _merge |
|-------------------------------------------------------|
1. | 1 1 1 2002 2 3 4 matched (3) |
2. | 2 3 1 1997 2 3 4 matched (3) |
3. | 2 1 2 1997 5 6 7 matched (3) |
4. | 1 2 2 2002 5 6 7 matched (3) |
5. | 2 2 3 1997 . . . master only (1) |
|-------------------------------------------------------|
6. | 1 3 3 2002 . . . master only (1) |
+-------------------------------------------------------+
drop if _merge < 3
list
+---------------------------------------------------+
| case which j year x y z _merge |
|---------------------------------------------------|
1. | 1 1 1 2002 2 3 4 matched (3) |
2. | 2 3 1 1997 2 3 4 matched (3) |
3. | 2 1 2 1997 5 6 7 matched (3) |
4. | 1 2 2 2002 5 6 7 matched (3) |
+---------------------------------------------------+

Need a Logic to say Bingo

I am creating an iphone app where I have a grid view of 25 images as:
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
Now when any 5 consecutive images are selected it should say bingo, like if 0,6, 12, 18, 24 are selected it should say Bingo.
How will i do that, please help me.
Many Thanks for your help.
Rs
iPhone Developer

-----------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 |
-----------------------------------
| 6 | 7 | 8 | 9 | 10 | 11 |
-----------------------------------
| 12 | 13 | 14 | 15 | 16 | 17 |
-----------------------------------
| 18 | 19 | 20 | 21 | 22 | 23 |
-----------------------------------
| 24 |
-----------------------------------
Hope this is how your grid looks like.
Associate each column with an array. The array will contain the list of all neighbour elements of that column,
For example, the neighbor array of the column [ 6 ] will ollk like array(0, 7, 12), which are all the immediate neighbors of [ 6 ].
Set counter = 0;
Now, when someone clicks an element, increment the counter (Now counter = 1)
When he clicks the second element, check if the element is in the neighbor list of the previous element OR the 1st element.
If the element clicked is in the neighbor list, increment the counter (now counter = 2)
ELSE
If the element clicked is not in the neighbor array, reset the counter (counter = 0) and start over.
Check if the value of counter = 5. If it is, Say Bingo!
The algorithm is not fully correct, but I hope you got the idea :)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Using a growth formula for grouped observations - macros

The base price can be picked out directly by egen: bysort id: egen price_b = total(price * (year == 2012)) generate wanted = (price - price_b) / price_b Notice that total is used along with the assumption that, for each id, you have only one observation with year = 2012.

Related

Problems with fprintf format (Matlab)

How can I merge two data sets with ID variation in stata

Perl function localtime giving incorrect values for years between 1964 and 1967

Stata merge with multiple match variables

Need a Logic to say Bingo

Categories

Resources