Need to write a GPA calculator using the provided dictionary to output the gpa based on the 4 arguments of letter grades. I can get the code to run in google colab or other IDEs, but I get no output in CL. Can someone point me to what I am missing?
'''
import sys
#print (sys.argv[1])
#print (sys.argv[2])
#print (sys.argv[3])
#print (sys.argv[4])
def gpa_calculator():
grades = [sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4]]
grades_upper = [each_string.upper() for each_string in grades]
points = 0
grade_chart = {'A':4.0, 'A-':3.66, 'B+':3.33, 'B':3.0, 'B-':2.66,
'C+':2.33, 'C':2.0, 'C-':1.66, 'D+':1.33, 'D':1.00, 'D-':.66, 'F':0.00}
for grade in grades_upper:
points += grade_chart[grade]
gpa = points / len(grades)
rounded_gpa = round(gpa,2)
return rounded_gpa
print (rounded_gpa)
gpa_calculator()'''
return rounded_gpa
print (rounded_gpa)
You seem to be returning the value from the function before you reach the print statement. I'm guessing the value is returned correctly, but you don't do anything with the return value when calling the function, so nothing is output to the screen.
You should move the print(...) call above the return statement, or print out the result when calling the function:
print(gpa_calculate())
That's because you return first before print.
In jupyter notebook like google colab, each cell will print anything the last line returns (if any). That's why in such environment you get output.
Corrected code:
import sys
#print (sys.argv[1])
#print (sys.argv[2])
#print (sys.argv[3])
#print (sys.argv[4])
def gpa_calculator():
grades = [sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4]]
grades_upper = [each_string.upper() for each_string in grades]
points = 0
grade_chart = {'A':4.0, 'A-':3.66, 'B+':3.33, 'B':3.0, 'B-':2.66,
'C+':2.33, 'C':2.0, 'C-':1.66, 'D+':1.33, 'D':1.00, 'D-':.66, 'F':0.00}
for grade in grades_upper:
points += grade_chart[grade]
gpa = points / len(grades)
rounded_gpa = round(gpa,2)
print(rounded_gpa)
return rounded_gpa
gpa_calculator()
output:
C:\Users\XXXXX\Desktop>python3 a.py A B C D
2.5
C:\Users\XXXXX\Desktop>python3 a.py A A A A
4.0
Related
Im trying to follow an exercise on calculating the maximum drawdown and maximum drawdown duration of a market market neutral vs a long-only trading strategy.
I followed the code to the T and has worked perfectly up until now, and I seem to be getting a ValueError Exception. What code do I need to change for my code to work?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from MaxDD_Function import calculateMaxDD
# CALCUALTING MAXDD AND CREATING THE FUNCTION.
def calculateMaxDD(cumret):
highwatermark = np.zeros(cumret.shape)
drawdown = np.zeros(cumret.shape)
drawdownduration = np.zeros(cumret.shape)
for t in np.arange(1, cumret.shape[0]):
highwatermark[t] = (np.maximum(highwatermark[t -1]), cumret[t])
drawdown[t] = ((1+ cumret[t] )/(1 + highwatermark[t]) - 1)
if drawdown[t] == 0:
drawdownduration[t] == 0
else:
drawdownduration[t] = drawdownduration[t -1] + 1
maxDD, i = np.min(drawdown, np.argmin(drawdown)) # drawdown < 0 always
maxDDD = np.max(drawdownduration)
return (maxDD, maxDDD, i)
# First part of example. Read the csv data and calculate.
#The first dataframe/set for my strategy
df = pd.read_csv('IGE_daily.csv')
#print (df.head())
df.sort_values(by= 'Date', inplace = True)
dailyret = df.loc[:, 'Adj Close'].pct_change()
excessRet = ((dailyret - 0.04)/252)
sharpeRatio = ((np.sqrt(252)*np.mean(excessRet))/np.std(excessRet))
print (sharpeRatio)
#Second part of example
#This is the second dataframe/set for my strategy.
df2 = pd.read_csv('SPY.csv')
#The new data frame, with both datasets.
df = pd.merge (df, df2, on = 'Date', suffixes = ('_IGE', '_SPY'))
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace = True)
df.sort_index(inplace = True)
dailyret = df [['Adj Close_IGE', 'Adj Close_SPY' ]].pct_change() # Daily
Returns
dailyret.rename(columns = {"Adj Close_IGE": "IGE", "Adj Close_SPY": "SPY"
}, inplace = True)
netRet = (dailyret['IGE'] - dailyret['SPY'])/2
sharpeRatio = np.sqrt(252) * np.mean(netRet)/np.std(netRet)
print (sharpeRatio)
cumret = np.cumprod(1 + netRet) - 1 #Cumalative return
#print (plt.plot(cumret))
#print (plt.show()) # Remember to always run plt.show to see the plot in
terminal.
maxDrawdown, maxDrawdownDuration, startDrawdownDay =
calculateMaxDD(cumret.values)
maxDrawdown = calculateMaxDD(cumret.values)
print (maxDrawdown)
Here are the results I got from my above mentioned code:
Ivies-MacBook-Pro:Quant_Trading Ivieidahosa$ python Ex3_4.py
-46.10531783058014
0.7743286831426566
Traceback (most recent call last):
File "Ex3_4.py", line 76, in <module>
maxDrawdown = calculateMaxDD(cumret.values)
File "Ex3_4.py", line 15, in calculateMaxDD
highwatermark[t] = (np.maximum(highwatermark[t -1]), cumret[t])
ValueError: invalid number of arguments
I expected the output on themaxDrawdown to be -0.09529268047208683,maxDrawdwnduration to be 497 andstartDrawdownday to be 1223.
Q: What code do I need to change for my code to work?
Your code uses a call to a numpy function having a defined a minimum-call-signature as: np.maximum( <array_like_A>, <array_like_B> )
This simply fails to meet the expected behaviour once only one of the expected pair of values was delivered in the reported line ( see the closing parenthesis ), or a scalar or any other, non-array-like type of object(s) were attempted to be delivered into the call-signature:
highwatermark[t] = ( np.maximum( highwatermark[t-1] ), cumret[t] )
where a tuple was attempted to get constructed on the right hand side of the value-assignment (well, actually an object-reference gets assigned in python, sure, but was trying to remain short here to tell that fast for an easy reading ), the first item of which was expected to get assigned to a returned value from a call to the above documented np.maximum(...) function. And Hic Sunt Leones ...
May like to start further bug-tracing with a cross-check-ing of the state of objects and the call-signature:
try:
for t in np.arange( 1, cumret.shape[0] ):
print( "The shape of <highwatermark[t-1]>-object was: ",
highwatermark[t-1].shape, " for t == ", t
)
except:
print( "The <highwatermark[t-1]>-object was not a numpy array",
" for t == ", t
)
finally:
print( np.maximum.__doc__ )
I need to find the next neighbor using the euklid-distance-algorithm.
Given are two hashes with each 100 elements in this format:
$hash{$i}{price}
$hash{$i}{height}
I need to compare each hash elements with each other so that my result is a 100 x 100 matrix.
After that my matrix shoul be sorted after the following rules:
$hash{$i,$j} {lowest_euklid_distance}
$hash{$i,$j+1}{lowest_euklid_distance + 1}
$hash{$i,$j+2}{lowest_euklid_distance + 2}
.
.
$hash{$i+1,$j}{lowest_euklid_distance}
$hash{$i+1,$j}{lowest_euklid_distance + 2}
.
.
$hash{$i+n,$j+n}{lowest_euklid_distance+n}
My problem is to sort these elements properly.
Any adivices?
Thanks in advance.
/edit: adding more information:
I create the distance with the following subroutine:
sub euklid_distance{
#w1 = price_testdata
#w2 = height_testdata
#h1 = price_origindata
#h2 = height_origindata
my $w1 = trim($_[0]);
my $w2 = trim($_[1]);
my $h1 = trim($_[2]);
my $h2 = trim($_[3]);
my $result = (((($w2-$w1)**2)+(($h2-$h1)**2))**(1/2));
return $result;
}
I fetch the test and the origindata from two seperate lists.
The resulthash with the 100x100 matrix is created by the following code:
my %distancehash;
my $countvar=0;
for (my $j=0;$j<100;$j++){
for (my $i=0;$i<100;$i++){
$distancehash{$countvar}{distance} = euklid_distance( $origindata{$i}{price}, $testdata{$j}{price}, $origindata{$i}{height}, $testdata{$j}{height} );
$distancehash{$countvar}{originPrice} = $origindata{$i}{price};
$distancehash{$countvar}{originHeight} = $origindata{$i}{height};
$distancehash{$countvar}{testPrice} = $testdata{$j}{price};
$distancehash{$countvar}{testHeight} = $testdata{$j}{height};
$countvar++;
}
}
where $j goes over the testdata and $i over the origindata.
My goal is to have a new hash which is sorted by the lowest distance from the current $j ascending to the highest.
When using the rapid annotator tool brat, it appears that the created annotations file will present the annotation in the order that the annotations were performed by the user. If you start at the beginning of a document and go the end performing annotation, then the annotations will naturally be in the correct offset order. However, if you need to go earlier in the document and add another annotation, the offset order of the annotations in the output .ann file will be out of order.
How then can you rearrange the .ann file such that the annotations are in offset order when you are done? Is there some option within brat that allows you to do this or is it something that one has to write their own script to perform?
Hearing nothing, I did write a python script to accomplish what I had set out to do. First, I reorder all annotations by begin index. Secondly, I resequence the label numbers so that they are once again in ascending order.
import optparse, sys
splitchar1 = '\t'
splitchar2 = ' '
# for brat, overlapped is not permitted (or at least a warning is generated)
# we could use this simplification in sorting by simply sorting on begin. it is
# probably a good idea anyway.
class AnnotationRecord:
label = 'T0'
type = ''
begin = -1
end = -1
text = ''
def __repr__(self):
return self.label + splitchar1
+ self.type + splitchar2
+ str(self.begin) + splitchar2
+ str(self.end) + splitchar1 + self.text
def create_record(parts):
record = AnnotationRecord()
record.label = parts[0]
middle_parts = parts[1].split(splitchar2)
record.type = middle_parts[0]
record.begin = middle_parts[1]
record.end = middle_parts[2]
record.text = parts[2]
return record
def main(filename, out_filename):
fo = open(filename, 'r')
lines = fo.readlines()
fo.close()
annotation_records = []
for line in lines:
parts = line.split(splitchar1)
annotation_records.append(create_record(parts))
# sort based upon begin
sorted_records = sorted(annotation_records, key=lambda a: int(a.begin))
# now relabel based upon the sorted order
label_value = 1
for sorted_record in sorted_records:
sorted_record.label = 'T' + str(label_value)
label_value += 1
# now write the resulting file to disk
fo = open(out_filename, 'w')
for sorted_record in sorted_records:
fo.write(sorted_record.__repr__())
fo.close()
#format of .ann file is T# Type Start End Text
#args are input file, output file
if __name__ == '__main__':
parser = optparse.OptionParser(formatter=optparse.TitledHelpFormatter(),
usage=globals()['__doc__'],
version='$Id$')
parser.add_option ('-v', '--verbose', action='store_true',
default=False, help='verbose output')
(options, args) = parser.parse_args()
if len(args) < 2:
parser.error ('missing argument')
main(args[0], args[1])
sys.exit(0)
I have two files:
regions.txt: First column is the chromosome name, second and third are start and end position.
1 100 200
1 400 600
2 600 700
coverage.txt: First column is chromosome name, again second and third are start and end positions, and last column is the score.
1 100 101 5
1 101 102 7
1 103 105 8
2 600 601 10
2 601 602 15
This file is very huge it is about 15GB with about 300 million lines.
I basically want to get the mean of all scores in coverage.txt that are in each region in regions.txt.
In other words, start at the first line in regions.txt, if there is a line in coverage.txt which has the same chromosome, start-coverage is >= start-region, and end-coverage is <= end-region, then save its score to a new array. After finish searching in all coverages.txt print the region chromosome, start, end, and the mean of all scores that have been found.
Expected output:
1 100 200 14.6 which is (5+7+8)/3
1 400 600 0 no match at coverages.txt
2 600 700 12.5 which is (10+15)/2
I built the following MATLAB script which take very long time since I have to loop over coverage.txt many time. I don't know how to make a fast awk similar script.
My matlab script
fc = fopen('coverage.txt', 'r');
ft = fopen('regions.txt', 'r');
fw = fopen('out.txt', 'w');
while feof(ft) == 0
linet = fgetl(ft);
scant = textscan(linet, '%d%d%d');
tchr = scant{1};
tx = scant{2};
ty = scant{3};
coverages = [];
frewind(fc);
while feof(fc) == 0
linec = fgetl(fc);
scanc = textscan(linec, '%d%d%d%d');
cchr = scanc{1};
cx = scanc{2};
cy = scanc{3};
cov = scanc{4};
if (cchr == tchr) && (cx >= tx) && (cy <= ty)
coverages = cat(2, coverages, cov);
end
end
covmed = median(coverages);
fprintf(fw, '%d\t%d\t%d\t%d\n', tchr, tx, ty, covmed);
end
Any suggestions to make an alternative using AWK, Perl, or , ... etc I will aslo be pleased if someone can teach me how to get rid of all loops in my matlab script.
Thanks
Here is a Perl solution. I use hashes (aka dictionaries) to access the various ranges via the chromosome, thus reducing the number of loop iterations.
This is potentially efficient, as I don't do a full loop over regions.txt on every input line. Efficiency could perhaps be increased further when multithreading is used.
#!/usr/bin/perl
my ($rangefile) = #ARGV;
open my $rFH, '<', $rangefile or die "Can't open $rangefile";
# construct the ranges. The chromosome is used as range key.
my %ranges;
while (<$rFH>) {
chomp;
my #field = split /\s+/;
push #{$ranges{$field[0]}}, [#field[1,2], 0, 0];
}
close $rFH;
# iterate over all the input
while (my $line = <STDIN>) {
chomp $line;
my ($chrom, $lower, $upper, $value) = split /\s+/, $line;
# only loop over ranges with matching chromosome
foreach my $range (#{$ranges{$chrom}}) {
if ($$range[0] <= $lower and $upper <= $$range[1]) {
$$range[2]++;
$$range[3] += $value;
last; # break out of foreach early because ranges don't overlap
}
}
}
# create the report
foreach my $chrom (sort {$a <=> $b} keys %ranges) {
foreach my $range (#{$ranges{$chrom}}) {
my $value = $$range[2] ? $$range[3]/$$range[2] : 0;
printf "%d %d %d %.1f\n", $chrom, #$range[0,1], $value;
}
}
Example invocation:
$ perl script.pl regions.txt <coverage.txt >output.txt
Output on the example input:
1 100 200 6.7
1 400 600 0.0
2 600 700 12.5
(because (5+7+8)/3 = 6.66…)
Normally, I would load the files into R and calculate it, but given that one of them is so huge, this would become a problem. Here are some thoughts that might help you solving it.
Consider splitting coverage.txt by chromosomes. This would make the calculations less demanding.
Instead of looping over coverage.txt, you first read the regions.txt full into memory (I assume it is much smaller). For each region, you keep a score and a number.
Process coverage.txt line by line. For each line, you determine the chromosome and the region that this particular stretch belongs to. This will require some footwork, but if regions.txt is not too large, it might be more efficient. Add the score to the score of the region and increment number by one.
An alternative, most efficient way requires both files to be sorted first by chromosome, then by position.
Take a line from regions.txt. Record the chromosome and positions. If there is a line remaining from previous loop, go to 3.; otherwise go to 2.
Take a line from coverage.txt.
Check whether it is within the current region.
yes: add the score to the region, increment number. Move to 2.
no: divide score by number, write the current region to output, go to 1.
This last method requires some fine tuning, but will be most efficient -- it requires to go through each file only once and does not require to store almost anything in the memory.
Here's one way using join and awk. Run like:
join regions.txt coverage.txt | awk -f script.awk - regions.txt
Contents of script.awk:
FNR==NR && $4>=$2 && $5<=$3 {
sum[$1 FS $2 FS $3]+=$6
cnt[$1 FS $2 FS $3]++
next
}
{
if ($1 FS $2 FS $3 in sum) {
printf "%s %.1f\n", $0, sum[$1 FS $2 FS $3]/cnt[$1 FS $2 FS $3]
}
else if (NF == 3) {
print $0 " 0"
}
}
Results:
1 100 200 6.7
1 400 600 0
2 600 700 12.5
Alternatively, here's the one-liner:
join regions.txt coverage.txt | awk 'FNR==NR && $4>=$2 && $5<=$3 { sum[$1 FS $2 FS $3]+=$6; cnt[$1 FS $2 FS $3]++; next } { if ($1 FS $2 FS $3 in sum) printf "%s %.1f\n", $0, sum[$1 FS $2 FS $3]/cnt[$1 FS $2 FS $3]; else if (NF == 3) print $0 " 0" }' - regions.txt
Here is a simple MATLAB way to bin your coverage into regions:
% extract the regions extents
bins = regions(:,2:3)';
bins = bins(:);
% extract the coverage - only the start is needed
covs = coverage(:,2);
% use histc to place the coverage start into proper regions
% this line counts how many coverages there are in a region
% and assigns them proper region ids.
[h, i]= histc(covs(:), bins(:));
% sum the scores into correct regions (second output of histc gives this)
total = accumarray(i, coverage(:,4), [numel(bins),1]);
% average the score in regions (first output of histc is useful)
avg = total./h;
% remove every second entry - our regions are defined by start/end
avg = avg(1:2:end);
Now this works assuming that the regions are non-overlapping, but I guess that is the case. Also, every entry in the coverage file has to fall into some region.
Also, it is trivial to 'block' this approach over coverages, if you want to avoid reading in the whole file. You only need the bins, your regions file, which presumably is small. You can process the coverages in blocks, incrementally add to total and compute the average in the end.
I'm trying to convert Chinese lunar system to Gregorian using Perl, both for fun and for learning purposes. I thought there was some kind of easy mathematical algorithm to do the job but turned out I was wrong. Anyway, after a little googling around I came aross some ready SAS code that could do the job. Well, it was not too difficult for me to figure out how to rewrite most of the code in Perl. But there is something like this:
Convert2Gregorian = INTNX ('day', conDate, AddDay-1);
Google tells me that INTNX is a function that can return a date after which a certain date interval has been added. But using the keywords "Perl INTNX" gave me nothing useful. I then found a script written in Javascript, the general approach is about the same except that the Javascript uses Dateadd function, something like:
Return DateAdd(DateInterval.Day, AddDay - 1, conDate)
I also tried searching using PPM but was unable to find the module I wanted. Can someone kindly give me some pointers?
Thanks in advance :)
Update
Thanks to #psilva and #hobbs :)
Haha, finally I can translate one programming language to another. It's fun :)
Just for fun and maybe for later reference:
Here's the original SAS code and the Perl code translated by me. Correct me if I'm wrong :)
Note: Data incomplete.
SAS CODE IS AS FOLLOWS:
data DateNtoG (drop = Ngstrings conDate AddMonth AddDay Mrunyue I);
array NGlist{43} $18_TEMPORARY_(
"010110101010170130" /*(1968)*/
"010101101010000217" /*(1969)*/
"100101101101000206" /*(1970)*/
"010010101110150127" /*(1971)*/
);
input tYear tMonth tDay RunYue;
if (tyear >1967 and tyear<2011) then do;
NGstrings = NGlist(tYear - 1967);
conDate = MDY (substr (NGstrings,15,2),(NGstrings, 17,2), tYear);
AddMonth = tMonth;
select (substr(NGstrings, 14, 1));
when ("A" ) Mrunyue=10;
when ("B" ) Mrunyue=11;
when ( "C" ) Mrunyue=12;
otherwise Mrunyue = substr (NGstrings, 14,1);
end;
if ((RunYue=1) and (tMonth=Mrunyue) ANDMrunyue>0)or ((tMonth Mrunyue) AND Mrunyue>0) then
do;
Addmonth = tMonth+1;
end;
AddDay = tDay;
do i = 1 To AddMonth-1;
AddDay = AddDay + 29 + substr(NGstrings,i,1);
end;
dNtoG = INTNX ('day', conDate, AddDay - 1);
put "Transfereddate:" dNtoGdate9.;
end;
TRANSLATED Perl CODE IS AS FOLLOWS:
didn't handle the undefined situations for the moment
(I Changed the original SAS variable names)
#!perl
# A Chinese-Gregorian Calendar System Converter just for Testing
use Date::Calc qw(Add_Delta_Days);
use integer;
use strict;
use warnings;
$_ = shift #ARGV;
if (length == 8) {
$_.=0;
}
my ($Lunar_Year,$Lunar_Month,$Lunar_Day,$Leap_Month) = /(\d\d\d\d)(\d\d)(\d\d)(\d)/;
my %Lunar_Year_Patterns = qw/1968 010110101010170130 1969 010101101010000217 1970 100101101101000206 1971 010010101110150127/;
if (substr ($Lunar_Year_Patterns{$Lunar_Year},13,1) =~ /A/) {
$Leap_Month = 10;
} elsif (substr ($Lunar_Year_Patterns{$Lunar_Year},13,1)=~ /B/){
$Leap_Month = 11;
} elsif (substr ($Lunar_Year_Patterns{$Lunar_Year},13,1)=~ /C/){
$Leap_Month = 12;
} else {
$Leap_Month = substr($Lunar_Year_Patterns{$Lunar_Year},13,1);
}
my $First_Lunar_Day_In_Gregorian_Month = substr($Lunar_Year_Patterns{$Lunar_Year},14,2);
my $First_Lunar_Day_In_Gregorian_Day = substr($Lunar_Year_Patterns{$Lunar_Year},16,2);
my $AddMonth;
if ( (($Leap_Month ==1) && ($Lunar_Month == $Leap_Month) && ($Leap_Month > 0)) || (($Lunar_Month > $Leap_Month) && ($Leap_Month>0) ) ){
$AddMonth = $Lunar_Month +1;
} else {
$AddMonth = $Lunar_Month;
}
my $AddDay;
$AddDay = $Lunar_Day;
for(my $i = 1; $i <= $AddMonth - 1; $i++){
$AddDay = $AddDay +29 + substr($Lunar_Year_Patterns{$Lunar_Year},$i,1);
}
my #Gregorian = Add_Delta_Days($Lunar_Year,$First_Lunar_Day_In_Gregorian_Month,$First_Lunar_Day_In_Gregorian_Day,$AddDay -1);
print #Gregorian;
DateTime is the 800-pound gorilla of date handling. It's quite large, but it's large because it does a lot, and more importantly, it does it right.
With DateTime you would simply construct the DateTime object for the starting date, and then obtain the ending date by adding: $dt->add(days => $add_days).
Also, there's a DateTime::Calendar::Chinese that you can compare your results with, even if you want to reinvent this particular wheel for fun :)
Check out the delta functions in Date::Calc and Date::Calc::Object, and Date::Calc::PP. Specifically, you might want to look at the DateCalc_add_delta_days subroutine in the PP source.
You could also try looking at the source of Calendar::China.