Python pandas group by aggregate statement print - eclipse

I am new to Pandas and I want to use group by statement. it worked but I am enable to print after group by function.
I am using Eclipse IDE
here is my code
import pandas as pd
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C' : [1,2,3,4,5,6,0,2]})
grouped = df.groupby('C')
print grouped
and i get this as ouptut: <pandas.core.groupby.DataFrameGroupBy object at 0x02FCA7D0>
My question is how can i print grouped variable with correct answer

I've always forced it to print using .head()
grouped.head()
What is happening is you are grouping, but not aggregating the result.
grouped = df.groupby('A')
grouped
grouped_sum = df.groupby('A').sum()
grouped_sum
C
A
bar 12
foo 11

Related

Return multiple output from stored proc in perl DBI returns extra 0 value [duplicate]

I'm beginner in sql. I have created the procedure as follows
create procedure testprocedure2 as
select 'one'
select 'three'
select 'five'
When I execute query into the database It shows the three result one three five. sql query is exec TEST_ABC_DB.dbo.testprocedure2
When I run the same query into the Perl it gives only one record which is one
$sth = $dbh->prepare("exec TEST_ABC_DB.dbo.testprocedure2");
$sth->execute();
while (#row= $sth->fetchrow_array())
{
print $row[0]."\t";
print "\n";
}
I don't know what is the problem. How can I fix it? I hope this answer will help in yesterday's question
Through the driver (e.g. DBD::ODBC)
Since you're using DBD::ODBC, you can use more_results provided by that driver to get the results of multiple queries in one execute.
This is the example they show in the documentation.
do {
my #row;
while (#row = $sth->fetchrow_array()) {
# do stuff here
}
} while ($sth->{odbc_more_results});
If we want to do this with your example queries, it's pretty much the same. You run your stored procedure, and then proceed with the do {} while construct (note that this is not a block, you cannot next out of it!).
my $sth = $dbh->prepare("exec TEST_ABC_DB.dbo.testprocedure2");
$sth->execute;
do {
while (my #row = $sth->fetchrow_array()) {
print $row[0]."\t";
print "\n";
}
} while ($sth->{odbc_more_results});
This should print your expected result.
one
three
five
Some other drivers also provide this. If they do, you can call $sth->more_results instead of using the internals as described below.
Workaround if your driver doesn't support this
There is no way for DBI itself to return the result of multiple queries at once. You can run them, but you cannot get the results.
If you really need three separate queries in your procedure and want all of the results, the answers by Shakheer and Shahzad to use a UNION are spot on.
However, your example is probably contrived. You probably don't have the same amount of columns in each of those queries, and you need to distinguish the results of each of the queries.
We have to change SQL and Perl code for this.
To get that to work, you can insert additional rows that you can later use to map each stack of results to each query.
Let's say the procedure looks like this:
create procedure testprocedure3 as
select 'one'
select 'three', 'three', 'three'
select 'five', 'five', 'five', 'five', 'five'
This is still just one row per query, but it should do as an example. With the UNION approach, it first becomes this:
create procedure testprocedure3 as
select 'one'
union all
select 'three', 'three', 'three'
union all
select 'five', 'five', 'five', 'five', 'five'
If you run this, it might fail. In ANSI SQL a UNION needs to have the same number of columns in all its queries, so I assume SQLServer also wants this. We need to fill them up with NULLs. Add them to all the queries so they match the number of columns in the one with the largest number of columns.
create procedure testprocedure3 as
select 'one', NULL, NULL, NULL, NULL
union all
select 'three', 'three', 'three', NULL, NULL
union all
select 'five', 'five', 'five', 'five', 'five'
If we now loop over it in Perl with the following code, we'll get something back.
use Data::Dumper;
my $sth = $dbh->prepare("exec TEST_ABC_DB.dbo.testprocedure3");
$sth->execute;
while ( my $row = $sth->fetchrow_arrayref ) {
print Dumper $row;
}
We'll see output similar to this (I didn't run the code, but wrote the output manually):
$VAR1 = [ 'one', undef, undef, undef, undef ];
$VAR1 = [ 'three', 'three', 'three', undef, undef ];
$VAR1 = [ 'five', 'five', 'five', 'five', 'five' ];
We have no way of knowing which line belongs to which part of the query. So let's insert a delimiter.
create procedure testprocedure3 as
select 'one', NULL, NULL, NULL, NULL
union all
select '-', '-', '-', '-', '-'
union all
select 'three', 'three', 'three', NULL, NULL
union all
select '-', '-', '-', '-', '-'
union all
select 'five', 'five', 'five', 'five', 'five'
Now the result of the Perl code will look as follows:
$VAR1 = [ 'one', undef, undef, undef, undef ];
$VAR1 = [ '-', '-', '-', '-', '-' ];
$VAR1 = [ 'three', 'three', 'three', undef, undef ];
$VAR1 = [ '-', '-', '-', '-', '-' ];
$VAR1 = [ 'five', 'five', 'five', 'five', 'five' ];
This might not be the best choice of delimiter, but it nicely illustrates what I am planning to do. All we have to do now is split this into separate results.
use Data::Dumper;
my #query_results;
my $query_index = 0;
my $sth = $dbh->prepare("exec TEST_ABC_DB.dbo.testprocedure3");
$sth->execute;
while ( my $row = $sth->fetchrow_arrayref ) {
# move to the next query if we hit the delimiter
if ( join( q{}, #$row ) eq q{-----} ) {
$query_index++;
next;
}
push #{ $query_results[$query_index] }, $row;
}
print Dumper \#query_results;
I've defined two new variables. #query_results holds all the results, sorted by query number. $query_index is the index for that array. It starts with 0.
We iterate all the resulting rows. It's important that $row is lexical here. It must be created with my in the loop head. (You are using use strict, right?) If we see the delimiter, we increment the $query_index and move on. If we don't we have a regular result line, so we stick that into our #query_results array within the current query's index.
The overall result is an array with arrays of arrays in it.
$VAR1 = [
[
[ 'one', undef, undef, undef, undef ]
],
[
[ 'three', 'three', 'three', undef, undef ]
],
[
[ 'five', 'five', 'five', 'five', 'five' ]
],
];
If you have actual queries that return many rows this starts making a lot of sense.
Of course you don't have to store all the results. You can also just work with the results of each query directly in your loop.
Disclaimer: I've run none of the code in this answer as I don't have access to an SQLServer. It might contain syntax errors in the Perl as well as the SQL. But it does demonstrate the approach.
The procedure you created is returning 3 result sets. And you are capturing only 1 result. If you are not bother about sets, make them as single result with UNION ALL
create procedure testprocedure2 as
select 'one'
union all
select 'three'
union all
select 'five'
Edit:
If you want to capture multiple resultsets returned from stored procedure, here is a good example explained with MySQL database Multiple data sets in MySQL stored procedures
simple use union all like this then only one table is shown with data.

Perl DBI 2 dynamic arrays in 1 query

I currently have a webpage with 2 multiselect boxes that returns 2 different strings, which will be used in my SQL queries.
I am currently only using 1 string in my queries, but wish to add another and am unsure of where to go from here.
I have the string being created into an array
#sitetemp = split ',', $siteselect;
my $params = join ', ' => ('?') x #sitetemp;
and am able to use a query with $params
$mysql_inquire = "SELECT starttime, SUM(duration) FROM DB WHERE feature = \"$key\" and starttime >= $start and starttime <= $end and site IN ($params) group by starttime order by starttime";
$sth = $DBH->prepare($mysql_inquire);
$sth->execute(#sitetemp);
Essentially my question is how could I do the same thing, using 2 different arrays?
I assume the line $sth->execute(#sitetemp, #otherarray); would not work.
Your approach will work.
You can pass as many arrays into a function as you want. Arrays are just lists of values.
Consider the following example:
sub foo {
print Dumper \#_;
}
my #a = ( 1, 2, 3 );
my #b = ( 4, 5, 6 );
foo( #a, #b, ( 7, (8), 9, ( ( (10) ) ) ) );
This will print:
$VAR1 = [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10
];
When you say foo(#a, #b) it will just evaluate each array into a list. And you can combine as many lists as you want, they will always be flattened.

How to write hash key and value pairs into MongoDB documents as field values in Perl?

my %Hash= (2012=> 1, 1982=>12, 2010=>0);
The has key and values need to be all on the same field name 'time' like an array
$mycollection->insert(
{
'field1' => $var1;
'field2' => $var2;
#right here I need to know how to add above hash key and values
# like below
#'time': ["2012.1","1982.12","2010.0"]
}
);
Any suggestions or ideas will be apprecieated. This can probably accomplished by doing series of update statements but I would like to accomplish this with one insert statement due to my requirement.
I suppose your %Hash variable is something like this:
my %Hash= (2012=> 1, 1982=>12, 2010=>0);
So your array "time" is build this way:
my #time = map { $_ . "." . $Hash{$_} } keys %Hash;
and finally:
$mycollection->insert({
'field1' => $var1,
'field2' => $var2,
'time' => \#time
});

Perl DBI Oracle not preserving column order after SELECT

I'm using Perl v5.12.3 built by ActiveState on Windows. DBD::Oracle version 1.27. DBI version 1.616. I'm selecting the data below in this particular order and wanting the resulting data to be retrieved in that same order. Below are the code samples and some examples.
SQL Snippet (contents of $report_sql below)
select student_number, lastfirst, counselor,
a.dateenrolled as "Date Enrolled 1", a.dateleft as "Date Left 1", a.termid as "Term ID 1", a.course_number as "Course Number 1",
a.expression as "Expression 1", b.dateenrolled as "Date Enrolled 2", b.dateleft as "Date Left 2",
b.termid as "Term ID 2", b.course_number as "Course Number 2", b.expression as "Expression 2"
Perl code snippet
## run the resulting query
my $report_result = $dbh->prepare( $report_sql );
$report_result->execute();
while( my $report_row = $report_result->fetchrow_hashref())
{
print Dumper(\$report_row); ## contents of this posted below
Contents of print Dumper for $report_row
$VAR1 = \{
'Expression 2' => 'x',
'LASTFIRST' => 'xx',
'Term ID 1' => 'xx',
'Date Enrolled 2' => 'xx',
'Course Number 1' => 'xx',
'Term ID 2' => 'xx',
'STUDENT_NUMBER' => 'xx',
'Date Left 2' => 'xx',
'Expression 1' => 'xx',
'COUNSELOR' => 'xx',
'Date Left 1' => 'xx',
'Course Number 2' => 'xx',
'Date Enrolled 1' => 'xx'
};
Order I EXPECTED it to be printed
$VAR1 = \{
'STUDENT_NUMBER' => 'xx',
'LASTFIRST' => 'xx',
'COUNSELOR' => 'xx',
'Date Enrolled 1' => 'xx',
'Date Left 1' => 'xx',
'Term ID 1' => 'xx',
'Course Number 1' => 'xx',
'Expression 1' => 'xx',
'Date Enrolled 2' => 'xx',
'Date Left 2' => 'xx',
'Term ID 2' => 'xx',
'Course Number 2' => 'xx',
'Expression 2' => 'x'
};
One thing to note is that this query being ran is one of many that are being ran. This particular script runs through a series of queries and generates reports based on the returned results. The queries are stored in files on the hd alongside the perl script. The queries are read in and then ran. It's not always the same columns being selected.
You used a hash. Hash elements have no controllable order*. The order of elememts in an arrays can be controlled. If you want to present the order in which the fields were received, use an array instead of hash.
If you actually need the names, you can get the ordered names of the fields using #{ $sth->{NAME} }. You should still use an array for efficiency reasons, but you could use a hash if you wanted to.
* — Just like array elements are returned in the order they are "physically" organised in the array, hash elements are returned in the order they are physically organised in the hash. You cannot control where an element is physically placed in a hash, and the position changes as the hash is changed. With an array, you decide the physical position of an element, and it will remain in that position.
When the order of columns in a DBI result matters you can get the column names and values as array references.
...
my $names = $report_result->{NAME}; # or NAME_lc or NAME_uc
while( my $report_row = $report_result->fetchrow_arrayref() ) {
for my $col_idx ( 0 .. $#{$names} ) {
print "$names->[$col_idx]: $report_row->[$col_idx]\n";
}
}
Back before I had to worry about internationalization I used this a lot to generate CSV reports, just pass the NAME array to Text::CSV before passing the result arrays and writing a report just becomes writing a query.

Having an SQL SELECT query, how do I get number of items?

I'm writing a web app in Perl using Dancer framework. The database is in sqlite and I use DBI for database interaction.
I'm fine with select statements, but I wonder is there a way to count selected rows.
E.g. I have
get '/' => sub {
my $content = database->prepare(sprintf("SELECT * FROM content LIMIT %d",
$CONTNUM));
$content->execute;
print(Dumper($content->fetchall_arrayref));
};
How do I count all items in the result without issuing another query?
What I want to achieve this way is showing 30 items per page and knowing how many pages there would be. Of course I can run SELECT COUNT (*) foo bar, but it looks wrong and redundant to me. I'm looking for a more or less general, DRY and not too heavy on database way to do so.
Any SQL or Perl hack or a hint what should I read about would be appreciated.
// I know using string concatenation for querys is bad
You have to do it the hard way: one query to get the count and another to get your desired slice of the row set:
my $count = $database->prepare('SELECT COUNT(*) FROM content');
$count->execute();
my $n = $count->fetchall_arrayref()->[0][0];
my $content = $database->prepare('SELECT * FROM content LIMIT ?');
$content->execute($CONTNUM);
#...
Not too familiar with perl, but I assume you can just store the result of $content->fetchall_arrayref and retrieve the count from that array befor you print it.
[edit]
Something like
my $ref = $content->fetchall_arrayref;
my $count = scalar(#$ref);
Don't use sqlite myself but the following might work:
select * from table join (select count(*) from table);
Whether the above works or not the first thing I'd look for is scrollable cursors if you are going to page through results - I doubt sqlite has those. However, in DBI you can use fetchall_arrayref with a max_rows to fetch a "page" at a time. Just look up the example in the DBI docs under fetchall_arrayref - it is something like this:
my $rowcache = [];
while( my $row = ( shift(#$rowcache) || shift(#{$rowcache=$sth->fetchall_arrayref(undef,100)||[]}) )
) {
# do something here
}
UPDATE: Added what you'd get with selectall_hashref assuming the table is called content with one integer column called "a":
$ perl -le 'use DBI; my $h = DBI->connect("dbi:SQLite:dbname=fred.db"); my $r = $h->selectall_hashref(q/select * from content join (select count(*) as count from content)/, "a");use Data::Dumper;print Dumper($r);'
$VAR1 = {
'1' => {
'count' => '3',
'a' => '1'
},
'3' => {
'count' => '3',
'a' => '3'
},
'2' => {
'count' => '3',
'a' => '2'
}
};
If you want to know how many results there will be, as well as getting the results themselves, all in one query, then get the count as a new value:
SELECT COUNT(*) AS num_rows, * from Table WHERE ...
Now the row count will be the first column of every row of your resultset, so simply pop that off before presenting the data.