Formatting rows of crystal report output - crystal-reports

I am a new to crystal report (2008) and need help on my formatting problem.
I have output sample as below in crystal report:
srNo Name ID assigned_number
==================================
1 aaa 111 1
2 bbb 222 2
3 ccc 333 3
4 ddd 444 23
5 fff 445 32
6 ggg 432 1
7 ffr 435 2
8 rty 654 43
9 ttt 434 33
10 trt 343 1
11 rre 346 2
12 gth 543 3
13 fgr 644 54
14 yyy 431 2
15 tut 323 3
16 hyj 777 4
17 juu 322 32
Have a look on last column assigned_number, here I want to highlight the row values (with row color) whenever the last column values are 1, 2, 3 consecutively (not 1, 2 or 2, 3).
So, here srNo 1 to 3 and 10 to 12 should be highlighted with row color as the last column values are 1,2,3(consecutively).
Let me know if it's not clear.
Thanks

You right-click on the field in your assigned_number column and choose Format Field. Then in the Border tab you check the Background box and enter a conditional formula under the "x+2" icon next to Background.
The formula is a little tricky. I have not tested this but it could go something like:
if previous ({assigned_number}) = 1 and
next({assigned_number}) = 3 then crRed
else crWhite
This will color the row with the 2 in it. Unfortunately "next" and "previous" are only limited to one record each way, so for 1 and 3 that won't work.
EDIT:
This formula will work but also highlight 1,2 and 2,3 combos. Even with a formula trying to get the previous 2 records ( the 1,2 when you're at 3) or next 2 (2,3 when you're at 1) doesn't work.
if {assigned_number} in [1, 2, 3] and
previous({assigned_number}) = 1 and
next({assigned_number}) = 3 or
{assigned_number} = 1 and
next({assigned_number}) = 2 or
{assigned_number} = 3 and
previous({assigned_number}) = 2
then crRed
else crWhite

If Right(assigned_number) in [1,2,3]
Then crred
else crwhite.
now you can extend this formula to the any number of values.

Related

Plot selected rows with the average and standard deviation (GNUPlot)

I have a csv file with experiment results that goes like this:
64 4 8 1 1 2 1 ttt 62391 4055430 333 0.0001 10 161 108 288 0
64 4 8 1 1 2 1 ttt 60966 3962810 322 0.0001 10 164 112 295 0
64 4 8 1 1 2 1 ttt 61530 3999475 325 0.0001 10 162 112 291 0
64 4 8 1 1 2 1 ttt 61430 4054428 332 0.0001 10 158 110 286 0
64 4 8 1 1 2 1 ttt 63891 4152938 339 0.0001 9 149 109 274 0
64 4 32 1 1 2 1 ttt 63699 4204182 345 0.0001 4 43 179 240 0
64 4 32 1 1 2 1 ttt 63326 4116218 336 0.0001 4 45 183 248 0
64 4 32 1 1 2 1 ttt 62654 4135211 340 0.0001 4 48 178 248 0
64 4 32 1 1 2 1 ttt 63192 4107506 339 0.0001 4 49 175 245 0
64 4 32 1 1 2 1 ttt 62707 4138666 345 0.0001 4 46 179 245 0
64 4 64 1 1 2 1 ttt 60968 3962929 323 0.0001 4 46 191 256 0
64 4 64 1 1 2 1 ttt 58765 3819787 305 0.0001 4 50 196 267 0
64 4 64 1 1 2 1 ttt 58946 3831499 308 0.0001 5 52 187 260 0
64 4 64 1 1 2 1 ttt 60646 3942047 321 0.0001 4 47 187 254 0
64 4 64 1 1 2 1 ttt 59723 3882044 311 0.0001 4 46 201 269 0
64 8 8 1 1 2 1 ttt 63414 4185382 382 0.0001 33 517 109 643 0
64 8 8 1 1 2 1 ttt 62429 4057899 372 0.0001 33 538 110 667 0
64 8 8 1 1 2 1 ttt 60622 3940452 384 0.0001 33 556 115 689 0
64 8 8 1 1 2 1 ttt 64433 4188192 369 0.0001 33 519 110 644 0
My goal is to be able to plot various combinations (choose which, in different charts) of the columns before the "ttt", with the average and standard deviation of the columns (choose which) after "ttt" (by grouping them by the before "ttt" columns).
Is this possible in GNUPlot and if yes how? If not, do you have any alternate suggestions regarding my problem?
Here is a completely revised and more general version.
Since you want to filter by 3 columns you need to have 3 properties to distinguish the data in the plot. This would be for example color, x-position and pointtype. What the script basically does:
Generates random data for testing (take your file instead)
$Data looks like this:
8 64 57773 0
4 32 64721 2
8 32 56757 1
4 16 56226 2
8 8 56055 1
8 64 59874 0
8 32 58733 0
4 16 55525 2
8 32 58869 0
8 64 64470 0
4 32 60930 1
8 64 57073 2
...
the variables ColX, ColC, ColP, and ColS define which columns are taken for x-position, color, pointtype and statistics.
find unique values of ColX, ColC, ColP, (check help smooth frequency) and put them to datablocks $ColX, $ColC, and $ColP.
put the unique values to arrays ArrX, ArrC, ArrP
loop all possible combinations and do statistics on ColS and put it to $Data2. Add 3 columns at the beginning for color, x-position and pointtype.
$Data2 looks like this:
1 1 1 0 8 4 61639.4 2788.4
1 1 2 0 8 8 59282.1 2740.2
1 2 1 0 16 4 59372.3 2808.6
1 2 2 0 16 8 60502.3 2825.0
1 3 1 0 32 4 59850.7 2603.8
1 3 2 0 32 8 60617.7 1979.8
1 4 1 0 64 4 60399.4 3273.6
1 4 2 0 64 8 59930.7 2919.8
2 1 1 1 8 4 59172.6 2288.2
2 1 2 1 8 8 58992.2 2888.0
2 2 1 1 16 4 59350.1 2364.6
2 2 2 1 16 8 61034.0 2368.5
2 3 1 1 32 4 59920.8 2867.6
2 3 2 1 32 8 59711.9 3464.2
2 4 1 1 64 4 60936.7 3439.7
2 4 2 1 64 8 61078.7 2349.3
3 1 1 2 8 4 58976.0 2376.3
3 1 2 2 8 8 61731.5 1635.7
3 2 1 2 16 4 58276.0 2101.7
3 2 2 2 16 8 58594.5 3358.5
3 3 1 2 32 4 60471.5 3737.6
3 3 2 2 32 8 59909.1 2024.0
3 4 1 2 64 4 62044.2 1446.7
3 4 2 2 64 8 60454.0 3215.1
Finally, plot the data. I couldn't figure out how plotting style with yerror works properly together with variable pointtypes. So, instead I split it into two plot commands with vectors and with points. The third one keyentry is just to get an empty line in the legend and the forth one is to get the pointtype into the legend.
I hope you can figure out all the other details and adapt it to your data.
Code:
### grouped statistics on filtered (unsorted) data
reset session
set colorsequence classic
# generate some random test data
rand1(n) = 2**(int(rand(0)*2)+2) # values 4,8
rand2(n) = 2**(int(rand(0)*4)+3) # values 8,16,32,64
rand3(n) = int(rand(0)*10000)+55000 # values 55000 to 65000
rand4(n) = int(rand(0)*3) # values 0,1,2
set print $Data
do for [i=1:200] {
print sprintf("% 3d% 4d% 7d% 3d", rand1(0), rand2(0), rand3(0), rand4(0))
}
set print
print $Data # (just for test purpose)
ColX = 2 # column for x
ColC = 4 # column for color
ColP = 1 # column for pointtype
ColS = 3 # column for statistics
# get unique values of the columns
set table $ColX
plot $Data u (column(ColX)) smooth freq
unset table
set table $ColC
plot $Data u (column(ColC)) smooth freq
unset table
set table $ColP
plot $Data u (column(ColP)) smooth freq
unset table
# put unique values into arrays
set table $Dummy
array ArrX[|$ColX|-6] # gnuplot creates 6 extra lines
array ArrC[|$ColC|-6]
array ArrP[|$ColP|-6]
plot $ColX u (ArrX[$0+1]=$1)
plot $ColC u (ArrC[$0+1]=$1)
plot $ColP u (ArrP[$0+1]=$1)
unset table
print ArrX, ArrC, ArrP # just for test purpose
# define filter function
Filter(c,x,p) = ArrX[x]==column(ColX) && ArrC[c]==column(ColC) && \
ArrP[p]==column(ColP) ? column(ColS) : NaN
# loop all values and do statistics, write data into $Data2
set print $Data2
do for [c=1:|ArrC|] {
do for [x=1:|ArrX|] {
do for [p=1:|ArrP|] {
undef var STATS*
stats $Data u (Filter(c,x,p)) nooutput
if (exists('STATS_mean') && exists('STATS_stddev')) {
print sprintf("% 3d% 3d% 3d% 3d% 3d% 3d% 9.1f % 7.1f", c, x, p, ArrC[c], ArrX[x], ArrP[p], STATS_mean, STATS_stddev)
}
}
}
print ""; print ""
}
set print
# print $Data2 # just for testing purpose
set xlabel sprintf("Column %d", ColX)
set ylabel sprintf("Column %d", ColS)
set xrange[0.5:|ArrX|+1]
set xtics () # remove all xtics
do for [x=1:|ArrX|] { set xtics add (sprintf("%d",ArrX[x]) x)} # set xtics "manually"
# function for x position and offsets,
# actually not dependent on 'n' but to shorten plot command
# columns in $Data2: 1=color, 2=x, 3=pointtype
width = 0.5 # float number!
step = width/(|ArrC|-1)
PosX(n) = column(2) - width/2.0 + step*(column(1)-1) + (column(3)-1)*step*0.3
plot \
for [c=1:|ArrC|] $Data2 u (PosX(0)):($7-$8):(0):(2*$8) index c-1 w vectors \
heads size 0.04,90 lw 2 lc c ti sprintf("%g",ArrC[c]),\
for [c=1:|ArrC|] '' u (PosX(0)):7:($3*2+4):(c) index c-1 w p ps 1.5 pt var lc var not, \
keyentry w p ps 0 ti "\n", \
for [p=1:|ArrP|] '' u (0):(NaN) w p pt p*2+4 ps 1.5 lc rgb "black" ti sprintf("%g",ArrP[p])
### end of code
Result:
I do not think gnuplot can produce exactly what you are asking for in a single plot command. I will show you two alternatives in the hope that one or both is a useful starting point.
Alternative 1: standard boxplot
spacing = 1.0
width = 0.25
unset key
set xlabel "Column 3"
set ylabel "Column 9"
plot 'data' using (spacing):9:(width):3 with boxplot lw 2
This collects points based on the value in column 3 and for each such value it produces a boxplot. This is a widely used method of showing the distribution of point values in a category, but it is a quartile analysis not a display of mean + standard deviation.
Alternative 2: calculate mean and standard deviation for categories known in advance
The boxplot analysis has the advantage that you do not need to know in advance what values may be present in column 3. Gnuplot can calculate mean and standard deviation based on a column 3 value, but you need to specify in advance what that value is. Here is a set of commands tailored to the specific example file you provided. It calculates, but does not plot, the requested categorical mean and standard deviation. You can use these numbers to construct a plot, but that will require additional commands. You could, for example, save the values for each category in a new file, or array, or datablock and then go back and plot these together.
col3entry = "8 32 64"
do for [i in col3entry] {
stats "data" using ($3 == real(i) ? $9 : NaN) name "Condition".i nooutput
print i, ": ", value("Condition".i."_mean"), value("Condition".i."_stddev")
}
output:
8: 62345.1111111111 1259.34784220021
32: 63115.6 392.552977316438
64: 59809.6 881.583711283279

how to look back on rows until criteria matched

Consider the following sheet example:
A1 A2
1 5 10
2 6 12
3 -3 9
4 1 10
5 5 15
6 -4 11
7 9 20
How do I look back from row 6 and sum all A2 rows until a previous negative A1 row.
In this example: 15 + 10 = 25
Assuming -3 is in A3, in C4 and copied down to suit:
=IF(A3<0,0,C3+B3)
This creates a running total, starting immediately after the first negative in the left hand column, that resets after each negative in the left hand column.

Matlab : how to read a constant width text file and turn it into a matrix?

i have a ASCII text file each row has format
------------------------------
Variable Columns Type
------------------------------
ID 1-11 Character
YEAR 12-15 Integer
MONTH 16-17 Integer
ELEMENT 18-21 Character
VALUE1 22-26 Integer
MFLAG1 27-27 Character
QFLAG1 28-28 Character
SFLAG1 29-29 Character
VALUE2 30-34 Integer
MFLAG2 35-35 Character
QFLAG2 36-36 Character
SFLAG2 37-37 Character
. . .
. . .
. . .
VALUE31 262-266 Integer
MFLAG31 267-267 Character
QFLAG31 268-268 Character
SFLAG31 269-269 Character
------------------------------
i only need variables "year" "month" "element" and "valuei" i = 1,2,...,31 (there are 31 values in each row)
parameters (like MFLAGi) can have a character in their place or white-space .
also value might not fill all of it's space with numbers so there can be space.
two sample lines from my text file
USC00190736189301TMAX 33 6 117 6 0 I6 -89 6 -28 6 -83 6 -67 6 -67 6 -28 6 -6 6 -139 6 -111 6 -117 6 -89 6 -106 6 -111 6 -106 6 -106 6 -39 6 -78 6 -61 6 -33 6 -6 6 6 6 39 6 28 6 6 6 -61 6 61 6 56 6 0 6
USC00190736189301TMIN -56 6 11 I6 -106 6 -161 6 -106 6 -133 6 -144 6 -117 6 -161 6 -156 6 -206 6 -183 6 -161 6 -161 6 -139 6 -178 6 -189 6 -161 6 -133 6 -150 6 -156 6 -156 6 -100 6 -50 6 -39 6 -67 6 -78 6 -111 6 -94 6 -33 6 -50 6
for example in line 1 value1 has only used 2 out of it's 5 spaces (' 33')
and both MFLAG1 and QFLAG1 are white-space .
i want to put "year" "month" "element" and "valuei" in a matrix and depending on the "element" value choose some of the rows and make my final matrix how can i do that ?
what i have thought of :
%open file
fid = fopen('myt.txt')
% read from file
%'whitespace','' do not overlook white spaces in counting
C = textscan(fid , formatspec ,'whitespace','')
i have two problems with this:
the formatspec i think should be
'%*11c %4d %2d %4c %5d %*3c'
ignore year month element valuei ignore
------------------
repeat this part 31 times
how can i repeat that part 31 times and concat all the parts together ?
i end up having a cell array C since "element" is a string i can't change it into a matrix. apparently C is column by column and each column is a whole string . then how can i access the read data row by row to select the rows i need (according to the value of "element") ?
am I using the wrong method to do what i want ? what should i do ?
for (1), you can use repmat:
idspec = ['%*11c %4d %2d %4c '];
valuespec = repmat('%5d %*3c',[1 31]);
filespec = [idspec valuespec];
(or something similar)
for (2), I can see a few options:
a) You could read the file twice, once ignoring the character column, and using the 'collectoutput' option, so that C would basically contain a matrix. You can read again by ignoring everything but ELEMENT, so that C would have the remaining info.
b) Using 'collectoutput', you'd have C with the year a month, then the ELEMENT, and then the rest.

How to use the merge function to merge the common values in two DataFrames?

I have two DataFrames, I want to merge on the column "Id"
df1 :
Id Reputation
1 10
3 5
4 40
df2 :
Id Reputation
1 10
2 5
3 5
6 55
I want the output to be:
dfOutput :
Id Reputation
1 10
2 5
3 5
4 40
6 55
I wish to keep all values from both the df s but merge the duplicate values into one. I know I have to use the merge() function but I don't know what arguments to pass.
You could concatenate the DataFrames, groupby Id, and then aggregate by taking the first item in each group.
In [62]: pd.concat([df1,df2]).groupby('Id').first()
Out[62]:
Reputation
Id
1 10
2 5
3 5
4 40
6 55
[5 rows x 1 columns]
Or, to preserve Id as a column rather than an index, use as_index=False:
In [68]: pd.concat([df1,df2]).groupby('Id', as_index=False).first()
Out[68]:
Id Reputation
0 1 10
1 2 5
2 3 5
3 4 40
4 6 55
[5 rows x 2 columns]
KarlD. suggests an excellent idea; use combine_first:
In [99]: df1.set_index('Id').combine_first(df2.set_index('Id')).reset_index()
Out[99]:
Id Reputation
0 1 10
1 2 5
2 3 5
3 4 40
4 6 55
[5 rows x 2 columns]
This solution appears to be faster for large DataFrames:
import pandas as pd
import numpy as np
N = 10**6
df1 = pd.DataFrame({'Id':np.arange(N), 'Reputation': np.random.randint(5, size=N)})
df2 = pd.DataFrame({'Id':np.arange(10, 10+N), 'Reputation':np.random.randint(5, size=N)})
In [95]: %timeit df1.set_index('Id').combine_first(df2.set_index('Id')).reset_index()
10 loops, best of 3: 174 ms per loop
In [96]: %timeit pd.concat([df1,df2]).groupby('Id', as_index=False).first()
1 loops, best of 3: 221 ms per loop

sorting a timer in matlab

ok it seems like a simple problem, but i am having problem
I have a timer for each data set which resets improperly and as a result my timing gets mixed.
Any ideas to correct it? without losing any data.
Example
timer col ideally should be
timer , mine reads
1 3
2 4
3 5
4 6
5 1
6 2
how do i change the colum 2 or make a new colum which reads like colum 1 without changing the order of ther rows which have data
this is just a example as my file lengths are 86000 long , also i have missing timers which i do not want to miss , this imples no data for that period of time.
thanks
EDIT: I do not want to change the other columns. The coulm 1 is the gps counter and so it does not sync with the comp timer due to some other issues. I just want to change the row one such that it goes from high to low without effecting other rows. also take care of missing pts ( if i did not care for missing pts simple n=1: max would work.
missing data in this case is indicated by missing timer. for example i have 4,5,8,9 with missing 6,7
Ok let me try to edit agian
its a 8600x 80 matrix of data:
timer is one row which should go from 0 to 8600
but timer starts at odd times , so i have start of data from middle , lets say 3400, so in the middle of day my timer goes to 0 and then back to 1.
but my other rows are fine. I just need 2 plot other sets based on timer as time.
i cannot use T= 1:length(file) as then it ignores missed time stamps ( timers )
for example my data reads like
timer , mine reads
1 3
2 4
3 5
4 8
5 9
8 1
9 2
so u can see time stamps 6,7 are missing.
if i used n=1:length(file)
i would have got
1 2 3 4 5 6 7
which is wrong
i want
1 2 3 4 5 8 9
without changing the order of other rows , so i cannot use sort for the whole file.
I assume the following problem
data says
3 100
4 101
5 102
NaN 0
1 104
2 105
You want
1 100
2 101
3 102
NaN 0
4 104
5 105
I'd solve the problem like this:
%# create test data
data = [3 100
4 101
5 102
NaN 0
1 104
2 105];
%# find good rows (if missing data are indicated by zeros, use
%# goodRows = data(:,1) > 0;
goodRows = isfinite(data(:,1));
%# count good rows
nGoodRows = sum(goodRows);
%# replace the first column with sequential numbers, but only in good rows
data(goodRows,1) = 1:nGoodRows;
data =
1 100
2 101
3 102
NaN 0
4 104
5 105
EDIT 1
Maybe I understand your question this time
data says
4 101
5 102
1 104
2 105
You want
1 4 101
2 5 102
4 1 104
5 2 105
This can be achieved the following way
%# test data
data = [4 101
5 102
1 104
2 105];
%# use sort to get the correct order of the numbers and add it to the left of data
out = [sort(data(:,1)),data]
out =
1 4 101
2 5 102
4 1 104
5 2 105
EDIT 2
Note that out is the result from the solution in EDIT 1
It seems you want to plot the data so that there is no entry for missing values. One way to do this is to make a plot with dots - there won't be a dot for missing data.
plot(out(:,1),out(:,3),'.')
If you want to plot a line that is interrupted, you have to insert NaNs into out
%# create outNaN, that has NaN-rows for missing entries
outNaN = NaN(max(out(:,1)),size(out,2));
outNaN(out(:,1),:) = out;
%# plot
plot(out(:,1),out(:,3))