Related
I have a single trained classifier tested on 2 related multiclass classification tasks. As each trial of the classification tasks are related, the 2 sets of predictions constitute paired data. I would like to run a paired permutation test to find out if the difference in classification accuracy between the 2 prediction sets is significant.
So my data consists of 2 lists of predicted classes, where each prediction is related to the prediction in the other test set at the same index.
Example:
actual_classes = [1, 3, 6, 1, 22, 1, 11, 12, 9, 2]
predictions1 = [1, 3, 6, 1, 22, 1, 11, 12, 9 10] # 90% acc.
predictions2 = [1, 3, 7, 10, 22, 1, 7, 12, 2, 10] # 50% acc.
H0: There is no significant difference in classification accuracy.
How do I go about running a paired permutation test to test significance of the difference in classification accuracy?
I have been thinking about this and I'm going to post a proposed solution and see if someone approves or explains why I'm wrong.
actual_classes = [1, 3, 6, 1, 22, 1, 11, 12, 9, 2]
predictions1 = [1, 3, 6, 1, 22, 1, 11, 12, 9 10] # 90% acc.
predictions2 = [1, 3, 7, 10, 22, 1, 7, 12, 2, 10] # 50% acc.
paired_predictions = [[1,1], [3,3], [6,7], [1,10], [22,22], [1,1], [11,7], [12,12], [9,2], [10,10]]
actual_test_statistic = predictions1 - predictions2 # 90%-50%=40 # 0.9-0.5=0.4
all_simulations = [] # empty list
for number_of_iterations:
shuffle(paired_predictions) # only shuffle between pairs, not within
simulated_predictions1 = paired_predictions[first prediction of each pair]
simulated_predictions2 = paired_predictions[second prediction of each pair]
simulated_accuracy1 = proportion of times simulated_predictions1 equals actual_classes
simulated_accuracy2 = proportion of times simulated_predictions2 equals actual_classes
all_simulations.append(simulated_accuracy1 - simulated_accuracy2) # Put the simulated difference in the list
p = count(absolute(all_simulations) > absolute(actual_test_statistic ))/number_of_iterations
If you have any thoughts, let me know in the comments. Or better still, provide your own corrected version in your own answer. Thank you!
I am new to constraint programming and OR-Tools. A brief about the problem. There are 8 positions, for each position I need to decide which move of type A (move_A) and which move of type B (move_B) should be selected such that the value achieved from the combination of the 2 moves (at each position) is maximized. (This is only a part of the bigger problem though). And I want to use AddElement approach to do the sub setting.
Please see the below attempt
from ortools.sat.python import cp_model
model = cp_model.CpModel()
# value achieved from combination of different moves of type A
# (moves_A (rows)) and different moves of type B (moves_B (columns))
# for e.g. 2nd move of type A and 3rd move of type B will give value = 2
value = [
[ -1, 5, 3, 2, 2],
[ 2, 4, 2, -1, 1],
[ 4, 4, 0, -1, 2],
[ 5, 1, -1, 2, 2],
[ 0, 0, 0, 0, 0],
[ 2, 1, 1, 2, 0]
]
# 6 moves of type A
num_moves_A = len(value)
# 5 moves of type B
num_moves_B = len(value[0])
num_positions = 8
type_move_A_position = [model.NewIntVar(0, num_moves_A - 1, f"move_A[{i}]") for i in range(num_positions)]
type_move_B_position = [model.NewIntVar(0, num_moves_B - 1, f"move_B[{i}]") for i in range(num_positions)]
value_position = [model.NewIntVar(0, 10, f"value_position[{i}]") for i in range(num_positions)]
# I am getting an error when I run the below
objective_terms = []
for i in range(num_positions):
model.AddElement(type_move_B_position[i], value[type_move_A_position[i]], value_position[i])
objective_terms.append(value_position[i])
The error is as follows:
Traceback (most recent call last):
File "<ipython-input-65-3696379ce410>", line 3, in <module>
model.AddElement(type_move_B_position[i], value[type_move_A_position[i]], value_position[i])
TypeError: list indices must be integers or slices, not IntVar
In MiniZinc the below code would have worked
var int: obj = sum(i in 1..num_positions ) (value [type_move_A_position[i], type_move_B_position[i]])
I know in OR-Tools we will have to create some intermediary variables to store results first, so the above approach of minizinc will not work. But I am struggling to do so.
I can always create a 2 matrix of binary binary variables one for num_moves_A * num_positions and the other for num_moves_B * num_positions, add re;evant constraints and achieve the purpose
But I want to learn how to do the same thing via AddElement constraint
Any help on how to re-write the AddElement snippet is highly appreciated. Thanks.
AddElement is 1D only.
The way it is translated from minizinc to CP-SAT is to create an intermediate variable p == index1 * max(index2) + index2 and use it in an element constraint with a flattened matrix.
Following Laurent's suggestion (using AddElement constraint):
from ortools.sat.python import cp_model
model = cp_model.CpModel()
# value achieved from combination of different moves of type A
# (moves_A (rows)) and different moves of type B (moves_B (columns))
# for e.g. 2 move of type A and 3 move of type B will give value = 2
value = [
[-1, 5, 3, 2, 2],
[2, 4, 2, -1, 1],
[4, 4, 0, -1, 2],
[5, 1, -1, 2, 2],
[0, 0, 0, 0, 0],
[2, 1, 1, 2, 0],
]
min_value = min([min(i) for i in value])
max_value = max([max(i) for i in value])
# 6 moves of type A
num_moves_A = len(value)
# 5 moves of type B
num_moves_B = len(value[0])
# number of positions
num_positions = 5
# flattened matrix of values
value_flat = [value[i][j] for i in range(num_moves_A) for j in range(num_moves_B)]
# flattened indices
flatten_indices = [
index1 * len(value[0]) + index2
for index1 in range(len(value))
for index2 in range(len(value[0]))
]
type_move_A_position = [
model.NewIntVar(0, num_moves_A - 1, f"move_A[{i}]") for i in range(num_positions)
]
model.AddAllDifferent(type_move_A_position)
type_move_B_position = [
model.NewIntVar(0, num_moves_B - 1, f"move_B[{i}]") for i in range(num_positions)
]
model.AddAllDifferent(type_move_B_position)
# below intermediate decision variable is created which
# will store index corresponding to the selected move of type A and
# move of type B for each position
# this will act as index in the AddElement constraint
flatten_index_num = [
model.NewIntVar(0, len(flatten_indices), f"flatten_index_num[{i}]")
for i in range(num_positions)
]
# another intermediate decision variable is created which
# will store value corresponding to the selected move of type A and
# move of type B for each position
# this will act as the target in the AddElement constraint
value_position_index_num = [
model.NewIntVar(min_value, max_value, f"value_position_index_num[{i}]")
for i in range(num_positions)
]
objective_terms = []
for i in range(num_positions):
model.Add(
flatten_index_num[i]
== (type_move_A_position[i] * len(value[0])) + type_move_B_position[i]
)
model.AddElement(flatten_index_num[i], value_flat, value_position_index_num[i])
objective_terms.append(value_position_index_num[i])
model.Maximize(sum(objective_terms))
# Solve
solver = cp_model.CpSolver()
status = solver.Solve(model)
solver.ObjectiveValue()
for i in range(num_positions):
print(
str(i)
+ "--"
+ str(solver.Value(type_move_A_position[i]))
+ "--"
+ str(solver.Value(type_move_B_position[i]))
+ "--"
+ str(solver.Value(value_position_index_num[i]))
)
The below version uses AddAllowedAssignments constraint to achieve the same purpose (per Laurent's alternate approach) :
from ortools.sat.python import cp_model
model = cp_model.CpModel()
# value achieved from combination of different moves of type A
# (moves_A (rows)) and different moves of type B (moves_B (columns))
# for e.g. 2 move of type A and 3 move of type B will give value = 2
value = [
[-1, 5, 3, 2, 2],
[2, 4, 2, -1, 1],
[4, 4, 0, -1, 2],
[5, 1, -1, 2, 2],
[0, 0, 0, 0, 0],
[2, 1, 1, 2, 0],
]
min_value = min([min(i) for i in value])
max_value = max([max(i) for i in value])
# 6 moves of type A
num_moves_A = len(value)
# 5 moves of type B
num_moves_B = len(value[0])
# number of positions
num_positions = 5
type_move_A_position = [
model.NewIntVar(0, num_moves_A - 1, f"move_A[{i}]") for i in range(num_positions)
]
model.AddAllDifferent(type_move_A_position)
type_move_B_position = [
model.NewIntVar(0, num_moves_B - 1, f"move_B[{i}]") for i in range(num_positions)
]
model.AddAllDifferent(type_move_B_position)
value_position = [
model.NewIntVar(min_value, max_value, f"value_position[{i}]")
for i in range(num_positions)
]
tuples_list = []
for i in range(num_moves_A):
for j in range(num_moves_B):
tuples_list.append((i, j, value[i][j]))
for i in range(num_positions):
model.AddAllowedAssignments(
[type_move_A_position[i], type_move_B_position[i], value_position[i]],
tuples_list,
)
model.Maximize(sum(value_position))
# Solve
solver = cp_model.CpSolver()
status = solver.Solve(model)
solver.ObjectiveValue()
for i in range(num_positions):
print(
str(i)
+ "--"
+ str(solver.Value(type_move_A_position[i]))
+ "--"
+ str(solver.Value(type_move_B_position[i]))
+ "--"
+ str(solver.Value(value_position[i]))
)
I'm tackling with VRPtw problem and struggling that the solver finds no solution with any data except for artificial small one.
The setting is as below.
There are several depots and locations to visit. Each locations have the time-window. Each vehicles have break time and work time. Also, the locations have some constraints and only the vehicles which satisfy that demand can visit there.
Based on this experiment setting, I wrote the code below.
As I wrote, it looks that it is working with small artificial data, but with real data, it never found the solution. I tried with 5 different data sets.
Although I set the 7200 sec time limit, previously I ran for longer than 10 hours and it was same.
The data's scale is 40~50 vehicles and 200~300 locations.
Does this code have a problem? If not, on what kind of order, should I change the approach(such as initialization, searching method and so on)?
(Edited to use integer for time matrix)
from dataclasses import dataclass
from typing import List, Tuple
from ortools.constraint_solver import pywrapcp
from ortools.constraint_solver import routing_enums_pb2
# TODO: Refactor
BIG_ENOUGH = 100000000
TIME_DIMENSION = 'Time'
TIME_LIMIT = 7200
#dataclass
class DataSet:
time_matrix: List[List[int]]
locations_num: int
vehicles_num: int
vehicles_break_time_window: List[Tuple[int, int, int]]
vehicles_work_time_windows: List[Tuple[int, int]]
location_time_windows: List[Tuple[int, int]]
vehicles_depots_indices: List[int]
possible_vehicles: List[List[int]]
def execute(data: DataSet):
manager = pywrapcp.RoutingIndexManager(data.locations_num,
data.vehicles_num,
data.vehicles_depots_indices,
data.vehicles_depots_indices)
routing_parameters = pywrapcp.DefaultRoutingModelParameters()
routing_parameters.solver_parameters.trace_propagation = True
routing_parameters.solver_parameters.trace_search = True
routing = pywrapcp.RoutingModel(manager, routing_parameters)
def time_callback(source_index, dest_index):
from_node = manager.IndexToNode(source_index)
to_node = manager.IndexToNode(dest_index)
return data.time_matrix[from_node][to_node]
transit_callback_index = routing.RegisterTransitCallback(time_callback)
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)
routing.AddDimension(
transit_callback_index,
BIG_ENOUGH,
BIG_ENOUGH,
False,
TIME_DIMENSION)
time_dimension = routing.GetDimensionOrDie(TIME_DIMENSION)
# set time window for locations start time
# set condition restrictions
possible_vehicles = data.possible_vehicles
for location_idx, time_window in enumerate(data.location_time_windows):
index = manager.NodeToIndex(location_idx + data.vehicles_num)
time_dimension.CumulVar(index).SetRange(time_window[0], time_window[1])
routing.SetAllowedVehiclesForIndex(possible_vehicles[location_idx], index)
solver = routing.solver()
for i in range(data.vehicles_num):
routing.AddVariableMinimizedByFinalizer(
time_dimension.CumulVar(routing.Start(i)))
routing.AddVariableMinimizedByFinalizer(
time_dimension.CumulVar(routing.End(i)))
# set work time window for vehicles
for vehicle_index, work_time_window in enumerate(data.vehicles_work_time_windows):
start_index = routing.Start(vehicle_index)
time_dimension.CumulVar(start_index).SetRange(work_time_window[0],
work_time_window[0])
end_index = routing.End(vehicle_index)
time_dimension.CumulVar(end_index).SetRange(work_time_window[1],
work_time_window[1])
# set break time for vehicles
node_visit_transit = {}
for n in range(routing.Size()):
if n >= data.locations_num:
node_visit_transit[n] = 0
else:
node_visit_transit[n] = 1
break_intervals = {}
for v in range(data.vehicles_num):
vehicle_break = data.vehicles_break_time_window[v]
break_intervals[v] = [
solver.FixedDurationIntervalVar(vehicle_break[0],
vehicle_break[1],
vehicle_break[2],
True,
'Break for vehicle {}'.format(v))
]
time_dimension.SetBreakIntervalsOfVehicle(
break_intervals[v], v, node_visit_transit
)
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)
search_parameters.local_search_metaheuristic = (
routing_enums_pb2.LocalSearchMetaheuristic.GREEDY_DESCENT)
search_parameters.time_limit.seconds = TIME_LIMIT
search_parameters.log_search = True
solution = routing.SolveWithParameters(search_parameters)
return solution
if __name__ == '__main__':
data = DataSet(
time_matrix=[[0, 0, 4, 5, 5, 6],
[0, 0, 6, 4, 5, 5],
[1, 3, 0, 6, 5, 4],
[2, 1, 6, 0, 5, 4],
[2, 2, 5, 5, 0, 6],
[3, 2, 4, 4, 6, 0]],
locations_num=6,
vehicles_num=2,
vehicles_depots_indices=[0, 1],
vehicles_work_time_windows=[(720, 1080), (720, 1080)],
vehicles_break_time_window=[(720, 720, 15), (720, 720, 15)],
location_time_windows=[(735, 750), (915, 930), (915, 930), (975, 990)],
possible_vehicles=[[0], [1], [0], [1]]
)
solution = execute(data)
if solution is not None:
print("solution is found")
https://plot.ly/python/bar-charts/#bar-chart-with-line-plot
I want to create a bar chart with line plot like in the example above using plotly and iPython. On the other hand, I want the bar chart to be a horizontal stacked bar chart like in the example below using plotly and iPython. How do I do this?
https://plot.ly/python/bar-charts/#colored-bar-chart
y_saving_yes = [1, 2, 4, 6, 7, 7]
y_saving_no = [10, 10, 10, 10, 10, 10]
y_net_worth = [93453, 81666, 69889, 78381, 141395, 92969]
x_saving = ['Premium', 'Spot Shadow', 'Slow Motion', 'Highlight Music','Extra Text', 'Top Play']
x_net_worth = ['Premium', 'Spot Shadow', 'Slow Motion', 'Highlight Music','Extra Text', 'Top Play']
trace1 = Bar(
x=y_saving,
y=x_saving,
marker=Marker(
color='rgba(50, 171, 96, 0.6)',
line=Line(
color='rgba(50, 171, 96, 1.0)',
width=1,
),
),
name='Highlight Properties',
orientation='h',
)
trace2 = Bar(
x=y_saving,
y=x_saving,
marker=Marker(
color='rgba(50, 171, 96, 0.6)',
line=Line(
color='rgba(50, 171, 96, 1.0)',
width=1,
),
),
name='Highlight Properties',
orientation='h',
)
data = Data([trace1, trace2])
layout = Layout(barmode='stack')
fig1 = Figure(data=data, layout=layout)
trace3 = Scatter(
x=y_net_worth,
y=x_net_worth,
mode='lines+markers',
line=Line(
color='rgb(128, 0, 128)',
),
name='Highlight Views',
)
fig = tools.make_subplots(rows=1, cols=2, specs=[[{}, {}]], shared_xaxes=True,
shared_yaxes=False, vertical_spacing=0.001)
fig.append_trace(trace1, 1, 1)
fig.append_trace(trace3, 1, 2)
fig['layout'].update(layout)
py.iplot(fig, filename='oecd-networth-saving-bar-line')
Andrew from Plotly here. Super close! I think you just missed a fig.append_trace(trace2, 1, 1). Here's a simple example doing basically the same thing for reference.
import plotly.plotly as py
from plotly import tools
from plotly.graph_objs import Bar, Data, Figure, Layout, Marker, Scatter
x_0 = [1, 2, 4, 6, 7, 7]
x_1 = [10, 10, 10, 10, 10, 10]
y_0 = [2, 3, 4, 2, 3, 3]
trace1 = Bar(
x=x_0,
marker=Marker(color='#001f3f'),
orientation='h',
)
trace2 = Bar(
x=x_1,
marker=Marker(color='#0074D9'),
orientation='h',
)
trace3 = Scatter(y=y_0)
fig = tools.make_subplots(1, 2)
fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 1, 1)
fig.append_trace(trace3, 1, 2)
fig['layout'].update(barmode='stack')
py.iplot(fig, filename='oecd-networth-saving-bar-line')
I am creating a flex table dynamically with the following code.
for (int CurrentRow=1;CurrentRow<2;CurrentRow++)
{
Label lblGettingName = new Label("Getting Name...");
View.getMainFlex().setWidget(CurrentRow, 0, lblGettingName);
Button btnViewDetails = new Button("View Details");
View.getMainFlex().setWidget(CurrentRow, 1, btnViewDetails);
Label lblGettingBid = new Label("Getting Bid...");
View.getMainFlex().setWidget(CurrentRow, 2, lblGettingBid);
View.getMainFlex().getFlexCellFormatter().setStyleName(CurrentRow, 2, "BackNormalNotBold");
Label lblGettingBidDesription = new Label("Getting Bid Desription...");
lblGettingBidDesription.setStyleName("BidDesc");
View.getMainFlex().setWidget(CurrentRow, 3, lblGettingBidDesription);
View.getMainFlex().getCellFormatter().setWidth(CurrentRow, 3, "40");
View.getMainFlex().getFlexCellFormatter().setStylePrimaryName(CurrentRow, 3, ".BidDesc");
Label lblCalculating = new Label("Calculating..");
Label lblCalculatingTime = new Label("Calculating Time...");
View.getMainFlex().setWidget(CurrentRow, 4, lblCalculatingTime);
View.getMainFlex().getFlexCellFormatter().setStyleName(1,4, "BackNormalNotBold");
TextBox textBox = new TextBox();
View.getMainFlex().setWidget(CurrentRow+1, 3, textBox);
View.getMainFlex().getCellFormatter().setWidth(CurrentRow+1, 3, "40");
View.getMainFlex().getFlexCellFormatter().setStyleName(CurrentRow, 0, "BackNormalNotBold");
View.getMainFlex().getFlexCellFormatter().setStyleName(CurrentRow, 1, "BackNormalNotBold");
View.getMainFlex().getFlexCellFormatter().setStyleName(CurrentRow, 2, "BackNormalNotBold");
View.getMainFlex().getFlexCellFormatter().setStyleName(CurrentRow+1, 3, "BackNormalNotBold");
View.getMainFlex().getFlexCellFormatter().setRowSpan(CurrentRow, 4, 3);
View.getMainFlex().getFlexCellFormatter().setRowSpan(CurrentRow, 2, 3);
View.getMainFlex().getFlexCellFormatter().setRowSpan(CurrentRow, 1, 3);
View.getMainFlex().getFlexCellFormatter().setRowSpan(CurrentRow, 0, 3);
View.getMainFlex().getFlexCellFormatter().setColSpan(CurrentRow+1, 3, 2);
View.getMainFlex().getFlexCellFormatter().setColSpan(CurrentRow, 3, 2);
View.getMainFlex().getFlexCellFormatter().setColSpan(CurrentRow-1, 3, 2);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow, 1, HasHorizontalAlignment.ALIGN_CENTER);
View.getMainFlex().getCellFormatter().setVerticalAlignment(CurrentRow, 1, HasVerticalAlignment.ALIGN_MIDDLE);
View.getMainFlex().getCellFormatter().setVerticalAlignment(CurrentRow, 0, HasVerticalAlignment.ALIGN_MIDDLE);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow, 0, HasHorizontalAlignment.ALIGN_CENTER);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow, 2, HasHorizontalAlignment.ALIGN_CENTER);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow+1, 3, HasHorizontalAlignment.ALIGN_CENTER);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow, 3, HasHorizontalAlignment.ALIGN_CENTER);
Button btnPlaceBid = new Button("Bid!");
View.getMainFlex().setWidget(CurrentRow+2, 3, btnPlaceBid);
View.getMainFlex().getCellFormatter().setWidth(CurrentRow+2, 3, "20");
btnPlaceBid.setSize("66px", "26px");
ToggleButton tglbtnAutomate = new ToggleButton("Automate");
View.getMainFlex().setWidget(3, 4, tglbtnAutomate);
View.getMainFlex().getCellFormatter().setWidth(3, 4, "20");
tglbtnAutomate.getDownHoveringFace().setText("TurnOFF");
tglbtnAutomate.getUpHoveringFace().setText("TurnON");
tglbtnAutomate.getDownDisabledFace().setText("Enable");
tglbtnAutomate.setHTML("Auto:OFF");
tglbtnAutomate.getUpFace().setHTML("Auto:OFF");
tglbtnAutomate.getDownFace().setHTML("Auto:ON");
tglbtnAutomate.setSize("54px", "18px");
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow+2, 3, HasHorizontalAlignment.ALIGN_RIGHT);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow-1, 4, HasHorizontalAlignment.ALIGN_CENTER);
View.getMainFlex().getCellFormatter().setHorizontalAlignment(CurrentRow, 4, HasHorizontalAlignment.ALIGN_CENTER);
}
FlexTableHelper.fixRowSpan(View.getMainFlex());
When the loop executes only once, the correct layout is generated but when i try to create more than 1 row, the layout degerate
Most likely problems with flextable are caused by setRowSpan/setColSpan methods which can easily wreak havoc in layout. Instead of using those methods you can create composite widget/Html and place it in cells so Flextable will have less amount of rows/columns.
FlexTable uses old element attributes of td like width, height, align etc.
It is not yet adapted to use CSS instead even in the newest release. IE11 does no longer support <td align="..."> at all, for example.