unable to add layer in keras - neural-network

I am trying to following this suggestion.
outputs = Conv2DTranspose(3, (1, 1), activation='sigmoid') (c9)
model = Model(inputs=[inputs], outputs=[outputs])
model = multi_gpu_model(model, gpus=8)
model.compile(optimizer='adam', loss = bce, metrics = [mean_iou])
model.add(Lambda(lambda x: K.batch_flatten(x)))
But at that last line of code, I receive the following error:
'Model' object has no attribute 'add'
I understand that since I didn't instantiate model as sequential() as in linked post, the function add() might not be available to me. However, I'm not sure how to work around this.

Corrected to reflect the working solution:
outputs = Conv2DTranspose(3, (1, 1), activation='sigmoid') (c9)
outputs = Lambda(lambda x: K.batch_flatten(x))(outputs)
model = Model(inputs=[inputs], outputs=[outputs])
model = multi_gpu_model(model, gpus=8)
model.compile(optimizer='adam', loss = bce, metrics = [mean_iou])

Going off #Today's answer in the OP's comments,
outputs = Lambda(lambda x: K.batch_flatten(x))(outputs)

Related

Optuna - what is a suitable metric for TFKerasPruningCallback?

I've been using optuna to perform a hyperparameter search for a Keras neural network model (using scikit-learn's wrapper, KerasRegressor). I have been trying to implement the TFKerasPruningCallback function to prune unpromising trials, but keep getting the following error: UserWarning: The metric 'val_accuracy' is not in the evaluation logs for pruning. Please make sure you set the correct metric name.
A recommended metric provided in the docs is 'val_acc', or 'val_accuracy' for tensorflow version > 2 as shown in this example. I'm using v2.9.1.
Here's a simplified version of my code:
def nn(n_layers, neurons):
model = Sequential()
model.add(Dense(neurons, input_dim=3, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_absolute_error', optimizer='Adam')
return model
def objective(trial, X_train, y_train):
# callbacks
cb = [TFKerasPruningCallback(trial,'val_accuracy')]
params = {'neurons': trial.suggest_int('neurons', 1, 10, 1),
'batch_size': trial.suggest_int('batch_size', 10, 200, 10),
'epochs': trial.suggest_int('epochs', 100, 500, 20),
'callbacks': cb}
# create model and perform cross-validation
model = KerasRegressor(model=nn, **params)
score = np.mean(cross_val_score(model, X_train, y_train, cv=5,
scoring='neg_root_mean_squared_error'))
return score
study = optuna.create_study(direction='maximize', pruner=optuna.pruners.MedianPruner(n_startup_trials=5))
study.optimize(lambda trial: objective(trial, X_train, y_train), n_trials=100)
Everything works fine if I don't bother to implement the callback. I haven't found any reports of this issue so it might be something really silly that I've overlooked, but I'm stuck. Any advice?

RandomForestClassifier has no attribute transform, so how to get predictions?

How do you get predictions out of a RandomForestClassifier? Loosely following the latest docs here, my code looks like...
# Split the data into training and test sets (30% held out for testing)
SPLIT_SEED = 64 # some const seed just for reproducibility
TRAIN_RATIO = 0.75
(trainingData, testData) = df.randomSplit([TRAIN_RATIO, 1-TRAIN_RATIO], seed=SPLIT_SEED)
print(f"Training set ({trainingData.count()}):")
trainingData.show(n=3)
print(f"Test set ({testData.count()}):")
testData.show(n=3)
# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)
rf.fit(trainingData)
#print(rf.featureImportances)
preds = rf.transform(testData)
When running this, I get the error
AttributeError: 'RandomForestClassifier' object has no attribute 'transform'
Examining the python api docs, I see nothing that look like it relates to generating predictions from the trained model (nor feature importance for that matter). Not much experience with mllib, so not sure what to make of this. Anyone with more experience know what to do here?
by looking closely to the documentation
>>> model = rf.fit(td)
>>> model.featureImportances
SparseVector(1, {0: 1.0})
>>> allclose(model.treeWeights, [1.0, 1.0, 1.0])
True
>>> test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])
>>> result = model.transform(test0).head()
>>> result.prediction
you will notice the rf.fit return fitted models which is different than the original RandomForestClassifier class.
And the model will have the method to transform and also feature importance
so in your code
# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)
model = rf.fit(trainingData)
#print(rf.featureImportances)
preds = model.transform(testData)

Multiheaded Model in Keras - error while merging

I try to implement a multiheaded model with a variable number of inputs of 1D data, which has a length of sps each.
So I define the Input in the loop which is later merged in a single model. And get the error
dense = (Dense(locChannels, activation=locActivation, input_shape=merged.output_shape)) (merged)
AttributeError: 'Tensor' object has no attribute 'output_shape'
If I remove the input_shape-parameter from the dense object I get the following:
UserWarning: Model inputs must come from keras.layers.Input (thus holding past layer metadata), they cannot be the output of a previous non-Input layer. Here, a tensor specified as input to your model was not an Input tensor, it was generated by layer
flatten_1.
Note that input tensors are instantiated via tensor = keras.layers.Input(shape).
Do you have an idea how to fix this?
I think I should clarify how my data looks. Maybe I habe an error in my structure.
locChannels is the number of different Features I have. Every feature is 1D and has exact sps samples in it.
The desired output is one-hot-coded-array .
differentModels = list()
for index in range (0,locChannels):
name = 'Input_'+str(index)
visible = Input(shape=(sps,1), name=name)
cnn1 = Conv1D(filters=8,kernel_size=2, activation=locActivation) (visible)
cnn1 = MaxPooling1D(pool_size = 2) (cnn1)
cnn1 = Flatten()(cnn1)
#print(visible)
differentModels.append(cnn1)
merged = Concatenate()(differentModels)
dense = (Dense(locChannels, activation=locActivation, input_shape=merged.output_shape)) (merged)
for index in range (2,locLayers):
dense = (Dropout(rate=locDropoutRate)) (dense)
dense = (Dense(locChannels, activation=locActivation, input_shape=(locChannels,))) (dense)
output = Dense(units=locClasses, activation='softmax')(dense)
model = Model(inputs=differentModels, outputs= output)
I just found out, what my mistake was.
In the line
model = Model(inputs=differentModels, outputs= output)
Input need to be the head, or Input layer, not the last one. So the following is working as expected:
inputheads = list()
myinputs = list()
for index in range(0,features):
input_a = Input(shape=(sps,1),name='Input_'+str(index))
cnn1 = Conv1D(filters=8,kernel_size=2, activation='selu') (input_a)
cnn1 = MaxPooling1D(pool_size = 2) (cnn1)
cnn1 = Flatten()(cnn1)
inputheads.append(cnn1)
myinputs.append(input_a)
merged = Concatenate() (inputheads)
dense = Dense(20)(merged)
predictions = Dense(10, activation='softmax')(dense)
model = Model(inputs=myinputs, outputs=predictions)

Pytorch: NN function approximator, 2 in 1 out

[Please be aware of the Edit History below, as the major problem statement has changed.]
We are trying to implement a neural network in pytorch, that approximates a function f(x,y)=z. So there are two real numbers as input and one as ouput, we therefore want 2 nodes in the input layer and one in the output layer. We constructed a test set of 5050 samples and had pretty good results for that task in Keras with Tensorflow backend, with 3 hidden layers with a configuration of the nodes like: 2(in) - 4 - 16 - 4 - 1(out); and ReLU activation functions on all hidden layers, linear on in- and output.
Now in Pytorch we tried to implement a similar network but our loss function still literally explodes: It changes in the first few steps and converges then to some value around 10^7. In Keras we had an error around 10 percent. We already tried different network configurations without any improvement. Maybe someone could have a look on our code and suggest any change?
To explain: tr_data is a list, containing 5050 2*1 numpy arrays which are the inputs for the network. tr_labels is a list, containing 5050 numbers which are the outputs we want to learn. loadData() just load those two lists.
import torch.nn as nn
import torch.nn.functional as F
BATCH_SIZE = 5050
DIM_IN = 2
DIM_HIDDEN_1 = 4
DIM_HIDDEN_2 = 16
DIM_HIDDEN_3 = 4
DIM_OUT = 1
LEARN_RATE = 1e-4
EPOCH_NUM = 500
class Net(nn.Module):
def __init__(self):
#super(Net, self).__init__()
super().__init__()
self.hidden1 = nn.Linear(DIM_IN, DIM_HIDDEN_1)
self.hidden2 = nn.Linear(DIM_HIDDEN_1, DIM_HIDDEN_2)
self.hidden3 = nn.Linear(DIM_HIDDEN_2, DIM_HIDDEN_3)
self.out = nn.Linear(DIM_HIDDEN_3, DIM_OUT)
def forward(self, x):
x = F.relu(self.hidden1(x))
x = F.tanh(self.hidden2(x))
x = F.tanh(self.hidden3(x))
x = self.out(x)
return x
model = Net()
loss_fn = nn.MSELoss(size_average=False)
optimizer = torch.optim.Adam(model.parameters(), lr=LEARN_RATE)
tr_data,tr_labels = loadData()
tr_data_torch = torch.zeros(BATCH_SIZE, DIM_IN)
tr_labels_torch = torch.zeros(BATCH_SIZE, DIM_OUT)
for i in range(BATCH_SIZE):
tr_data_torch[i] = torch.from_numpy(tr_data[i])
tr_labels_torch[i] = tr_labels[i]
for t in range(EPOCH_NUM):
labels_pred = model(tr_data_torch)
loss = loss_fn(labels_pred, tr_labels_torch)
#print(t, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
I have to say, those are our first steps in Pytorch, so please forgive me if there are some obvious, dumb mistakes. I appreciate any help or hint,
Thank you!
EDIT 1 ------------------------------------------------------------------
Following the comments and answers, we improved our code. The Loss function has now for the first time reasonable values, around 250. Our new class definition looks like:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
#super().__init__()
self.hidden1 = nn.Sequential(nn.Linear(DIM_IN, DIM_HIDDEN_1), nn.ReLU())
self.hidden2 = nn.Sequential(nn.Linear(DIM_HIDDEN_1, DIM_HIDDEN_2), nn.ReLU())
self.hidden3 = nn.Sequential(nn.Linear(DIM_HIDDEN_2, DIM_HIDDEN_3), nn.ReLU())
self.out = nn.Linear(DIM_HIDDEN_3, DIM_OUT)
def forward(self, x):
x = self.hidden1(x)
x = self.hidden2(x)
x = self.hidden3(x)
x = self.out(x)
return x
and the loss function:
loss_fn = nn.MSELoss(size_average=True, reduce=True)
As we stated before, we already had far more satisfying results in keras with tensorflow backend. The loss function was around 30, with a similar network configuration. I share the essential parts(!) of our keras code here:
model = Sequential()
model.add(Dense(4, activation="linear", input_shape=(2,)))
model.add(Dense(16, activation="relu"))
model.add(Dense(4, activation="relu"))
model.add(Dense(1, activation="linear" ))
model.summary()
model.compile ( loss="mean_squared_error", optimizer="adam", metrics=["mse"] )
history=model.fit ( np.array(tr_data), np.array(tr_labels), \
validation_data = ( np.array(val_data), np.array(val_labels) ),
batch_size=50, epochs=200, callbacks = [ cbk ] )
Thank your already for all the help! If anybody still has suggestions to improve the network, we would be happy about it. As somebody already asked for the data, we want to share a pickle file here:
https://mega.nz/#!RDYxSYLY!P4a9mEDtZ7A5Bl7ZRjRk8EzLXQt2gyURa3wN3NCWFPA
together with the code to access it:
import pickle
f=open("data.pcl","rb")
tr_data=pickle.load ( f )
tr_labels=pickle.load ( f )
val_data=pickle.load ( f )
val_labels=pickle.load ( f )
f.close()
It should be interesting for you to point out the differences between torch.nn and torch.nn.functional (see here). Essentially, it might be that your backpropagation graph might be executed not 100% correct due to a different specification.
As pointed out by previous commenters, I would suggest to define your layers including the activations. My personal favorite way is to use nn.Sequential(), which allows you to specify multiple opeations chained together, like so:
self.hidden1 = nn.Sequential(nn.Linear(DIM_IN, DIM_HIDDEN1), nn.ReLU())
and then simply calling self.hidden1 later (without wrapping it in F.relu()).
May I also ask why you do not call the commented super(Net, self).__init__() (which is the generally recommended way)?
Additionally, if that should not fix the problem, can you maybe just share the code for Keras in comparison?

Trying to balance my dataset through sample_weight in scikit-learn

I'm using RandomForest for classification, and I got an unbalanced dataset, as: 5830-no, 1006-yes. I try to balance my dataset with class_weight and sample_weight, but I can`t.
My code is:
X_train,X_test,y_train,y_test = train_test_split(arrX,y,test_size=0.25)
cw='auto'
clf=RandomForestClassifier(class_weight=cw)
param_grid = { 'n_estimators': [10,50,100,200,300],'max_features': ['auto', 'sqrt', 'log2']}
sw = np.array([1 if i == 0 else 8 for i in y_train])
CV_clf = GridSearchCV(estimator=clf, param_grid=param_grid, cv= 10,fit_params={'sample_weight': sw})
But I don't get any improvement on my ratios TPR, FPR, ROC when using class_weight and sample_weight.
Why? Am I doing anything wrong?
Nevertheless, if I use the function called balanced_subsample, my ratios obtain a great improvement:
def balanced_subsample(x,y,subsample_size):
class_xs = []
min_elems = None
for yi in np.unique(y):
elems = x[(y == yi)]
class_xs.append((yi, elems))
if min_elems == None or elems.shape[0] < min_elems:
min_elems = elems.shape[0]
use_elems = min_elems
if subsample_size < 1:
use_elems = int(min_elems*subsample_size)
xs = []
ys = []
for ci,this_xs in class_xs:
if len(this_xs) > use_elems:
np.random.shuffle(this_xs)
x_ = this_xs[:use_elems]
y_ = np.empty(use_elems)
y_.fill(ci)
xs.append(x_)
ys.append(y_)
xs = np.concatenate(xs)
ys = np.concatenate(ys)
return xs,ys
My new code is:
X_train_subsampled,y_train_subsampled=balanced_subsample(arrX,y,0.5)
X_train,X_test,y_train,y_test = train_test_split(X_train_subsampled,y_train_subsampled,test_size=0.25)
cw='auto'
clf=RandomForestClassifier(class_weight=cw)
param_grid = { 'n_estimators': [10,50,100,200,300],'max_features': ['auto', 'sqrt', 'log2']}
sw = np.array([1 if i == 0 else 8 for i in y_train])
CV_clf = GridSearchCV(estimator=clf, param_grid=param_grid, cv= 10,fit_params={'sample_weight': sw})
This is not a full answer yet, but hopefully it'll help get there.
First some general remarks:
To debug this kind of issue it is often useful to have a deterministic behavior. You can pass the random_state attribute to RandomForestClassifier and various scikit-learn objects that have inherent randomness to get the same result on every run. You'll also need:
import numpy as np
np.random.seed()
import random
random.seed()
for your balanced_subsample function to behave the same way on every run.
Don't grid search on n_estimators: more trees is always better in a random forest.
Note that sample_weight and class_weight have a similar objective: actual sample weights will be sample_weight * weights inferred from class_weight.
Could you try:
Using subsample=1 in your balanced_subsample function. Unless there's a particular reason not to do so we're better off comparing the results on similar number of samples.
Using your subsampling strategy with class_weight and sample_weight both set to None.
EDIT: Reading your comment again I realize your results are not so surprising!
You get a better (higher) TPR but a worse (higher) FPR.
It just means your classifier tries hard to get the samples from class 1 right, and thus makes more false positives (while also getting more of those right of course!).
You will see this trend continue if you keep increasing the class/sample weights in the same direction.
There is a imbalanced-learn API that helps with oversampling/undersampling data that might be useful in this situation. You can pass your training set into one of the methods and it will output the oversampled data for you. See simple example below
from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler(random_state=1)
x_oversampled, y_oversampled = ros.fit_sample(orig_x_data, orig_y_data)
Here it the link to the API: http://contrib.scikit-learn.org/imbalanced-learn/api.html
Hope this helps!