Cyclical learning rate with R and Keras

An example on how to perform cyclical learning rate with EfficientNet in R and Keras.

Etienne Rolland https://github.com/Cdk29
06-30-2021

Efficientnet with R and Tf2

In this blog post I will share a way to perform cyclical learning rate, with R. I worked on top of some source code I found on a other blog, by chance, but I adjusted things to make it more similar to the fast.ai approach. Also, my blog is on R-bloggers, so other R users that might want to use cyclical learning rate with R will have less trouble to find it. Sometimes things are possible in R, but, since our community is smaller, we don’t have that many resources or tutorials compared to the python community.

What is cyclical learning rate ? In a nutshell it is mostly about varying the learning rate around a min and max value during an epoch. The interests are that : 1) you don’t need to keep trying different learning rate, 2) it works as a form of regularization. Also, it trains the network faster (a phenomenon named “super convergence”).

About the data

I wrote this code in the first place in the context of the Cassava Leaf Disease Classification, a Kaggle’s competition where the goal was to train a model to identify the disease on leafs of cassava. Like the last time the I will use an Efficientnet0.

#reticulate::py_install(packages = "tensorflow", version = "2.3.0", pip=TRUE)
library(tidyverse)
library(tensorflow)
tf$executing_eagerly()
[1] TRUE
tensorflow::tf_version()
[1] '2.3'

Here I flex with my own version of keras. Basically, it is a fork with application wrapper for the efficient net.

Disclaimer : I did not write the code for the really handy applications wrappers. It came from this commit for which the PR is hold until the fully release of tf 2.3, as stated in this PR. I am not sure why the PR is closed.

devtools::install_github("Cdk29/keras", dependencies = FALSE)
labels<-read_csv('train.csv')
head(labels)
# A tibble: 6 x 2
  image_id       label
  <chr>          <dbl>
1 1000015157.jpg     0
2 1000201771.jpg     3
3 100042118.jpg      1
4 1000723321.jpg     1
5 1000812911.jpg     3
6 1000837476.jpg     3
levels(as.factor(labels$label))
[1] "0" "1" "2" "3" "4"
idx0<-which(labels$label==0)
idx1<-which(labels$label==1)
idx2<-which(labels$label==2)
idx3<-which(labels$label==3)
idx4<-which(labels$label==4)
labels$CBB<-0
labels$CBSD<-0
labels$CGM<-0
labels$CMD<-0
labels$Healthy<-0
labels$CBB[idx0]<-1
labels$CBSD[idx1]<-1
labels$CGM[idx2]<-1
labels$CMD[idx3]<-1

“Would it have been easier to create a function to convert the labelling ?” You may ask.

labels$Healthy[idx4]<-1

Probably.

#labels$label<-NULL
head(labels)
# A tibble: 6 x 7
  image_id       label   CBB  CBSD   CGM   CMD Healthy
  <chr>          <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
1 1000015157.jpg     0     1     0     0     0       0
2 1000201771.jpg     3     0     0     0     1       0
3 100042118.jpg      1     0     1     0     0       0
4 1000723321.jpg     1     0     1     0     0       0
5 1000812911.jpg     3     0     0     0     1       0
6 1000837476.jpg     3     0     0     0     1       0

Following code is retaken from this online notebook named simple-convnet, which used a better approach to create a validation set than I did in the first place (not at random, but with stratification) :

set.seed(6)

tmp = splitstackshape::stratified(labels, c('label'), 0.90, bothSets = TRUE)

train_labels = tmp[[1]]
val_labels = tmp[[2]]

#following line for knowledge distillation : 
write.csv(val_labels, file='validation_set.csv', row.names=FALSE, quote=FALSE)


train_labels$label<-NULL
val_labels$label<-NULL

head(train_labels)
         image_id CBB CBSD CGM CMD Healthy
1: 3903787097.jpg   1    0   0   0       0
2: 1026467332.jpg   1    0   0   0       0
3:  436868168.jpg   1    0   0   0       0
4: 2270851426.jpg   1    0   0   0       0
5: 3234915269.jpg   1    0   0   0       0
6: 3950368220.jpg   1    0   0   0       0
head(val_labels)
         image_id CBB CBSD CGM CMD Healthy
1: 1003442061.jpg   0    0   0   0       1
2: 1004672608.jpg   0    0   0   1       0
3: 1007891044.jpg   0    0   0   1       0
4: 1009845426.jpg   0    0   0   1       0
5: 1010648150.jpg   0    0   0   1       0
6: 1011139244.jpg   0    0   0   1       0
summary(train_labels)
   image_id              CBB               CBSD       
 Length:19256       Min.   :0.00000   Min.   :0.0000  
 Class :character   1st Qu.:0.00000   1st Qu.:0.0000  
 Mode  :character   Median :0.00000   Median :0.0000  
                    Mean   :0.05079   Mean   :0.1023  
                    3rd Qu.:0.00000   3rd Qu.:0.0000  
                    Max.   :1.00000   Max.   :1.0000  
      CGM              CMD           Healthy      
 Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
 Median :0.0000   Median :1.000   Median :0.0000  
 Mean   :0.1115   Mean   :0.615   Mean   :0.1204  
 3rd Qu.:0.0000   3rd Qu.:1.000   3rd Qu.:0.0000  
 Max.   :1.0000   Max.   :1.000   Max.   :1.0000  
summary(val_labels)
   image_id              CBB               CBSD       
 Length:2141        Min.   :0.00000   Min.   :0.0000  
 Class :character   1st Qu.:0.00000   1st Qu.:0.0000  
 Mode  :character   Median :0.00000   Median :0.0000  
                    Mean   :0.05091   Mean   :0.1023  
                    3rd Qu.:0.00000   3rd Qu.:0.0000  
                    Max.   :1.00000   Max.   :1.0000  
      CGM              CMD            Healthy      
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :0.0000   Median :1.0000   Median :0.0000  
 Mean   :0.1116   Mean   :0.6147   Mean   :0.1205  
 3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
image_path<-'cassava-leaf-disease-classification/train_images/'
#data augmentation
datagen <- image_data_generator(
  rotation_range = 40,
  width_shift_range = 0.2,
  height_shift_range = 0.2,
  shear_range = 0.2,
  zoom_range = 0.5,
  horizontal_flip = TRUE,
  fill_mode = "reflect"
)
img_path<-"cassava-leaf-disease-classification/train_images/1000015157.jpg"

img <- image_load(img_path, target_size = c(448, 448))
img_array <- image_to_array(img)
img_array <- array_reshape(img_array, c(1, 448, 448, 3))
img_array<-img_array/255
# Generated that will flow augmented images
augmentation_generator <- flow_images_from_data(
  img_array, 
  generator = datagen, 
  batch_size = 1 
)
op <- par(mfrow = c(2, 2), pty = "s", mar = c(1, 0, 1, 0))
for (i in 1:4) {
  batch <- generator_next(augmentation_generator)
  plot(as.raster(batch[1,,,]))
}