Object Detection Stage:

For the object detection network, we have used Faster RCNN. 
Implementation :  https://github.com/tryolabs/luminoth/tree/master/luminoth (Tensorflow Implementation)

Experiment 1:

We have used Resnet-101 as the base network to extract CNN- features. For that, we utilized the pre-trained weight and fine-tuned the network after block 2. 
Optimizer : SGD with momentum 0.9, learning rate 0.0003
Anchors:  base_size: 256 ,scales: [0.25, 0.5] , ratios: [1, 2]
Best result so far.

Experiment 2:

Limited the maximum class detection to 17 and changed the optimizer to Adam. Didn't perform well.
Average Precision (AP) @ [0.50] = 0.920
Average Precision (AP) @ [0.75] = 0.566
Average Precision (AP) @ [0.50:0.95] = 0.526
Average Recall (AR) @ [0.50:0.95] = 0.603
Removed the maximum class detection and kept the optimizer as Adam, still poor performance.
Average Precision (AP) @ [0.50] = 0.961
Average Precision (AP) @ [0.75] = 0.588
Average Precision (AP) @ [0.50:0.95] = 0.552
Average Recall (AR) @ [0.50:0.95] = 0.637

Experiment 3: