End-to-end PLDC frameworks have also been investigated, with the aim to avoid the need for manual prostate segmentation. Yang, Liu, Wang, Yang, Le Min, Wang and Cheng [2] incorporated CNN for automatic segmentation in advance to the PLDC. Insufficient prostate image features extracted by the shallow network (i.e. in five-layer) could deteriorate much the overall segmentation accuracy. Later, Wang, et al. [30] proposed a deeper prostate segmentation model capable of detecting more complex features. Apart from improving the segmentation performance, fusing spatial features using 3D CNNs is also another means to enhance the PCa classification accuracy. Mehta, et al. [31] employed a patient-level 3D model for binary classification using volumetric mpMRI, achieving an AUC of 0.79 and 0.86, respectively, on their local cohort dataset and PROSTATEx. However, only single-cohort datasets were used to evaluate the model. Domain shift would occur when it is directly applied to an unseen cohort [18-19]. Provided with very few studies (e.g., Mehta, Antonelli, Ahmed, Emberton, Punwani and Ourselin [31]) mpMRI sequences from multiple cohorts, they could just directly combine the heterogeneous images, giving rise of samples sufficient for model training, but inevitably ignoring data source heterogeneity. It would be prone to severe domain shift, thus biasing predictions by particular cohorts.