© 2020 IEEE.Pedestrian Detection (PD) is one of the most challenging problems for advanced driving assistance systems. Although deep models have achieved significant performance, their limited generalization ability prevents real-life implementation. The reason behind this limitation is shown to be the data-dependency of the deep learning, especially when model architecture and parameters are tweaked to boost the performance. As a result, pre-trained deep models tend to perform much worse when they are tested on a different dataset. By focusing on this issue, this paper investigates the use of ensembles consisting of vanilla-style deep models as weak classifiers. The main aim is to investigate if multiple models are used by their default parameters (without any fine-tuning considering the test set performance); would their ensemble reach better generalization capacity and have consistent performance on different datasets? Designing an ensemble for PD significantly differs from conventional classifier ensembles due to the importance of bounding box properties (location, size etc.) found by the detector. Thus, an extensive experimentation is carried out on how to combine model score with bounding box properties. Both simple ensemble strategies and learning based methods have been tested to present a simple and effective pipeline. Finally, non-maximum suppression, which is currently being used for bounding box regression, is modified to develop an ensemble method and the results show that it provides better improvements compared to existing ensemble strategies.