The identification of rainfall-runoff models requires selection of appropriate data for model calibration. Traditionally, hydrologists use rules of thumb to select a certain period of hydrological data to calibrate the models (i.e., 6 year data). There are no numerical indices to help hydrologists to quantitatively select the calibration data. There are two questions: how long should the calibration data be (e.g., 6 months), and from which period should the data be selected (e.g., which 6 month data should be selected)? In this study, some indices for the selection of calibration data with adequate lengths and appropriate durations are proposed by examining the spectral properties of data sequences before the calibration work. With the validation data determined beforehand, we assume that the more similarity the calibration data set bears to the validation set, the better should the performance of the rainfall-runoff model be after calibration. Three approaches are applied to reveal the similarity between the validation and calibration data sets: flow-duration curve, Fourier transform, and wavelet analysis. Data sets used for calibration are generated by designing three scenario groups with fixed lengths of 6, 12, and 24 months, respectively, from 8 year continuous observations in the Brue catchment of the United Kingdom. Scenarios in each group have different starting times and thus various durations with specific hydrological characteristics. With a predetermined 18 month validation set and the rainfall-runoff model chosen to be the probability distributed model, useful indices are produced for certain scenario groups by all three approaches. The information cost function, an entropy-like function based on the decomposition results of the discrete wavelet transform, is found to be the most effective index for the calibration data selection. The study demonstrates that the information content of the calibration data is more important than the data length; thus 6 month data may provide more useful information than longer data series. This is important for hydrological modelers since shorter and more useful data help hydrologists to build models more efficiently and effectively. The idea presented in this paper has also shown potential in enhancing the efficiency of calibration data utilization, especially for data-limited catchments.