Background There is a lack of consensus regarding the use of quality scores in diagnostic systematic reviews. The objective of this study was to use different methods of weighting items included in a quality assessment tool for diagnostic accuracy studies (QUADAS) to produce an overall quality score, and to examine the effects of incorporating these into a systematic review. Methods We developed five schemes for weighting QUADAS to produce quality scores. We used three methods to investigate the effects of quality scores on test performance. We used a set of 28 studies that assessed the accuracy of ultrasound for the diagnosis of vesico-ureteral reflux in children. Results The different methods of weighting individual items from the same quality assessment tool produced different quality scores. The different scoring schemes ranked different studies in different orders; this was especially evident for the intermediate quality studies. Comparing the results of studies stratified as "high" and "low" quality based on quality scores resulted in different conclusions regarding the effects of quality on estimates of diagnostic accuracy depending on the method used to produce the quality score. A similar effect was observed when quality scores were included in meta-regression analysis as continuous variables, although the differences were less apparent. Conclusion Quality scores should not be incorporated into diagnostic systematic reviews. Incorporation of the results of the quality assessment into the systematic review should involve investigation of the association of individual quality items with estimates of diagnostic accuracy, rather than using a combined quality score.