Objective: To determine the diagnostic accuracy of tuning fork tests for detecting fractures. Design: Systematic review of primary studies evaluating the diagnostic accuracy of tuning fork tests for the presence of fracture. Data source: We searched MEDLINE, CINAHL, AMED, EMBASE, Sports Discus, CAB Abstracts and Web of Science from commencement to November 2012. We manually searched the reference lists of any review papers and any identified relevant studies. Study selection and data extraction: Two reviewers independently reviewed the list of potentially eligible studies and rated the studies for quality using the QUADAS-2 tool. Data were extracted to form 2x2 contingency tables. The primary outcome measure was the accuracy of the test as measured by its sensitivity and specificity with 95% CIs. Data synthesis: We included six studies (329 patients), with two types of tuning fork tests (pain induction and loss of sound transmission). The studies included patients with an age range 7-60 years. The prevalence of fracture ranged from 10% to 80%. The sensitivity of the tuning fork tests was high, ranging from 75% to 100%. The specificity of the tests was highly heterogeneous, ranging from 18% to 95%. Conclusions: Based on the studies in this review, tuning fork tests have some value in ruling out fractures, but are not sufficiently reliable or accurate for widespread clinical use. The small sample size of the studies and the observed heterogeneity make generalisable conclusion difficult.