Cochrane Database Syst Rev. 2017 Nov; 2017(11): CD011221. Monitoring Editor: Sarah Hull, Vijay Tailor,
Sara Balduzzi, Jugnoo Rahi, Christine Schmucker, Gianni Virgili, Annegret
Dahlmann‐Noor, NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, 162 City Road, LondonUK, EC1V 2PD University of Modena and Reggio Emilia, Cochrane Italy, Department of Diagnostic, Clinical and Public Health Medicine, Via del Pozzo 71, ModenaItaly, 41124 UCL Institute of Child Health and UCL Institute of Ophthalmology, Department of Epidemiology, LondonUK Medical Center – Univ. of Freiburg, Faculty of Medicine, Univ. of Freiburg, Cochrane Germany, Breisacher Straße 153, FreiburgGermany, 79110 University of Florence, Department of Translational Surgery and Medicine, Eye Clinic, Largo Brambilla, 3, FlorenceItaly, 50134 AbstractBackgroundStrabismus (misalignment of the eyes) is a risk factor for impaired visual development both of visual acuity and of stereopsis. Detection of strabismus in the community by non‐expert examiners may be performed using a number of different index tests that include direct measures of misalignment (corneal or fundus reflex tests), or indirect measures such as stereopsis and visual acuity. The reference test to detect strabismus by trained professionals is the cover‒uncover test. ObjectivesTo assess and compare the accuracy of tests, alone or in combination, for detection of strabismus in children aged 1 to 6 years, in a community setting by non‐expert screeners or primary care professionals to inform healthcare commissioners setting up childhood screening programmes. Secondary objectives were to investigate sources of heterogeneity of diagnostic accuracy. Search methodsWe searched the Cochrane Central Register of Controlled Trials (CENTRAL; 2016, Issue 12) (which contains the Cochrane Eyes and Vision Trials Register) in the Cochrane Library, the Health Technology Assessment Database (HTAD) in the Cochrane Library (2016, Issue 4), MEDLINE Ovid (1946 to 5 January 2017), Embase Ovid (1947 to 5 January 2017), CINAHL (January 1937 to 5 January 2017), Web of Science Conference Proceedings Citation Index‐Science (CPCI‐S) (January 1990 to 5 January 2017), BIOSIS Previews (January 1969 to 5 January 2017), MEDION (to 18 August 2014), the Aggressive Research Intelligence Facility database (ARIF) (to 5 January 2017), the ISRCTN registry (www.isrctn.com/editAdvancedSearch); searched 5 January 2017, ClinicalTrials.gov (www.clinicaltrials.gov); searched 5 January 2017 and the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) (www.who.int/ictrp/search/en); searched 5 January 2017. We did not use any date or language restrictions in the electronic searches for trials. In addition, orthoptic journals and conference proceedings without electronic listings were searched. Selection criteriaAll prospective or retrospective population‐based test accuracy studies of consecutive participants were included. Studies compared a single or combination of index tests with the reference test. Only those studies with sufficient data for analysis were included specifically to calculate sensitivity and specificity and determine diagnostic accuracy. Participants were aged 1 to 6 years. Studies reporting participants outside this range were included if subgroup data were available. Permitted settings included population‐based vision screening programmes or opportunistic screening programmes, such as those performed in schools. Data collection and analysisWe used standard methodological procedures expected by Cochrane. In brief, two review authors independently assessed titles and abstracts for eligibility and extracted the data, with a third senior author resolving any disagreement. We analysed data primarily for specificity and sensitivity. Main resultsOne study from a total of 1236 papers, abstracts and trials was eligible for inclusion with a total number of participants of 335 of which 271 completed both the screening test and the gold standard test. The screening test using an automated photoscreener had a sensitivity of 0.46 (95% confidence interval (CI) 0.19 to 0.75) and specificity of 0.97 (CI 0.94 to 0.99). The overall number affected by strabismus was low at 13 (4.8%). Authors' conclusionsThere is very limited data in the literature to ascertain the accuracy of tests for detecting strabismus in the community as performed by non‐expert screeners. A large prospective study to compare methods would be required to determine which tests have the greatest accuracy. Plain language summaryTests for detecting strabismus in children aged one to six years in the community Review aim Background Results and conclusion As only one study could be included in this review, it was not possible to conclude which test is the most accurate for screening for strabismus. Further studies would be needed to determine this. However, they would need to include very large numbers of children to be able to make statistically valid conclusions. How up to date is this review? Summary of findingsSummary of findings 2Data extraction from included studies
BackgroundTarget condition being diagnosedStrabismus is a physical condition in which the eyes are not aligned. It is associated with deficient binocularity, the mechanism that integrates visual information from both eyes. Strabismus can be primary, or it can be a consequence of poor vision in one eye or of refractive errors. Less commonly, strabismus can be caused by lesions affecting the oculomotor, trochlear or abducens nerve, or higher neurological pathways. Strabismus is rarely caused by developmental or traumatic defects of the extraocular muscles. Strabismus is a risk factor for the development of amblyopia during the 'sensitive period' of vision development. During this period, neural plasticity is greatest, and it begins to decline around the age of 6 years; clinical interventions are typically offered to children up to the age of 10 years. Screening programmes therefore attempt to identify children with amblyogenic risk factors before the age of 6 years, to allow remedial treatment. Prevalence figures for strabismus vary. The most recent screening study in Baltimore, USA, found a prevalence of manifest strabismus of 3.3% in Caucasian and 2.1% in African American children aged 6 to 71 months (Friedman 2009). Other population‐based studies have reported a prevalence of childhood strabismus between 0.01% and 3.1%, indicating that prevalence may vary greatly by ethnicity, age, type of strabismus and definitions used (Graham 1974; Matsuo 2007a; Matsuo 2007b; Preslan 1996; Traboulsi 2008; Turacli 1995; Wedner 2000;Appendix 1). Relevance of strabismus in childrenThere are many subtypes of strabismus. In the context of childhood vision screening programmes, the most relevant distinction is between manifest and latent strabismus. Manifest strabismus is a risk factor for the development of amblyopia, the commonest vision disorder in children (prevalence 1.6% to 3.6% in Western societies) (Simons 1996a). Amblyopia is a developmental anomaly of spatial vision, usually associated with strabismus, anisometropia or from deprivation early in life (Ciuffreda 1991). Amblyopes have reduced visual acuity in one or both eyes, reduced contrast sensitivity and reduced contour integration. Clinical definitions of amblyopia are based on visual acuity only, taking into consideration the age of the child and progressive improvement of 'normal visual acuity' in the early years. Unilateral amblyopia is often defined as an interocular difference in best‐corrected visual acuity (of 2 logMAR or Snellen chart lines) (Friedman 2009), or best‐corrected visual acuity of 0.30 logMAR or worse in either eye (Rahi 2002; Traboulsi 2008). In 3‐year‐old children, bilateral amblyopia is suspected if best‐corrected visual acuity is worse than 0.40 logMAR in one eye and worse than 0.3 logMAR in the other eye in the presence of a bilateral amblyogenic risk factor. In 4‐year‐old children, the thresholds are 0.3 and 0.18 logMAR, respectively (Schmidt 2004). Strabismus has a profound effect on stereopsis or perception of depth. Stereopsis normally develops within the first 3 to 4 months of age and reaches adult levels by the age of 24 to 36 months (Braddick 1980; Fawcett 2005; Fox 1980; Petrig 1981; Takai 2005). Two studies reported that stereoacuity continues to develop beyond the age of 3 years, and may not yet be fully mature at 5 years or 12 years of age, respectively (Simons 1981a; Walraven 1993). Normal adult stereopsis is 50 to 60 seconds of arc; some childhood vision screening programmes have used a threshold of 400 seconds of arc for "suspicion of amblyopia" (Traboulsi 2008). Reduced stereopsis adversely affects motor skills, particularly fine motor skills (Grant 2007; Hrisos 2006; O'Connor 2010; Webber 2008). Significant misalignment can affect development (through unilateral reduced acuity, lack of depth perception and limitation of peripheral visual field), social interactions, and emotional well‐being. In children with infantile esotropia, surgical correction of strabismus leads to improvement in general development as measured by the Bayley scale (Rogers 1982). Scores on anxiety and depression scales such as the National Eye Institute Visual Functioning Questionnaire and the Hospital Anxiety and Depression Scale are significantly different from non‐strabismic children, and improve following surgical strabismus correction (Bernfeld 1982; Chai 2009). Children with strabismus may have significantly greater conduct and externalising problems (Koklanis 2006). Strabismus can also be an indicator of severe eye and health problems. As it can indicate poor vision, it may in rare cases be the first sign of childhood cataract, glaucoma, or tumours of the eye, optic nerve, orbit or brain, such as retinoblastoma, glioma, or rhabdomyosarcoma. Gross misalignment of the eyes is usually noticed by members of the family or carers. Small angles of deviation are not necessarily apparent. In young children, features such as a broad nasal bridge or certain lid positions and shape (epicanthus) can give rise to pseudostrabismus, i.e. a perception of strabismus when in fact the eyes are straight. Diagnosis of strabismus: the cover testThe cover test is based on the observation of a refixation movement of a deviated eye when the fixing eye is covered (Gamble 1950; McKean 1976; Romano 1971; Scott 1973). The basic form of the cover test, the cover‒uncover test, establishes the diagnosis of manifest strabismus. An occluder is introduced in front of one eye, then removed, re‐establishing binocular viewing. If an eye moves when the other is covered, this indicates that this eye was not fixing before the cover was introduced. Any eye movement is interpreted as 'test positive' and 'manifest strabismus present'; the magnitude of the movement is often categorised as small, moderate or large. This test is used in some screening programmes to detect strabismus (Fogt 2000). The accuracy of the cover test in detecting strabismus may be affected by the child's age at screening, with better test performance in children over the age of 3 years (Williams 2001). Variations of the cover test are used to diagnose latent strabismus, and to measure the magnitude of both manifest and latent strabismus. The presence of latent strabismus is assessed by using the alternate cover test. The occluder is moved from eye to eye, allowing viewing of the target with one eye only, without permitting binocular viewing. The observer notes refixation movements of either eye as the cover is removed. Quantitative measurements are obtained by neutralising the strabismus with prisms held in front of one eye whilst performing the cover‒uncover test (simultaneous prism cover test) or the alternate cover test (prism alternate cover test); the endpoint of measurements is the prism with which no refixation movement is observed when the cover is removed. To trained professionals (orthoptists) refixation on cover test can indicate strabismus; however, this has not been used in published screening studies. All cover tests are carried out with the participant fixing on a target presented at distances of 6, 4 or 3 metres, and then at near distances (33 cm or 40 cm). In children, the distance target is often presented at 3 metres. In very young children the test is often only carried out at near fixation. The cover‒uncover test aims to detect strabismus, but not refractive errors, the other significant group of amblyogenic risk factors. Its accuracy as a standalone amblyopia screening test is therefore limited (Schmidt 2004). Conversely, addition of the cover‒uncover test to vision screening tests increases the detection rate of strabismus (VIP 2007). Vision screening programmes for children between 4 and 6 years traditionally use optotype testing to determine visual acuity (matching or naming letters or pictures), with or without a cover test to detect strabismus. In an effort to screen younger children to identify and treat problems early, these 'manual' screening programmes are increasingly supplemented or even replaced by the use of devices such as photorefractors, which also aim to provide information about refractive amblyogenic risk factors. The American Association of Pediatric Ophthalmology and Strabismus (AAPOS) recently published updated recommendations for automated screening programmes (Donahue 2013). Screening methods were categorised into refractive and non‐refractive screening instruments. With regards to detection of strabismus, the AAPOS recommends that non‐refractive screening devices should detect manifest strabismus greater than 8 prism dioptres (PD) in primary position (Donahue 2013). UK recommendations suggest that screening at age 4 to 5 years old provides the most accuracy and allows adequate time to treat (Solebo 2015). Index test(s)Different tests are in use to detect strabismus in a community or primary care setting by non‐expert screeners or primary care professionals.
We planned to include studies that report combinations of several index tests. Other tests for strabismus, such as controlled binocular acuity test, suppression tests, blur test, and tests designed to detect reduced fusional reserve (prism reflex test, prism fusion range) are not used by lay screeners, but only by trained professionals (orthoptists). Orthoptist‐delivered screening is not within the remit of this review. Principles underlying each type of index testDetailed information about each index test is given in Appendix 2. Type 1: tests which directly identify ocular misalignmentIn manifest strabismus with childhood onset, information from the deviating eye is suppressed, so that a person does not perceive double vision. The principle of the cover test is that when the fixating eye is covered, the deviating eye will move to primary position (looking straight ahead position) to take up fixation, as long as it has some vision and does not have eccentric fixation or severe eye movement deficit. Presence, speed and magnitude of this refixation movement are the outcomes of the cover test. Type 2: tests of binocular function ‒ stereopsisThe visual axes need to be within a certain angle to each other in order to detect information that is presented in stereotests. Strabismus may be associated with reduced stereopsis. Type 3: tests designed to detect reduced central vision/visual acuityThough not a specific indicator, visual acuity tests may indicate the presence of strabismus‐induced amblyopia. Type 4: automated refraction devices designed to report ocular misalignmentSome autorefractors indicate asymmetry of corneal reflections. Clinical pathwayChildhood vision screening programmes vary around the world, and may differ between high‐ and low‐ to middle‐income countries. World Health Organization (WHO) Member States are grouped into low‐ and middle‐income countries (LMIC) by WHO region, separating out high‐income countries within each of these regions, based on the World Bank's gross national income per capita (World Bank; World Health Organization 2014). Low‐ and middle‐income states have a gross national income per capita of less than USD 12,276 whereas high‐income states have a gross national income per capita of USD 12,276 or more. High‐income countries often have national guidelines for screening, though these are not necessarily matched by an established national screening programme. For example in Israel and Sweden, there are established national screening programmes (Schmucker 2009). In the UK, the National Screening Committee recommends vision screening of children during the school year of their fifth birthday, to be delivered under the supervision of orthoptists with the focus on screening for visual impairment as the target condition, not other risk factors for amblyopia (Hall 2003; UK National Screening Committee). Abnormal screening results trigger referral for a comprehensive eye examination. Despite this recommendation, implementation of childhood vision screening continues to show regional variation. In the USA, Canada, Belgium and Germany there are no national screening programmes although regional programmes do exist. There are, however, guidelines in the USA which recommend vision and alignment screening between the age of 3 and 3.5 years by a suitably trained individual (American Academy of Ophthalmology 2012). In Canada, there are national guidelines for screening visual acuity and ocular alignment at 3 to 5 years of age, but no established screening programme (Canadian Paediatric Society 2009). In many countries office‐based paediatricians, ophthalmologists and optometrists offer annual 'child health' or 'eye health' checks, respectively, but these occur outside national programmes. In low‐ and middle‐income countries, little information on national screening programmes is available in the literature or online (World Health Organization 2014). One exception is Iran, where a national screening programme of 3 to 6 year olds performed by kindergarten teachers assessing visual acuity with illiterate 'E' Snellen charts has been in place since 1996 with an estimated uptake of 67% of eligible children in 2005 (Khandekar 2009). In India there is no national screening (Jose 2009). There are efforts to find cost‐effective strategies for screening in developing countries such as a remote photoscreener system piloted in Brazil and China, and a home‐based screening programme in China performed by parents (Donahue 2008; Lan 2012). RationaleStrabismus is a risk factor for the development of amblyopia. Whilst large deviations may be detected by family, friends or lay screeners, small deviations may go unnoticed, leading to suppression of visual information from the deviated eye. Childhood vision screening programmes use varying combinations of tests, depending on the age at which children are tested, and the type of professionals carrying out tests. Strabismus tests as part of a combination of tests may increase the precision of childhood amblyopia screening (VIP 2007). Published vision screening studies often lack specific information on strabismus detection, instead reporting overall precision in detecting amblyopia. This review does not propose screening for strabismus that is of no aesthetic concern or visual consequence, but it is important to summarise current evidence on accuracy of tests in detecting strabismus in a screening setting, in order to enable healthcare commissioners to implement the most effective screening programme. ObjectivesTo assess and compare the accuracy of tests, alone or in combination, for detection of strabismus in children aged 1 to 6 years, in a community setting by non‐expert screeners or primary care professionals to inform healthcare commissioners setting up childhood screening programmes. Secondary objectivesOther objectives were to investigate sources of heterogeneity of diagnostic accuracy, including:
MethodsCriteria for considering studies for this reviewTypes of studiesWe have included all prospective or retrospective population‐based test accuracy studies of consecutive participants. By 'population‐based' we mean not only screening studies, implying sampling based on census, but also studies recruiting from community services such as schools or paediatric health districts. Hospital cohorts were excluded, unless the sampling from a community service was clearly described. We have included studies that compare a single index test, or a combination of index tests, with a reference standard (cover test, performed as a standalone test or as part of a comprehensive eye examination). Case‐control studies, in which children are selected based on their disease status, have been excluded unless they are nested in large prospective consecutive studies. Studies had to provide sufficient data to calculate diagnostic accuracy (sensitivity, specificity). We planned to include studies in which only a subgroup of participants had undergone the reference test; the result from these to be considered by subgroup analysis. ParticipantsWe included children aged 1 to 6 years old. Strabismus is a risk factor for the development of amblyopia during the 'sensitive period' of vision development. During this period, neural plasticity is greatest, and it begins to decline around the age of 6 years; clinical interventions are typically offered to children up to the age of 10 years. Screening programmes therefore attempt to identify children with amblyogenic risk factors before the age of 6 years, to allow remedial treatment. We set the lower limit of the age range at 1 year to avoid overlap with early postnatal eye screening programmes. In countries where children start school in the academic year of their fifth birthday, screening programmes aim to capture children aged 4 to 5 years, i.e. during their first year at school. In other countries, the year of school entry can be earlier or later, and vision screening programmes may be carried out in the first year of school, or independent from schools. An age range of 1 to 6 years allows inclusion of all population‐based studies in children at risk of developing amblyopia from strabismus. When studies included children outside the range of 1 to 6 years, we tried to obtain subgroup data. If we did not obtain subgroup data we excluded these studies. We intended to include these studies if the proportion of children beyond age 6 is less than an agreed threshold, e.g. 20%, and we would have conducted sensitivity or subgroup analyses as appropriate. We considered children attending population‐based vision screening programmes. We included opportunistic screening programmes, such as including children attending schools. We excluded orthoptist‐delivered programmes, as these include the reference standard. Index testsWe included any test used by lay screeners to detect strabismus, either directly by identifying misalignment, or indirectly by identifying a consequence of strabismus such as loss of stereovision. The participant age range means that different tests may be used, as appropriate for the age of participants in each particular study. We described the index test by test type rather than enumerated.
If the search had revealed several high‐quality studies for each test type for inclusion in this review, we would have considered splitting the review by test type group. Finally, we did not consider tests that require specialist skills, such as the 4‐dioptre prism test, since we are concerned with population screening which is typically carried out by non‐expert professionals, not by orthoptists, optometrists or ophthalmologists, who would directly use our reference standard, the cover‒uncover test. Target conditionsThe target condition is constant or intermittent manifest strabismus of any magnitude and type (esotropia, microtropia, exotropia, hyper/hypotropia). Reference standardsThe reference standard considered in this review was the cover‒uncover test, whether used alone or within a comprehensive ophthalmic examination, or in combination with other tests, by trained personnel. We included studies that use the cover‒uncover test, regardless of the type of professional performing the test. Type of professional (ophthalmologist, orthoptist, optometrist, trained technician, non‐expert screener) will be noted and analysed as subgroups. We included studies in which the cover‒uncover or alternate cover test is used as part of a comprehensive eye examination, which often also includes visual acuity, biomicroscopy and refraction. For the latter scenario there is a risk of incorporation bias. This bias can be avoided by ensuring that the tests that are part of the reference standard 'comprehensive eye examination' do not belong to the same type of test as the index test included in that analysis. We excluded the whole study if a single test is assessed and there is incorporation bias, and we excluded part of the study data if a study comparing several index tests suffered from incorporation bias regarding a specific test. Search methods for identification of studiesElectronic searchesThe Cochrane Eyes and Vision Information Specialist searched the following electronic databases. There were no language or publication year restrictions.
Searching other resourcesWe used the weblink pcwww.liv.ac.uk/˜rowef/index_files/Page646.htm to search the following orthoptic journals and conference proceedings which are not electronically listed: British and Irish Orthoptic Journal, American Orthoptic Journal, Australian Orthoptic Journal, European Strabismus Association, International Strabismus Association and the International Orthoptic Congress. We contacted study authors for further clarification when required. Data collection and analysisSelection of studiesTwo review authors (SH, VT) independently assessed the titles and abstracts for eligibility. We sorted the abstracts into 'definitely exclude' and 'possibly include' categories, recognising that sometimes it is not possible to judge from the abstract whether a reference fulfils the criteria or not. We placed all abstracts selected by at least one review author in the 'possibly include' category. We resolved disagreements at each step by discussion between the two review authors and a third senior author (AD‐N). Data extraction and managementWe extracted the number of:
Uninterpretable test results at individual participant level were recorded in primary publications when a child did not comply with a test, i.e. refused to give an answer during assessment of visual acuity or stereovision, or did not fixate on targets for automated devices, or in case of ocular abnormalities affecting the clarity of cornea, lens or vitreous, or a combination of the three. For each study, we recorded how such cases were treated in the analyses. Two review authors independently extracted the data to ensure consistency and entered these into Cochrane's statistical software, Review Manager 5 (RevMan 5) (Review Manager 2014). We have extracted the data shown in Table 3, which we have displayed in the Characteristics of included studies tables. 1Data extraction from included studies
Assessment of methodological qualityWe used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)‐2 tool to evaluate the risk of bias and applicability of primary studies (www.bris.ac.uk/quadas/quadas‐2). QUADAS‐2 consists of four key domains: patient selection, index test, reference standard, and flow and timing. The tool is completed in four phases.
Each domain is assessed in terms of the risk of bias and the first three are also assessed in terms of concerns regarding applicability. To help reach a judgement on the risk of bias, signalling questions are included. These flag aspects of study design related to the potential for bias and aim to help review authors make risk of bias judgements. Two review authors independently assessed the methodological quality of the included studies. A third senior author resolved disagreements on study quality. Table 4 shows the guidance the review authors used when judging the methodological quality of studies. 2QUADAS‐2 assessment guidance
We scored the risk of bias signalling questions as 'yes/no/unclear' as detailed in Table 4. Risk of bias was judged as 'low', 'high' or 'unclear'. When we answered 'yes' for all signalling questions for a domain then we could judge the risk of bias 'low'. If we answered any question as 'no', this flagged the potential for bias. When this occurred, we followed the guidelines developed in phase 2 of the quality assessment process to judge the risk of bias. We used the 'unclear' category only when insufficient data were reported to permit a judgment. We judged applicability of primary studies to the review question in a similar manner. We also recorded study sponsorship. Statistical analysis and data synthesisWe used two‐by‐two data of index and reference test results to calculate the sensitivities and specificities, with their 95% confidence intervals. We used the RevMan 5 software for descriptive analyses, and plotted individual studies in forest plots. Considering test threshold across different test types was the most important analytic issue in this review. We planned to use a continuous output measure for most tests: ocular misalignment as prism dioptres (PD) or degrees for test type 1; stereoscopic acuity as seconds of arc for test type 2; visual acuity in logMAR for test type 3 (acknowledging that comparison of values may be hampered by use of charts with different optotype size steps, and that simple mathematical conversion from Snellen to logMAR may be inaccurate); and millimetres or a ratio for test type 4. Other tests listed in Appendix 2 and used in the diagnosis of, but not in screening for, strabismus are not based on an explicit common measure. However, in practice the heterogeneous execution and technical characteristics of the tests made it difficult to consider using an explicit threshold in statistical analyses, and implicit threshold effects are more likely. Analyses within each test typeWe intended to analyse different tests within each test type group, using the following strategy. For each study, we intended to extract data at specific thresholds if available. We attempted to extract cut‐offs of 8 PD for horizontal and 1 PD for vertical deviations in test type 1; 400 arc seconds for test type 2; and visual acuity 0.2 logMAR for test type 3. UK screening recommendations specify "less than 0.2 logMAR" as referral threshold (UK National Screening Committee); guidelines from the AAPOS specify that optotype‐based screening (which covers test type 3) should detect visual acuity of less than 0.176 logMAR (Snellen 20/30) at all ages. Threshold values for test type 4 have not been published; we therefore used "any asymmetry, in millimetres or as ratio" as threshold. Thresholds are summarised in Table 5 3Thresholds for analysis
Investigations of heterogeneityThe framework for likely sources of heterogeneity was described previously and mainly includes setting and study population, particularly regarding referral method and inclusion criteria; type of professional executing the reference standard; and study quality assessment. We planned to investigate heterogeneity in the first instance through visual examination of forest plots of sensitivities and specificities and through visual examination of the receiver operating characteristic (ROC) plot of the raw data. However, we had insufficient data to investigate these secondary objectives. Sensitivity analysesWhere appropriate (i.e. if not already explored in our analyses of heterogeneity) and if sufficient data were available, we planned to explore the sensitivity of any summary accuracy estimates to aspects of study quality such as nature of masking and type of reference standard, guided by the anchoring statements developed in our QUADAS‐2 exercise. Assessment of reporting biasWe did not assess publication bias since there is no standard method to achieve this in diagnostic test accuracy reviews (Deeks 2005). For selective outcome reporting issues, such as the use of a specific cut‐off of ocular misalignment, we did not search for a protocol to assess within‐study reporting bias, since protocols of diagnostic accuracy studies are not routinely reported. ResultsResults of the searchThe searches yielded a total of 2327 records (Figure 1). After de‐duplication we screened 1236 studies/papers which, following independent screening by two authors (SH, VT), led to the exclusion of 1129 studies not meeting the inclusion criteria (including age range, examiner type and primary use of cover test), lack of relevance or lack of results. The remaining 107 studies underwent full text review, with disagreements resolved by a third author (AD) (Figure 1). The authors of six posters were contacted to ascertain relevant publications and data for those posters; three replied with one publication identified that did not meet the inclusion criteria as no strabismus outcomes were reported (Shallo‐Hoffman 2004). In addition authors for three published studies were contacted where full analysis of the results required additional data (Enzenauer 2000; Robinson 1999; Tung 2006); two of these authors responded but further data were unavailable (summarised in the Characteristics of excluded studies table). One study met the inclusion criteria and had sufficient data for analysis (Arthur 2009). Methodological quality of included studiesArthur 2009 was a prospective study performed in a community setting with all eligible children invited for screening and all screened participants offered a gold standard examination (Table 1). Eligible children for the study were all junior kindergarten students in a specific school district of Ontario, Canada; and 98% of those enrolled were 4 or 5 years of age. The screening was conducted by certified dental assistants conjointly with an existing dental screening programme. The dental assistants underwent training on the plusoptiX S04 photoscreener (Plusoptix GmbH) with defined criteria for failing the test of a corneal reflex more than 10 degrees from the centre. Bias assessment indicated an unclear risk of bias for the patient selection domain but a low risk of bias for all other QUADAS‐2 domains (Figure 2; Figure 3). This was due to a relatively low uptake of screening at 25% with included children volunteering and not sampled randomly or consecutively. There was no available data on the prevalence of strabismus in non‐responders compared to responders. There were no adverse outcomes reported. Summary of findings 1Summary of findings table
Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies FindingsThree hundred and six children were screened by the photoscreener. Two hundred and seventy‐one had both interpretable screening photographs and completed the gold standard examination, the others having declined (n = 14), being unable to attend within the study timeframe (n = 11), become uncontactable (n = 6), having had uninterpretable photographs (n = 3) or incomplete examination (n = 1). The photoscreener was used to ascertain refractive error, anisocoria and ocular misalignment with 14 children referred specifically for ocular misalignment. A total of 13 children were identified to have strabismus on gold standard examination of which six had been referred for ocular misalignment, two had been referred for refractive error and five had passed the screening test. The two participants referred for refractive error not ocular misalignment and found to have strabismus were considered as false negatives for calculating accuracy. The main outcomes were a sensitivity of 0.46 (95% confidence interval (CI) 0.19 to 0.75), and a specificity of 0.97 (CI 0.94 to 0.99) (Figure 4). The estimated prevalence of strabismus in the screened population was 4.8%. The types of strabismus identified were intermittent exotropia (n = 3 well‐controlled, n = 2 poorly‐controlled), esotropia (n = 4), hypertropia (n = 2) and exotropia (n = 2). Forest plot of 1 Photoscreener. DiscussionScreening for strabismus in children in the community may be achieved by tests that directly ascertain misalignment of the eyes (corneal or fundus reflections) or indirectly detect associated reduced vision or stereopsis. Small deviations may not be noticed by family but may have significant impact on visual development hence the rationale for screening. Summary of main resultsThere is limited available data on strabismus screening in the community as performed by lay examiners with the majority of published screening studies predominantly focusing on amblyopia screening. One study was identified that met the full inclusion criteria for this review in which all children screened with a photoscreener were offered a gold standard examination (Arthur 2009). There was an unclear risk of bias. The results indicated high specificity but low sensitivity implying the potential for significant false negatives. Absolute numbers found to have strabismus by gold standard examination were small at 13 in total out of 271 children. Strengths and weaknesses of the reviewOnly one study was analysed in this review, prohibiting any conclusion on the accuracy of screening tests. It remains unclear whether other screening modalities would have significant accuracy for screening in this context. Applicability of findings to the review questionThe findings have limited applicability to the review question in which the assessment and comparison of the accuracy of multiple tests in screening for strabismus was to be ascertained. The single study included suggests that the plusoptiX S04 photoscreener for detecting ocular misalignment could provide a specific but not sensitive test, and this single study is not sufficient for robust conclusion (95% CI about 20% to 75%). Further studies are needed. Due to the lack of relevant studies, the secondary objectives to investigate sources of heterogeneity of diagnostic accuracy could not be assessed. Through review of the literature, other studies were identified with relevant results that did not meet the inclusion criteria for this review and as such could not be included. This included the Vision in Preschoolers (VIP) study, a large multicentre trial performed in two phases to ascertain the accuracy of various screening tools for children aged 3 to 4 years old (VIP 2007). In phase I, trained eye care professionals performed the screening assessment but in phase II, trained nurses and lay screeners performed the screening tests. The population screened were enriched from a preceding generalised screening programme, with all those who failed screening included in the VIP study as well as a proportion of those who did not. The aim was to enrich for ocular pathology within the study to better ascertain accuracy of screening methods. As such, this study could not be included in this review but still has relevant conclusions. VIP 2007 specifically assessed methods for screening for strabismus; and for the lay and nurse screeners it included four tests; Retinomax autorefraction, SureSight Vision Screener autorefractor, LEA symbols visual acuity testing and Stereo Smile Test II stereoacuity testing. Of 4040 children screened, 157 (3.9%) were found to have strabismus. For lay screeners the combination of both the Stereo Smile test and the SureSight autorefractor and the Stereo Smile test and the Retinomax autorefractor were associated with a statistically significant increase in sensitivity of strabismus detection for a 90% specificity but no such increase was observed for other test combinations or for the nurse screeners. The study concluded that the addition of tests for eye alignment to acuity or refraction tests alone would depend on a screening programme's goals and resources. It also indicates that tests of visual acuity alone would be insufficient for identifying all cases of strabismus. A large prospective, consecutively enrolled study of all 3 to 6 year olds in an eastern province of Taiwan used two different index tests for screening with all children offered a gold standard examination by a single ophthalmologist (Tung 2006). Screening for strabismus performed in 2003 on 2868 children was conducted by trained kindergarten teachers using both a National Taiwan University (NTU) random dot stereogram to detect stereopsis less than 300 seconds of arc and Hirschberg corneal reflexes at 1 metre, with any displacement of the light reflexes considered abnormal. The number screened and then unavailable for the gold standard assessment was not disclosed. Detailed outcome numbers were not provided and as such this study could not be included in this review. However, the overall sensitivity and specificity for the NTU random dot stereogram were 38.9% and 90.4% respectively and for the Hirschberg light reflex were 75% and 98.9% respectively suggesting good efficacy for the Hirschberg light reflexes as a screening modality. In summary the applicability of available studies to primary screening programmes is limited. Future screening studies should also consider the optimum screening age, which for optotype‐based tests is around 4 to 5 years (Solebo 2015). Lastly, we would recommend further research into long‐term visual and psychosocial outcomes of childhood strabismus, to explore the benefits of early detection. Authors' conclusionsImplications for practiceIdentifying strabismus as part of a screening programme is most important if it impacts on visual acuity (leading to amblyopia) or stereopsis. Therefore screening in the community does not need to directly test for strabismus by ocular misalignment although there is the suggestion from other studies that sensitivity is increased by doing so. There is a lack of evidence for which tests are most accurate in detecting strabismus specifically in a normal population being screened by non‐expert screeners. Implications for researchCochrane Reviews of the accuracy of screening tests to detect anisometropia and amblyopia would complement the evidence review on screening strategies. Given the prevalence of amblyopia and amblyogenic risk factors, primary vision and strabismus screening studies would require large numbers of children to be screened. Such studies may be cost‐effective if run alongside existing vision screening programmes. As visual acuity alone may not be sufficiently sensitive to detect strabismus, addition of autorefractor, stereoacuity, corneal light reflex testing or novel devices should be considered (VIP 2007). Although sensitivity of screening tests should be around 80% specificity may not need to be, as the further assessment for amblyopia is non‐invasive and does not carry a risk of harm. AcknowledgementsCochrane Eyes and Vision (CEV) created and executed the electronic search strategies. We wish to thank Mrs Angela Coleman, Head Orthoptist at Moorfields Eye Centre at Bedford Hospital, for the critical review of the protocol; and Tess Garretty, Helen Griffiths and Fiona Rowe for external peer review comments on the protocol and review. We thank Anupa Shah, Managing Editor for CEV for her assistance throughout the review process. AppendicesAppendix 1. Prevalence of strabismus
Appendix 2. Index testsCorneal reflection test (Hirschberg) (Hirschberg 1881) In the literature there is some confusion around the terms visual axis, pupillary axis, optical axis, line of sight, angle kappa and angle lambda. The visual axis is the line connecting the fovea and the nodal point of the eye and continuing anteriorly through the cornea. The pupillary axis is the line perpendicular to the cornea that intersects the centre of the pupil. It is a clinical approximation of the visual axis. The optical or anatomical axis of the eye connects the centre of the curvature of the cornea and the centre of the curvature of the posterior pole. The line of sight is the line that connects the fixation point and the centre of the pupil; it is a clinical approximation of the optical axis. The angle between visual and optical axis is called angle kappa. Landolt originally defined angle kappa as "the angle between the visual axis and the so‐called central pupillary line (the pupillary axis)" (Emsley 1948). Lancaster then defined angle lambda as the angle between the pupillary axis and the line of sight. LeGrand finally re‐defined angle kappa exactly the way Lancaster had defined angle lambda, stating that the nodal point of the eye is a theoretical concept, and that for all practical purposes the visual axis is identical to the line of sight (LeGrand 1980). In addition, angle lambda and angle kappa are nearly identical when the point of fixation is not very close to the eye. Angle kappa is the angle between the visual and optical axis, or between the pupillary axis and the line of sight (LeGrand 1980). By convention it is normally positive. Individuals with exotropia have higher angle kappa values than esotropic and orthotropic individuals (Basmak 2007). A large angle kappa may also give rise to pseudo‐exotropia. When the fovea is situated nasal to the optical axis, such as in high myopia or ectopic fovea, for example after retinopathy of prematurity, the angle kappa is negative, the CR is located temporal to the centre of the cornea, and pseudo‐esotropia may be present. In ocular misalignment, the CR is displaced – nasally in exotropia, temporally in esotropia (Hirschberg 1881). When the CR is located at the border of the pupil, the deviation is approximately 15 prism dioptres (PD). If it lies midway across the iris, the deviation measures around 30 PD, and when the reflection is near the limbus, around 45 PD. Hirschberg’s original observations indicated a ratio of 12 to 14 PDs per millimetre displacement of the CR from the pupillary axis, the so‐called Hirschberg ratio. Later evaluations of the Hirschberg test using photography to standardise measurements indicated a ratio of 19.5/1 (Wick 1980), 21/1 (Brodie 1987; DeRespinis 1989), 22/1 (Eskridge 1988) or 24/1 (Carter 1978) with little change from birth to adulthood (Hasebe 1998; Riddell 1994; Wick 1980). Photographs acquired whilst fixating with first the preferred eye, then the deviating eye, in primary position and in slightly eccentric fixation may allow a highly accurate measurement of the ocular misalignment (Romano 2006). Based mainly on reasons of photographic technique, some consider the limbus a more accurate landmark than the centre of the pupil (Barry 1997; Romano 2006); however displacement from the centre of the pupil remains the more commonly used value. Use of the Hirschberg test as a screening test for ocular alignment has been recommended in young preverbal and also in pre‐school children (daSilva 1991; Sansonetti 2004). To allow standardisation, videographic techniques have been proposed (Miller 1993). These can be applied when using video refractors or photoscreeners developed for the automated assessment of refractive errors (Griffin 1989; Hasebe 1995; Moghaddam 2012; Schaeffel 2002; Weinand 1998). Automated assessment of CR on digital photographs and videographs is currently in development (Almeida 2012; Model 2012; Yang, 2012). Accuracy of the Hirschberg test may be in the range of ± 9 to 10 PD, which would make it unsuitable to detect or exclude microtropia. In orthoptic practice, Hirschberg and Krimsky tests are reserved for very young, preverbal patients or those with profound visual impairment which prevents fixation with the affected eye(s). The Hirschberg test is useful to demonstrate pseudo‐strabismus in young children with a broad nasal bridge and epicanthal folds or in individuals with wide interpupillary distance. Coaxial fundus reflex test (Brückner) Stereovision tests Tests of stereovision may indicate ocular misalignment. As other causes — such as uncorrected refractive errors and reduced visual acuity — also impair stereopsis, specificity for any particular cause may be poor. Stereopsis is most commonly tested at near. Near stereotests fall into two categories: contour (Titmus fly, Wirt ring test); and random dot tests. Contour tests achieve horizontal image disparity by vectographic techniques and require polarised glasses to view a three‐dimensional (3D) picture embedded in polarised filter sheets made from plastic. By stacking two of these sheets at a perpendicular angle, a separate image is shown to each eye. When viewed without the glasses, the picture can still be seen, but its 3D qualities can only be perceived through the polarising glasses. As contour tests gives some monocular cues to the position of the 3D shapes many clinicians prefer random dot tests for testing stereovision. Random dot images do not contain any contour lines. Shapes can only be seen and depth can only be perceived when true binocular stereopsis with central (foveal) fusion is present. Random dot tests include the Frisby, Lang, TNO, Randot and Randot‐E stereotests (Broadbent 1990; Lang 1983; Rosner 1984; Simons 1981b). The Frisby test consists of three Perspex plates of different thickness. On each plate there are four square areas which contain triangular shapes apparently distributed in a random pattern. On one of the squares, some shapes arranged in a geometric pattern, such as a circle, are printed onto the back surface of the Perspex plate, whilst the remaining square is filled with triangles printed onto the front surface. The physical thickness of the Perspex plate and the distance between the shapes printed onto the front and the back of the plate induce horizontal image disparity. This test is simple to perform, does not require 3D viewers and is popular with children. Preverbal children may point onto the 3D shape or may direct their eyes towards it, similar to their response in preferential looking tests. The Lang stereotest combines two methods of three‐dimensional image perception: random dots and cylindrical gratings. The cylindrical gratings use a prismatic effect to achieve the slight horizontal image displacement required for a 3D effect. The advantage of this method is that it does not require special glasses for viewing. Essentially, the separation of the two images is achieved by a system of fine parallel cylindrical shapes. Beneath each cylinder are two fine strips of picture, one seen by the right, the other seen by the left eye. In the Lang test, random dot images hide simple shapes such as a star, a cat, a car. As it does not require 3D viewers, it is easy to use with children and is commonly used in vision screening programmes. Like the Frisby test, it can be used in preverbal children by observing their behavioural response. The Randot and Random dot E stereotests require polarising glasses for viewing. Images of animals and geometric shapes are horizontally displaced using vectographic techniques. These tests allow fine grading of stereopsis, but not all children will like wearing the polarising glasses. The TNO test creates a 3D effect by using red‒green anaglyphs. Red‒green anaglyphs are based on two images showing the same scene from a slightly different angle. One image is processed through a red, and the other image through a green or blue or mixed (cyan) filter. The resulting images are superimposed, but slightly offset. When viewing these pictures through glasses with one red and one green lens, a stereoscopic effect results. Near stereotests are not sufficiently accurate to be used as standalone vision screening tests (Donahue 2013; Huynh 2005; Ohlsson 2002; Schmidt 2003; VIP 2005; VIP 2007). Testability is affected by age (Pai 2012; Schmidt 2003). Distance stereopsis can be measured with the distance Frisby stereotest, a cabinet which houses Perspex plates which present random images at slightly different distances from the observer (Adams 2005; Holmes 2005; Kaye 2005), or with a distance Randot test (Fu 2006). Despite reports of high sensitivity to detect vision defects (Rutstein 2000), distance stereopsis has not been evaluated in vision screening programmes. Distance stereoacuity can be reduced in convergence excess esotropia and intermittent distance exotropia (Hatt 2008). Visual
acuity tests In older children and adults, VA is assessed by reading a chart of characters, or optotypes, at a defined distance. In very young children, assessment of visual acuity relies on observation of behavioural responses to visual targets. Preferential looking cards showing patterns of high‐contrast black and white stripes are used in children under the age of 2 years to determine “grating acuity” (Dobson 1978). In strabismic amblyopia, grating acuity is reduced to a lesser degree than linear letter acuity and results may overestimate the level of vision. From the age of 2 years, single symbols such as Kay pictures can be used (Kay 1983). From the age of 3 years, crowded linear optotypes such as HOTV, or crowded Kay or Lea pictures can be used. These tests are often used in childhood vision screening programmes (Schmidt 2004; Hered 1997; Anonymous 2004). Crowded optotypes (several characters next to each other) viewed one eye at a time (monocularly), such as on HOTV or logMAR charts, are considered the 'gold standard' for visual acuity testing (Schmidt 2004). The use of crowded optotypes instead of single optotypes is particularly important in amblyopia screening, as single optotype testing can overestimate visual acuity. All visual acuity tests can be performed either by the child calling out the name of the picture or letter, or by the child matching the target optotype with a chart held by a parent/guardian. More detailed letter‒optotype tests include the Keeler logMar and the Sonksen Silver logMar tests. In order to both increase portability of charts and reduce variation of illumination levels, computer‐based testing applications are available and used in some screening settings (Thomson 1999). Visual acuity tests can be administered by any suitably trained person. In the UK, screening programmes are delivered by qualified orthoptists, health care technicians or school nurses trained by orthoptists (Hall 2003; UK National Screening Committee) In the USA, paediatric vision screening is usually performed by suitably trained nurses or lay screeners. Autorefractors/Photorefractors Photorefraction analyses the reflection of light emitted from a small flashlight placed close to the camera lens. Three types of photorefraction have been developed: orthogonal, isotropic and eccentric (also called photoretinoscopy). Refractive errors result in certain patterns of photographic appearances, which vary with the degree to which the eye is defocused with respect to the plane of the camera (Howland 1974; Howland 2009). Photorefractors are mainly used to obtain refractive values. Some devices combine a photographic Brückner test and eccentric photorefraction to detect amblyogenic risk factors (Cibis 1994; VanEenwyk 2008). Several current photorefractors also detect strabismus as asymmetry of corneal light reflections (Arnold 2013; Dahlmann‐Noor 2009a; Dahlmann‐Noor 2009b; Moghaddam 2012; Silbert 2013). The following table summarises possible test outcomes, pass/fail thresholds and examples of published screening studies that have used these tests. The variation of tests used in different studies
for each group of index tests means that many specific tests have only been used in one or a small number of studies.
Other tests used in the diagnosis of strabismus, but not in primary care or community screening settings delivered by lay screeners or primary care professionals Krimsky test Prism reflection
test Controlled binocular acuity (CBA) test of strabismus Suppression tests Fusion tests Blur test Appendix 3. The Cochrane Library search strategy#1 MeSH descriptor: [Vision Tests] explode all trees Appendix 4. MEDLINE (Ovid) search strategy1. exp vision tests/ Appendix 5. Embase (Ovid) search strategy1. exp vision test/ Appendix 6. CINAHL (EBSCO) search strategyS34 S28 and S33 Appendix 7. Web of Science Conference Proceedings Citation Index – Science (CPCI‐S) search strategy#20 #15 AND #18 AND #19 Appendix 8. BIOSIS Previews search strategy#20 #15 AND #18 AND #19 Appendix 9. MEDION search strategyDatabase was searched on ICPC code field. Using code "f" for ophthalmology. Appendix 10. ARIF search strategystrabismus OR amblyopia Appendix 11. ISRCTN search strategy(strabismus OR amblyopia) AND (test OR screen OR diagnosis OR assess) Appendix 12. ClinicalTrials.gov search strategy(strabismus OR amblyopia) AND (test OR screen OR diagnosis OR assess) Appendix 13. ICTRP search strategystrabismus OR amblyopia = Condition AND test OR screen OR diagnosis OR assess = Intervention Appendix 14. GlossaryAccommodation: mechanism by which an eye focuses on a near object; accommodation involves contraction of the ciliary muscle, which relaxes the fibres holding the lens inside the eye; the lens then assumes a more rounded shape. DataPresented below are all the data for all of the tests entered into the review. Characteristics of studiesCharacteristics of included studies [ordered by study ID]Arthur 2009
Characteristics of excluded studies [ordered by study ID]
Contributions of authorsAll authors helped to draft the protocol. Sarah Hull and Vijay Tailor reviewed the abstracts provided by the search as well as full articles which potentially met our inclusion criteria. They extracted relevant data, wrote the Results section and updated this article from protocol to full review. Annegret Dahlmann‐Noor provided a third opinion on study inclusion where required. Gianni Virgili reviewed the analysis and provided critical review of findings. Jugnoo Rahi and Christine Schmucker critically reviewed the final version of this article. Sources of supportInternal sources
External sources
Declarations of interestThe authors have no interests to declare. ReferencesReferences to studies included in this reviewArthur 2009 {published data only}
References to studies excluded from this reviewEnzenauer 2000 {published and unpublished data}
Robinson 1999 {published and unpublished data}
Shallo‐Hoffman 2004 {published data only}
Tung 2006 {published and unpublished data}
VIP 2007 {published data only}
Additional referencesAdams 2005
Almeida 2012
American Academy of Ophthalmology 2012
Amitava 2012
Anonymous 2004
Arnold 2000
Arnold 2013
Barry 1997
Basmak 2007
Bernfeld 1982
Braddick 1980
Broadbent 1990
Brodie 1987
Brückner 1965
Canadian Paediatric Society 2009
Carrera 1993
Carter 1978
Chai 2009
Choi 1998
Cibis 1994
Ciuffreda 1991
Dahlmann‐Noor 2009a
Dahlmann‐Noor 2009b
daSilva 1991
Deeks 2005
DeRespinis 1989
Dobson 1978
Donahue 2008
Donahue 2013
Emsley 1948
Eskridge 1988
Fawcett 2005
Fogt 2000
Fox 1980
Friedman 2009
Fu 2006
Gamble 1950
Graf 2012
Graham 1974
Grant 2007
Griffin 1986
Griffin 1989
Hall 2003
Hasebe 1995
Hasebe 1998
Hatt 2008
Hered 1997
Hirschberg 1881
Holmes 2005
Howland 1974
Howland 2009
Hrisos 2006
Huynh 2005
Jose 2009
Kaakinen 1979
Kay 1983
Kaye 2005
Khandekar 2009
Koklanis 2006
Kothari 2007
Krimsky 1951
Lan 2012
Lang 1983
LeGrand 1980
Matsuo 2007a
Matsuo 2007b
McCormick 2002
McKean 1976
Miller 1993
Miller 1995
Model 2012
Moghaddam 2012
Nuzzi 1986
O'Connor 2010
Ohlsson 2002
Pai 2012
Paysse 2001
Petrig 1981
Pott 1998
Pott 2003
Prakash 1996
Preslan 1996
Rahi 2002
Review Manager 2014 [Computer program]
Riddell 1994
Rogers 1982
Romano 1971
Romano 2006
Rosner 1984
Rutstein 2000
Sansonetti 2004
Schaeffel 2002
Schmidt 2003
Schmidt 2004
Schmucker 2009
Scott 1973
Silbert 2013
Simons 1981a
Simons 1981b
Simons 1996a
Simons 1996b
Smith 1985
Solebo 2015
Takai 2005
Thomson 1999
Tongue 1981
Tongue 1987
Traboulsi 2008
Turacli 1995
UK National Screening Committee
VanEenwyk 2008
VIP 2005
Walraven 1993
Walsh 2000
Webber 2008
Wedner 2000
Weinand 1998
Wick 1980
Williams 2001
World Bank
World Health Organization 2014
Yang, 2012
References to other published versions of this reviewTailor 2014
Articles from The Cochrane Database of Systematic Reviews are provided here courtesy of Wiley |