Translation of the BSS-R: Psychometric approaches to validation

There are a number of ways to evaluate the psychometric properties of a newly translated version of the BSS-R. Prior to a description of these below, an assumption is made that the translation process itself was robust and followed established guidelines in terms of forward-backward translation etc.

Sample Size

A primary goal of the translation process in terms of psychometric validation is to ensure equivalence to the original UK English-language version of the BSS-R (Hollins Martin & Martin, 2014). In order to do this a minimum sample size is required in order to perform some of the statistical procedures, particularly those that evaluate the measurement model of the BSS-R. A balance also has to be made between psychometric rigour and any publishable outputs that may be envisaged from the translation as well as adoption within a clinical context. Evaluating the measurement model of the BSS-R generally requires the largest minimum sample size, thus this would represent a realistic minimum for a full psychometric evaluation involving a number of tests of validity and reliability.

Evaluation of the measurement model

The BSS-R is underpinned conceptually by a tri-dimensional measurement model comprising (i) Stress experienced during childbirth, (ii) Women’s attributes and (iii) Quality of care. These domains comprise 4, 2 and 4 items BSS-R respectively and represent the sub-scales of the BSS-R. The measurement model assumes these domains represented by the BSS-R sub-scale items to be correlated, an observation found consistently in validation studies of the BSS-R. Evaluation of the tri-dimensional measurement model of the BSS-R is usually undertaken using confirmatory factor analysis (CFA) and the findings considered against threshold levels on established measures of ‘model fit’. Invariably a translation of the BSS-R following due process in translation process will offer a good fit to data when testing the tri-dimensional measurement model using CFA and with a sufficient sample. We would recommend a minimum sample size for undertaking the CFA to be N=200.

We have undertaken a simulation study which suggests that the minimum sample size could be a little lower (N=185), however we would still recommend a minimum sample size of N=200 as this represents a realistic threshold and would also be considered acceptable by most journals if a validation paper was submitted for publication.

Internal consistency

The general accepted measure for internal consistency is Cronbach’s alpha (Cronbach, 1951). This should be at a level of 0.70 or above for the whole scale and the Stress experienced during childbirth and Quality of care sub-scale should be near to or exceed 0.70. The two-item Women’s attributes sub-scale may be more appropriately evaluated for internal consistency using inter-item correlation with a minimum of 0.15 for sub-scale acceptability (Clark & Watson, 1995).

Known-groups discriminant validity

Known-groups discriminant validity (KGDV) may be evaluated in a number of ways, with hypotheses related to group difference at a whole scale or sub-scale level investigated to confirm this validity domain. An example are studies that have used unassisted vaginal delivery compared to an intervention delivery to examine group differences (Romero-Gonzalez et al., 2019; Skvirsky, Taubman-Ben-Ari, Hollins Martin, & Martin, 2019). However, any hypothesis-driven and evidence-informed group comparison may be undertaken in the same way. Dependent on the profile of data, statistical evaluation by parametric or non-parametric tests may be used, for example in the case comparison of two groups and data characteristics suitable for a parametric test, the independent t-test would be appropriate.

Divergent validity

Divergent validity assumes no relationship between BSS-R total or sub-scale scores and a domain that, naturally, is not assumed to have a relationship with BSS-R scores. The usual statistical approach to this is to undertake a Pearson’s correlation coefficient (or the non-parametric equivalent test) with the expectation that a statistically significant correlation will not be found.

Test-retest reliability

Test-retest reliability may be evaluated by re-administering the BSS-R to the same group of participants at a second observation point and testing for a statistically significant relationship between observation points a Pearson’s correlation coefficient (or the non-parametric equivalent test). It is noteworthy that test-retest reliability findings that are reported in research papers more generally have a too short second observation point to evaluate test-retest reliability. We would suggest a minimum test-retest period of three months consistent with the recommendations of Kline (2000).

The above are some suggestions for psychometric evaluation of a translated version of the BSS-R. We would recommend further reading to consider the range of statistical approaches that may be undertaken and the characteristics of the study that should be taken into account. A good paper covering a number of these issues is that of Martin and Savage-McGlynn (2013).


Clark, L.A., Watson, D. (1995). Constructing validity: Basic issues in objective scale development Psychological Assessment. 7(3), 309-319.

Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika. 16(3), 297–334.

Hollins Martin, C.J., Martin, C.R. (2014). Development and psychometric properties of the Birth Satisfaction Scale-Revised (BSS-R). Midwifery. 30(6), 610-619.
Doi: 10.1016/j.midw.2013.10.006

Kline, P. (2000). A Psychometrics Primer. London: Free Association Books.

Martin, C.R., Savage-McGlynn, E. (2013). A ‘good practice’ guide for the reporting of design and analysis for psychometric evaluation. Journal of Reproductive and Infant Psychology. 31, 449-455.

Romero-Gonzalez, B., Peralta-Ramirez, M.I., Caparros-Gonzalez, R.A., Cambil-Ledesma, A., Hollins Martin, C.J., Martin, C.R. (2019). Spanish validation and factor structure of the Birth Satisfaction Scale-Revised (BSS-R). Midwifery. 70, 31-37. Doi: 10.1016/j.midw.2018.12.009

Skvirsky, V., Taubman-Ben-Ari, O., Hollins Martin, C. J., Martin, C.R. (2019). Validation of the Hebrew Birth Satisfaction Scale – Revised (BSS-R) and its relationship to perceived traumatic labour. Journal of Reproductive and Infant Psychology, 1-7. Doi: 10.1080/02646838.2019.1600666

Contact Us

Please dont hesitate to get in touch. For more information on our research and services, drop us an email and speak with us directly.