Reproducible Research Peer Assignment 2 Bmis

Citation: Donini LM, Poggiogalle E, Mosca V, Pinto A, Brunani A, Capodaglio P (2013) Disability Affects the 6-Minute Walking Distance in Obese Subjects (BMI>40 kg/m2). PLoS ONE 8(10): e75491.

Editor: Reury F. P. Bacurau, University of Sao Paulo, Brazil

Received: April 17, 2013; Accepted: August 14, 2013; Published: October 11, 2013

Copyright: © 2013 Donini et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: No current external funding sources for this study.

Competing interests: The authors have declared that no competing interests exist.


In obese subjects, the relative reduction of the skeletal muscle strength [1], the reduced cardio-pulmonary capacity and tolerance to effort [2], [3], the higher metabolic costs and, therefore, the increased inefficiency of gait [4], together with the increased prevalence of co-morbid conditions, might interfere with walking. Pain from overloaded joints [5]–[7] is a frequent complaint during walking in obese subjects, who tend to walk slower and report more frequently dyspnea than their lean counterparts [8]. On the other hand, walking often represents the most accessible mean of exercise for weight management. The ability to walk for a distance is a quick and inexpensive measure of physical function, and an important component of quality of life, since it reflects the capacity to undertake the activities of daily living [4], [5]. Performance tests, such as the six-minute walking test (6MWT), can unveil the limitations in cardio-respiratory and motor functions underlying the obesity-related disability [2], [3].

After the publication of the 6MWT official guidelines elaborated by the American Thoracic Society in 2002, several authors studied the determinants of the 6-minute walking distance (6MWD) in healthy adults. Predictive equations considering age, sex, weight and height were proposed for clinical use [9]–[13]. They aimed at representing a reference test for populations with different ethnicities and clinical conditions. These studies varied with respect to the number of individuals (with the exception of two large ones) [14], [15] but presented similar design and the reference equations were obtained using linear multiple regression models, including demographic and anthropometric features (age, sex, stature and weight in almost all studies) [16]. Only few studies correlated the 6MWD and severity of obesity; moreover, despite results were shown to be highly reproducible, they also demonstrated that they were influenced by the severity of obesity, reduced strength and aerobic capacity [17], [18].

According to the predictive equations from the literature, obese subjects consistently show a deficit in the distance walked and in work exerted for walking when compared with normal-weight subjects [19]. Reference values obtained from healthy, normal-weight populations would therefore predictably underline the reduced performance capacity of obese individuals. Instead, reference values specific for this population would serve as a benchmark to assess baseline functional capacity, to prescribe proper and safe exercise intensity and to supervise changes after rehabilitation interventions. Recently Capodaglio et al. [18] developed a reference equation for predicting the 6MWD specifically in adult obese subjects to be used in the clinical practice. Clinical applicability of the test represented, for many authors, the guiding criterium for avoiding inclusion of other parameters correlated with the results of the walking test. From a mathematical point of view, the correlation with the 6MWD would certainly benefit from the inclusion of several other factors in the predictive formula. Hulens et al. [8] found that 75% of the variance in walking performance was explained by the combination of the following variables: body mass index (BMI), peak aerobic capacity, knee extension torque, age, hours of TV viewing, BMI explaining 59% of the variance by itself. Among the predictors of the distance walked, other physiological (heart rate, oxygen saturation, blood pressure, muscle strength), life style (physical activity levels) factors and degree of disability may well play a role. Although their inclusion in an equation appears unpractical for clinical use, we need to further investigate the determinants of distance walked by obese individuals, as it would result likely in an increased prediction capacity of the equation and a deeper comprehension of the limitations of obese subjects. Also, pre- and post-assessments after combined interventions in obese subjects revolve around the main expected outcome of weight loss. The expected functional correlation is an increase in the distance walked secondary to weight loss. However, if co-morbid disabling conditions are present, distance might not necessarily increase, as expected on the basis of weight loss solely. Otherwise, if weight loss is accompanied by an improved tolerance to the effort after aerobic conditioning, the formula may underestimate the real performance. Hence, we hypothesized that the degree of disability of obese subjects should be part of their functional assessment. In fact, their disability was shown to affect the basic activities of daily living and to be mainly related to mobility impairment. Recently, an obesity-specific disability scale was developed [20] and it was also demonstrated to be able in measuring changes after multidisciplinary rehabilitation interventions [21], [22]. Therefore, the aims of the present study were: to further explore the determinants of the 6MWD by obese subjects and in particular whether measures of disability would affect the results; and to investigate the predictors of interruption of the walk test in obese subjects.



Caucasian adult obese patients (BMI>40 kg/m2) were recruited at the Metabolic, Nutritional and Psychological Rehabilitation Unit at “Villa delle Querce” Clinical and Rehabilitation Institute (Nemi, Rome-Italy) from January 2009 to December 2011, among all the obese patients hospitalised in the facility during the above mentioned period. Eligibility criteria for patients to be admitted to an intensive rehabilitation treatment were: BMI>40 kg/m2 associated to a significant disability level [as assessed by the TSD-OC test (SIO Test assessing disabilities obesity related), see above, with a disability score>33% - [20] and the presence of at least one clinical comorbidity. Patients aged less than 18 years and more than 80 years were excluded from the study. In addition, bed-ridden patients and patients presenting contraindications for the 6MWT (acute cardiac diseases in the previous month, unstable angina, uncontrolled hypertension (higher than 180/100 mmHg), major othopaedic or neurological conditions interfering with the test) were excluded [23].

The study protocol was approved by the Ethical Committee of the “Sapienza” University of Rome and oral and written informed consent was obtained from all the subjects.


The following data were measured within the first week after the admission:

  • anthropometric measures, according to the procedures described in the “Anthropometric standardisation reference manual” by Lohman et al. [24], by a trained operator. Body weight was measured to the nearest 0.1 kg using a standard column body scale SECA (Hamburg, Germany). Body height (using a rigid stadiometer – SECA, Hamburg, Germany), waist and arm circumferences (WC and AC respectively) (using a measuring tape) were determined to the nearest 0.1 cm. Triceps skinfold thickness (TSF) was measured using a Harpenden Skinfold Caliper (British Indicators Ltd, St. Albans, Herts, UK).

Then, the following indexes were calculated:

    • BMI =  weight/height in kg/m2
    • mid-upper arm muscle circumference  = AC - (π * TSF)
  • Body composition [fat mass (FM) and fat free mass (FFM)] was estimated by bioelectrical impedance analysis (BIA): whole-body impedance vector components, resistance (R) and reactance (Xc), were measured with a single-frequency 50-kHz analyzer STA-BIA (AKERN Bioresearch SRL, Pontassieve, FL, Italy). Measurements were obtained following standardized procedures [25]. The external calibration of the instrument was checked with a calibration circuit of known impedance value. Estimations of FFM and FM by BIA were obtained using sex-specific, BIA prediction equations developed by Sun et al. in a large population including extremes of BMI values [26]. Fat mass index (FMI) and fat-free mass index (FFMI) were calculated as FM or FFM in kg/body height in m2.
  • Specific short-form questionnaire for Obesity-related Disabilities (TSD-OC test) proposed by the Italian Society of Obesity was fulfilled by all the participants [20]. The TSD-OC test addresses adults and does not target a specific sex. It is composed by 7 sections (pain: 5 items; stiffness: 2 items; activities of daily living and indoor mobility: 7 items; housework: 7 items; outdoor activities: 5 items; occupational activities: 4 items; social life: 6 items) for a total of 36 items. Patients were requested to subjectively assess their difficulty in each item by means of a 0–10 visual analogue scale (10 indicating the highest level of disability and 0 no difficulties in performing the task). The total score (0 to 360) represents the disability status of the patient;
  • Fitness status was assessed by:
    • hand grip strength (HGST), measured using a Lafayette hand grip (Mod. 78011). The maximum value (kg) out of three trials using the dominant hand was recorded. Between two consecutive trials, a 1-minute recovery was provided [27];
    • Spine flexion, together with hip and shoulder flexion, extension, and abduction were measured with a standard goniometer by a skilled physiotherapist. The floor-fingertip distance (in centimeters) was considered as a measure of spinal flexibility;
  • The 6MWT was performed according to the instructions by the American Thoracic Society [23]. In particular, conditions for the execution of a safe test were respected: an easily accessible corridor for emergencies, the test interruption criteria, such as chest pain, severe dyspnea, muscle cramps, dizziness, and sudden paleness, were considered when applicable. The test was performed in an undisturbed 20-meter hospital corridor marked every 2 meters with colored tape on the floor; starting and finishing points were marked on the floor. Before the test, at 1, 3 and 5 minutes after the start and at the end of the test, pulse, respiratory rate, blood pressure and perceived fatigue on Borg's scale were measured [28]. Subjects were instructed to walk as fast as they could. They were allowed to stop or rest during the test if necessary. The 6MWD was calculated.


First, the correlations between 6MWD and the potential independent variables (anthropometric parameters, body composition, muscle strength, flexibility and disability) were analysed. After verification of the normal distribution of the variables, t-test and the analysis of variance (ANOVA) were performed to describe differences between means of the groups, and chi-square test was used to compare observed and expected frequencies. A linear regression analysis (Pearson's r) was performed to verify the association among continuous variables.

In a second phase, the variables which were singularly correlated with the response variable were included in a group of potential explicative elements of a multivariated regression model using the variables with the highest correlations and excluding the redundant ones to minimize the confounding effect of collinearity, in accordance with the principle of parsimony.

The multiple linear regression models obtained were expressed in the following algebraic formwhere “y” represents the outcome variable (6MWD), “x” the values of the independent variables, “β” the unstandardized coefficients of the independent variable and α the constant intercept coefficient.

The efficacy of the regression model was analysed according to the value of the determination coefficient R2 (comparing the explained variance of the model's predictions with the total variance of the data) and the R2adjusted (considering a correction for inclusion of variables). The standard error of the estimate (SEE), representing a measure of the accuracy of predictions (standard deviation of the differences between the actual values of the dependent variables (results) and the predicted values), was calculated.

Finally, the correlations between nutritional and functional parameters and test interruption were investigated.

Differences were considered to be statistically significant at p<0.05. Statistical analysis was performed using SPSS 10.0 statistical software (SPSS Inc Wacker Drive, Chicago, IL, USA).


Characteristics of the study sample (Table 1)

354 subjects (87 males, mean age 48.5±14 years - range 19–74 years, 267 females, mean age 49.8±15 years - range 19–80 years) were enrolled in the study. All of the subjects had a BMI>40 kg/m2 (44.7±8 versus 43.7±8 kg/m2, respectively for males and females) with a significantly increased WC (133.3±13 versus 117.8±15 cm, respectively for males and females; p<0.05). Statistically significant differences (p<0.05) were found between males and females, in particular for the 6MWD (444.3±106 versus 418.8±80 m), handgrip strength (36.7±7 versus 25.4±6 kg) and articular mobility.

Deteminants of the 6MWD

In Table 2, the correlations between the considered variables and the distance walked are described. Based on these results, a multivariate regression analysis was performed using only the independent variables significantly correlated with the outcome variables at the univariate analysis: age, weight, height, BMI, FMI, FFMI, HGST and disability (TSD-OC test score). Variables showing a lower correlation with analogous biological meaning were excluded. Sex was not part of the predictive model: distance walked by males and females did not significantly differ in our sample (Table 1). Data from the elaborated models and indicators of the precision in describing the 6MWT results are reported in Table 3. The R2 of the regression analysis ranged from 0.21 of the model 1 considering only HGST and TSD-OC (SEE: 82.0 m) to 0.47 for the model 5 considering also age, FMI and FFMI (SEE: 66.7 m). Slightly lower results were obtained with models using BMI or body weight and height. Model 5 showed a significant correlation with the real distance walked by patients (r = 0.644; p<0.001): the mean difference between real and predicted results was 38.7±79 m (range −42.5 m to 106.1 m).

Predictors of the 6MWT interruption (Table 4)

15 males (17.2%) and 54 females (30.2%) interrupted the test according to the described criteria (p>0.05).

Obese men who interrupted the test showed a higher body weight (144.2±33 versus 131.1±18 kg), BMI (49.5±7 versus 44.1±7 kg/m2) and WC (143.5±16 versus 132.6±11 cm) (p<0.05) than the rest of the sample. Disability as measured by TSD-OC test was more severe: 48.7±22 versus 27.4±26% (p<0.05). Flexibility, except for spine flexion, were significantly lower (p<0.05). Although non-significantly, among those who interrupted the test, HGST showed a tendency to be lower and FM higher.

Obese women who interrupted the test showed a higher body weight (121.3±23 versus 106.8±20 kg), BMI (47.7±9 versus 43.4±7 kg/m2), a larger WC (126.4±17 versus 117.2±14 cm) and higher FM (47.6±4 versus 44.8±4%) than obese women completing the test (p<0.05). The degree of disability was also higher (44.1±28 versus 33.5±24%; p<0.05), whereas HGST (25.4±7 vs 27.2±5 kg) and flexibility were significantly lower (p<0.05). Males and females did not differ significantly with respect to age.


The present study demonstrated the impact of the degree of disability in obese subjects on the 6MWD. The latter was correlated to the following variables: age, anthropometric data (body weight, height, BMI), body composition (FMI, FFMI), strength (HGST) and disability (TSD-OC test).

Previously several authors addressed the identification of determinants of the 6MWD by healthy adults and proposed reference equations. The large majority of them considered only body height, age and body weight [16]. Troosters et al. [10] concluded that these variables accounted for 66% of the variance in a sample of 53 healthy Caucasian adults aged 50 to 85 years, who were not previously hospitalized and did not show any chronic condition potentially hindering physical capacity [29]. Enright [30] performed the 6MWT in 290 healthy adults aged 40 to 80 years with BMI<35 kg/m2, finding a significant difference depending on height, sex and age. There is a general consensus about the fact that shorter individuals and females present a shorter step length and, consequently, shorter distances walked at the 6MWT. Likewise, in elderly sarcopenic individuals, similarly to patients with cognitive impairment or musculoskeletal disorders, reduction in the 6MWD was described [14], [30].

Muscle strength, depression, reduced perceived quality of life, medications, inflammatory disease and impaired pulmonary function are other factors that can influence the test performance [31]–[34]. In particular, in a study done by Enright and Sherrill [9], a BMI>30 kg/m2 was considered an exclusion criterium, since the research addressed the adult healthy population. Also a paper by Hulens et al. [8] was in line with these considerations, underlining that the test results were highly affected by the degree of obesity. Ben Saad et al. [13] showed that when BMI was included in the final reference equation, the 6MWD decreased by 5.27 meters when BMI increased by one unit. In a later study [30], Enright reported that the 6MWT results were affected by muscle strength in individuals with reduced mobility and aerobic capacity. Thus, our results are consistent with the extant literature: mobility and muscle strength are key factors for predicting the 6MWD by obese individuals. Body composition was considered relevant by some authors in influencing results at the 6MWT, more significantly than BMI per se [18], [30]. Although the BMI is a useful epidemiological index of obesity, it cannot be considered as the best index to determine the amount of body fat. Moreover, the correlation between body composition and the 6MWD is usually more robust than the correlation between the 6MWD and BMI [14], [30], [31], [34]. In our sample, these data were confirmed, both FMI and FFMI, and HGST correlating with the 6MWD. We also aimed at ascertaining to what extent disability may affect test results. In a previous study Enright [30] concluded that disability in activities of daily living and occupational activities is an important factor. Disability may impair the test performance also at the emotional and psychological level, as it may induce depression, which ultimately impacts on the 6MWT results, according to several authors [14], [35]–[37]. In fact, also the American Thoracic Society in the guidelines published in 2002 [23], recommended the use of standardized encouragement to avoid bias of the results, on the basis that improving the emotional state may enhance 6MWD results by 30%. Despite significantly correlated to the distance walked, the proposed multivariate models explained less than half of the variance of the phenomenon. The other models in the literature show R2 ranging from 0.20 [14] to 0.78 [37]. The population considered in our study may in part explains the relatively low reliability of the model proposed, despite the inclusion of variables all individually correlated with the outcome variable. In fact our population consisted of subjects admitted to a multidisciplinary metabolic- nutritional rehabilitation due to the severe obesity-related comorbidities. They were in frail functional and clinical conditions. Other variables more focused on the clinical aspects may perhaps increase the validity of the model. Other authors [18], [35]–[38] commented that some features linked to specific comorbidities may affect test results; our data about the subjects who were not able to complete the 6MWT seem to be consistent. In fact, obese subjects who failed in the test performance, showed a greater functional impairment and disability, reduced muscle strength, higher fat mass as compared to their counterparts who finished the test. Therefore, the 6MWT appears more as a global performance test than a mere measure of motor capacity. It remains true that the implementation of those variables hinders the daily use of the predictive equation in non-specialistic facilities. However, those variables should be considered in the baseline assessment of obese patients to optimize the rehabilitation programs and increase their effectiveness. The variables adopted in our model define a more complex equation than those already available in the literature, however, the main goal of our study was not to provide an evaluation tool for everyday practice, instead to highlight the differences in the 6MWT results due to the disability correlated to obesity and define the elements that may account for such different performances, either causes or consequences of disability.

The present study has certain limitations that need to be taken into account. Despite having acknowledged all the indications suggested by the American Thoracic Society, the length of the walkway we used in this study was shorter than that used by Enright (20 versus 30 m) [14]. This difference might have biased the results, although it appears very unlikely, as already commented by other authors [35], that this particular circumstance might have caused such a marked difference in the results.

In our study a greater number of females was enrolled. In the literature, as in our study, males normally walk a longer 6MWD. Although the distribution of FM, that is different between males and females, may play a role in influencing this result, evidence suggests that the impact of sex on joint mobility does not appear relevant. Accordingly, in our sample, the correlation between disability and 6MWD does not change as a function of sex.

Some parameters that were shown to be correlated with the performance during the 6MWT (such as customary physical activity, smoking habits, socioeconomic status, depression, lower cognition) [16] were not considered in our study. Although important, however, these aspects were beyond our goals.

Finally, we did not consider in our study the relationship between the 6MWD and parity, an interesting factor in developing nations (4.3 in North Africa and 1.6 in Europe and North America). It seems that parity accelerates decline of the 6MWD [13]. Although in our sample only Caucasian subjects were enrolled, as in Italy there is a large number of immigrant, this association should be evaluated in future studies.

In conclusion, the 6MWD by obese subjects is not only influenced by age, sex and height, as reported in the majority of reference equations in the extant literature. Disability should be a pivotal variable of the predictive model of the distance walked by obese subjects at the 6MWT.

Author Contributions

Conceived and designed the experiments: LMD. Performed the experiments: VM EP. Analyzed the data: AP AB LMD. Wrote the paper: LMD PC EP.


  1. 1. Capodaglio P, Vismara L, Menegoni F, Baccalaro G, Galli M, et al. (2009) Strength characterization of knee flexor and extensor muscles in Prader-Willi and obese patients. BMC Musculoskelet Disord 10: 47
  2. 2. Salvadori A, Fanari P, Mazza P, Agosti R, Longhini E (1992) Work capacity and cardiopulmonary adaptation of the obese subject during exercise testing. Chest 101: 674–679.
  3. 3. Salvadori A, Fanari P, Fontana M, Buontempi L, Saezza A, et al. (1999) Oxygen uptake and cardiac performance in obese and normal subjects during exercise. Respiration 66: 25–33.
  4. 4. Malatesta D, Vismara L, Menegoni F, Galli M, Romei M, et al. (2009) Mechanical external work and recovery at preferred walking speed in obese subjects. Med Sci Sports Exerc 41: 426–434.
  5. 5. Wearing SC, Hennig EM, Byrne NM, Steele JR, Hills AP (2006) Musculoskeletal disorders associated with obesity: a biomechanical perspective. Obes Rev 7: 239–250.
  6. 6. Wearing SC, Hennig EM, Byrne NM, Steele JR, Hills AP (2006) The biomechanics of restricted movement in adult obesity. Obes Rev 7: 13–24.
  7. 7. Capodaglio P, Castelnuovo G, Brunani A, Vismara L, Villa V, et al. (2010) Functional limitations and occupational issues in obesity: a review. Int J Occup Saf Ergon 16: 507–523.
  8. 8. Hulens M, Vansant G, Claessens AL, Lysens R, Muls E (2003) Predictors of 6-minute walk test results in lean, obese and morbidly obese women. Scand J Med Sci Sports 13: 98–105.
  9. 9. Enright PL, Sherrill DL (1998) Reference equations for the six-minute walk in healthy adults. Am J Respir Crit Care Med 158: 1384–1387.
  10. 10. Troosters T, Gosselink R, Decramer M (1999) Six minute walking distance in healthy elderly subjects. Eur Respir J 14: 270–274.
  11. 11. Chetta A, Zanini A, Pisi G, Aiello M, Tzani P, et al. (2006) Reference values for the 6-min walk test in healthy subjects 20-50 years old. Respir Med 100: 1573–1578.
  12. 12. Gibbons WJ, Fruchter N, Sloan S, Levy RD (2001) Reference values for a multiple repetition 6-minute walk test in healthy adults older than 20 years. J Cardiopulm Rehabil 21: 87–93.
  13. 13. Ben Saad H, Prefaut C, Tabka Z, Mtir AH, Chemit M, et al. (2009) 6-minute walk distance in healthy North Africans older than 40 years: influence of parity. Respir Med 103: 74–84.
  14. 14. Enright PL, McBurnie MA, Bittner V, Tracy RP, McNamara R, et al. (2003) The 6-min walk test: a quick measure of functional status in elderly adults. Chest 123: 387–98.
  15. 15. Li AM, Yin J, Au JT, So HK, Tsang T, et al. (2007) Standard reference for the six-minute-walk test in healthy children aged 7 to 16 years. Am J Respir Crit Care Med 176: 174–80.
  16. 16. Dourado VZ (2011) Reference Equations for the six-minute walk test in healthy individuals. Arq Bras Cardiol 96: e128–e138.
  17. 17. Beriault K, Carpentier AC, Gagnon C, Ménard J, Baillargeon JP, et al. (2009) Reproducibility of the 6-minute walk test in obese adults. Int J Sports Med 10: 725–7.
  18. 18. Capodaglio P, De Souza SA, Parisio C, Precilios H, Vismara L, et al. (2013) Reference values for the six-minute walk test in obese subjects. Disabil Rehabil 35: 1199–203.
  19. 19. Larsson UE, Reynisdottir S (2008) The six-minute walk test in outpatients with obesity: reproducibility and known group validity. Physiother Res Int 13: 84–93.
  20. 20. Donini LM, Brunani A, Sirtori A, Savina C, Tempera S, et al. (2011) SIO-SISDCA task force: assessing disability in morbidly obese individuals: the Italian Society of Obesity test for obesity-related disabilities. Disabil Rehabil 33: 2509–2518.
  21. 21. Precilios H, Brunani A, Cimolin V, Tacchini E, Donini LM, et al. (2013) Measuring changes after multidisciplinary rehabilitation of obese individuals. J Endocrinol Invest 36: 72–7.
  22. 22. Capodaglio P, Cimolin V, Tacchini E, Precilios H, Brunani A (2013) Effectiveness of in-patient rehabilitation in obesity-related orthopedic conditions. J Endocrinol Invest Mar 19.
  23. 23. American Thoracic Society Statement: guidelines for the six-minute walk test (2002) Am J Respir Crit Care Med. 166: 111–117.
  24. 24. Lohman TG, Roche AF, Martorell editors (1988) Anthropometric standardization reference manual. Human Kinetics Book, Champaign (IL – USA) 183.
  25. 25. Kushner RF (1992) Bioelectrical impedance analysis: a review of principles and applications. J Am Coll Nutr 11: 199–209.
  26. 26. Sun SS, Chumlea WC, Heymsfield SB, Lukashi HC, Schoeller D, et al. (2003) Development of bioelectrical impedance analysis prediction equations for body composition with the use of a multicomponent model for use in epidemiological surveys. Am J Clin Nutr 77: 331–340.
  27. 27. Andrews AW, Thomas MW, Bohannon RW (1996) Normative values for isometric muscle force measurements obtained with hand-held dynamometers. Phys Ther 76: 248–259.
  28. 28. Borg G (1990) Psychophysical scaling with applications in physical work and the perception of exertion. Scand J Work Environ Health 16 (Suppl 1)55–8.
  29. 29. Alameri H, Al-Majed S, Al-Howaikan A (2009) Six-minute walk test in a healthy adult Arab population. Respir Med 103: 1041–1046.
  30. 30. Enright PL (2003) The six minute walk test. Respir Care 48: 783–5.
  31. 31. Gosselink R, Troosters T, Decramer M (1996) Peripheral muscle weakness contributes to exercise limitation in COPD. Am J Respir Crit Care Med 153: 976–80.

BMI data and models

Population for empirical evaluation. The Swiss Health Survey (SHS) is a population-based cross-sectional survey. Since 1992, it has been conducted every five years by the Swiss Federal Statistical Office17. For this study, we restricted the sample from the 2012 survey to 16,427 individuals aged between 18 and 74 years. Height and weight were self-reported by telephone interview. Records with extreme values of height or weight were excluded (highest and lowest percentile by sex). Smoking status was categorized into never smoked, former smokers, light smokers (1 – 9 cigarettes per day), moderate smokers (10 – 19), and heavy smokers (> 19). Individuals who never smoked stated that they did not currently smoke and never regularly smoked for longer than a six-month period; former smokers had quit smoking but had smoked for more than 6 months during their life. One cigarillo or pipe was counted as two cigarettes, and one cigar was counted as four cigarettes. The following adjustment variables were included: fruit and vegetable consumption, physical activity, and alcohol intake. Information on the number of days per week fruits and vegetables were consumed was available. We chose to categorize as close to the “5-a-day” recommendation as possible18. Fruit and vegetable consumption was combined in one binary variable that comprised the information on whether both fruits and vegetables were consumed daily or not. The variable describing physical activity was defined as the number of days per week a subject started to sweat during leisure time physical activity and was categorized as > 2 days, 1 – 2 days, or none. Alcohol intake was included using the continuous variable grams per day. Education was included as highest degree obtained and was categorized as mandatory (International Standard Classification of Education, ISCED 1-2), secondary II (ISCED 3-4), or tertiary (ISCED 5-8)19. Nationality had the two categories: Swiss and foreign. Language region reflecting cultural differences within Switzerland was categorized as German/Romansh, French, or Italian.

Models for BMI distributions. Binary logistic regression, ordered, and unordered polytomous logistic regression20 were previously applied to the analysis of BMI distributions based on ad hoc categorized BMI values. We will review the corresponding parameterizations and compare the model parameters in the common framework of model (1) before introducing the novel continuous outcome logistic regression for the analysis of BMI distributions.

  • Binary logistic regression For a binary outcome, such as non-obesity vs. obesity (BMI30 = I(BMI ≤ 30)), the regression function is defined for non-obese individuals only

  •                                                 r(30 | smk, sex, x) = α30 + γsmk:sex + xβ,

  • with intercept α30, main and interaction parameters γ of smoking and sex, and regression coefficients or covariate parameters β. This model evaluates the conditional distribution function for BMI only at b = 30. Note that a change of the BMI cut-off point b leads to a different model, and thus different parameter estimates for all parameters αb, γ, and β. Such models have been reported for b = 25 or b = 3011,12.

  • Ordered polytomous logistic regression This model is also known as proportional odds logistic regression for an ordered categorical outcome, such as the WHO categories3 underweight (BMI18.5 = I(BMI ≤ 18.5)), normal weight (BMI(18.5,25] = I(18.5 < BMI ≤ 25)), overweight (BMI(25,30] = I(25 < BMI ≤ 30)), and obese (BMI > 30). For these four categories, the model is defined by three category-specific regression functions

  •                                                 r(18.5 | smk, sex, x) = α18.5 + γsmk:sex + xβ

  •                                                    r(25 | smk, sex, x) = α(18.5,25] + γsmk:sex + xβ

  •                                                    r(30 | smk, sex, x) = α(25,30] + γsmk:sex + xβ

  • or, in more compact notation, by r(b | smk, sex, x) = α(b) + γsmk:sex + xβ with intercept function

  • The parameters γ and β are the same for all three regression functions and can be interpreted as category-independent log-odds ratios as a consequence of the proportional odds assumption on these parameters. The intercept function increases monotonically. Ordered polytomous logistic regression can be understood as a series of binary logistic regression models where only the intercept is allowed to change with increasing BMI values at cut-off points chosen ad hoc. Self-reported BMI values using the WHO criteria have been analyzed by such a model in 7. The BMI distribution of children categorized at marginal percentiles has been analyzed by a proportional odds model in 13.

  • An extension of ordered polytomous regression to continuous responses, treating the intercept function α as a step-function at the observations with subsequent non-parametric maximum likelihood estimation, was recently suggested by 21. Unlike the model and estimation procedure discussed here, their method does not allow for the different likelihood contributions presented in the next section.

  • Unordered polytomous logistic regression Multinomial logistic regression is equivalent to polytomous logistic regression for an unordered outcome and is a generalization of the proportional odds model as it allows for category-specific parameters γ(b) and β(b) in the regression function

  •                                                 r(b | smk, sex, x) = α(b) + γ(b)smk:sex + xβ(b)

  • for b ∈ {18.5, 25, 30}. The model can be used to test the proportional odds assumption, i.e.,γγ(b) and ββ(b) for all b ∈ {18.5, 25, 30}. Typically, the model is introduced as a model of the conditional density by the relationship between density and distribution function for discrete variables (as in (2)). This model is very popular for the analysis of BMI-related outcomes8–10.

The novel continuous outcome logistic regression model can be viewed as a generalization of the above-introduced models from discrete to continuous outcomes. Like these discrete models, the continuous BMI logistic regression model does not require strong parametric assumptions for the conditional BMI distribution, yet it allows to model the conceptually continuous BMI variable by a continuous distribution, regardless of the scale of the actual BMI measurements.

The most important aspect here is a smooth and monotonically increasing intercept function α(b). In an unconditional model for the marginal BMI distribution

                                     logit(ℙ(BMI ≤ b)) = r(b) = α(b),

such an intercept function can model arbitrary BMI distribution functions by the term expit(α(b)) (technical details of the specification and estimation of such an intercept function are given in the Appendix). This essentially removes the need to specify a strict parametric distribution, such as the normal, for BMI. Because of a potential impact of both smoking and sex of the individual on the entire distribution, we stratify this intercept function with respect to these two variables, i.e., one specific intercept function is dedicated to each combination of smoking and sex:

logit(ℙ(BMI ≤ b | smk, sex)) = r(b | smk, sex) = α(b)smk:sex.

This model is also assumption free, because arbitrary BMI distribution functions can be assigned to each combination of sex and smoking.

To facilitate model interpretation, we assume that regression coefficients β of the remaining covariates are constant across the entire BMI distribution in our final model

                                logit(ℙ(BMI ≤ b | smk, sex, x)) = r(b | smk, sex)     (4)

                                                                                 = α(b)smk:sex + xβ.

The regression coefficients β are log-odds ratios of all possible events BMI ≤ b, b > 0. The interpretation of the parameters β is the same in logistic regression, proportional odds regression, and the novel continuous BMI logistic regression (4). Of course, these constant regression coefficients might be incorrectly specified. Residual analysis, for example using the residual U = ℙ(BMI ≤ b | smk, sex, x) for a subject with BMI b, can help to detect such misspecifications. Similar to Cox-Snell residuals, the residual U is uniform when the model is correct.

Our model (4) can be understood as a joint model of all possible binary logistic regression models for the outcomes BMI ≤ b with b > 0 under two constraints: (1) the sex- and smoking-level-specific intercept is not allowed to jump abruptly, thus less parameters are required in this joint model, and increases for increasing cut-off points b; (2) the regression coefficients β are held constant as b increases. Instead of restricting our attention to specific binary logistic regression models defined by some cut-off points chosen ad hoc, we can answer questions about the odds ratios for all or specific events BMI ≤ b post hoc based on this model.

The interpretation of the sex- and smoking-specific intercept functions, and thus the associations of smoking and sex with BMI, however, is fundamentally different from the interpretation of the regression coefficients β. Because we allow the entire BMI distribution to change with these two variables in more complex ways, there is no simple interaction term γ that captures these parameters in model (4). However, model (4) allows computation of the log-odds ratios for some event BMI ≤ b between, for example, female former smokers and females who never smoked for all x as

r(b | former smoker, female, x) – r(b | never smoked, female, x) = α(b)former smoker:femaleα(b)never smoked:female

In this way, the parameters and contrasts we are interested in are not directly parameterized in model (4) but nevertheless can be obtained from this model by relatively simple contrasts. The events BMI ≤ b are not restricted to those of a specific categorization of the BMI measurements (such as the WHO categories). Due to the smoothness of the underlying intercept functions, log-odds ratios can be computed for arbitrary BMI values b > 0.

Likelihoods for BMI models. Because the regression function r is defined for all possible BMI values b in model (4), the likelihood (2) can be evaluated for all types of intervals (b, b] and also for “exact” BMI values computed as the ratio of weight and squared height. We distinguished between four different likelihood contributions corresponding to four different BMI measurement scales.

  • WHO categories (WHO) The BMI for each individual was reported in one of the four WHO categories corresponding to the intervals ≤ 18.5 (under-weight), (18.5, 25] (normal weight), (25, 30] (over-weight), > 30 (obese). The likelihood contribution of a normal-weight individual is thus

  •                                                 expit(r(25 | smk, sex, x)) – expit(r(18.5 | smk, sex, x)).

  • Other categories (Int 1) Other studies might have used a different categorization scheme, e.g., the 21 categories defined by BMI intervals for length two:

                                                    ≤ 17, (17, 19], (19, 21], . . . , (35, 37], > 37.                                                

  • An individual with a BMI value between 19 and 21 thus contributes

  •                                                 expit(r(21 | smk, sex, x))–expit(r(19 | smk, sex, x))

  • to the likelihood.

  • Numeric intervals (Int 2) With weight measured in kilogram and height in meters, the BMI is calculated according to its definition as BMI = weight/height2. However, for an individual 1.75m tall weighting 76kg, all BMI values between 75.5/1.7552 = 24.51 and 76.5/1.7452 = 25.12 are consistent with this individual due to rounding error. Thus, this individual contributes

  •                                                 expit(r(25.12 | smk, sex, x)) − expit(r(24.51 | smk, sex, x))

  • to the likelihood, which automatically takes the measurement error into account. These intervals can be expected to be much larger in studies that rely on self-reported weights and heights.

  • Exact measurements (Exact) If extreme precision was used to measure weight and height, BMI = weight/height2 can be considered an “exact” observation. Because the interval around this value is very narrow, one can approximate the likelihood contribution by the density of the conditional BMI distribution

  • evaluated at the “exact” BMI value.

It is important to note that it is possible to evaluate the likelihood when a mixture of these different BMI measurement scales is applied to subsets of the individuals. In subject-level meta analyses, for example, it would be possible to estimate a joint model based on studies using different BMI categorizations or no categorization at all. From a purely theoretical point of view, the application of numeric intervals that take rounding error into account (Int 2) is most appropriate. The remaining three procedures must be considered approximate.

0 thoughts on “Reproducible Research Peer Assignment 2 Bmis”


Leave a Comment

Your email address will not be published. Required fields are marked *