Psychometric properties of the Minnesota Living with Chronic Heart Failure Questionnaire in a Colombian population

Introduction: Although the Minnesota Living with Heart Failure Questionnaire (MLHF-Q) is one of the most widely used tools to assess Health-Related Quality of Life (HRQoL) in patients with chronic heart failure (CHF), it has not been validated in Latin American Spanish-speaking populations. Objective: We evaluated internal consistency and construct validity of the MLHF-Q in patients with CHF from Colombia. Methods: The Spanish version of the MLHF-Q was given to 200 patients. Cronbach’s alpha was used to evaluate internal consistency. Confirmatory factorial Principal Component Analysis (PCA) and Rasch analysis were used to evaluate construct validity. The discriminative capacity was measured using the Mann-Whitney U test. Results: Median age was 64 years, 63% of the patients included in the study were men, and 79.5% had a left ventricular ejection fraction (LVEF) ≤ 45%. The median of the total score of HRQoL was 40 points (Q1=20; Q3=55), physical dimension 11 points (Q1=4; Q3=23) and emotional dimension 7 points (Q1=3; Q3=13). Global internal consistency of MLHF-Q was 0.91 (95% CI 0.89 - 0.93). In the PCA, the three dimensions explained 47.7% and 54.0% in Rasch analysis, in which five items presented misfit. Worse HRQoL was observed among women than men in the emotional dimension ( p =0.047). Discriminative capacity for the overall score of the MLHF-Q and their subscales was observed in age and New York Heart Association (NYHA) functional class ( p <0.05). Conclusions: Our findings confirmed the three-factor structure of the MLHF-Q, and satisfactory level for internal consistency. Additionally, these results suggest that the questionnaire adequately reflects the severity of the disease. However, further studies are required to validate these findings in Colombian population and to evaluate the sensitivity to change of the MLHF-Q in longitudinal designs. emocional ( p =0.047) y se evidenció capacidad discriminativa de las subescalas y el puntaje total del MLHF-Q en la edad y la clase funcional New York Heart Association (NYHA) ( p <0.05). Conclusión: Nuestros hallazgos confirmaron la estructura de tres factores del MLHF-Q y un nivel satisfactorio para la consistencia interna. Adicionalmente, estos resultados sugieren que el cuestionario refleja adecuadamente la gravedad de la enfermedad. sensibilidad longitudinales.

The burden of the disease in CHF involves several limitations in patients carrying out daily life activities, and affects health-related quality of life (HRQoL) more severely than other chronic diseases 4 . Several studies have shown that HRQoL of patients with HF is worse than the general population, or patients with other chronic diseases 4,5 . Furthermore, the decline in quality of life of HF patients is not temporary, but rather progressive over time 6 . Nevertheless, measuring HRQoL in HF remains a challenge, and despite the existence of several instruments (generic and disease-specific) for assessing HRQoL, no consensus has been achieved on which instrument would be most suitable 7 .
The Minnesota Living with Heart Failure Questionnaire (MLHF-Q) is a disease-specific instrument, consisting of 21 items addressing a wide range of HRQoL and it is the most frequently used internationally. Since 1987, the MLHF-Q has been translated into more than 30 languages, including Spanish [8][9][10][11][12][13] and it is used as an outcome measure in multiple clinical trials showing the best psychometric properties as to validity, reliability and sensitivity to change [14][15][16][17] .
Even though Spanish is spoken by 95% of Latin America's population, Brazil, where Portuguese is spoken, is the only regional country where the MLHF-Q has been validated 18 , while Colombia has no data available on the evaluation of the reliability and validity of the MLHF-Q. Therefore, we aimed to evaluate the internal consistency and construct validity of the MLHF-Q in patients with CHF in Colombia.

Study population
A cross-sectional study was conducted between February and October 2015 in the Heart Failure and Heart Transplant Clinic of Cardiovascular Foundation in Floridablanca city, Santander-Colombia. We included patients if they (i) were 18 years old or older and (ii) had a confirmed HF medical diagnosis by a cardiologist. Patients with mental sphere alterations or communication limitations were excluded. All patients gave written informed consent and the Research Ethic Committee of the institution approved the research protocol.
In calculating our sample size, care was taken to comply with the 10-patient-per-analized-item criterion considered adequate for factorial analysis 19 . The sample was selected in a non-probabilistic way; all patients were invited to participate consecutively by a previously trained nurse who conducted the interviews upon medical control appointments.

Clinical screening
HRQoL was measured with the MLHF-Q 11 , a specific self-report instrument for CHF patients. HRQoL questionnaire is made up of 21 items graded by the patient using a 6-point Likert-type scale ranging from 0 (no impairment) to 5 (very much impairment). The MLHF-Q groups the items in three dimensions: physical (8 items), emotional (5 items), and the overall score for HRQoL (21 items). Eight separate items, which do not assess a single construct or dimension of HRQoL, measure social and economic impairments of patients with HF and contribute to the overall score. The total score has a range between 0 and 105 points, the physical dimension (between 0 and 40), the emotional dimension (0 and 25) and the separate items on the socio-economic impairments (0 and 40). High scores on the MLHF-Q scale indicate impaired HRQoL. The MLHF-Q has a global internal consistency measured by Cronbach's alpha of 0.94 (95% CI, 0.91 to 0.95) and general intraclass correlation coefficient of 0.84, characteristics that make it suitable for use 17 .

Statistical analysis
Continuous variables are reported as median and quartiles (Q) unless stated otherwise, and categorical variables are presented as percentages. Internal consistency was evaluated through Cronbach's alpha coefficient 20 . Kaiser-Meyer-Oklin's (KMO) index and Bartlett's test of sphericity were estimated to establish the pertinence of factorial analysis. KMO ≥0.7 was considered acceptable 21,22 .
To evaluate construct validity of the questionnaire, two different approaches were used: first, the structure of the model originally proposed by Rector and Cohn 8 was examined by means of confirmatory factorial principal component analysis (PCA). Dimensional structure was identified through varimax-type octagonal rotation, factor loading, and those ≥0.4 were considered acceptable 13,23 . Second, polytomous Rasch rating scale model was used to assess each specific questionnaire dimension according to the factorial structure proposed by literature 24 . Thus, the first step was to evaluate the functioning of rating scale categories. A clearly progressive level of difficulty across item categories was expected as a criterion of adequate function. We also examined the standardized (ZSTD) fit statistics of persons for whom a score between ±3 was expected.
For dimensionality evaluation, which is a fundamental requirement for construct validity, we applied the following criteria: (i) mean square information-weighted statistic (infit) and the outlier-sensitive statistic (outfit), with values between 0.7 and 1.3 indicate a good fit (ii) PCA of the residuals 25 . Unidimensionality was violated if, besides the first factor, other factors had eigenvalues >3, and the local dependency was assessed through the item residual correlations where values >0.5 may indicate that the response to one item may be determined by another. To detect the presence of differential item functioning (DIF), which occurs when groups within the sample respond differently from an individual item; we compared distinct levels of the trait by sex and age group (≤65 vs. >65 years). A Welch's t statistically significant (p<0.05), and a difficulty difference ≥0.5 logits were considered evidence of uniform DIF.
Finally, discriminative capacity of the questionnaire was assessed by its ability to differentiate among subgroups of patients with different levels of CHF severity, taking into account the following hypothesis: women, higher age, superior New York Heart Association (NYHA) functional class and Left Ventricular Ejection Fraction (LVEF) under 45% will have higher scores of the MLHF-Q, by using the Mann-Whitney U test. All statistical tests were two-sided and a p-value <0.05 was considered significant. Data were analyzed using Stata Statistical Software, version 14 and Winsteps 3.80.0.

Characteristics of the study population
The proportion of missing data was 0%. During recruitment period, two hundred CHF patients fulfilled the selection criteria, agreed to participate and completed the questionnaire. Median age of participants was 64 (Q1=53; Q3=73) years old, 63.0 % were men, 79.5% had a LVEF ≤ 45%, and 24.0% subjects were in NYHA functional class III-IV. Sociodemographic and clinical characteristics of the study population are shown in Table 1.

Internal reliability
Cronbach's alphas coefficients ranged from 0.73 (social dimension) to 0.91 (physical dimension and total score) in the MLHF-Q, indicating satisfactory level for internal consistency. Descriptive analysis and internal consistency of the MLHF-Q are shown in Table 2 and  Supplementary Material Table S1.

Construct validity
The KMO statistic was 0.90, indicating sampling adequacy (Supplementary Material Table S2) and Bartlett's test of sphericity was statistically significant (chi 2 (210)= 2126.20; p=0.000), suggesting that data were appropriate to be subjected to a factorial analysis 22 .
All items in the first factor were associated to signs and symptoms of HF; this factor was identified such as physical dimension. The second factor, included four items of five items from the original questionnaire, and they were related to the patient's psychological response to disease; this factor was recognized as the emotional dimension. Finally, three items in the third factor were correlated to the patient's social relationships, thus this factor was named the social dimension. Then, confirmatory factorial PCA of three factors explained 54.03% of total variation in the study population, of which 30.6% was explained by the first factor, 15.8% the second factor and 7.6% the third dimension. Eigenvalue was 6.43 for the physical, 3.31 for the emotional, and 1.59 for the social dimension (Supplementary Material Table S3).   Regarding Rasch analysis of the total score, the average measures of the rating scale of the MLHF-Q were ordered, progressing from -0.84 logits for rating scale category zero (no impairment) to 0.27 logits for rating scale category of five (very much impairment); disordered thresholds (response categories not working logically) were corrected by combining adjacent categories (Supplementary Material Figure S1); the result was a 3-point scale that met the criteria for rating scale. Eight persons had a ZSTD exceeding the value expected and were excluded from the analysis. In the analysis of physical, emotional and social dimensions, disorders of the rating scale were not observed, in fact, all analyses were made with the original MLHF-Q codification. Social dimension's items explained 44.8% of the variance and had 1.7 eigenvalues in the first contrast, the residual did not present any important correlation. Item 8 (working to earn a living difficult) of social dimension had a slightly lower value of the range. The eight items of physical dimension presented 2 eigenvalues in the first contrast and explained 57.4% of the raw variance; two of the items were above and one item was below expected range as shown in Table 4. In the emotional dimension, one item presented a severe misfit Table 4, the variance explained by these items was 54.0% and had 1.7 eigenvalues in the first contrast; correlation of -0.51 between items 19 and 20 was found. Eliminating item 20 and analyzing the remaining four items, all statistics were into the expected values (Supplementary Material Table S4) and the variance explained improved (Supplementary Material Table  S5). It was not detected uniform DIF by sex or age group in any dimension. Wright maps are presented for each dimension evaluated (Supplementary Material Figure S2).

Contrast validity
Discriminative capacity of the MLHF-Q subscales and for the overall score was observed in age and NYHA functional class (p <0.05). Worse HRQoL was observed among women than men in the emotional dimension (p=0.047). Although higher HRQoL impairment was evident in LVEF ≤ 45% compared with LVEF >45% patients, it was not statistically significant (Supplementary Material Table S6).

Discussion
To the best of our knowledge, the present work is the first study that has assessed the psychometric properties of the MLHF-Q in a Spanish-speaking population of Latin America. We have evaluated the internal consistency, construct validity through the two methods (PCA and Rasch analysis), and the discriminative capacity of the MLHF-Q in outpatients with CHF in Colombia.
Regarding construct validity, we found that the three factors explain 47.7% and 54.03% of the overall score in the Rasch analysis and PCA, respectively, with similar results previously reported by another study 27 , while in other studies the variance explained by these three factors has been higher (64.1 to 72%) 9,10,19,28 . Also, we found the following similarities with other authors; Heo, et al. 27 evidenced that items (1 and 9) were loaded on physical dimension and items (14 and 16) presented loading <0.4. Ho, et al. 19 showed that item 1 was loaded on physical dimension. Finally, Moon, et al. 28 found that items (1, 9 and 10) were loaded on physical dimension.
Item 1 (Swelling in your ankles, legs) is part of the social dimension (another dimension) from original version; however, it has been reported that up two thirds of patients admitted with acute HF presented hypervolemia signs such as jugular venous distension and peripheral edema, typical physiopathological manifestations of HF [29][30] ,which support its correlation to the physical dimension. On the other hand, determine the most plausible dimension for item 10 (sexual activities difficult) is complicated, due to the multifactorial explanation (psychological, emotional, physical and medical) of HF patients' sexual activity 31 . Also, possible explanations for the differences found in the factor structure, variance explained, and eigenvalues with other authors could be sample size, culture, demographics and clinical characteristics, among others 10,17 .
The MLHF-Q is interpreted by its total score, which results of averaging the score of all 21 items. However, this assumes that the total score is unidimensional. Nevertheless, Rasch analysis for the total score did not find evidence of unidimensional functioning. Moreover, it demonstrated misfitting of five items (10,14,15,16,20), and therefore confirming the existence of some problematic items in the composition of the total score. Elimination of items has been reported as a solution 10,27 . Exclusion of problematic items in our study improved the general fit to the Rasch model.
Similar findings have been reported by Munyombwe, et al. 10 who found that several items (7,8,10,14,16) presented misfit. Also, Bilbao, et al. 24 reported two misfitting items (1 and 10). Considering that misfitting items have been identified in a third factor presenting the social dimension, as also shown in the current study, several authors have suggested to add a third factor to the total score 9,10,19,23,28 . However, it remains a challenge to reach a consensus on which of the different social factors proposed is the most appropriate and has the best psychometric properties, and therefore, future studies should examine further and use confirmatory techniques.
Regarding to response categories, we found difficulties in distinguishing between the response options very little (1) and little (2), or much (4) and very much (5). This pattern was also reported by Munyombwe, et al. 10 , who suggests that it could be explained by the sample size or an excess of response categories.
According to the findings of the item-map graphics for the subscales, some patients are in the bottom of the emotional and physical subscales person-item maps, denoting floor effects. This finding is consistent with Munyombwe, et al. 10 , and suggests that those subscales need more items to cover all the levels of the underlying trait. Nevertheless, some studies have reported either floor or ceiling effect in the analysis of total score of the MLHF-Q 10 .
In relation to other variables that measure different stages of disease severity, our results are consistent with a priori hypothesis. The MLHF-Q scores clearly discriminate between different stages of NYHA functional class and age. This has also been observed in both observational studies 9,11,13 , as well as clinical trials, where it is the ideal setting for assessing sensitivity to change [32][33][34] .

Strengths and limitations
The strengths of our study include an adequate sample size, as also shown by the KMO statistic. Also, we provide complete analyses of the structural validity, using both factorial PCA and Rasch analysis. The present study has, however, some important limitations to consider. First, our study was conducted in a single HF center. Accordingly, study results cannot be considered a representative description of the HRQoL of all Colombia's HF clinics. Second, the MLHF-Q is a self-administered questionnaire and, in our study it was applied by a nurse because a high percentage of our population had low educational level, and therefore it could have affected the measurement of the HRQoL.

Conclusions
In conclusion, we have assessed the content, the internal consistency, construct and discriminative capacity of the MLHF-Q in patients with CHF from Colombia. We have confirmed the three-factor structure of MLHF-Q such as previous studies, and satisfactory level for internal consistency. Additionally, these results suggest that the questionnaire adequately reflects the severity of the disease. However further studies are required in Colombian population to validate these findings and to evaluate the sensitivity to change of the MLHF-Q in longitudinal designs.