Impact of preanalytical factors on fecal immunochemical tests: need for new strategies in comparison of methods
Int J Biol Markers 2015; 30(3): e269 - e274
Article Type: ORIGINAL RESEARCH ARTICLE
AuthorsTiziana Rubeca, Filippo Cellai, Massimo Confortini, Callum G. Fraser, Stefano Rapi
Harmonization of fecal immunochemical tests for hemoglobin (FIT-Hb) is crucial to compare clinical outcomes in screening programs. The lack of reference materials and standard procedures does not allow the use of usual protocols to compare methods. We propose 2 protocols, based on artificial biological samples (ABS), to discriminate preanalytical and analytical variation and investigate clinical performances. The protocols were used to compare 2 FIT systems available on European markets: the OC-Sensor Diana (Eiken, Tokyo, Japan) and HM-JACKarc (Kyowa-Medex, Tokyo, Japan).
ABS were obtained adding Hb to Hb-free feces. In the first procedure, 35 ABS were collected for each collection device and analyzed on both systems. In the second, 188 ABS (106 positive and 82 negative) were collected and tested on the specific systems. Passing-Bablock (PB), Pearson’s correlation coefficients (R) and Bland-Altman difference analysis were used to compare data.
PB, R and mean standard errors for Bland-Altman analysis (Diana vs. Arc) results were 0.93x-0.56: R = 0.97 and 19%; and 1.09x + 5.60: R = 0.96 and -18%; for Diana and Arc devices, respectively. No correlations and no difference in positive/negative assessment were observed with the second protocol.
A good correlation was observed in comparing data generated using collection devices on the 2 systems. Manufacturers have developed different sample collection procedures for feces: therefore, data from different systems cannot easily be compared. Adoption of protocols to discriminate preanalytical and analytical variation would be a significant contribution to harmonization of FIT, facilitating data comparison and information acquisition for sample collection strategy and effect of buffers on systems.
- • Submitted on 10/11/2014
- • Accepted on 23/02/2015
- • Available online on 20/05/2015
- • Published online on 22/07/2015
This article is available as full text PDF.
In recent years, the introduction of colorectal cancer (CRC) screening programs has been recommended by the EU Council (1), and these have been implemented as centralized programs in many member states (2, 3). Quantitative fecal immunochemical tests (FIT) for hemoglobin (Hb) seem to be the best noninvasive strategy for centralized CRC screening programs (4), in particular allowing selection of a cutoff fecal Hb concentration according to colonoscopy requirements, standardization and monitoring the analytical process and the comparison of data obtained in different screening programs. Qualitative FIT and opportunistic screening programs do not provide such benefits. As recently reported by the Expert Working Group on FIT for Screening, of the Colorectal Cancer Screening Committee (CRCSC), of the World Endoscopy Organization (WEO), the lack of reference materials and standard procedures for analytical systems and, in particular, for sample collection and preanalytical techniques, represents major problems for the comparison of both analytical and outcome data (5). Many sources of variability are related to feces and bleeding characteristics. Consistency of feces and design of the sample collection devices (pickers) affect the amount of collected material: the intermittent nature of bleeding of neoplastic and nonneoplastic lesions and the heterogeneity of blood distribution in the fecal mass affect Hb concentrations in samples and the overall results of FIT-based screening programs. Moreover, each system has its own extraction buffer and extraction procedure in the sample collection device, and therefore analytical methods are strictly related to the characteristics of the specific devices regarding:
amount of collected feces;
feces to buffer ratio and subsequent range of the Hb concentration;
chemical conditions in devices: buffer composition affects pH and ionic strength of the medium, and the preservatives and stabilizers can cause interference in the overall reaction.
The consequence of this system specificity in fecal sample collection strategy is the impossibility of comparing FIT system performance using the standard NCCLS (National Committee for Clinical Laboratory Standards) protocols for methods comparison (6) or investigation based on a single sampling strategy (7).
Since 1997, our laboratories have been involved in the investigation of fecal methods for Hb measurements in CRC screening (8). In this study, we present our internal strategy used for quantitative FIT investigation, through an assessment of 2 commercial systems. Our proposed strategy is based on the use of artificial biological samples (ABS), obtained by addition of Hb to Hb-free fecal specimens to investigate method performance characteristics according to an assessment of positive and negative samples and a 2-step procedure to discriminate between preanalytical and analytical variations of both devices and analytical systems. In the first stage, a series of ABS were collected using the specific sample collection devices and analyzed on both systems. This procedure allowed comparison of the analytical performance and the effects of buffer on the overall results of the 2 systems. This kind of strategy eliminates preanalytical variation in Hb measurements.
In the second stage, a large series of ABS were sampled with system-specific sample devices and analyzed on the appropriate system. The aim was a preliminary assessment of the clinical performance of systems using ABS of known Hb concentrations.
Material and methods
The OC-Sensor Diana (Eiken Chemical Co. Ltd., Tokyo, Japan; hereafter termed Diana), used in the CRC screening program in Florence for many years (8, 9) and well established in the European market, was compared with the HM-JACKarc (Kyowa Medex Co. Ltd, Tokyo, Japan; distributed in continental Europe by Menarini Diagnostics; termed Arc) recently introduced in Italy. Systems for quantitative FIT measurements are usually composed of system-specific sample collection devices, automated analyzers and reagents, calibrators and quality control materials. Sample collection devices for both systems consist of a probe integral to the cap of the device for fecal collection and a tube containing a volume of sample buffer with Hb preservatives and stabilizers (both patent-protected by manufacturers). Manufacturers declare specify masses of 10 mg and 2 mg of feces collected for Diana and Arc, respectively, and the same volumes of buffer (2 mL), corresponding to theoretical fecal relative concentrations of 5 g/L for Diana and 1 g/L for Arc (10) and a Hb to feces ratio of 5 and 1 μg/g, respectively, for Diana and Arc systems (10). In both systems, the Hb measurement is based on the turbidity induced by the reaction between human Hb and a latex-bound antibody measured on a dedicated automated instrument.
Artificial biological samples (ABS) were prepared adding different amounts of a human blood lysate to freshly collected Hb-free human fecal samples. Starting lysates for ABS were obtained adding 10 μL of red blood cells from residual laboratory materials to 5 mL of distilled water. Further dilutions were obtained using an aqueous sodium chloride solution (156 mmol/L) as diluent, from 0.1-10 mg/mL of Hb concentration. A series of volumes (50-100 μL) of the final solution were added to a set of fecal samples (1-3 g).
The Hb concentrations in ABS were set taking into account both the cutoff concentration and analytical range suggested by the manufacturer (cutoff 100 ng/mL and 30 ng/mL equivalent to 20 μg/g and 30 μg/g, analytical range 50-1,000 ng/mL and 7-400 ng/mL, equivalent to 10-200 μg/g and 7-400 μg/g, for Diana and Arc, respectively). ABS were kept at 2°C-4°C during the study period. To obtain a full knowledge of the new Arc system, imprecision profile, calibration stability, recovery and hook effect were investigated before the methods comparison (11, 12).
Imprecision profile was investigated using 2 control materials at different Hb concentrations (C1 = 60-96 μg/g; C2 = 178-250 μg/g: Hemo-Dev FOB-Bilevel; BIO-DEV, Milan, Italy) and 2 ABS concentrations (ABS1: at cutoff concentration; ABS2: moderately positive) to investigate the imprecision. The same expected values were documented for the Hemo-Dev materials by the manufacture for all analytical systems.
C1 and C2 were analyzed on 11 working days (2 daily runs) over 35 days, to assess calibration stability. Two calibration curves were kept for 7 days each, the system was then recalibrated, and the curve retained for 21 days. Linearity and sensitivity were investigated using a recovery test with a series of 10 dilutions of 2 ABS with different Hb concentrations (20 μg/g and 80 μg/g). Hook effect (prozone) was investigated using the relationship between the expected Hb concentration (ng/mL) and integrated sphere turbidity (IST) values reported on the system. The prozone effect was investigated using 15 serial dilutions of a high Hb concentration ABS (ca. 30,000 μg/g). Carryover on the Arc system was investigated considering the IST of samples of very low Hb concentrations following samples with high Hb concentrations during the study.
First protocol for method comparison
To obtain information about the effect of the collection devices on the specific system and the overall bias using Diana and Arc sample devices, 35 ABS were sampled using the specific sample devices (tubes and pickers) for Diana and Arc and analyzed on both systems. In a preliminary evaluation, the same Hb concentrations were obtained comparing sampling tubes and the extracted content on the appropriate analyzers The samples were first run on the appropriate system (Diana devices on Diana system, and Arc devices on Arc system), and then the sampling devices were opened and contents pipetted into traditional sample cups and analyzed on the other system.
Hb concentrations below the analytical range of Diana (10 μg/g ) were excluded from further investigations. A lower Hb concentration in Diana devices was expected in relation to the lower amount of feces collected by the Arc sample device.
Second protocol for method comparison
A series of 106 positive and 82 negative ABS was sampled with both sample collection devices to obtain 188 tubes for Diana and Arc, respectively, and the devices were then analyzed on the appropriate system to evaluate the effects of sample collection on analytical outcomes.
Contingency tables were created using the known Hb concentrations in ABS to simulate the diagnostic assessments of 2 systems. Positivity or negativity of each ABS was assessed considering the cutoff advocated by the manufacturers: 20 and 30 μg/g on the Diana and Arc, respectively.
Method imprecision was reported as mean and standard deviation (SD) of result, and expressed as coefficient of variation (CV, %). The stability of calibration was expressed as bias between the first and the last days of the results obtained with the control materials (days 1-7 for the first and the second, and days 1-21 for the third). Recovery performance was investigated using linear regression analysis (theoretical vs. recovered Hb) and Bland-Altman analysis (theoretical Hb vs. bias, %: [(expected – observed value) vs. expected value]).
Data obtained with both procedures for methods comparison were investigated performing Passing-Bablock (PB) analysis, Pearson’s correlation coefficients (R) and Bland-Altman analysis of differences, as percentages. All data were reported considering Arc vs. Diana and (Diana−Arc) vs. mean Hb. Statistical analysis was performed using Microsoft Office Excel 2003 (Windows 7 Professional).
Imprecision data obtained for the 2 control materials (C1, C2) and the 2 ABS on Arc during the familiarization period are reported in
Imprecision profile obtained for OC Sensor Diana and HM-JACKarc on third party control materials C1 (assigned: 60-96 μg/g) and C2 (assigned: 178-250 μg/g) and 2 artificial biological samples (ABS)
|Sample||System||Number||Mean (μg/g)||SD (μg/g)||Total CV (%)|
|CV = coefficient of variation; SD = standard deviation.|
No prozone effect was observed up to an IST value of 115,000 (corresponding to 650 μg/g), and a gradual reduction of IST was observed at higher Hb concentrations in the IST range 115,000-75,000 (600-5,000 μg/g). The overall IST count remained steadily higher than IST of 45,000, corresponding to a fecal Hb concentration of 400 ng/mL (the stated upper analytical limit). Three samples with IST = 0 were observed after samples with Hb values out of range. Despite the low limit of detection (7 μg/g), no evidence of carryover effect was observed on the Arc system. All results generated were consistent with the manufacturer’s specifications.
First protocol for method comparison
Linear regression plot of hemoglobin (Hb) concentrations measured on analysis of samples collected in Diana devices on both systems. PB = Passing-Bablock analysis.
Linear regression plot of hemoglobin (Hb) concentrations measured on analysis of samples collected in Arc devices on both systems. PB = Passing-Bablock analysis.
PB values and confidence intervals were Diana = 0.92x - 0.56 (95% confidence interval [95% CI]: Slope 0.85 to 1.04, Intercept -2.56 to 0.13) and Arc = 1.09x + 5.60 (95% CI: Slope 0.95 to 1.21, Intercept -4.60 to 21.0). Pearson’s values (95% CI) were 0.966 (0.935-0.983) and 0.959 (0.919-0.979) for Diana and Arc, respectively. Mean concentrations and range of Hb obtained on Diana devices by Diana and Arc systems were 39 μg/g (13-112) and 33 μg/g (10-78), respectively. Mean values and range of Hb concentration obtained on Arc devices by Diana and Arc systems were 98 μg/g (27-176) and 116 μg/g (34-204). Bland-Altman difference plots are reported in
Bland-Altman plot of Diana sample collection devices. Data are reported as differences between Diana and Arc concentrations (percentage differences).
Bland-Altman plot of Arc sample collection devices. Data are reported as differences between Diana and Arc concentrations (percentage differences).
A mean standard error of 19% ± 15% (Diana vs. Arc) was observed on Diana devices, whereas a mean standard error of -18% ± 15% (Diana vs. Arc) was observed starting with Arc devices. Lower differences were observed (14% and -13%, respectively, for Diana and Arc devices) for concentrations higher than the cutoff concentration recommended for Diana (20 μg/g).
Second protocol for method comparison
The distribution plot of data obtained from the analysis of 188 ABS collected with specific sample devices on the appropriate systems (Diana devices on Diana system and Arc devices on Arc system) is shown in
Analysis of 188 artificial biological samples (ABS; 106 positive and 82 negative samples) collected with system-specific sample devices and performed on the appropriate systems. Hb = hemoglobin.
Application of standard techniques for the comparison of analytical methods facilitates understanding of a large part of the between-method variability before their introduction in clinical setting. Differences in measurement technology, traceability of calibrants, composition of reagents, mass of feces collected with sample devices, volume of buffer and preservatives and stabilizers contained in the collection buffers are indicated as the main problems hindering data comparison between FIT systems (10, 13, 14). A close relation between the component of sampling strategy, sample collection and analytical system has been shown in our study, so a single device cannot be tested on different systems without introducing significant sources of error. For this reason, specific protocols should be developed and used to compare fecal methods.
Until now, the strategies to compare data from different FIT methods have been based on the comparison of outcomes obtained from large cohorts of participants in screening programs (15-16-17). The use of ABS and the introduction of stage protocols to discriminate preanalytical and analytical components of the overall variation of systems could be a useful step to assess the performance and characteristics of fecal tests before their clinical use. ABS based on a fecal artificial matrix, although more complex than samples obtained adding Hb to feces, as used in other work (13), seem to be more useful for the estimation of the amount of feces collected during sampling and for the evaluation of the effect of sampling on results.
In the present investigation, a good correlation between the 2 systems was observed using both sample devices, considering the analytical phase and also in terms of clinical assessments using specific system-sampling procedures and cutoff concentrations; in contrast, a total lack of alignment between methods was confirmed when fecal sampling was included in the procedure. These data confirm the close relation between FIT results and their specific sampling strategies and collection devices. Discrimination between the preanalytical and analytical phase facilitated the collection of some interesting information on the effect of sampling buffer on specific methods.
The mean amounts of feces, and then of Hb, collected with other sample devices are different, highlighting a lack of consistency in reporting fecal Hb concentrations in ng/mL and the need to report the results as μg/g (18, 19).
Both sampling buffers show a significant positive effect on the specific system, in particular at lower Hb concentrations (
Moreover, in both systems, the differences in Hb measurement became less when samples with Hb of clinical interest (19% vs. 14% for Diana devices; -18% vs. -13% for Arc devices for all samples and values higher than the cutoff concentrations of systems).
Interesting data from this short investigation were also related to ABS with borderline Hb and cutoff concentration evaluation of the 2 systems. In the 2 series of ABS investigated during the first stage, 10 (28%) and 18 (54%) samples were below the Diana cutoff concentration for Diana and Arc devices, respectively, whereas all samples were positive on Arc with both sample devices, confirming data found previously on the Kyowa-Medex system (16) suggesting a higher rate of positive test results, although the 2 HM-JACK analytical systems have many significant differences. Until now, all manufacturers have developed different systems with specific strategies for fecal sampling and a total lack of standardization is observed on preanalytical variation: in consequence, a harmonization of fecal sampling procedures across systems and manufacturers could be a fruitful target to reduce the overall variability among FIT results (7).
The introduction of a new strategy for sampling, collection devices and the use of specific procedures to discriminate preanalytical and analytical components of variation of fecal tests, may provide more accurate information on these aspects of methods and better address differences among them, supporting items already suggested by the Expert Working Group on FIT for Screening of the WEO’s CRCSC (5, 18). We recognize that our suggested protocol stages are incomplete, mainly from the statistical point of view, but we suggest that they do represent at least a starting point toward complete and shared procedures in a neglected field in laboratory medicine. The development of specific protocols and procedures to discriminate preanalytical and analytical components of variation in fecal tests could be a major step in the improvement of FIT harmonization, allowing better comparison of analytical results of Hb concentration (19) and data on clinical outcomes.
1. Report from the Commission to the Council, the European Parliament, the European Economic and Social Committee and the Committee of the Regions - Implementation of the Council Recommendation of 2 December 2003 on cancer screening (2003/878/EC). Official Journal L 327 of 16.12.2003 ; : -
Segnan N Patnick J von Karsa L eds European guidelines for quality assurance in colorectal cancer screening and diagnosis. 1st ed Luxembourg Publications Office of the European Union 2010
6. Method Comparison and Bias Estimation Using Patient Samples; Approved Guideline—Second Edition. ; : -
11. Evaluation of precision performance of quantitative measurement methods: approved guideline. Vol. 22, No. 2; 2nd ed. NCCLS documents EP5-A2. 2004; : -
12. Evaluation of the linearity of quantitative measurement procedures: a statistical approach: approved guideline. NCCLS Document EP6-A. 2003; : -
13. NHS Bowel Cancer Screening Southern Programme Hub. Evaluation of quantitative faecal immunochemical tests for haemoglobin 2013; : -
- Rubeca, Tiziana [PubMed] [Google Scholar] 1
- Cellai, Filippo [PubMed] [Google Scholar] 1
- Confortini, Massimo [PubMed] [Google Scholar] 1
- Fraser, Callum G. [PubMed] [Google Scholar] 2
- Rapi, Stefano [PubMed] [Google Scholar] 3, * Corresponding Author (firstname.lastname@example.org)
Cancer Prevention and Research Institute (ISPO), Florence - Italy
Centre for Research into Cancer Prevention and Screening, University of Dundee, Ninewells Hospital and Medical School, Dundee - Scotland
Central Laboratory, Laboratory Department, Careggi Hospital, Florence - Italy