Machine learning powered detection of biological toxins in association with confined lateral flow immunoassay (c-LFA)

Seoyeon Choi ab, Seongmin Ha a, Chanmi Kim b, Cheng Nie a, Ju-Hong Jang c, Jieun Jang cd, Do Hyung Kwon cd, Nam-Kyung Lee cd, Jangwook Lee cd, Ju Hwan Jeong e, Wonjun Yang *c and Hyo-Il Jung *ab
aSchool of Mechanical Engineering, Yonsei University, Seoul, 03722, Republic of Korea. E-mail: uridle7@yonsei.ac.kr
bTheDABOM Inc., Seoul, 03722, Republic of Korea
cBiotherapeutics Translational Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 34141, Republic of Korea. E-mail: wonjun@kribb.re.kr
dDepartment of Biomolecular Science, Korea Research Institute of Bioscience and Biotechnology, School of Bioscience, Korea University of Science and Technology, Daejeon, 34113, Republic of Korea
eChem-Bio Technology Center, Agency for Defense Development, Daejeon, 34186, Republic of Korea

Received 22nd April 2024 , Accepted 19th July 2024

First published on 5th August 2024


Abstract

Biological weapons, primarily dispersed as aerosols, can spread not only to the targeted area but also to adjacent regions following the movement of air driven by wind. Thus, there is a growing demand for toxin analysis because biological weapons are among the most influential and destructive. Specifically, such a technique should be hand-held, rapid, and easy to use because current methods require more time and well-trained personnel. Our study demonstrates the use of a novel lateral flow immunoassay, which has a confined structure like a double barbell in the detection area (so called c-LFA) for toxin detection such as staphylococcal enterotoxin B (SEB), ricinus communis (Ricin), and botulinum neurotoxin type A (BoNT-A). Additionally, we have explored the integration of machine learning (ML), specifically, a toxin chip boosting (TOCBoost) hybrid algorithm for improved sensitivity and specificity. Consequently, the ML powered c-LFA concurrently categorized three biological toxin types with an average accuracy as high as 95.5%. To our knowledge, the sensor proposed in this study is the first attempt to utilize ML for the assessment of toxins. The advent of the c-LFA orchestrated a paradigm shift by furnishing a versatile and robust platform for the rapid, on-site detection of various toxins, including SEB, Ricin, and BoNT-A. Our platform enables accessible and on-site toxin monitoring for non-experts and can potentially be applied to biosecurity.


1. Introduction

Biological agents are weapons that use organisms such as bacteria, viral pathogens, and toxins to cause fatalities.1–4 Approximately 40 known types of biological agents are currently available.3,4 They possess characteristics that make them easy to weaponize and capable of causing mass casualties even in small quantities. Precise detection of staphylococcal enterotoxin B (SEB), which is a significant contributor to food poisoning, poses challenges due to symptoms resembling complications from various respiratory pathogens, resulting in a high risk of misdiagnosis. Ricinus communis (Ricin), a protein derived from castor seeds, is notorious for its potent toxicity, ranking among the most formidable and lethal plant toxins. Botulinum neurotoxin type (BoNT), counted among the most potent toxins worldwide, typically triggers foodborne intoxication but can also infiltrate the body through wounds or the respiratory system. These agents can induce symptoms even at very low concentrations (nanograms) when individuals are exposed to and inhale them as aerosols, and they have diverse incubation periods and onset times.1,2,5 Consequently, identifying the specific biological agent becomes challenging as initial infections can go undetected or present symptoms similar to those of other diseases. Hence, it is imperative to swiftly and accurately identify the specific types of biological agents used before initiating treatment. This is essential because of the limited chances of recovery and the potential for severe physical harm. In cases involving such biological agents, especially in incidents or bioterrorism scenarios, the quick and precise determination of the specific biological agents used is crucial. Conventional detection methods, which often depend on laboratory facilities, are typically time-consuming.6,7 This limitation may impede their suitability in critical situations.

Lateral Flow Immunoassay (LFA) has garnered significant attention as a rapid and portable detection platform for various analytes, including biological weapons and toxins.8,9 LFA offers several advantages, including simplicity, affordability, and ease of use, making it suitable for on-site and point-of-care applications. However, typical LFA has limitations in terms of sensitivity and specificity. These limitations undermine the prompt and accurate identification of specific agents. Recent advancements in biosensors and bioelectronics have addressed these limitations and significantly enhanced the capabilities of LFA for measuring biological weapons and toxins.10–13 These advancements involve the integration of confined structures within the LFA platform,14–17 facilitating improved analyte capture and signal amplification.18–22 In a previous report, we developed a strip for signal amplification using a wax barrier.23 The incorporation of nanoparticles, microfluidics, and porous materials into the LFA architecture enables the detection of low concentrations of target analytes with high sensitivity.

Machine learning (ML) is a new and innovative method that enables the creation of prediction models and helps in making accurate decisions.24–26 ML is widely used in a variety of fields, such as the classification of biomarkers via biosensors,27 field monitoring,28 the food industry,29 and healthcare.30 It is an essential tool to overcome the limitation of multi-class learning to classify concentrations with a high predictive accuracy and improve biosensor's analytical precision. This is particularly important since collecting extensive datasets from biosensors with different components to enhance detection sensitivity can be a time-consuming, labor-intensive, and expensive process. Therefore, we use a hybrid algorithm that combines the advantages of machine learning to achieve large-scale learning effects even with an appropriate amount of data, prevent overfitting, and mitigate the disadvantages of biased learning results.

Our study focuses on the development of the ML powered confined lateral flow immunoassay (c-LFA) for the high-sensitivity and high-specificity measurement of biological weapons and toxins (Fig. 1). Among the ML models, we employ a TOCBoost hybrid algorithm combining least square boost (LSBoost),26 exclusive in predicting concentration, and convolutional neural networks (CNN),31 comprehended for classifying large-scale image datasets, support vector machines (SVM),32 specialized in multi-class classification. Additionally, ML powered c-LFA effectively classified three types of biological toxins with an average accuracy reaching 95.5%. To the best of our knowledge, the current study represents the initial foray into the field of toxin detection using the confinement strip and ML technology. This study underscores the importance of ongoing R&D in this field to safeguard public health and security. The ML powered c-LFA has the potential to save and protect civilians and soldiers in a situation of biological warfare, terrorism, military, healthcare, and emergency response scenarios.


image file: d4an00593g-f1.tif
Fig. 1 Schematic illustration of the ML powered confined lateral flow immunoassay (c-LFA) process. Biological samples (either SEB, Ricin, or botulinum neurotoxin type A (BoNT-A)) are loaded into the c-LFA, and images are acquired. The data mining process uses a dataset of these images and the ImageJ software program to build an intensity dataset of c-LFA, expressed according to toxin type and concentration. Then, the morphological characteristics of the marker and the eigenvector values of multiple indexes are extracted through pre-processing the datasets. The classification results of the toxin classes are analyzed using the toxin chip boosting (TOCBoost) hybrid algorithm.

2. Materials and methods

The SEB antibody, botulinum neurotoxin type A (BoNT-A) antibody, and antigens were developed by the Korea Research Institute of Bioscience and Biotechnology (KRIBB, Korea). The AntoXa Corporation (Canada) provided the Ricin antibody. Detailed descriptions of all reagents and materials as well as the preparation of the detection antibody–gold AuNP conjugate and the extraction of morphological characteristics are provided in the ESI. However, challenges have been encountered regarding the toxin product, including cost barriers, limited accessibility owing to restricted quantities, and eventual discontinuation, which presented challenges in its acquisition. Therefore, the initial testing procedure involved the use of human serum albumin (HSA, Sigma-Aldrich, USA).

2.1 Design and fabrication of lateral flow immunoassay strip

The sample, nitrocellulose membrane, and absorbent pads were glued onto a backing card. The conventional straight strip (typical LFA) was manufactured using a cutting machine (Taewoo, TBC-50ND, Korea) with a width of 6 mm and a speed of 30%. For the c-LFA, the barbell and double barbell were fabricated using laser cutting (Taejin Acrylic, Korea) at a width of 3 mm (Fig. S1).

Capture antibodies were used at the concentration of 1 mg mL−1 for each antibody (SEB, Ricin, and BoNT-A). Capture and control antibodies were dispensed onto a nitrocellulose membrane at the detection area of the test and control spots using a pipette (Gilson, USA) at 0.5 μL. The membranes were dried at 37 °C and a humidity of less than 30% for 2 hours (SERIMA, S-THSC31R1, Korea). After drying, the strips were stored at room temperature and humidity of less than 30% in a desiccator (GOODSGOOD, ESD-600S, Korea).

2.2 Assay performance

For the experiments, a typical LFA and a c-LFA were used to analyze each standard solution and tap water spiked with specific analytes. Standard solutions were prepared using standard solutions containing each analyte (SEB, Ricin, and BoNT-A). Standard solutions of SEB were prepared at concentrations of 0, 10, 25, 50, and 100 ng mL−1. Standard solutions of Ricin and BoNT-A were prepared at concentrations of 0, 10, 25, 50, 100, and 200 ng mL−1 (n = 5).

Tap water was prepared to dilute the SEB, Ricin, and BoNT-A samples. The tap water was filtered using a 0.45 μm syringe filter. The SEB solutions were spiked into tap water, and the concentrations were fixed at 0, 10, 25, 50, and 100 ng mL−1 for all dilutions (n = 5). Similarly, solutions of Ricin and BoNT-A were spiked into tap water, and the concentrations of Ricin and BoNT-A were fixed at 0, 10, 25, 50, 100, and 200 ng mL−1 (n = 5).

The standard solution was mixed with the antibody–AuNP conjugate (1[thin space (1/6-em)]:[thin space (1/6-em)]1, v/v). After a 2-minute incubation, the mixture (100 μL) was loaded onto the sample part of the cartridge. Following a 20-minute incubation, the c-LFAs were scanned using an iBright™ CL1500 Imaging System (Thermo Fisher Scientific, Korea). Each sample was tested at least three times by the testing laboratories. Digital images were used to quantify the color intensity of the test spots using the ImageJ software (NIH, Bethesda, MD, USA).

2.3 Dataset preparation

The dataset used for training in this study comprises images of the toxin marker region of c-LFA and a numerical dataset containing various intensities, including several concentrations of SEB, Ricin, and BoNT-A toxins. The training set of the algorithm was preprocessed using ImageJ, following standard toxin concentrations such as 0, 10, 25, 50, and 100 ng mL−1. This preprocessing ensured consistent image resolution and preserved the normality of the numerical data, resulting in a dataset with 15 classes. Seventy percent of this dataset was allocated for training, while the remaining 30% was used for validation.

For the blind testing of the algorithm, a separate test set of 9 classes was created using non-standard concentrations (17.5, 37.5, and 75 ng mL−1). This test set was arranged randomly and labeled without using standard labels. Further details for each dataset can be found in Table S1.

2.4 Architecture of the TOCBoost hybrid algorithm

To extract concentration-specific features of each toxin class and normalize intensity values in the detection area, we employed a least square boost (LSBoost) hybrid algorithm26 specialized in concentration prediction. This involved conducting a correlation matrix analysis with dimensions of 25 × 50 × 3 rows and columns for each concentration, utilizing a parallel learning structure, and implementing an automatic weighting function. The hyperparameters were set with a leaf size of 8, 500 epochs, 50 iterations, and a learning rate of 0.01. Also, to enhance the accuracy of c-LFA image classification, we applied a hybrid algorithm that combines a convolutional neural network (CNN) model,38,39 which is well known for image classification, with an SVM classifier32 specialized in multiclass classification. The hyperparameters for the training set were configured as follows: input image size of 724 × 242 × 3, batch size of 32, 500 epochs, 50 iterations, and learning rate of 0.02. For the test set, the hyperparameters were set to an input image size of 1533 × 719 × 3, 1000 epochs, and 100 iterations (Fig. 1).

To validate the reliability of the results of classifying the c-LFA types and concentrations using the hybrid TOCBoost algorithm, we employed K-fold cross-validation (KFCV).40 The dataset was divided into 10 folders, with seven folders allocated to the training set and the remaining three used for validation. Fig. S2 illustrates the data annotation workflow used to generate both the training and test datasets. All the experiments were performed using MATLAB R2022b with a PC equipped with an Intel Core i9-12900KS CPU running at 3.40 GHz, 32 GB of RAM, and an NVIDIA GeForce RTX 3090 Ti graphics card.

2.5 Extraction of morphological characteristic

Evaluation techniques are used to verify the regression results after extracting features from the intensity values of the M-Chip dataset. The algorithm's regression learning results were calculated using various evaluate metrics, such as mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and relative root mean square error (RRMSE), to assess quantitative performance. The evaluation metrics of learning are carried out using equations ((1), (2), (3), and (4) respectively). Specifically, the quality of model predictions is assessed based on the percentage value of RRMSE, categorized into three steps. A value between 0% and 10% indicates good performance, while a value between 11% and 20% suggests poor performance. RRMSE values ranging from 21% to 100% indicate a failure in the learning process.33,34
 
image file: d4an00593g-t1.tif(1)
 
image file: d4an00593g-t2.tif(2)
 
image file: d4an00593g-t3.tif(3)
 
image file: d4an00593g-t4.tif(4)

To analyze the correlation between all toxin types, we constructed a correlation coefficient specialized for multi-class correlation analysis.35 The correlation coefficient was visualized as a circular pattern using eqn (5). The output layer of the TOCBoost algorithm incorporates these verification techniques.

 
image file: d4an00593g-t5.tif(5)

We employed a length scale, which is utilized for object size detection, to calculate the distinct vectors for each class by eqn (6). The MATLAB Toolbox's superpixel36 was used to identify the detection area within the c-LFA images. The marker coordinates were identified as spatial vectors.37 A blue grid pattern represents the distribution density.

This can be expressed as follows:

 
image file: d4an00593g-t6.tif(6)
here, ‘x’ and ‘y’ represent the coordinates containing contour information for the spot boundaries, while ‘n’ signifies the number of coordinates for the spot areas detected by the superpixels.

The c-LFA images were split into red, green, and blue components to obtain the RGB values. The minimum and maximum pixel values of each color image within the range of 0–255 were recorded and subsequently calculated using eqn (7), as follows:

 
image file: d4an00593g-t7.tif(7)
here, ‘n’ represents the number of color channels in the c-LFA image. The hidden layer of the CNN-SVM incorporates these two feature extraction techniques.

2.6 Simulation condition

The fluid flow within the strip was simulated and analyzed for velocity and streamline distribution in the channel using COMSOL Multiphysics. The models were based on a straight channel with dimensions of 17 mm (height) and 6 mm (width), with modifications to the detection area for the barbell and double-barbell structures. All the parts of the strip were simulated under the same conditions.

2.7 Statistical analysis

The data presented in this study have been expressed as mean ± standard error of the mean (n = 5). Statistical analysis was conducted using an unpaired t-test or one-way ANOVA. Results were interpreted based on the data with a p-value < 0.05. In the graph, asterisks indicate statistical significance as follows: *p < 0.05, **p < 0.01, and ***p < 0.001. On the other hand, a p-value greater than 0.05 was considered non-statistically significant and represented as “ns”.

3. Results and discussion

3.1 Design and fabrication of confined lateral flow immunoassay (c-LFA) strip

The strip sections, except for the membrane, had uniform conditions owing to their negligible impact. Fluid velocity significantly influenced antigen–antibody interaction time, with higher velocity shortening the interaction time and reducing sensitivity, while lower velocity increased sensitivity. The overall flow distribution and maximum velocity of the various membrane models were obtained by considering these factors. The simulation results are aligned with those in Fig. 2A, showing that narrower detection areas lead to an increased flow velocity and intensity of the test line. The test line intensity of the double barbell was higher compared to the straight and barbell channels by about 2-fold when using a 5 μg mL−1 concentration of HSA (Fig. 2B).
image file: d4an00593g-f2.tif
Fig. 2 (A) The barbell and double barbell channels exhibited faster fluid flow compared to the straight channel. It was observed that the barbell and double barbell channels exhibited faster fluid flow compared with the straight channel. (B) The intensity of the double barbell was higher compared to the straight and barbell channels when using a 5 μg mL−1 concentration of HSA. (C) Optimization of sample dilution based on the concentration of HSA. (D) The calibration curve of HSA at various concentrations (i.e., 0, 0.5, 1, 2.5, 5, 10, 25, and 50 μg mL−1). Quantitative measurement of the (E) SEB, (F) Ricin, and (G) BoNT-A in standard solution and tap water using the ML powered c-LFA. The intensity of the test spot determined as eliminating the blank signal (n = 5).

Additionally, using HSA, optimal sample dilution ratios were determined (Fig. 2C), and measurements were conducted at concentrations of 0, 0.5, 1, 2.5, 5, 10, 25, and 50 μg mL−1 on both the straight and double barbell strips for comparison (Fig. 2D). The experimental results led to the selection of the double barbell structure, which could distinguish between concentrations, even at low levels (n = 5). Signal intensities were measured using ImageJ software.

3.2 Analytical performance

In the analytical performance analysis, we evaluated the ML-powered c-LFA using spiked tap water and a standard solution (Fig. 2E–G) to demonstrate its efficacy in real-world scenarios. All data represent the mean ± standard deviation of triplicate measurements at least.

By comparing the standard curves of our c-LFA in the standard solution and spiked tap water, we established a robust correlation for SEB concentration in the standard solution (Fig. 2E). With increasing concentration of spiked tap water, we observed a noticeable reduction in the color intensity within the test spot. This decrease in intensity can be attributed to the significant interfering effect of tap water (Table S2), along with the antigen-blocking effect within the detection area.

In the standard curve of our strip, conducted in both standard solution and spiked tap water, we validated a strong correlation between ricin concentration levels in these respective solutions (Fig. 2F). Importantly, this comparison revealed no concerns regarding blocking or tap water interference within the detection area. In the standard curve of our strip constructed in both standard solution and spiked tap water, we established a robust correlation between BoNT-A concentration levels in these distinct solutions (Fig. 2G).

These findings highlight the potential applicability of our toxin measurement system in resource-constrained areas and regions affected by conflict. The attributes of our toxin strip, including its power-free operation, portability, and user-friendly testing procedures, offer substantial advantages in the field (Table 1).

Table 1 Recovery of three toxins at different concentrations in tap water
Target Spiked concentration (ng mL−1) Recovery (%) RSD (%)
SEB 10.0 91.1 0.8
25.0 107.9 18.3
50.0 94.9 4.9
100.0 104.2 4.9
Ricin 10.0 N/A N/A
25.0 112.0 24.0
50.0 84.5 11.1
100.0 104.2 8.5
200.0 99.7 6.2
BoNT-A 10.0 142.3 18.2
25.0 111.2 11.3
50.0 91.6 7.1
100.0 100.5 3.1
200.0 100.1 1.6


3.3 Correlation coefficient and regression analysis of multiple concentration classes

To increase classification accuracy, correlation analysis between multiple concentration classes was essential to identify and extract features effective for learning. The analysis of correlation coefficients for the concentrations of all toxin classes using the TOCBoost algorithm is shown in Fig. 3. The correlation coefficient pattern represented the relationship between the standard and non-standard concentrations of the three toxins and the intensity of c-LFA, calculated using eqn (5). Each concentration's correlation coefficient ranged between −1 and 1; the closer to 1, the simpler the correlation, and the closer to −1, the more complex it was.35 Higher values were highlighted in blue, and lower values in red (Fig. 3). The analysis showed that extractable features increased when classifying three toxins simultaneously rather than a single toxin. Moreover, the 14-class correlation coefficient from SEB-0 ng mL−1 to Ricin-50 ng mL−1 was −0.1, and including all 24 classes resulted in a correlation coefficient of −0.9, indicating the highest number of potential features. This approach successfully extracted features that improved classification accuracy, including all classes with concentrations not matching the standard curve values. The RMSE values of each concentration-specific regression of the three classes of toxins on the intensity of the c-LFA were calculated using eqn (4). Analyzing the correlation of the TOCBoost algorithm, RMSE tended to increase as the concentration value for each toxin type increased. Specifically, SEB-100 ng mL−1 exhibited the highest error with an RMSE of 0.5. These results indicated that SEB-100 ng mL−1 contained the most data with overlapping intensity values. The analyzed RMSE showed a sequential increase within the interval 0.11 and 0.5 in the sub-concentration range of 0 to 100 ng mL−1 for each toxin type. Despite this, the maximum RMSE among all classes remained below 0.5, demonstrating high accuracy and precision. Thus, successful classification was achieved at the evaluation interval with an RRMSE of less than 10% for all concentrations of each toxin class (see Table 2).
Table 2 Evaluation index of TOCBoost algorithm for predicting the concentration of three toxin groups at various concentration values (0–100 ng mL−1)
Concentration (ng mL−1) SEB Ricin BoNT-A
Evaluation metrics
RMSE (10−3) RRMSE (%) Accuracy (RRMSE < 10%) RMSE (10−3) RRMSE (%) Accuracy (RRMSE < 10%) RMSE (10−3) RRMSE (%) Accuracy (RRMSE < 10%)
0.0 13.64 1.4 Excellent 16.60 1.7 Excellent 11.97 1.2 Excellent
10.0 11.30 1.1 Excellent 17.63 1.8 Excellent 17.07 1.7 Excellent
17.5 12.86 1.3 Excellent 16.90 1.7 Excellent 18.71 1.9 Excellent
25.0 13.30 1.3 Excellent 15.23 1.5 Excellent 19.98 2.0 Excellent
37.5 19.45 2.0 Excellent 18.07 1.8 Excellent 21.36 2.1 Excellent
50.0 25.78 2.6 Excellent 23.60 2.4 Excellent 12.85 1.3 Excellent
75.0 38.19 3.8 Excellent 23.85 2.4 Excellent 19.63 1.9 Excellent
100.0 42.68 4.3 Excellent 23.46 2.3 Excellent 20.32 2.0 Excellent



image file: d4an00593g-f3.tif
Fig. 3 Correlation coefficient results for standard (0, 10, 25, 50, and 100 ng mL−1) and non-standard (17.5, 37.5, and 75 ng mL−1) concentrations of three toxin classes (SEB, Ricin, and BoNT-A) using the TOCBoost hybrid algorithm.

3.4 Eigenvectors of each class based on marker morphological characteristics

We employed two equations to extract the morphological characteristics of the marker at varying concentrations on the c-LFA, enabling the definition of the eigenvectors and RGB indices41 for each class. To extract the morphological characteristics of the circular markers for each concentration class of the three types of toxins, the Superpixel36 tool was used to compute the spatial vectors located on the outline of the markers. By comparing the distinct eigenvector ranges for each class using the length scale calculated in eqn (6) with the identical concentration values for each class, it became evident that each class possessed unique eigenvector values. Additionally, the RGB index technique was utilized to extract distribution characteristics of different RGB values latent in markers at each concentration. As per eqn (7), the calculated RGB index represents the point with the highest RGB value in the marker pixel area (10 × 10) based on the concentration of each class, shown in black, indicating that each toxin class reacted at different concentration ranges. Specifically, the RGB index value increased significantly from 37.5 ng mL−1 for the SEB and BoNT-A classes and 75 ng mL−1 for the Ricin class (Fig. 4). Our analysis revealed that BoNT-A exhibited the most circular morphology, whereas the SEB and Ricin classes displayed morphological characteristics resembling a left half-oval shape. Consequently, our evaluation of the results obtained from the markings, informed by the adjusted RGB index, revealed distinct morphological features for each class. Moreover, an increase in the RGB index values (>10) was also observed with higher concentrations for the three types of toxins (Table S4).
image file: d4an00593g-f4.tif
Fig. 4 Eigenvectors of three classes extracted from morphological characteristics. We define the length scales and RGB indices for each class using two equations to extract the morphological characteristics of the marker at various concentrations on the c-LFA. Initially, we observed that the length scale calculated by eqn (6) exhibited different eigenvector values among the three classes (SEB, Ricin, BoNT-A). Additionally, eqn (7) allowed us to represent the point with the highest RGB index value in the marker area based on the concentration in black. This revealed the distinct morphological features for each class based on the adjusted RGB index (detailed information is described in Table S4).

3.5 Classification of c-LFA using the machine learning based TOCBoost algorithm

In this study, we evaluated the classification performance of the TOCBoost hybrid algorithms for three toxin strips: SEB, Ricin, and BoNT-A. We utilized multi-dimensional intensity data extracted from c-LFA images to use a CNN-based machine-learning model for classifying various morphological states within the detection area. The c-LFAs were employed as the sample dataset for comparative analysis, allowing us to assess the sensitivity of the model compared to conventional methods such as SVM classification, visual inspection, and semiquantitative detection.

The TOCBoost algorithm was trained using a dataset based on standard concentration curve values, and a blind test was subsequently performed using a non-standard concentration dataset. For the overall dataset, we presented the confusion matrix for the classification results of all classes, including standard and non-standard concentrations of SEB (Fig. 5A), and demonstrated the classification performance using ROC plots42 (Fig. 5B). Similarly, to assess the performance of the learning model, confusion matrix results (Fig. 5C) and ROC plots for the classification of Ricin classes were provided (Fig. 5D). Finally, confusion matrix results and ROC plots for the classification of BoNT-A classes were also included (Fig. 5E and F, respectively).


image file: d4an00593g-f5.tif
Fig. 5 Upper panel graphs represent integrated results of classification learning by using the TOCBoost algorithm. (A) Confusion matrix represents learning results for standard and non-standard concentration curves for the SEB class. (B) ROC plot displays the classification performance of all concentration classes of the TOCBoost algorithm. In particular, the AUC values for each concentration are SEB-0, 17.5 ng mL−1: 1.00, SEB-10 ng mL−1: 0.96, SEB-37.5 ng mL−1: 0.99, SEB-50, 100 ng mL−1: 0.98, SEB-75 ng mL−1: found to be 0.97. The labels for each class were differentiated using color. (C) Confusion matrix represents the classification learning results for all concentration classes for the same concentration ricin class. (D) ROC plot shows the classification performance of the lysine class of the TOCBoost algorithm. The AUC values for each concentration were Ricin-0, 10, 50, 100 ng mL−1: 0.99, Ricin-17.5, 25, 75 ng mL−1: 0.97, and Ricin-37.5 ng mL−1: 0.98. Likewise, the labels for each class were color-coded. (E) Confusion matrix represents classification learning results for all concentrations of BoNT-A class. (F) ROC plot displays the classification performance for the BoNT-A class of the TOCBoost algorithm. In particular, the AUC values for each concentration are BoNT-A-0 ng mL−1: 0.98, BoNT-A-10, 25, 50 ng mL−1: 0.99, BoNT-A-17.5 ng mL−1: 0.96, BoNT-A-37.5 ng mL−1: 0.91, BoNT-A-75, 100 ng mL−1: 0.95. The labels for each class were differentiated using color.

The confusion matrices, specifically for classifying the length scale and RGB index, showed an average accuracy of 96.5% for SEB, 92.1% for Ricin, and 92.2% for BoNT-A in the toxin classes, respectively. Using the TOCBoost hybrid algorithm, the sensitivity and specificity for ML-powered c-LFA were calculated within the algorithm, and the sensitivity and specificity for typical LFA and c-LFA were extracted using ImageJ software, and the learning results are summarised in Table 3. The complete dataset is presented in ESI Fig. S3 and Table S3.

Table 3 Comprehensive summary of the sensitivity and specificity
Class Sensitivitya (%) Specificityb (%)
Typical LFA c-LFA ML-powered c-LFA Typical LFA c-LFA ML-powered c-LFA
a Sensitivity (%) = true positives/(true positives + false negatives). b Specificity (%) = true negatives/(true negatives + false positives).
SEB 70.0 89.7 96.5 66.7 85.7 94.9
Ricin 82.9 95.8 93.4 35.7 54.5 86.2
BoNT-A 88.6 95.5 95.7 38.5 66.7 88.1


Impressively, the TOCBoost algorithm demonstrated strong classification performance, achieving AUC values ranging from 0.97 to 1.00 for SEB, 0.97 to 0.99 for Ricin, and 0.91 to 1.00 for BoNT-A. This demonstrates that for concentration data that does not match the standard curve value, the performance of the hybrid algorithm was excellent.

4. Conclusions

In this study, we developed c-LFA for the highly sensitive detection of biological toxins such as SEB, Ricin, and BoNT-A and subsequently improved the classification accuracy using the machine learning (ML)-based hybrid algorithm. Our confined c-LFA can be utilized outside laboratory settings using a simple method and possibly used by non-experts.

This was achieved by integrating confined c-LFAs within a designated detection area. By adjusting the shape of the detection area by modulating the width, and length, and normalization of Intensity values, we enabled semi-quantitative sensitivity measurements for the three toxin biomarkers. This approach surmounts inherent sensitivity limitations and upholds the simplicity of well-established lateral flow tests.

We employed the TOCBoost hybrid algorithm to successfully categorize three toxin types with an average accuracy as high as 95.5%. In addition, all types of toxins were predicted with RMSE values of less than 0.5, depending on the concentration. To the best of our knowledge, no prior studies have focused on the assessment of toxins using c-LFAs based on the ML, which simultaneously extracts two morphological features, and focuses on multi-class concentration prediction. While previous studies mainly relied on single morphological features or limited prediction models, we simultaneously extracted and analyzed morphological contour information of toxin markers and color change characteristics. Accordingly, when classifying three toxin classes simultaneously by combining c-LFA and ML, we analyzed the correlation between each toxin type and extracted potential features helpful for classification learning to increase the average accuracy of toxin type classification. Moreover, in blind tests with non-standard curve concentrations, the TOCBoost algorithm demonstrated excellent performance, achieving an average accuracy of 90.5%. Therefore, this method of accurately classifying the concentration of different types of toxins sheds new light on the importance of c-LFA image-based classification and represents a breakthrough in the field.

These findings underscore the practical utility of our sensor for toxin detection, with potential applications in identifying other biological threats, including those about military, public health, national security, and emergency responses. This methodology also holds promise, especially in the military sector and in the commercially viable domain of LFAs for point-of-care applications.

Author contributions

S. C. conceived the idea and designed experiments. S. H., C. K., C. N. visualized the data. J.-H. J., J. J., D. H. K., N.-K. L., J. L. developed the antibodies and antigens. J. H. J. validated the data. W. Y., H.-I. J. supervised the experiments. The manuscript was written through the contributions of all authors. All authors have approved the final version of the manuscript.

Data availability

The codes and additional data to support the figures are freely available on the GitHub repo: (https://github.com/SeongminHA/Enhanced-Toxin-Detection-Strategies-Real-time-Measurement-of-Biological-Toxin-Detection-with-et-al.)

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

The authors would like to acknowledge AntoXa Corporation for their contribution to the process of this work. This research was supported by the Defense Acquisition Program Administration (ADD-911255202).

References

  1. F. Detrick and M. Frederick, Usamriid's Medical Management of Biological Casualties Handbook, 2004 Search PubMed .
  2. H. K. Kim, E. Philipp and H. Chung, North Korea's Biological Weapons Program, 2017 Search PubMed .
  3. H. Sohrabi, M. R. Majidi, P. Khaki, A. Jahanban-Esfahlan, M. de la Guardia and A. Mokhtarzadeh, State of the art: Lateral flow assays toward the point-of-care foodborne pathogenic bacteria detection in food samples, Crit. Rev. Food Sci. Nutr., 2022, 21(2), 1868–1912 Search PubMed .
  4. A. V. Orlov, S. L. Znoyko, V. R. Cherkasov, M. P. Nikitin and P. I. Nikitin, Multiplex Biosensing Based on Highly Sensitive Magnetic Nanolabel Quantification: Rapid Detection of Botulinum Neurotoxins A, B, and E in Liquids, Anal. Chem., 2016, 88, 10419–10426 CrossRef CAS PubMed .
  5. H. P. Cheng and H. S. Chuang, Rapid and Sensitive Nano-Immunosensors for Botulinum, ACS Sens., 2019, 4, 1754–1760 CrossRef CAS PubMed .
  6. X. Cai, Y. Luo, C. Zhu, D. Huang and Y. Song, Rhodium nanocatalyst-based lateral flow immunoassay for sensitive detection of staphylococcal enterotoxin B, Sens. Actuators, B, 2022, 367, 132066 CrossRef CAS .
  7. L. Feldberg, E. Elhanany, O. Laskar and O. Schuster, Rapid, Sensitive and Reliable Ricin Identification in Serum Samples Using LC-MS/MS, Toxins, 2021, 13, 79 CrossRef CAS PubMed .
  8. D. Stern, D. Pauly, M. Zydek, C. Müller, M. A. Avondet, S. Worbs, F. Lisdat, M. B. Dorner and B. G. Dorner, Simultaneous differentiation and quantification of ricin and agglutinin by an antibody-sandwich surface plasmon resonance sensor, Biosens. Bioelectron., 2016, 78, 111–117 CrossRef CAS PubMed .
  9. A. Sena-Torralba, R. Álvarez-Diduk, C. Parolo, A. Piper and A. Merkoçi, Toward Next Generation Lateral Flow Assays: Integration of Nanomaterials, Chem. Rev., 2022, 122, 14881–14910 CrossRef CAS PubMed .
  10. S. Worbs, M. Skiba, M. Söderström, M. L. Rapinoja, R. Zeleny, H. Russmann, H. Schimmel, P. Vanninen, S.Å Fredriksson and B. G. Dorner, Characterization of Ricin and R. communis Agglutinin Reference Materials, Toxins, 2015, 7, 4906–4934 CrossRef CAS PubMed .
  11. K. H. Ching, A. Lin, J. A. McGarvey, L. H. Stanker and R. Hnasko, Rapid and selective detection of botulinum neurotoxin serotype-A and -B with a single immunochromatographic test strip, J. Immunol. Methods, 2012, 380, 23–29 CrossRef CAS PubMed .
  12. L. Babrak, A. Lin, L. H. Stanker, J. McGarvey and R. Hnasko, Rapid Microfluidic Assay for the Detection of Botulinum Neurotoxin in Animal Sera, Toxins, 2016, 8, 13 CrossRef PubMed .
  13. C. C. Tam, A. R. Flannery and L. W. Cheng, A Rapid, Sensitive, and Portable Biosensor Assay for the Detection of Botulinum Neurotoxin Serotype A in Complex Food Matrices, Toxins, 2018, 10, 476 CrossRef CAS PubMed .
  14. K. Misawa, T. Yamamoto, Y. Hiruta, H. Yamazaki and D. Citterio, Text-Displaying Semiquantitative Competitive Lateral Flow Immunoassay Relying on Inkjet-Printed Patterns, ACS Sens., 2020, 5, 2076–2085 CrossRef CAS PubMed .
  15. J. Turner, E. Lay, U. Jungwirth, V. Varenko, H. Gill, P. Estrela and H. Leese, 3D-Printed Hollow Microneedle-Lateral Flow Devices for Rapid Blood-Free Detection of C-Reactive Protein and Procalcitonin, Adv. Mater. Technol., 2023, 2300259 CrossRef CAS .
  16. D. Hristov, C. Rodriguez-Quijada, J. Gomez-Marquez and K. Hamad-Schifferli, Designing Paper-Based Immunoassays for Biomedical Applications, Sensors, 2019, 19, 554 CrossRef PubMed .
  17. V. G. Panferov, N. A. Ivanov, D. Brinc, A. Fabros and S. N. Krylov, Electrophoretic Assembly of Antibody–Antigen Complexes Facilitates 1000 Times Improvement in the Limit of Detection of Serological Paper-Based Assay, ACS Sens., 2023, 8, 1792–1798 CrossRef CAS PubMed .
  18. X. He, S. McMahon and R. Rasooly, Evaluation and comparison of three enzyme-linked immunosorbent assay formats for the detection of ricin in milk and serum, Biocatal. Agric. Biotechnol., 2012, 1, 105–109 CrossRef CAS .
  19. X. Jia, C. Wang, Z. Rong, J. Li, K. Wang, Z. Qie, R. Xiao and S. Wang, Dual dye-loaded Au@Ag coupled to a lateral flow immunoassay for the accurate and sensitive detection of Mycoplasma pneumoniae infection, RSC Adv., 2018, 8, 21243 RSC .
  20. J. Hwang, S. Lee and J. Choo, Application of a SERS-based Lateral Flow Immunoassay Strip for Rapid and Sensitive Detection of Staphylococcal Enterotoxin B, Nanoscale, 2016, 8, 11418–11425 RSC .
  21. K. H. Wu, W. C. Huang, R. H. Shyu and S. C. Chang, Silver nanoparticle-base lateral flow immunoassay for rapid detection of Staphylococcal enterotoxin B in milk and honey, J. Inorg. Biochem., 2020, 210, 111163 CrossRef CAS PubMed .
  22. R. H. Shyu, H. F. Shyu, H. W. Liu and S. S. Tang, Colloidal gold-based immunochromatographic assay for detection of ricin, Toxicon, 2022, 40, 255–258 CrossRef PubMed .
  23. S. Choi, J. H. Lee, B. S. Kwak, Y. W. Kim, J. S. Lee, J. S. Choi and H. I. Jung, Signal amplification in a microfluidic paper based analytical device (u-PAD) by confinement of the fluidic flow, BioChip J., 2015, 9(2), 116–123 CrossRef CAS .
  24. S. Kim, M. H. Lee, T. Wiwasuku, A. S. Day, S. Youngme, D. S. Hwang and J. Y. Yoon, Human sensor-inspired supervised machine learning of smartphone-based paper microfluidic analysis for bacterial species classification, Biosens. Bioelectron., 2021, 188, 113335 CrossRef CAS PubMed .
  25. J. Park, S. M. Ha, J. Kim, J. W. Song, K. A. Hyun, T. Kamiya and H. I. Jung, Classification of circulation tumor cell clusters by morphological characteristics using conventional neural network-support vector machine, Sens. Actuators, B, 2024, 401, 134896 CrossRef CAS .
  26. K. Lee, S. M. Ha, N. G. Gurudatt, W. Heo, K. A. Hyun, J. Kim and H. I. Jung, Machine learning-powered electrochemical aptasensor for simultaneous monitoring of di(2-ethylhexyl) phthalate and bisphenol A in variable pH environments, J. Hazard. Mater., 2024, 462, 132775 CrossRef CAS PubMed .
  27. I. B. Ansah, M. Leming, S. H. Lee, J. Y. Yang, C. Mun, K. Noh and S. G. Park, Label-free detection and discrimination of respiratory pathogens based on electrochemical synthesis of biomaterials-mediated plasmonic composites and machine learning analysis, Biosens. Bioelectron., 2023, 227, 115178 CrossRef CAS PubMed .
  28. K. C. Un, C. K. Wong, Y. M. Lau, J. C. Y. Lee, F. C. C. Tam, W. H. Lai and C. W. Siu, Observational study on wearable biosensors and machine learning-based remote monitoring of COVID-19 patients, Sci. Rep., 2021, 11(1), 4388 CrossRef CAS PubMed .
  29. A. Singh, A. Sharma, A. Ahmed, A. K. Sundramoorthy, H. Furukawa, S. Arya and A. Khosla, Recent advances in electrochemical biosensors: Applications, challenges, and future scope, Biosensors, 2021, 11(9), 336 CrossRef CAS PubMed .
  30. J. Wiens and E. S. Shenoy, Machine learning for healthcare: on the verge of a major shift in healthcare epidemiology, Clin. Infect. Dis., 2018, 66(1), 149–153 CrossRef PubMed .
  31. S. Albawi, T. A. Mohammed and S. Al-Zawi, Understanding of a convolutional neural network, in 2017 international conference on engineering and technology (ICET), IEEE, 2017, pp. 1–6.
  32. S. Suthaharan and S. Suthaharan, Support vector machine, Machine learning models and algorithms for big data classification: thinking with examples for effective learning, 2016, pp. 207–235 Search PubMed .
  33. J. Peng, K. Manevski, K. Kørup, R. Larsen and M. N. Andersen, Random forest regression results in accurate assessment of potato nitrogen status based on multispectral data from different platforms and the critical concentration approach, Field Crops Res., 2021, 268, 108158 CrossRef .
  34. M. A. Ghorbani, S. Shamshirband, D. Z. Haghi, A. Azani, H. Bonakdari and I. Ebtehaj, Application of firefly algorithm-based support vector machines for prediction of field capacity and permanent wilting point, Soil Tillage Res., 2017, 172, 32–38 CrossRef .
  35. B. Ratner, The correlation coefficient: Its values range between +1/−1, or do they?, J. Targeting, Meas. Anal. Mark., 2009, 17(2), 139–142 CrossRef .
  36. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua and S. Süsstrunk, SLIC superpixels compared to state-of-the-art superpixel methods, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34(11), 2274–2282 Search PubMed .
  37. S. T. Acton and D. P. Mukherjee, Scale space classification using area morphology, IEEE Trans. Image Process., 2000, 9(4), 623–635 CrossRef CAS PubMed .
  38. W. Rawat and Z. Wang, Deep convolutional neural networks for image classification: A comprehensive review, Neural Comput., 2017, 29(9), 2352–2449 CrossRef PubMed .
  39. F. Sultana, A. Sufian and P. Dutta, Advancements in image classification using convolutional neural network, IEEE, 2018, pp. 122–129 Search PubMed .
  40. T. Fushiki, Estimation of prediction error by using K-fold cross-validation, Stat. Comput., 2011, 21, 137–146 CrossRef .
  41. Y. Chai, V. Lempitsky and A. Zisserman, Bicos: A bi-level co-segmentation method for image classification, IEEE, 2011, pp. 2579–2586 Search PubMed .
  42. L. Gonçalves, A. Subtil, M. R. Oliveira and P. de Zea Bermudez, ROC curve estimation: An overview, Revstat Stat. J., 2014, 12(1), 1–20 Search PubMed .

Footnotes

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4an00593g
These authors contributed equally to this work.

This journal is © The Royal Society of Chemistry 2024