Membrane filter removal in FTIR spectra through dictionary learning for exploring explainable environmental microplastic analysis

FTIR spectraIn this research, we investigate 22 types of plastics, namely Cellulose, high-density polyethylene (HDPE), low-density polyethylene (LDPE), polycarbonate (PC), polyetheretherketone (PEEK), polyoxymethylene (POM), polypropylene (PP), polytetrafluoroethylene (PTFE), polyvinyl chloride (PVC), polyvinyl alcohol (PVA), Acrylic, Nylon, poly(butylene succinate) (PBS), poly(ethylene terephthalate) (PET), polylactide (PLA), polybutylene adipate terephthalate (PBAT), ethylene propylene diene monomer rubber (EPDM), epoxidized natural rubber (ENR), polyethylenimine (PEI), polymethyl methacrylate (PMMA), polyurethane (PU), and polystyrene (PS). The objective of the spectral analysis conducted in this study is to discriminate the chemical composition of neat MP materials from MPs adhered to a membrane filter substrate, to assist in the removal of the membrane filter’s spectrum and the MP identification. To obtain the dataset for supporting a supervised learning approach with input-label sample pairs, the FTIR spectra-gathering process involves three separate datasets under varied conditions, where the intensity of all spectra is normalized to enhance the training efficiency and performance of the methods (Fig. 1):

1.

Measured clean spectra dataset: This involves obtaining 10 spectra per MP type using a controlled setup, exclusively measuring the MP material. These spectra serve as a reference for the pure material’s spectral signature.

2.

Measured membrane filter spectra dataset: Thirty spectra are collected by measuring only the membrane filter. This dataset is designed to help comprehend the spectral characteristics of the membrane filter substrate.

3.

Measured noisy spectra dataset: Sixty spectra for each MP type are acquired by depositing MPs onto membrane filters. This dataset simulates the spectra we expect to obtain from the analysis of water field samples.

Figure 1FTIR spectra from the three measured datasets used in this work — measured clean spectra dataset, measured membrane filter spectra dataset, and measured noisy spectra dataset (from left to right).Data synthesisTo examine the capability of our membrane filter removal method, the evaluation is carried out across various SNR levels. Due to the limited number of acquired spectra, obtaining spectra with specific SNR levels can be challenging, so data synthesis is needed. We adopted the data synthesis method similar to that used in this study36. The simulation of noisy spectra from MP adhered to membrane filters involves three steps (Fig. 2).Figure 2FTIR spectra from the three synthetic datasets used in this work — synthetic clean spectra dataset (top-left), synthetic membrane filter spectra dataset (top-right), and synthetic noisy spectra dataset (bottom). (Bottom, from left to right) FTIR spectra from the synthetic noisy spectra dataset with SNR at 0dB, – 10dB, – 20dB, and – 30dB, respectively.First, we generate a synthetic clean spectra dataset, denoted as \({\textbf{Y}}\). This dataset results from a weighted summation of two randomly normalized spectra of the same MP type from the measured clean spectra dataset. Each synthetic clean spectrum is normalized to ensure that its intensity is between zero and one. These synthetic spectra aim to replicate variations observed in the spectra of pure materials.Second, we synthesize a dataset of synthetic membrane filter spectra, denoted as \({\textbf{Z}}\). This dataset is created through a weighted summation of two randomly normalized spectra from the measured membrane filter spectra dataset, simulating variations in the membrane filter’s spectra. Each synthetic membrane filter spectrum is also normalized.Finally, a synthetic noisy spectrum for the b-th MP type, denoted as \({\textbf{s}}^{(b)}\), is obtained by combining a synthetic clean spectrum for the b-th MP type (\({\textbf{y}}^{(b)} \in {\textbf{Y}}\)) with a synthetic membrane filter spectrum (\({\textbf{z}} \in {\textbf{Z}}\)):$$\begin{aligned} {\textbf{s}}^{(b)} = {\textbf{y}}^{(b)} + \beta {\textbf{z}}, \end{aligned}$$
(1)
where \(\beta\) represents the amplitude scale of the synthetic membrane filter spectrum, ensuring that the SNR (in the unit of dB) of \({\textbf{s}}^{(b)}\) is equal to \(-20 \log _{10}(\beta )\). The synthetic clean spectrum \({\textbf{y}}^{(b)}\) serves as the ground truth for the denoising task. All synthetic spectra (\({\textbf{s}}^{(b)}\)) are normalized by the min-max normalization as well.It is important to note the division of the measured clean spectra dataset and measured membrane filter spectra dataset into two equal groups. One group is designated for training our model and other ML models, while the other is reserved for model evaluation. This separation ensures the independence of training and test data, originating from distinct sets used in data synthesis.MP identification performanceOur dictionary-learning-based method is compared against the SOTA denoising methods, i.e., AE25 and UNet28. The performances in MP identification are measured by the accuracy of classification results, a percentage of spectra with correctly predicted MP type. Our method and others perform classification by assigning the same material type as the synthetic clean spectrum with the maximum gradient correlation to the denoised spectra.In all experiments, the learning matrix used in our method consists of 100 synthetic membrane filter spectra from \({\textbf{Z}}\) and 10 synthetic clean spectra per class from \({\textbf{Y}}\). The number of components or atoms is set to 50 and the regularization parameter is set to 1.0. The orthogonal matching pursuit (OMP)37 technique was used to learn the dictionary.The AE uses a multilayer perceptron as the backbone architecture. It consists of two dense layers with 512 and 256 units on the encoder side, with LeakyReLU38 as the activation function. Similarly, on the decoder side, there are two dense layers with 256 and 512 units, with LeakyReLU as the activation function. The output layer utilizes the tanh activation function.The UNet model is modified to use 1D convolutional layers instead of 2D convolutional layers, as in the original work28. The UNet model comprises four encoder blocks followed by a middle block and closure with four decoder blocks. The ReLU activation function is used, with the number of filters ranging from 16 to 128 and a kernel size of \(5 \times 1\). The output layer is a 1-D CNN layer with a kernel size of \(3 \times 1\) and a tanh activation function.Additionally, we include a baseline method, representing results without any preprocessing, computed by assigning the same material type as the clean spectrum with the maximum gradient correlation to the noisy spectra.We validated our approach using the measured noisy spectra dataset, representing instances from real-world scenarios. The outcomes of our proposed method, AE, UNet, and the baseline method, are presented in Table 1. For AE and Unet, the synthetic datasets between 0 and – 30dB SNR were used as the training data since these methods require noisy spectra as the input and clean spectra as the ground truth. The findings suggest that while our method performs similarly to other methods in real-world scenarios, the superior explainability inherent in our method enhances the robustness of the results, contributing to its stability across different levels of SNR.Table 1 The accuracy of our method and others on the measured noisy spectra dataset.To investigate the limitations and robustness of our method, all methods are evaluated on the synthetic noisy spectra dataset at different levels of SNR, i.e., 0dB, – 10dB, – 20dB, and – 30dB. At each level of SNR, the test set consists of 100 synthetic spectra per MP type. The classification results in Table 2 show that our method achieves higher accuracy at low SNR levels (between – 10dB and – 30dB). Although our method is outperformed by AE at high SNR, i.e., at 0dB, it demonstrates more stable behavior. UNet and AE exhibit an undesirable and significant drop of 10% or more in accuracy between – 20dB and – 30dB. While our method may have lower representation power, leading to lower performance at high SNR, it proves to be more robust with consistent accuracy as SNR decreases. Additionally, our method boasts simplicity in terms of training procedure, computational cost, and model size compared to other methods. The training procedure of our method differs from other supervised learning techniques like AE and UNet. Our method relies solely on clean spectra obtained in the laboratory for training, whereas supervised learning methods typically require both training and test sets stemming from the same distribution or a procedure to simulate this, such as data augmentation. This process often requires the collection and analysis of actual water field samples to obtain the ground truth and effectively train models.Table 2 The accuracy of our method and others on the test set at different levels of SNR.Quality of spectral reconstructionFigure 3The flowchart illustrates the key steps of our method (yellow boxes) and the analysis of our method, particularly the dictionary learning process related to environmental MP analysis (orange boxes). The solid arrows represent the training phase of dictionary learning, while the dashed arrows indicate the inference phase.The primary objective may be to detect the presence of MPs, but various aspects of the dictionary learning process and the results from its downstream tasks need to be discussed (Fig. 3). One notable aspect is the ability to reconstruct spectra after membrane filter removal, which can significantly aid chemists in comprehending and endorsing the method’s predictions by increasing its explainability. To measure the quality of the reconstructed spectra, we compute the gradient correlation between the reconstruction of the synthetic noisy spectrum and the corresponding synthetic clean spectrum. Ideally, the reconstructed spectrum should mirror the synthetic clean spectrum, since the synthetic noisy spectrum is a composite of the membrane filter spectrum and the synthetic pure MP spectrum. The gradient correlation value can serve as an indicator of the similarity between the two spectra.The average of the maximum gradient correlations over the test set is reported in Table 3. The results show that our method yields high-quality reconstructed spectra, exhibiting a higher gradient correlation compared to other methods. Moreover, as SNR decreases, the gap between our method and others widens. Between 0db and – 30dB, the gradient correlation of baseline, AE, and UNet drops around 64%, 43%, and 34% respectively, while our method drops only by 24%. At – 30dB SNR, our method outperforms the others by 18% or more in gradient correlation. This suggests that the results from our method deteriorate at a lower rate compared to other methods, indicating the robustness of our method in the denoising task.Table 3 The gradient correlation of our method and others on the test set at different levels of SNR.Figure 4(From left to right) the ground truth/measured clean spectrum, measured noisy spectrum, and the reconstructed spectra using dictionary learning, AE, and UNet, respectively, of measured noisy Acrylic spectrum (top) and POM spectrum (bottom).Figure 4 visually compares the spectral reconstruction of measured noisy spectra using our dictionary learning, AE, and UNet. The ground truth or the measured clean spectra are given as the reference of what reconstructed spectra should be. Our method demonstrates superior reconstruction quality since the results closely resemble the ground truth. On the other hand, the reconstructed spectra from AE may capture large peaks accurately but they introduce signal fluctuations, which are undesirable for chemists and make it challenging to recognize small peaks. While, UNet fails to eliminate some membrane filter peaks, causing the remaining peaks to mix with the peaks of the MPs and complicating the differentiation of peaks and the MP identification for chemists. Hence, both the qualitative findings in Fig. 4 and the quantitative results in Table 3 indicate that the level of explainability inherent in our method correlates positively with its robustness.Atom profile analysisFigure 5The heatmap of non-zero coefficients (non-zeros in dark blue and zeros in yellow) of 50 atoms across all 22 MPs and the membrane filter spectra from the learning matrix \({\textbf{X}}\). The red dashed lines indicate the atom indices of the membrane filter, which do not overlap with the atom indices of the other MPs.The key component of our methodology is the atoms or components learned by the dictionary learning, where a spectrum can be expressed by a weighted sum of these atoms. The portion of a spectrum captured by each atom is referred to as the atom profile. These atoms aim to capture distinctive patterns from spectra in the learning matrix \({\textbf{X}}\), which consists of the synthetic clean spectra and the synthetic membrane filter spectra. These learned atoms play a crucial role in reconstructing the spectra and illustrating the spectrum reconstruction ability. Dictionary learning employs sparsity-inducing penalties, which encourage a sparse or minimal set of non-zero coefficients of these atoms. The sparse coefficient matrix offers a concise representation of the spectra in the learning matrix \({\textbf{X}}\). The coefficient values signify the relative importance of atoms within the learned dictionary for each spectrum. Figure 5 illustrates the non-zero coefficients of all spectra in the learning matrix for each material type, demonstrating that most material types respond to a different set of atoms, especially for the membrane filter. This highlights the desirable property of our method that aids in characterizing the membrane filter removal and reconstruction processes.Analyzing the atom profiles highlights the highly desirable property in the practical application of our method, which is the explainability of the spectrum reconstruction process. Two key observations in atom profiles shed light on how our method achieves membrane filter spectrum removal and spectrum reconstruction:

1.

The profile of the atom with the highest coefficient captures the unique features in MP spectra for all types, as shown in Fig. 6. This dominant atom appears to capture significant information aligned with the distinctive patterns presented in different MP types. This observation suggests the meaningful association between this specific atom and the distinct characteristics of each MP type. It emphasizes the method’s capacity to capture and represent the unique features of each MP type through atoms computed by dictionary learning. Additionally, the profile of the atom associated with the membrane filter resembles the unique pattern of the membrane filter spectrum, as shown in Fig. 7, which is distinguishable from the atom profile of MPs in Figs. 8 and 9. This property enables our method to separate the membrane filter spectrum from the input spectrum.

2.

The coefficient values can indicate the presence of MPs and membrane filters in the spectrum. In Fig. 10, as the SNR decreases, the coefficient values of atoms corresponding to MPs (at index 24 for Acrylic and index 5 for POM) decrease. This suggests that when the level of noise overwhelms the MP signal, the coefficients corresponding to the occurrence of MP diminish. At the same time, the coefficient values of atoms corresponding to the membrane filter (e.g., at indices 13, 28, 37, and 41) emerge when the level of noise is high at SNR between 0dB and -30dB (Fig. 10). Reducing the influence of coefficients related to the membrane filter enhances the performance of MP spectrum reconstruction. By excluding these membrane-filter-related coefficients, the remaining information is primarily related to MPs, which facilitates the effective reconstruction of MP spectra.

Figure 6Comparison of the original spectrum (left) with the profile of the highest coefficient atom (right) of the Acrylic (top) and POM (bottom).Figure 7The non-zero coefficient atoms are presented according to the atom profiles of the membrane filter, sorted by atom’s coefficients in descending order from top to bottom.Figure 8The non-zero coefficient atoms are presented according to the atom profiles of the Acrylic (left) and PMMA (right) spectra, sorted by atoms’ coefficients in descending order from top to bottom.Figure 9The non-zero coefficient atoms are presented according to the atom profiles of the PTFE (left) and PVA (right) spectra, sorted by atoms’ coefficients in descending order from top to bottom.Figure 10Heatmap of coefficient values in noisy and reconstructed spectrum with SNR at 0dB, -10dB, -20dB, and -30dB, respectively, of the Acrylic (top) and POM (bottom) spectra calculated by dictionary learning technique (left) and after removing coefficients of the membrane filter following our method (right).Figure 11Confusion matrix on the results using our method on the synthetic noisy spectra dataset at SNR of -20dB (left) and -30dB (right).Classification error analysisWe examine the limitations of our methodology through an error analysis of the classification performance, focussing on cases where our method failed to accurately identify MPs. This examination begins with the computation of the confusion matrix on the synthetic noisy spectra dataset at SNR of -20dB (Fig. 11 (left)). The result highlights a pair of materials that our method often confuses one with another — Acrylic–PMMA. Addressing this confusion involves exploring of the atom profiles extracted from each type of MP. Figure 8 depicts the atom profiles ranked by their significance of Acrylic and PMMA, revealing noticeable similarities among these atom profiles, as they share the same set of atoms. In fact, the spectra of these MP pairs are also similar to each other, posing a challenging task even for experts attempting to differentiate between them. Apart from these tricky cases, our method accurately classifies 21 out of the 22 types of MPs in the experiment.We further investigate the atom profiles to explain why our method gives incorrect predictions for the pair of MPs, i.e., Acrylic–PMMA. The potential for confusion during the prediction process is made evident as the reconstruction of the MP pair utilizes the same set of atoms (Fig. 8). It turns out that the atom profiles of both Acrylic and PMMA share the same set of atoms but with different orders of coefficients (atom indexes 24 and 34). This confusion can be further explained in terms of the chemical properties of the polymers involved. PMMA is actually one of the structural variants of Acrylic materials. Therefore, the FTIR characteristics bands of both classes are almost indistinguishable, even for experts.However, when examining the confusion matrix on the synthetic noisy spectra dataset at SNR of -30dB (Fig. 11 (right)), our method starts misclassifying another pair of materials — PTFE–PVA. This error can be attributed to the overwhelming noise present in the spectra, combined with the fact that PTFE exhibits a low number of peaks (Fig. 9), some of which coincide with peaks in the membrane filter (particularly in the wavenumber range of 1000–1200). As a result, when the peaks of these two materials are merged due to noise, they become indistinguishable in some cases, leading our method to recognize them as peaks of MPs. Consequently, our method selects spectra that bear a higher resemblance to the MF, such as PVA, which possesses a greater number of peaks (Fig. 9). Moreover, our method also starts getting confused between Acrylic–PMMA, unlike before when they were just indistinguishable, resulting in all being categorized as PMMA. While this change may improve accuracy, it suggests that our method has reached its limitation and can no longer effectively differentiate between Acrylic and PMMA or classify PTFE accurately.

Hot Topics

Related Articles