100X Improvement in Efficiency of Atomic Chemical Compound Analysis with Deep Learning
Applications of Electron Microscopy
Electron microscopy — a method of generated nanoscale and mesoscale imagining by capturing how electrons interact with materials — is becoming increasingly powerful.
EM hosts a series of important applications:
- Microbiology: providing insight into the microbiology of bacteria, viruses & other cells.
- Medicine: Offering insight into disease diagnostics and damaged issues/cancers.
- Forensics: Trace evidence such as gunshot residue, hairs, fibers, glass paint fragments, and fingerprints.
Quality of materials ensuring that they are fit for purpose, preventing material failure, and designing new materials. Playing an essential role in aerospace, electronics, chemistry, and energy.
Detailed topographical information is vital to developing high-performing semiconductors, requiring EM.
One particularly important use case is defect detection. Here the authors focused on:
Analyzing the locations and sizes of defects of metal alloys under irradiation.
Traditionally done manually by skilled researchers, manual defect detection hosts a number of issues, manual identification is:
- Extremely time-consuming.
- Lacks consistency and reproducibility.
- Does not scale.
To address these concerns, the authors implemented an algorithm (predominantly reliant on Machine Learning) to automate the defect detection process.
Current ML approaches recognize objects in the image accurately but do not have mechanisms for extracting quantitative defect structure and defect distribution information, such as defect dimensions and areal number density.
Deep Learning Model
The model consists of 3 modules:
Module 1. Detection — Cascade Object Detector
An ensemble method that uses the input data (pixels in our case) to make a prediction through a series of binary logic gates. This was used to detect possible defects in the raw data.
Module 2. Screening — ConvNet
Once passed through the detector, the data was further pruned by training a CNN to classify defects.
Module 3. Analysis — Local Image Analysis
Finally, the first two steps can be considered dimensionality reduction & auto-labeling methods for the final module, where a watershed flood algorithm (image segmentation technique) was used to extract & predict the actual size & shape of the defect.
Lveraging this pipeline, we managed to identify loops and extract the interesting loop shape information in an automated way with minimal tunable parameters.
Recall + Precision: the capability to find correct positions of the loop defects inside the image
Defect size distribution metrics:
- total number of defects in the image
- the average diameter (length of the major axis) of the loop defects in the image
- the standard deviation of the diameters of the loop defects in the image.
Modules I and II were evaluated on recall and precision metrics. Module III was evaluated on the defect size distribution metrics.
The overall performance of the automated defect analysis approach was the combination of the recall and the precision from the combined modules I and II and the image analysis results from module III.
Barring a few exceptions, the machine learning algorithm employed offers similar or superior performance than that achieved by human experts, whilst boasting significant improvements in efficiency & reproducibility.
Recall + Precision: The Machine learning model averaged higher precision & recall scores than its (expert) human counterparts. Likely due to differences in how individuals handle ambiguous loops.
Speed: The model was approximately 80x faster than human researchers. With access to greater compute or embedding in image extraction (STEM) systems — this number can be improved in orders of magnitude.
Number of Loops: The machine did an excellent job of determining a total loop number similar to that provided by human labeling.
Mean diameter: The automated defect detection algorithm also showed excellent agreement with the human labeling — barring image 3.
Standard deviation: The machine also was in very good agreement with human labeling for standard deviation — barring image 3.