100X Improvement in Efficiency of Atomic Chemical Compound Analysis with Deep Learning

Applications of Deep Learning in Material Science & Chemical Engineering

Zach Wolpe
4 min readJul 3, 2021

#PaperReview

Applications of Electron Microscopy

Electron microscopy — a method of generated nanoscale and mesoscale imagining by capturing how electrons interact with materials — is becoming increasingly powerful.

EM hosts a series of important applications:

Life Sciences

  • Microbiology: providing insight into the microbiology of bacteria, viruses & other cells.
  • Medicine: Offering insight into disease diagnostics and damaged issues/cancers.
  • Forensics: Trace evidence such as gunshot residue, hairs, fibers, glass paint fragments, and fingerprints.

Material Sciences

Quality of materials ensuring that they are fit for purpose, preventing material failure, and designing new materials. Playing an essential role in aerospace, electronics, chemistry, and energy.

Semiconductor Sciences

Detailed topographical information is vital to developing high-performing semiconductors, requiring EM.

Application

One particularly important use case is defect detection. Here the authors focused on:

Analyzing the locations and sizes of defects of metal alloys under irradiation.

Note: Reference paper available here.

Existing Methods

Traditionally done manually by skilled researchers, manual defect detection hosts a number of issues, manual identification is:

  1. Extremely time-consuming.
  2. Error-prone.
  3. Lacks consistency and reproducibility.
  4. Does not scale.

To address these concerns, the authors implemented an algorithm (predominantly reliant on Machine Learning) to automate the defect detection process.

Notable contribution:

Current ML approaches recognize objects in the image accurately but do not have mechanisms for extracting quantitative defect structure and defect distribution information, such as defect dimensions and areal number density.

Deep Learning Model

Figure 1. A sample of the data after fitting took place.

The model consists of 3 modules:

Module 1. Detection — Cascade Object Detector

An ensemble method that uses the input data (pixels in our case) to make a prediction through a series of binary logic gates. This was used to detect possible defects in the raw data.

Module 2. Screening — ConvNet

Once passed through the detector, the data was further pruned by training a CNN to classify defects.

Module 3. Analysis — Local Image Analysis

Finally, the first two steps can be considered dimensionality reduction & auto-labeling methods for the final module, where a watershed flood algorithm (image segmentation technique) was used to extract & predict the actual size & shape of the defect.

Lveraging this pipeline, we managed to identify loops and extract the interesting loop shape information in an automated way with minimal tunable parameters.

Model Evaluation

Recall + Precision: the capability to find correct positions of the loop defects inside the image

Defect size distribution metrics:

  • total number of defects in the image
  • the average diameter (length of the major axis) of the loop defects in the image
  • the standard deviation of the diameters of the loop defects in the image.

Modules I and II were evaluated on recall and precision metrics. Module III was evaluated on the defect size distribution metrics.

The overall performance of the automated defect analysis approach was the combination of the recall and the precision from the combined modules I and II and the image analysis results from module III.

Results

Barring a few exceptions, the machine learning algorithm employed offers similar or superior performance than that achieved by human experts, whilst boasting significant improvements in efficiency & reproducibility.

Results Breakdown

Recall + Precision: The Machine learning model averaged higher precision & recall scores than its (expert) human counterparts. Likely due to differences in how individuals handle ambiguous loops.

Speed: The model was approximately 80x faster than human researchers. With access to greater compute or embedding in image extraction (STEM) systems — this number can be improved in orders of magnitude.

Number of Loops: The machine did an excellent job of determining a total loop number similar to that provided by human labeling.

Mean diameter: The automated defect detection algorithm also showed excellent agreement with the human labeling — barring image 3.

Standard deviation: The machine also was in very good agreement with human labeling for standard deviation — barring image 3.

Figure 2. A sample of the results is shown here. The image numbers correspond to the image number in figure 1 (above).

--

--