Here is the problem, when a neural network is training on incomplete data, a loss function (usually binary cross-entropy) assigns 0 during training for those regions. Even when the network tries to increase the probability in the falsely interpreted region, it would be returned to 0 because of the labeling.
The solution to the problem lies within loss function since it calculated Mistie between predicted and ground-truth data, provides gradient, and drives the training. In our commercial software, we use an advanced loss function to make our network less sensitive to incorrect labeling. And finally, there is a publication that addresses the problem.
I am reading the second excellent paper by YiMin Dou et al., "Efficient Training of High-Resolution Representation Seismic Image Fault Segmentation Network by Weakening Anomaly Labels." The authors propose to use region-based loss (Masked Dice) instead of the distribution loss function (BCE) that shows a better ability to neglect false-negative labeling. They showed how the proposed loss function and new network architecture could predict various datasets around the world.
If you are interested in the recent advancement in AI/ML for Seismic Interpretation, I highly recommend reading this paper. It's worth it. Thanks!
Links are in the comments (including link to the GitHub repo with scripts for PyTorch + trained model).
Link to the GitHub repo: https://github.com/douyimin/FaultNet
Link to the LinkedIn post