What they have is not ground truth, it's bad data. Why is it bad data? Because any model that uses, or any metric based on it, will be worse. That's in opposition to the definition and purpose of ground truth data: it's not supposed to make things worse.
You're both right. Perfection isn't possible or practical. But their "ground truth" (in that example) is obviously shite, that nobody should be using for training or any sort of metric, since it will make them worse. You're also right that you can name a dataset "ground truth", but names don't mean much when they're in opposition to the intent.
You're both right. Perfection isn't possible or practical. But their "ground truth" (in that example) is obviously shite, that nobody should be using for training or any sort of metric, since it will make them worse. You're also right that you can name a dataset "ground truth", but names don't mean much when they're in opposition to the intent.