Adversarial examples are inputs to machine learning models that have been intentionally modified to cause the model to make incorrect predictions. These small, often imperceptible changes can trick models, such as those used in image recognition, into misclassifying objects. For instance, an image of a cat might be altered slightly so that a neural network mistakenly identifies it as a dog.
These examples highlight vulnerabilities in artificial intelligence systems, raising concerns about their reliability and security. Researchers study adversarial examples to improve model robustness and ensure that AI applications, like self-driving cars or facial recognition, can operate safely in real-world scenarios.