Entropy is used for a lot of things in data science, including building classification trees, quantifying the relationship between two things with mutual information, and calculating distances between probability distributions with relative entropy and cross entropy.
Surprise is inversely related to probability, meaning that when the probability of an event is low, the surprise is high, and when the probability of an event is high, the surprise is low.
Surprise is calculated using the log of the inverse of the probability of an event. This gives a curve where the surprise increases as the probability of the event decreases, but the surprise is zero when the probability of the event is one.