Published at SB Nation.
The concept is pretty simple. For every shot, you assign an “expected goals” value based on characteristics like the location on the pitch, whether the shot is taken with the foot or the head, whether the shot is assisted by a cross or through-ball, and so on. This is in no way a comprehensive list of the characteristics of each shot, but it provides a reasonable estimate when dealing with larger samples. A club’s expected goals, then, is the sum of all their expected goals values for all their shots.
The relationship between goals and points is humongously complex. As Howard Hamilton showed in his work on the “soccer pythagorean”, having three unequal possible match results creates a weird, non-linear relationship. So instead of dealing with the math, I’m just simulating the games and comparing projected points to real points. There we should expect a simple linear relationship if the projections are good.