Wrestlestat is one of my favorite wrestling resources, bar none.
As for the algorithm, ELO Chess has limitations, with the biggest weaknesses being it seems to lack a recency bias, placing to much or at least weight on matches that happened 2 years ago. What a kid did this month is much more indicative of how he is performance than what he did the last year or before. There also seems to be lack of a heavier weighting assigned to head to head matchups.
Do I think WS could be a far better predictor with some different math behind the algorithm? Absolutely and I wish he would explore several options. Simply test them with prior years' data to validate the impact.
Here is what GROK had to say
"For one-on-one sports like tennis or wrestling, where direct matchups are central but not everyone competes against everyone else, the Bradley-Terry model stands out as a strong alternative to Elo. It's particularly well-suited if you want to emphasize head-to-head records more directly, as it fits a probabilistic model to the entire dataset of pairwise outcomes at once, rather than updating ratings sequentially like Elo (which can dilute the impact of older matches over time).
Why Bradley-Terry Over Elo?
Core Mechanism: The model estimates a "strength" parameter for each player such that the probability of player A beating player B is strength_A / (strength_A + strength_B). This is similar to Elo's logistic expectation but computed via maximum likelihood estimation across all matches simultaneously.
Head-to-Head Emphasis: It directly incorporates the full history of wins and losses between any pair of players. If two competitors have faced off multiple times, those results heavily influence their relative strengths in the global ranking—more so than in Elo, where past games are baked into current ratings but can be overshadowed by recent performances against others. You can also weight matches by recency or importance (e.g., major tournaments) to customize it.
Handling Incomplete Data: In sports like tennis or wrestling, where schedules are sparse, it accounts for the network of results transitively (e.g., if A beats B and B beats C, it infers something about A vs. C) while prioritizing actual head-to-head data where available.
Advantages: It's statistically robust, scalable for large player pools, and has been used or adapted in contexts like ranking Smash Bros. players91a9fe, horse racing, and academic sports analyses580b9dcae9d9. Unlike Elo, which is "online" (real-time updates), Bradley-Terry is "batch" processed, making it ideal for end-of-season rankings or recalculating with full historical data67cc7cf9c075.
Drawbacks: It's computationally more intensive for very large datasets (though efficient algorithms exist2563d2), and it doesn't inherently handle uncertainty or player improvement over time as elegantly as extensions like generalized Bradley-Terry models50d15ee88f0b.
Other Strong Alternatives
If Bradley-Terry doesn't fit, consider these, which also address Elo's limitations in varying ways:
Glicko-2: An extension of Elo that adds a rating deviation (uncertainty) and volatility factor. It values head-to-head through updates but better handles inactive players or sparse data. Used in some online gaming and chess variants33e7e9968053.
TrueSkill: A Bayesian system (from Microsoft) that's probabilistic like Bradley-Terry but includes skill uncertainty. It's great for 1v1 and can emphasize recent head-to-head by design. Commonly used in esports and adaptable to sportsd31cb567bf2e7624b7.
Points-Based Systems (e.g., ATP-Style for Tennis): Not purely algorithmic like the others, but rewards tournament performance with points scaled by event prestige. Head-to-head indirectly matters (it determines match winners), but the system emphasizes consistency and quality wins over direct rivalries. Used officially in tennis and similar to some wrestling federations' points for competitionsdb3489bf0417.
In practice, "best" depends on your goals—predictive accuracy, fairness, or incentives. For pure head-to-head valuation in a computational sense, start with Bradley-Terry. If implementing, libraries like Python's scipy or statsmodels can fit it via logistic regression on pairwise data."