Title: | Skill Estimation Based on a Single Bayesian Network |
---|---|
Description: | Most estimators implemented by the video game industry cannot obtain reliable initial estimates nor guarantee comparability between distant estimates. TrueSkill Through Time solves all these problems by modeling the entire history of activities using a single Bayesian network allowing the information to propagate correctly throughout the system. This algorithm requires only a few iterations to converge, allowing millions of observations to be analyzed using any low-end computer. The core ideas implemented in this project were developed by Dangauthier P, Herbrich R, Minka T, Graepel T (2007). "Trueskill through time: Revisiting the history of chess." <https://dl.acm.org/doi/10.5555/2981562.2981605>. |
Authors: | Gustavo Landfried [aut, cre] |
Maintainer: | Gustavo Landfried <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.1 |
Built: | 2024-11-22 05:52:24 UTC |
Source: | https://github.com/glandfried/trueskillthroughtime.r |
Game class
Game(teams, result = vector(), p_draw = P_DRAW) posteriors(g) ## S4 method for signature 'Game' posteriors(g)
Game(teams, result = vector(), p_draw = P_DRAW) posteriors(g) ## S4 method for signature 'Game' posteriors(g)
teams |
A list of |
result |
A vector of numbers, with the score obtained by each team, or
an empty vector. The default value is an empty vector. In this case, the
outcome is defined by the order in which the |
p_draw |
A number, the probability of a draw. The default value is
|
g |
A game object |
Game object
a1 = Player(Gaussian(mu=0, sigma=6), beta=1, gamma=0.03) a2 = Player(); a3 = Player(); a4 = Player() team_a = c(a1, a2) team_b = c(a3, a4) teams = list(team_a, team_b) g = Game(teams) post = posteriors(g) lhs = g@likelihoods post[[1]][[1]] == lhs[[1]][[1]]*a1@prior ev = g@evidence ev == 0.5 ta = c(a1) tb = c(a2, a3) tc = c(a4) teams_3 = list(ta, tb, tc) result = c(1, 0, 0) g3 = Game(teams_3, result, p_draw=0.25)
a1 = Player(Gaussian(mu=0, sigma=6), beta=1, gamma=0.03) a2 = Player(); a3 = Player(); a4 = Player() team_a = c(a1, a2) team_b = c(a3, a4) teams = list(team_a, team_b) g = Game(teams) post = posteriors(g) lhs = g@likelihoods post[[1]][[1]] == lhs[[1]][[1]]*a1@prior ev = g@evidence ev == 0.5 ta = c(a1) tb = c(a2, a3) tc = c(a4) teams_3 = list(ta, tb, tc) result = c(1, 0, 0) g3 = Game(teams_3, result, p_draw=0.25)
Gaussian class
Gaussian(mu = 0, sigma = 1) Pi(N) ## S4 method for signature 'Gaussian' Pi(N) Tau(N) ## S4 method for signature 'Gaussian' Tau(N) forget(N, gamma, t) ## S4 method for signature 'Gaussian,numeric,numeric' forget(N, gamma, t) isapprox(N, M, tol = 1e-04) ## S4 method for signature 'Gaussian,Gaussian,numeric' isapprox(N, M, tol = 1e-04) ## S4 method for signature 'Gaussian,Gaussian' e1 + e2 ## S4 method for signature 'Gaussian,Gaussian' e1 - e2 ## S4 method for signature 'Gaussian,Gaussian' e1 * e2 ## S4 method for signature 'Gaussian,Gaussian' e1 / e2 ## S4 method for signature 'Gaussian,Gaussian' e1 == e2 ## S4 method for signature 'Player' performance(a)
Gaussian(mu = 0, sigma = 1) Pi(N) ## S4 method for signature 'Gaussian' Pi(N) Tau(N) ## S4 method for signature 'Gaussian' Tau(N) forget(N, gamma, t) ## S4 method for signature 'Gaussian,numeric,numeric' forget(N, gamma, t) isapprox(N, M, tol = 1e-04) ## S4 method for signature 'Gaussian,Gaussian,numeric' isapprox(N, M, tol = 1e-04) ## S4 method for signature 'Gaussian,Gaussian' e1 + e2 ## S4 method for signature 'Gaussian,Gaussian' e1 - e2 ## S4 method for signature 'Gaussian,Gaussian' e1 * e2 ## S4 method for signature 'Gaussian,Gaussian' e1 / e2 ## S4 method for signature 'Gaussian,Gaussian' e1 == e2 ## S4 method for signature 'Player' performance(a)
mu |
A number, the mean of the Gaussian distribution. |
sigma |
A number, the standar deviation of the Gaussian distribution. |
N |
A Gaussian object |
gamma |
The dynamic factor, the dynamic uncertainty |
t |
The elapsed time |
M |
A Gaussian object |
tol |
The tolerance threshold for comparitions |
e1 |
A Gaussian object |
e2 |
A Gaussian object |
a |
A Gaussian object |
Gaussian object
N01 = Gaussian(0,1); N12 = Gaussian(mu = 1, sigma = 2) N06 = Gaussian(); Ninf = Gaussian(0,Inf) N01 * Ninf == N01 N01 * N12 N01 / N12 N01 + N12 N01 - N12 Pi(N12) == 1/(N12@sigma^2) Tau(N12) == N12@mu/(N12@sigma^2) Nnew = forget(N = N01, gamma = 0.01, t = 100) isapprox(Nnew, Gaussian(N01@mu,sqrt(N01@sigma^2+100*(0.01^2))), tol=1e-6)
N01 = Gaussian(0,1); N12 = Gaussian(mu = 1, sigma = 2) N06 = Gaussian(); Ninf = Gaussian(0,Inf) N01 * Ninf == N01 N01 * N12 N01 / N12 N01 + N12 N01 - N12 Pi(N12) == 1/(N12@sigma^2) Tau(N12) == N12@mu/(N12@sigma^2) Nnew = forget(N = N01, gamma = 0.01, t = 100) isapprox(Nnew, Gaussian(N01@mu,sqrt(N01@sigma^2+100*(0.01^2))), tol=1e-6)
History class
composition |
A list of list of player's names (id). Each position of the list is a list that represents the teams of a game, so the latter must contain vectors of names representing the composition of each team in that game. |
results |
A list of numeric vectors, representing the outcome of each
game. It must have the same
length as the |
times |
A numeric vector, the timestamp of each game. It must have the
same length as the |
priors |
A hash object, a dictionary of |
mu |
A number, the prior mean. The deafult value is: |
sigma |
A number, the prior standar deviation. The deafult value is:
|
beta |
A number, the standard deviation of the performance. The default
value is: |
gamma |
A number, the amount of uncertainty (standar deviation) added to
the estimates between events. The default value is: |
p_draw |
A number, the probability of a draw. The default value is
|
epsilon |
A number, the convergence threshold. Used to stop the convergence procedure. The default value is |
iterations |
A number, the maximum number of iterations for convergence. Used to stop the convergence procedure. The default value is |
History object
size
A number, the amount of games.
batches
A vector of Batch
objects. Where the games that occur at the same timestamp live.
agents
A hash, a dictionary indexed by the players' name (id).
time
A boolean, indicating whether the history was initialized with timestamps or not.
mu
A number, the default prior mean in this particular History
object
sigma
A number, the default prior standard deviation in this particular History
object
beta
A number, the default standar deviation of the performance in this particular History
object
gamma
A number, the default dynamic uncertainty in this particular History
object
p_draw
A number, the probability of a draw in this particular History
object
h_epsilon
A number, the convergence threshold in this particular History
object
h_iterations
A number, the maximum number of iterations for convergence in this particular History
object
convergence(epsilon = NA, iterations = NA, verbose = TRUE)
initialize(
composition,
results = list(),
times = c(),
priors = hash(),
mu = MU,
sigma = SIGMA,
beta = BETA,
gamma = GAMMA,
p_draw = P_DRAW,
epsilon = EPSILON,
iterations = ITERATIONS
)
learning_curves()
log_evidence()
c1 = list(c("a"),c("b")) c2 = list(c("b"),c("c")) c3 = list(c("c"),c("a")) composition = list(c1,c2,c3) h = History(composition, gamma=0.0) trueskill_learning_curves = h$learning_curves() ts_a = trueskill_learning_curves[["a"]] ts_a[[1]]$N; ts_a[[2]]$N ts_a[[1]]$t; ts_a[[2]]$t h$convergence() trueskillThrougTime_learning_curves = h$learning_curves() ttt_a = trueskillThrougTime_learning_curves[["a"]] ttt_a[[1]]$N; ttt_a[[2]]$N ttt_a[[1]]$t; ttt_a[[2]]$t ## Not run: # Synthetic example library(hash) N = 100 skill <- function(experience, middle, maximum, slope){ return(maximum/(1+exp(slope*(-experience+middle)))) } target = skill(seq(N), N/2, 2, 0.075) opponents = rnorm(N,target,0.5) composition = list(); results = list(); times = c(); priors = hash() for(i in seq(N)){composition[[i]] = list(c("a"), c(toString(i)))} for(i in seq(N)){results[[i]]=if(rnorm(1,target[i])>rnorm(1,opponents[i])){c(1,0)}else{c(0,1)}} for(i in seq(N)){times = c(times,i)} for(i in seq(N)){priors[[toString(i)]] = Player(Gaussian(opponents[i],0.2))} h = History(composition, results, times, priors, gamma=0.1) h$convergence(); lc_a = h$learning_curves()$a; mu = c() for(tp in lc_a){mu = c(mu,tp[[2]]@mu)} plot(target) lines(mu) # Plotting learning curves # First solve your own example. Here is a dummy one. agents <- c("a", "b", "c", "d", "e") composition <- list() for (i in 1:500) { who = sample(agents, 2) composition[[i]] <- list(list(who[1]), list(who[2])) } h <- History(composition = composition, gamma = 0.03, sigma = 1.0) h$convergence(iterations=6) # Then plot some learning curves lc <- h$learning_curves() colors <- c(rgb(0.2,0.2,0.8), rgb(0.2,0.8,0.2), rgb(0.8,0.2,0.2)) colors_alpha <- c(rgb(0.2,0.2,0.8,0.2), rgb(0.2,0.8,0.2,0.2), rgb(0.8,0.2,0.2,0.2)) plot(0,0, xlim = c(0, 500), ylim = c(-1, 1), xlab = "t", ylab = "skill", type = "n") for (i in 1:3) { agent <- agents[i] t <- c(); mu <- c(); sigma <- c() for(x in lc[[agent]]){ t <- c(t, x$t ) mu <- c(mu, x$N@mu) sigma <- c(sigma, x$N@sigma) } lines(t, mu, col = colors[i], lwd = 2, type = "l") polygon(c(t, rev(t)), c(mu + sigma, rev(mu - sigma)), col = colors_alpha[i], border = NA) } legend("topright", legend = agents[1:3], col = colors, lwd = 2) ## End(Not run)
c1 = list(c("a"),c("b")) c2 = list(c("b"),c("c")) c3 = list(c("c"),c("a")) composition = list(c1,c2,c3) h = History(composition, gamma=0.0) trueskill_learning_curves = h$learning_curves() ts_a = trueskill_learning_curves[["a"]] ts_a[[1]]$N; ts_a[[2]]$N ts_a[[1]]$t; ts_a[[2]]$t h$convergence() trueskillThrougTime_learning_curves = h$learning_curves() ttt_a = trueskillThrougTime_learning_curves[["a"]] ttt_a[[1]]$N; ttt_a[[2]]$N ttt_a[[1]]$t; ttt_a[[2]]$t ## Not run: # Synthetic example library(hash) N = 100 skill <- function(experience, middle, maximum, slope){ return(maximum/(1+exp(slope*(-experience+middle)))) } target = skill(seq(N), N/2, 2, 0.075) opponents = rnorm(N,target,0.5) composition = list(); results = list(); times = c(); priors = hash() for(i in seq(N)){composition[[i]] = list(c("a"), c(toString(i)))} for(i in seq(N)){results[[i]]=if(rnorm(1,target[i])>rnorm(1,opponents[i])){c(1,0)}else{c(0,1)}} for(i in seq(N)){times = c(times,i)} for(i in seq(N)){priors[[toString(i)]] = Player(Gaussian(opponents[i],0.2))} h = History(composition, results, times, priors, gamma=0.1) h$convergence(); lc_a = h$learning_curves()$a; mu = c() for(tp in lc_a){mu = c(mu,tp[[2]]@mu)} plot(target) lines(mu) # Plotting learning curves # First solve your own example. Here is a dummy one. agents <- c("a", "b", "c", "d", "e") composition <- list() for (i in 1:500) { who = sample(agents, 2) composition[[i]] <- list(list(who[1]), list(who[2])) } h <- History(composition = composition, gamma = 0.03, sigma = 1.0) h$convergence(iterations=6) # Then plot some learning curves lc <- h$learning_curves() colors <- c(rgb(0.2,0.2,0.8), rgb(0.2,0.8,0.2), rgb(0.8,0.2,0.2)) colors_alpha <- c(rgb(0.2,0.2,0.8,0.2), rgb(0.2,0.8,0.2,0.2), rgb(0.8,0.2,0.2,0.2)) plot(0,0, xlim = c(0, 500), ylim = c(-1, 1), xlab = "t", ylab = "skill", type = "n") for (i in 1:3) { agent <- agents[i] t <- c(); mu <- c(); sigma <- c() for(x in lc[[agent]]){ t <- c(t, x$t ) mu <- c(mu, x$N@mu) sigma <- c(sigma, x$N@sigma) } lines(t, mu, col = colors[i], lwd = 2, type = "l") polygon(c(t, rev(t)), c(mu + sigma, rev(mu - sigma)), col = colors_alpha[i], border = NA) } legend("topright", legend = agents[1:3], col = colors, lwd = 2) ## End(Not run)
Print list of Gaussian using the python and julia syntax
lc_print(lc.a)
lc_print(lc.a)
lc.a |
List of Gaussians |
No return value, print lists of Gaussian using the python and julia syntax
Player class
Player(prior = Nms, beta = BETA, gamma = GAMMA) performance(a)
Player(prior = Nms, beta = BETA, gamma = GAMMA) performance(a)
prior |
A Gaussian object, the prior belief distribution of the skills. The
default value is: |
beta |
A number, the standard deviation of the performance. The default
value is: |
gamma |
A number, the amount of uncertainty (standar deviation) added to
the estimates at each time step. The default value is: |
a |
A Player object |
Player object
a1 = Player(prior = Gaussian(0,6), beta = 1, gamma = 0.03); a2 = Player() a1@gamma == a2@gamma N = performance(a1) N@mu == a1@prior@mu N@sigma == sqrt(a1@prior@sigma^2 + a1@beta^2)
a1 = Player(prior = Gaussian(0,6), beta = 1, gamma = 0.03); a2 = Player() a1@gamma == a2@gamma N = performance(a1) N@mu == a1@prior@mu N@sigma == sqrt(a1@prior@sigma^2 + a1@beta^2)