How to lose 15 euros scraping soccer webs (and of course betting)

Everybody wants to become rich. There are some ways to achieve this goal: You could work harder during a lot of years or if you want something easier and faster, you could bet, for instance, in sports activities and cross fingers.
I decided the last option because it was easier, faster, lovely and I really like soccer.
So I decided to build a web app (shiny app with R) that can tell me what will happen in a soccer match (Spanish League).
In Spain we have what we call «La Quiniela», it is a game (organized by the Guvernment) where you need to predict the result of 15 matches.

I decided to apply one thing: logic. Even if it is a game is more probable that Real Madrid could beat Leganes (with all my respect for Leganes supporters) than finding a lost game of the white team.

To apply logic in the results I decided to use statistics (R is a great language for that), so the steps were:

1) Read previous results (historical results)
2) Apply algorithm to predict
3) Paint results and predictions
1) Read previous results

I decided to use an R library called rvest that allows you to read any website to get information.
Once I detected a good website to read from ( I use the next R code:

html <- read_html(url)

Where url is the web with the results.
Then we need to process the results and insert them into a table or a good structure to manage data (in R a dataframe is a good one):

table <- html %>%
html_nodes(«.tablaresultados») %>%
## The 4th table is the classification
dfClassification <- table[[4]] %>%
html_table(fill = TRUE)

Once I have the previous results it is time to predict.
2) Apply algorithm to predict

I decided to develop my own algorithm. It takes into account the next factors:

1) Goals scored at home of the team that plays at home.

2) Goals received at home of the team that plays at home.
3) Goals scored as visitor of the team that plays as visitor.
4) Goals received as visitor of the team that plays as visitor.

Then I decided to give some importance (configurable weight) to the last matches, in order to give value to the fitness of a specific team (maybe one team is the last one but in the last 5 matches they didn’t lose…). How many matches we could consider to look back? It is configurable too.

With all these factors then we can apply the algorithm and get predictions.
3) Paint results and predictions

I decided to use the R library ggplot2 to paint results. For instance, the evolution along the time of the correct results:

ScreenHunter 853

And with library DT the application shows you the prediction in one column and the real result in the other one. In real time it shows a clear table: green means correct prediction, red means bad prediction:

ScreenHunter 852

After some matches the app adjusts the configurable weights in order to learn and predict better the week after.

At this point, I am not rich, in fact, I lose 15 euros betting to «La Quiniela» but maybe in the future we could get back these euros…

The app itself is here:

Sebastián Revuelta

Deja una respuesta

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de

Estás comentando usando tu cuenta de Salir /  Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Salir /  Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Salir /  Cambiar )

Conectando a %s