Thursday, May 24, 2012

Tout le monde aura un prix

After attempting to find out if all the teams in the Premiership could be characterised as winners in some (involved and in some cases not very realistic) way, now Ligue 1 has finished for the season, it seems only fair to try the same for them.  

Same rules as before – each achievement must be at least arguably positive (i.e. no ‘best at letting in goals’), and the Occam’s Razor approach applies, start with the obvious stuff and only apply filters where necessary.  And same source as before, files available at, with a little bit of help from L'Equipe, and Mr Gibney and Mr Benneworth on Twitter when I got stuck. 

And thanks to Wikisource for the French version of Alice.  

  • ‘accuracy’ is the percentage of shots deemed on target; ‘efficiency’ the percentage of shots on target resulting in a goal; ‘conversion rate’ is the percentage of total shots resulting in a goal; ‘save rate’ is the percentage of shots on target against that did not result in a goal; 
  • filters applied were the home / away record, top half v bottom half, 2011 / 2012, and first / second half.
  1. Montpellier - most points (82), most wins (25), most shots (589), most on target (223), most home wins (16), most home clean sheets (13), and fewest goals conceded (34, joint with Toulouse). And some other stuff. They won! There would be. Anyway.
  2. Paris-SG - most goals (75), most away goals (33), most goals in a single home game (6 v Sochaux), most HT losses turned around for wins (3, v Evian, Lorient and Toulouse), highest conversion rate (14%) and highest efficiency (38%). 
  3. Lille - most home goals (48), fewest goals conceded away (16), and most clean sheets away (8), highest shooting accuracy (39%)
  4. Lyon - scored in every home game
  5. Bordeaux - longest winning streak (6 - and that at the end of the season) and unbeaten run (7 games), most away points in 2012 (18), and most half-time leads (17)
  6. Rennes - most goals in a single away game (6 - poor Sochaux), best travellers (46.7% pts were won away) and equal most away wins (9, with MHSC, PSG and LOSC)
  7. St-Etienne - most corners (243)
  8. Toulouse - fewest conceded (34, joint with MHSC) and most clean sheets (19), a 79% save rate, rising to 85% at home. Take a bow, Ali Ahamada, playing all 38 games in the TFC net there.
  9. Evian TG - longest away unbeaten run (7 games)
  10. Marseille - most shots at home (344)
  11. Nancy - only the one red card for the boys from ASNL. 
  12. Valenciennes - biggest home advantage (1.53 pts per game, 83.7% pts taken at home)
  13. Nice - best bottom half performance against the top half (21 pts, 24 goals)
  14. Sochaux - only 46 yellow cards.
  15. Brest - best away save rate (79.2%) and the early bird prize (61.3% goals scored in first half)
  16. AC Ajaccio - biggest final day jump up the table (2 places - but as that was from 18th to 16th, we're taking it)
  17. Lorient - fewest fouls (466) and the late bloomer prize (74.3% goals came in the second half)
  18. Caen - erm...
  19. Dijon - well...
  20. Auxerre - got nothing.

Again, it must be stressed that all this proves (or entails) absolutely nothing, but it was interesting to see that it was slightly easier to find 'winnings' for teams throughout the table than it was for the Premiership, perhaps a function of the relatively closer standings in Ligue 1.  See the Crazy Scores Comparison dashboard for more detail.  There was less need for multiple applied filters or the 2011/12 split, or going down to results by month.  I possibly could have found something for the remaining teams with a bit more cross-cutting, but as they have all been relegated, perhaps they are best left as they are.

The danger of picking a single metric without context is again well displayed; in the 'mixed blessing' corner, Valenciennes' home advantage might be reassuring for season ticket holders but also results from them being more than a bit iffy away from home (7pts), whereas Rennes travelling well suggests that things might not be all spiffy at home; and Brest being best at getting in early has the corollary that they tended to fade away in the second half, while Lorient being the best second half scorers, similarly, that they did start games rather slowly.

So, again, I have failed in my challenge, but in a slightly less arse-about-face way than last time. Which is nice. 

Saturday, May 19, 2012

Staying Up Late in Ligue 1

The standings at the bottom:

12 - Valenciennes: 40pts, GD -12 (home to Caen)
13 - Nice: 39, -8 (away at OL)
14 - Lorient: 39, -13 (home to PSG)
15 - Sochaux: 39, -21 (home to OM)
16 - Brest: 38, -8 (away at Evian)
17 - Caen: 38, -18 (away at Valenciennes)
18 - Ajaccio: 38, -23 (away at Toulouse)
19 - Dijon: 36, -20 (away at Rennes)
20 - Auxerre: 34 (relegated - home to MHSC)

There's only one 'six-pointer' in this, Valenciennes v Caen, but there's a lot of possible configurations for the final standings, so as I am trying to avoid worrying about MHSC's chances of winning the title, I have been trying to work them out*.

Dijon stay up if they beat Rennes and at least two of the four teams above them lose, with the outside chance also of overtaking Lorient or Nice if either of them get so massively tonked the current 7/12 GD gets overturned. This seems highly unlikely, but you never know with PSG. Anyway. Not looking good, but not totally hopeless as Ajaccio, Brest, and Sochaux could well get beaten.  But beating Rennes will be a big ask.

Ajaccio could stay up with a draw if two of the following things happen: Dijon drop points, Brest lose, Caen lose, or Sochaux lose by 3. If they win, they stay up if any of the five teams above them drop points, or Valenciennes lose.

Caen also could stay up with a draw if two of the following happen: Dijon drop points, Ajaccio drop points,  Brest lose, Sochaux lose by three. A win would see them stay up if any of Ajaccio or the four above them drop points (as they are playing Valenciennes).  If they lose, they're in a pickle if two of: Dijon win, Ajaccio or Brest draw or win.

Brest, and their highly suggestive goal difference, will stay up if they draw and any of the three below them drop points, or any of the three above them lose.  If they win, again any points dropped by any of the other teams other than Auxerre would see them safe.

Sochaux, mired in the red zone for much of the season, could still chuck it all away.  They go down if they lose, and any two of the following: Dijon win, Ajaccio win or draw with Sochaux losing by 3, Caen win / draw, Brest win / draw.  If they draw, they only need one of the three below them or two above them to lose.  If they win, they just need one of the same teams to drop points, or Valenciennes to lose.

Lorient could have fewer likely outcomes than some of the teams below them as PSG will be trying quite hard on Sunday evening.  If they lose, they will be looking for two of Caen, Ajaccio and Dijon to drop points. Or Sochaux to be hammered for an over 8-goal swing. A draw, and they only need one of the four below them or one above them to drop points, or Valenciennes to lose by 2.

Nice are OK if they lose if any two of the following: Lorient lose, Sochaux lose, Brest / Caen / Ajaccio drop points, Dijon don't beat Rennes with a 12-goal swing.  Get a point and they just need somebody from Valenciennes or the five below them to drop points. Think they're OK then...

Valenciennes could technically go down but only if they lose and Brest, Caen, Ajaccio, Sochaux all win, and Nice and Lorient pick up points.  If they draw, they're safe.

So - erm.....Dijon look gone, as they have to win to stand any chance. Ajaccio might just cling on with a draw (a win seems highly unlikely - admittedly, a draw is pretty unlikely) as Dijon will almost certainly drop points and one of the other losses needed will probably come in. But they seem the most likely two to join Auxerre in Ligue 2 next season.

Caen will be nervous, but with Dijon unlikely to win, they would need both Ajaccio and Brest to draw or better to send them down.  Brest could well manage that (Evian look a bit disjointed due to injuries and suspensions) but Ajaccio probably not.  Assuming Lorient lose, it would take a hell of an accumulator to put them down as Caen, Ajaccio and Dijon all winning would not be a treble to put on sober.

Thus - in conclusion the current bottom three will probably not change.  All that for that. Pah. Anyway...

*There's almost certainly at least one massive error in my workings here. Unfortunately there are no prizes for spotting one...sorry.

Wednesday, May 16, 2012

Beautiful Game, Fairer Sex, Yadda Yadda Yadda...

This Saturday sees the epic clash between Bayern München and Chelsea – the latter responsible for dashing the hopes of those wanting to see Barcelona to be the first team to win back-to-back Champion’s League trophies.


On Thursday 17 May, we might still see that happen*, as Olympic Lyon take on FFC Frankfurt in the Women’s Champion’s League Final 2012, at the Olympiastadion in Munich.    

(Good job it's across town, or they'd have to be sure to leave the turf tidy. Didier might trip.) 

Lyon are the defending champions, with a strong record in the competition; semi-finalists in 2008 and 2009 and finalists in 2010 (losing to Potsdam on penalties), before winning last year’s get-your-own-back clash 2-0 at Craven Cottage against Potsdam, who they also beat in this year’s semi-finals. 

Potsdam were in fact the only team not to lose – indeed, the only team to score an actual goal - against the reigning French champions, with the two legs running out 5-1 and 0-0.

Lyon have been scoring like, erm, high-scoring things in this year’s campaign – beating both Olympia Cluj and Sparta Prague 12-0 on aggregate (yes, each) before easing off to beat Brondby only 8-0 on aggregate in the quarter finals.  Thus, their tally over the eight games played so far is 37 for, 1 against, and Eugénie Le Sommer and Camille Abily are joint top scorers in the competition with 8 apiece.  Woof.  They can also call on the talents of a fair proportion of the French national team, currently top of their group for the Euro 2013 qualifiers with 18 points (context: Scotland are second on 7) – 11 of the currrent 21 Bleues are from OL.

Frankfurt have been rather less prolific, beating both Stabaek and PSG 4-2 on aggregate and then Malmo 3-1 in the quarters, before managing their only dual win of the campaign over Arsenal in the semi-finals, finishing 4-1 on aggregate.  They, thus, don’t make the top scorer charts, with former German international midfielder Kerstin Garefrekes top on 4.  They have 4 current German internationals on their side.

Looking at Lyon’s results, both in this competition and domestically, does point up one noticeable problem with the women’s game – the smaller nature of the game in simple number terms means a bigger gulf in class between teams.  There is a massive drop off in Division 1 in France between the top four (traditionally those who make up the French national team – Lyon, Juvisy, PSG and Montpellier) and the rest.  Lyon’s last five games in the league? 7-0, 3-0, 4-0, 8-0, 6-0. They have a goal difference of +100 at the moment. So, well done Muret, for keeping it down to three (they’re bottom of the table). 

Meanwhile, Frankfurt, runners up last season, are fourth in the Frauen Bundesliga, which also shows a big split between the top four (Potsdam, Duisburg, Wolfsburg are above them) and the rest. They however are less dominant – they most recently got beaten by Potsdam 3-1; given the current domestic league positions and the history of the Women’s CL more generally, Lyon might legitimately be feeling that they have got the hardest challenge out of the way already.

The stats suggest a convincing win for Lyon is on the cards, and going for >2,5 goals would look safe (which is why it’s 27/40 at present).  The odds at the time of writing are OL 4/7, draw 14/5, Frankfurt 15/4 – I’ve gone for Lyon to win 4-1 at 22/1, just to have something to aim for.  

What I remember most from last year’s clash was the close control, a precise short-passing game, the commentator making the inevitable ‘fox in the box’ joke when Wendy Renard scored, and an epic performance by Sarah Bouhaddi in the Lyon goal, thus giving the lie to the stereotype that women, being short, are shit goalies. Think about it – Wayne Hennessy’s 6 ft 6. He’s not very good.  Anyway, I am hoping for more of the same, plus goals goals goals, and for Lyon to retain their crown. 

As well as the goalscorers, key players for Lyon will be Louisa Necib, defenders Wendy Renard and Sonia Bompastor, and for Frankfurt, Fatmire Bajramaj and, inevitably, goalkeeper Desirée Schumann (standing in since the quarter-finals for knee-knacked German international Nadine Angerer).  They may not be given much chance, but if Frankfurt can get an early goal, that would keep things interesting.

The game is being shown on Direct 8 in France, ARD in Germany, and Eurosport in the UK and US.  I will be live-tweeting the game for @FrenchFtWeekly , hoping that Frankfurt's Ana Maria Crnogorčević has a quiet match.

*Technically, it’s already happened – this is only the third season for the Women’s Champion’s League; before that, it was called the UEFA Women’s Cup, won in 2003 and 2004 by Umeå IK from Sweden.  Although that also means that certain bands of men have also managed the back-to-back thing (easy...) pre-1992 when the competition was the European Cup.

Monday, May 14, 2012

All Must Have Prizes

Following my earlier piece on the limitations of a statistical approach to football analysis, the sight of Bolton at the top of the shooting accuracy chart caused me to set myself a little challenge: to see if each team in the Premiership could in fact claim to be ‘number one’ at something.

Fig. 1 - the Anfield Cat

Now, I’ve rigged numbers before (things like fig. 1 don’t just happen), considered it on several other occasions (giving up plotting the relative positions of Didier Drogba’s head and ground level in Chelsea v Barcelona because sitting down at half-time was too difficult to force on the x- axis), and screwed up a couple of times (the form guide that suggested certain teams were averaging more than three points a game early on in the season) but I can assure you that all these numbers are echt, coming as they do from the lovely people at (thus, certain stats such as shots / on target may differ from other data sources, and I don’t have possession / pass completion stats).

Not Opta, I’m afraid. I can’t afford Opta. If anyone is interested, my birthday is in January...

The rules – each achievement must be positive (i.e. no ‘best at letting in goals’), so yes, I was most worried about finding something for Villa; and the Occam’s Razor approach applies, start with the obvious stuff and only apply filters where necessary.

  • ‘accuracy’ is the percentage of shots deemed on target; ‘efficiency’ the percentage of shots on target resulting in a goal; ‘conversion rate’ is the percentage of total shots resulting in a goal; ‘save rate’ is the percentage of shots on target against that did not result in a goal. 
  • Filters applied include home / away record, 2011 / 2012 split, and, when I got desperate, results by month and multiple combinations of the above.

Here are the results – well done everybody.
  1. Manchester City – most goals (93), most shots (666), most shots on target (399), most shots and shots on target in a single game (35, 24 – v QPR), fewest goals conceded (20), most home points (55), most home wins (18) and at that point I stopped writing down any more things. There were lots of others.
  2. Manchester United – most first half goals (40), most away points (42), most away wins (13) and again, stopped writing them down, and again, there were lots of others.
  3. Arsenal – highest 2011 away accuracy (64.18%), most goals in February (14)
  4. Tottenham Hotspur – highest number of corners in a game (19 v Villa – joint with MCFC v QPR and NUFC v SAFC), most points in September (9) November (9) January (10) and May (7), most goals in November (8) January (9) and May (7)
  5. Newcastle United – highest 2012 away efficiency (28.89%)
  6. Chelsea – most shots in 2011 (314)
  7. Everton – highest 2012 home save rate (92.31%) and efficiency (27.54%), most goals in April (14)
  8. Liverpool – most corners (308)
  9. Fulham – the only team with no red cards this season. Good lads, they are. Also, recovered two half time deficits to win, so highest turnarounds.
  10. West Bromwich Albion – most away goals in February (6)
  11. Swansea – fewest fouls (309) and yellow cards (40), highest home save rate (87%)
  12. Norwich – most away goals in January (4 – joint with Liverpool and Sunderland)
  13. Sunderland – highest 2012 efficiency (24.47%) and save rate (86.36%)
  14. Stoke – highest home efficiency (26.6%)
  15. Wigan Athletic – most points in April (9)
  16. Aston Villa – I swear I am not doing this on purpose but I genuinely can’t find a thing...
  17. Queens Park Rangers – highest percentage of points at home (70.27% - rising to 95% in 2012)
  18. Bolton – highest home accuracy (64.96%) and 2011 accuracy (67.45%)
  19. Blackburn Rovers – highest 2011 away efficiency (31.37%) and conversion rate (18.6%)
  20. Wolverhampton Wanderers – highest percentage of points away in 2012 (87.5%)
This of course proves absolutely nothing, but it was interesting to see how difficult it was to find 'winning metrics' for teams high up the table - Arsenal, for example; one might expect a more obvious skill or facet to the game to emerge from the dataset for a team that finishes third; ditto Newcastle and Chelsea. What this perhaps indicates is the weird up-and-down nature of the season, with teams having slow starts, slumps, collapses etc - and conversely, with Wigan and Everton doing their usual end-of-season thing.

While Villa is the only team I couldn't find anything for, the QPR and Wolves stats are, of course, a half-the-story story, meaning mostly that they were shocking away from / at home respectively. But I am a slave to the numbers.

Anyway, I have failed, so it really doesn't matter. As you were.

Wednesday, May 9, 2012

Giroud Interview with Sport Bild, per L'Equipe

Original (French) version here - this is not an official translation, meant only to help out non-French speakers. Any errors are my own. But I have tried. Promise.
Olivier Giroud spoke to Sport Bild on Wednesday, and revealed he was aware of the interest of Bayern München and that he felt ready to deal with competition with Mario Gomez: “I have not personally had discussions with Bayern but my agent has.  We are both the same height and have a similar profile, but there are also differences.  Mario Gomez is very adroit in front of goal, very strong in the area; I can also score from distance. Given the number of competitions that Bayern are in, Gomez and I could rotate."

Giroud also confirmed that he has been advised by Franck Ribéry and that he would like to have his future settled before the Euros: “I have already spoken to Ribéry about the Bundesliga, the fans, the stadia and life in Germany.  I have to say that the Premier League stays a notch above the Bundesliga.  But I have regulaarly followed Bundesliga matches.  I would definitely like to settle my future before Euro 2012.”

Saturday, May 5, 2012

Caveat Statto: The Limitations of the Statistical Approach

Premiership Stats Visuals: See Dynamic Dashboard

Defining Terms

In a dataset for a football match, certain actions are fixed, and others a matter of opinion; however, they are fixed by the referee, which is the first problem.  The whistle is blown for fouls, freekicks, corners and goalkicks are awarded, offside calls made, cards handed out – all these things happen, and are therefore included in the dataset.  They may, however, be wrong.  This has a knock-on effect on the dataset; for example, QPR’s disallowed goal against Bolton in March.  Where is that in the dataset?  It was definitely a shot; it was definitely on target.  However, it was definitely a goal, by usual objective standards – but not according to the referee-defined objectivity of the game.  In recording that action, the observer must not designate it ‘a goal’, as it wasn’t (although it was); that then feeds into the realm of opinion in its recording in the shots statistics.

In that realm, there is further confusion; the recording of shots and shots on target is ostensibly a record of fact, but that is not necessarily the case.  When Emile Heskey’s attempt against Manchester United sheared off for a throw-in, is that recorded as a shot? He clearly meant to shoot, but if the ball ends up way over there, it can cause confusion – would any forward (or sideways) pass in the attacking third then be a shot?  No, but only because they are not intended as such. Then we are reduced to considering the motives of players in performing particular actions, which is frowned upon in other circumstances, such as when considering whether or not player A is or is not that kind of player, or when in ‘but did he mean it?’ situations (eg Olivier Giroud – 25 secs in, Papiss Demba Cissé, Tim Howard) where it really doesn’t matter if he did or didn’t, it went in, and thus the result defines the previous action.

A further example of this difficulty is shown in the different records that may exist for the same match; in the Manchester Derby, there was much talk of Manchester United not mustering a shot on target – according to the dataset used in my analysis, they managed 4 shots, 2 of which were on target, and thus did not ‘do a Blackburn’ (where no shots are recorded in their match v Tottenham).  In this realm, therefore, there will also be variances between different datasets.

Between these two positions are other actions that definitely happen, but without the sanction of the referee; passes completed, tackles won, etc.  These simply need to be seen and recorded by the observer and included in the dataset.  This inclusion is however factual rather than entailing any particular judgment, which is connected to our second problem.

Which Metrics? Quantifying Quality

No single metric can define a match; it is of course getting the goal in the back of the net that counts, but with the other team attempting to do that too (unless they are Blackburn playing Tottenham), even goals scored is not sufficient.  ‘Points’ is the final definer of a result, of course, but is in itself a result of a combination of actions rather than an action in itself.

Some metrics, such as possession, pass completion rate, and assists are simultaneously lauded and derided as measures of quality.  The first two in particular are used to demonstrate the dominance of a team, which mostly works as the highest performers in these areas tend to be Barcelona; however there is no causation here (see next section).  When Swansea played Newcastle in April, they had 77% possession, and completed 835 passes to Newcastle’s 181. They also lost 2-0. Thus, high possession and pass completion rates are useful in terms of potential, but that potential still has to be realised.

The assist is a tricky beast – and here, the French refer to a ‘decisive pass’, which seems more useful , as otherwise Hazard’s rabona against PSG would probably not count as ‘an assist’ as it bounced off De Melo first, before Roux got to finish – as an assist could be a beautiful piece of individual skill to set up a tap-in, or just the last mug to touch the ball before the striker did all the work.  The same can be said of goals, of course, but as they are used primarily as a team-metric, and to define individual performance only as a subsidiary, this is less pronounced.

As statistical analysis becomes more prevalent in the footballing discourse, there is occasionally the feeling that analysts are searching for more esoteric metrics to distinguish them from the ball in the back of the net crowd.  This can make life difficult.  An example – shooting accuracy might be considered a good reflector of quality, but if we look at that metric alone (% of shots that are on target) a slight drawback emerges (Fig 1).   

Fig. 1 - Best Shooting Accuracy by Team
Alternatively, when Arsenal were shipping goals all over the place early on in the season, there was still an insistence that Wojciech Szczesny is a fine goalkeeper (and that David de Gea might not be).  Looking at the rankings for save rate over the season (% of shots on target against that do not result in a goal) is similarly surprising from that perspective (Fig. 2 - and Manchester United are at the top of this chart). 

Fig. 2 - Worst Save Rates by Team

If no single metric can stand alone in match analysis, a combination of metrics may be more useful.  However, none can define success.

Cause and Effect – Prophesying the Past

Win more corners and you’ll win more games, as, hopefully, the saying doesn’t go (Fig.3).  Statistical analysis can assume causality from a metric that is actually an effect (attack more, and a team is more likely to win corners – they are also more likely to win; both are results of attacking more, but also then used to define the level of attack, circular reference warning ahoy).  Analysis can be dependent on results, and the interpretation of the metrics in the dataset behind that result can therefore change to fit the narrative, eg, Barcelona won because they had more possession, Barcelona lost because they didn’t capitalise on their possession.   The second statement (guess which match) is more accurate, and also gives the lie to the causality assumed in the first.

Fig. 3 - Most Corners Won by Team by Match

It’s the ball in the back of the net that counts, basically. Preferably the other team’s net.


Statistical analysis can be a useful addition to match reporting, but to me is more important in perceiving trends over a season rather than explaining a particular result, still less forecasting a game to come.  There are dangers at each end of the scale – over-reliance on particular metrics and an assumption of causality can lead to inconsistency as conclusions differ between matches; trying to take everything into account can render analysis so un-incisive that it is useless (or ends up being a simple statement of shit we already knew – you have to take your chances; or, Manchester City shoot quite a lot, Stoke don’t - Fig. 4).

Fig. 4 - Highest / Lowest Shots by Team

There is also the tension between objective and subjective in assessing the quality of a game – castigating Chelsea for playing ‘anti-football’ when they just beat arguably the best (subj) team in the world, by doing what had to be done, or lauding Swansea for playing beautiful football when they got beaten by Newcastle’s more direct approach, are two sides of the same coin.  A complex combination of metrics may approach expressing quality of play, but there is still no number that can adequately describe Cissé’s goal against Chelsea or Ben Arfa’s runs through confused defences.  The beauty of the beautiful game is difficult to convey other than by the use of the word woof.

There is also luck, of both flavours, and numerous hypotheticals around that – if Suarez hadn’t been bullied by a tree as a child leading him to take revenge on woodwork wherever he see it, if Harry Redknapp wasn’t using a dartboard to determine where Bale is going to play, if Arsenal had had a functioning set of defenders throughout the season, well then, things would have been different.  But luck is a matter of chance.  And then there’s the refereeing – if there was goal-line technology...

Finally, connected to the causality issue above, there is the danger of assuming X therefore Y or relying on preconceptions – under-estimating the other team, setting up not to lose and then going a goal down, being happy in possession but failing to take chances.  At the end of the day, it’s the ball in the back of the net that counts – you still have to play better than the other team.

My name is PhilippaB, and I am a functioning statoholic.  But I am striving to be self-aware.