# All models are wrong, anyway

Today I gave a seminar on the ASA statement about p-values, the 5-sigma criterion, and other amenities. In the seminar I slipped by some comments on the stance of Bob Cousins on models, hypotheses, and laws of Nature, and ended up ranting about interventionist definitions of causality (following Pearl, mainly).

A few minutes ago I opened Twitter and found the somehow excessive prodromes of an internet flame sparked by a post by Sabine Hossenfelder titled “Predictions are Overrated”. I disagree with that post in a few key points, which are too long to describe on Twitter. Bear with me here.

The argument of the shady forecaster

The way in which scammer forecasters work is to generate huge amount of predictions, send them to random people, and finally follow up only with the people who received the predictions which a posteriori proved to be correct. The scammers then repeat this iteratively until for a small set of people they appear as if they always provided correct predictions. It’s a well-known strategy, which I think was described extensively in a book by Nate Silver. Or Nassim Taleb. Or Levitt and Dubner. I read too much.

Sabine uses this story to argue that since any successful prediction can be successful just by chance—because of the size of the pool of scientists producing models—therefore judging theories based on their predictive power is meaningless. However, the shady-forecaster example to me is quite disconnected from the topic at hand: the shady forecaster relies on selecting its targets based on a-posteriori considerations and conditional probabilities, whereas the point by Sabine is one of pure unconditional chance.

The power of doing

Sabine remarks that epidemic models are incomplete because they don’t include the  “actions society takes to prevent the spread”; that’s true, but Sabine hints that’s because “They require, basically, to predict the minds of political leaders”. The real point is instead that to some extent researchers cannot access the causal structure underlying the epidemic model because they are stuck with conditional probabilities and cannot trasform them into conditional interventions; in other words, they cannot fix some of the conditions to remove faux causal links and highlight the true causal structure—mainly because they would need the politicians to take certain actions which sometimes would be plainly unethical and sometimes would have too high a political cost.

A theory should describe nature

The exact quote from Sabine’s article is “If I have a scientific theory, it is either a good description of nature, or it is not“. Books have been written on the meaning of a good description of nature, but the full sentence is simply what Quine would call a logical truth, that is an expression which is true regardless of the effective content of the sentence—if I have a cup, it is either broken or not broken. I won’t into deeper considerations about factual truth vs logical truth.

Sabine in any case goes on defining an explanatory power which “measures how much data you can fit from which number of assumptions. The fewer assumption you make and the more data you fit, the higher the explanatory power, and the better the theory”. This is ultimately an expression of Occam’s Razor, which is embedded in our mentality of scientists—and in Bayesian model selection.

Sabine also points out that ultimately there is a trade-off between obtaining a better fit and introducing more (ad-hoc) assumptions, which again is something deeply embedded in Bayesian model selection and in formal procedures such as the Fisher test for choosing the minimal complexity of a model we want to fit to the data. So far so good.

We diverge towards the end, where Sabine claims that “By now I hope it is clear that you should not judge the scientific worth of a theory by looking at the predictions it has made. It is a crude and error-prone criterion.” and laments that “it has become widely mistaken as a measure for scientific quality, and this has serious consequences, way beyond physics”.

To me the explanatory power of a model, or even better its interpretability, should indeed be a fundamental characteristic of a model, but I subscribe to Box’s all models are wrong. I want to have a reasonable explanatory power or interpretability before considering a theory minimally acceptable as a physics model, but ultimately the sole judge for the success or a failure of a model are indeed the data. Rather than focussing on the possibility that a model predicts the data because of chance, I prefer to focus on requiring that a well-interpretable model predicts multiple data, in multiple scenarios, in multiple independent experiments. If its predictions are successful, I’ll take it as the current working assumption about how things work.

Particle physics, as pointed out by Bob Cousins in the paper linked above, is indeed a happy realm where we have tremendous predictive power and where we can often build models starting from first principles rather than just figuring out what type of line fits data the best (as is common in other sciences, I recently experienced). Bob is also right when he remarks that when we go from Newtonian motion to special/general relativity the former is the correct mathematical limits in a precise sense” rather than an approximation. However, all of this to me simply justifies the use of (quasi-)point null hypotheses: it does not imply any strong connection with a ground truth. More importantly, even our choice of those very reasonable assumptions (symmetries and whatnot) that generate the explanatory power or interpretability of the model might ultimately result in a very successful theory by chance alone. After all—I insist—all models are wrong anyway.

# My New Paper is a Manifesto for Good Practices!

Too much time has passed since my last post; I had a couple busy months, and I will have a few more 🙂

Among my recent activities, I spent last week in Zürich attending Standard Model at the LHC 2019 , where I presented the status of W and Z multiboson measurements in ATLAS and CMS. Together with Carlos from Oviedo—where I previously was based—we have produced a nice result, that has been part of my presentation, on WZ inclusive+differential cross section and search for anomalous couplings , which has been published in JHEP just a couple days before the conference 🙂

The SM@LHC series consists actually of specialized workshops designed to bring together experienced researcher and have them discuss the open topics and points of improvement that concern Standard Model physics. Well, not only Standard Model, actually; nowadays the precision of SM measurements is so large that we expect to be able to see sizeable discrepancies from SM predictions in case there is some new physics nested into the couplings (parameters representing the strength of an interaction between a set of particles).

In the “usual” conferences about HEP physics, a talk on “multiboson measurements in ATLAS and CMS” would consist in a list of nice results with highlights about who did what with a better precision; while showcasing results is very important, one sometimes feels the need of a more critical discussion of the results, to identify possible improvements to be made and therefore inform future action.

Workshops like SM@LHC satisfy exactly this need; speakers are invited by the organizers to give talks focused more on the issues and open points than on the accomplishments. In order to prepare my overview of multiboson measurements, I have read in detail a number of ATLAS and CMS papers, following this mandate. Because of my tasks within the Collaboration (I review papers for the phrasing of statistical claims, for example), I have grown a bit picky on the topic of reporting results, and I started to notice things.

After preparing the talk I took a plane to Zürich right before Easter, spent the weekend visiting the town with my wife, and started thinking about systematizing the observations I had made to possibly abstracting some kind of guidelines.

During the four days of the workshop, I started jotting out a few ideas, and on Friday morning I submitted the result to the ArXiv.

My Reporting Results in High Energy Physics Papers: a Manifesto is now out, and I have already received feedback from the community (quite good, so far!). If you feel like reading these 10 entertaining pages, make sure you drop me a like with additional feedback; I surely missed some point and the document can be always improved.

Last but not least, the act of writing the acknowledgements section for this paper led me to investigate about my CMS membership; I realized this year (in July, actually) marks my 10th year in CMS; I am not sure this is a milestone, but somehow it feels like one.

For sure, looking back, I realize how many things I now kind of understand—things  I had absolutely no clue about when I started. And that feels good 😀

# Turok, Dark Matter, and the Issue of Telephone Games in Science

Chinese Whispers is a children’s game; according to the linked Wikipedia article, it’s called Telephone Game in American English, which better resembles the Italian telefono senza fili (literally, wireless phone).

Regardless of the name, which might stir up some discussion in its British version due to stereotype, the point is it’s a game in which information gets progressively distorted at each step—or I should rather say that opportunity for distortion at each step is embedded in the rules of the game.

Information is usually distorted by the environment (i.e. by the challenge of quickly whispering words one player to each other), but there’s always the chance that a player intentionally changes the message. This makes often the game a bit less funny (the funniest realizations—at least to me—are the ones in which the changes are unintentional), but results in no big deal; the message has no real utility.

In the real world, messages are usually important in being meant to have some effect on the recipient, and intentional distortion becomes an issue because the distortion is motivated by the hidden agenda of the player (or in general actor, in this context) that distorts the message.

In science this issues can rise in the way scientific results are presented to the general public, and also in the way results are presented to a public of peers; I will discuss two recent examples that bothered me a bit.

The first example is the popular book The Order of Time by Carlo Rovelli. In the book, Rovelli argues essentially that time is a sort of emergent property rather than a fundamental entity. The book has been followed by a series of interviews and articles in the press, which helped popularize it and certainly pumped up sales.

The book—and the general attitude shown in press articles and interviews—creates huge harm, though, because the notion that sticks with the layman is precisely that time does not exist. While this is certainly an interesting theory, worth discussion and scientific exploration (if feasible), it is a theory. A fancy, interesting theory that is not supported by any evidence whatsoever, at this moment in time (pun intended).

I think that selling (because the issue here is selling) a theory as if it was a fact is seriously damaging both the public and the community, with the aggravating factor that the public is defenseless; the public just trusts whatever is written in a popular book or in a press article, regardless of the truth—as the Trump campaign taught us. Furthermore, unfortunately the general public does not go and check more informed reports such as an article from Nature which points out that the theory is just Rovelli’s theory and that the layman should not buy the theory as if it was the truth.

If you think that I am exaggerating, consider that I am one of the administrators of what is probably the major Italian Facebook group on outreach on the topic of Quantum Mechanics, Meccanica Quantistica; Gruppo Serio; every couple days we have users that keep posting their thoughts “on the fact that time does not exist”, to the point that we stopped allowing those posts to pass through our filters. When we still accepted those discussions, I have been able to experience firsthand that these people have read the book (or a press article about it) and have taken home the message that the state of the art of scientific knowledge is that time does not exist. And this is very bothering. I think Rovelli messed up very badly in this, and I have the impression (I hope the incorrect impression) that he is unwilling or not caring about correcting this mistake.

Rovelli’s book is not the only example of a book that does a disservice to outreach by projecting the theory or the biases of the author into the general public; another recent example would be the book (and blog post about FCC) by Sabine Hossenfelder in which she claims that a new particle collider would be a waste of money, but I think that others have already written extensively about the topic, so I won’t delve into the topic in this blog post (I already did on Twitter, though), and my second example won’t be Sabine’s book.

My second example will be a sneakier example I have assisted to last week in a seminar in my institution, Université catholique de Louvain. In the context of the assignment of some PhDs honoris causa to renown scientists, Neil Turok has been invited and gave a couple lectures. One lecture was to the general public, and I missed it because of other commitment; you can find the full video of it in my institution’s website. The second lecture, the one I will focus on, was to a semi-general public; not only researchers like me from the CP3 (Centre for Cosmology, Particle Physics and Phenomenology—kudos for centre, Oxford comma is missing though), but also bachelor and master students in Physics.

A seminar for specialists is pretty much an open field, where it’s assumed that the spectators will be actively engaged and will critically evaluate any bit of information transmitted by the speaker.

A lecture with bachelor and master students—who were encouraged to participate and make questions—is a more delicate scenario, in which I would argue that you want to make sure that everything will be communicated with the necessary caveats. Either well-established theories should be presented, or new, bizarre, untested theories; in the case of the latter, there should be ample warnings about the theories not being part of the scientific consensus. I am not saying that new/bizarre/untested theories should not be presented; on the contrary, it is good for the formation of the critical mind of the students that debate is stirred up and that exciting possibilites are presented to them. What I am saying is that such possibilities should be presented as such, and not as the unquestionable truth; here is where I think Turok messed up pretty badly.

The lecture was about a CPT-symmetric universe; a couple slides into the talk, he presented a slide in which he wrote an equation and outlined the different components and the scientists that solved those pieces of the puzzle. There was an almost invisible (dark violet on black) bit of the equation that I was not able to read but that turned out to be pretty crucial; he claimed that he used to put disclaimers about that piece of the equation, because it referred to dark matter, but that recently he removed the disclaimer because that part of the puzzle has been solved.

At that point, I kind of woke up, because to this day we are pretty far from being able to state that “we solved Dark Matter”.

It became clear a few slides later that what he meant is that his theory is that Dark Matter is constituted by right-handed (RH) neutrinos, and that consequently the standard model plus right-handed neutrinos is enough to explain all the universe.

He then went on to state that competing theories such as freeze-out and freeze-in are full of ad-hoc assumptions, whereas his theory was simple and elegant; he even threw in the middle some paternalistic comments saying that in astrophysics/cosmology lately people just produce bad papers for the sake of it, whereas he prefers simple solutions based on works from 50 years ago.

Now, it might be true that some people produce bad papers just for the sake of it, and it might be true that going back to the roots of a discipline can result in ideas with a newly found strength and solidity. But using this argument to bash at competing models seems to me a bit arrogant and uncalled for. Particularly in front of undergraduate students.

During the Q&A, a couple colleagues of mine argued on two different fronts; one argued that freeze-in mechanisms—contrary to what stated by Turok—do not assume a huge number of new fields and ad-hoc assumptions. I am no expert on astrophysics, but we had in the past weeks two or three seminars about freeze-out and freeze-in mechanisms at CP3, and I am pretty sure my colleague was right; yet, Turok dismissed him basically saying that he was sure my colleague was wrong, and the moderator in the end had to use the traditional diplomatic let’s continue discussing this during the coffee break before things went awry.

The other colleague argued that the “very simple and standard-model only” model by Turok assumed not just the Standard Model but also right-handed neutrinos, to which a small exchange followed about whether RH neutrinos can be considered practically-Standard-Model or not. The discussion dragged on a bit, and at some point Turok admitted—although very en-passant—that also his model is affected by totally ad-hoc assumptions such the Z2 symmetry that makes one and only one of the RH neutrinos stable. And yes, that assumption is totally ad-hoc and is apparently the only way in which the theory can explain why of all RH neutrinos only one should be stable and give rise to Dark Matter. Again, I think that while it’s healthy that students are exposed to debate and to new ideas, the way in which the theory has been presented before the critics has been very problematic.

Summarizing, I think our duty as scientists is to give both the public and the students the most objective picture about whatever new theory we fancy at the moment—even if we ourselves devised that theory.

It is good to expose the public to some degree of the professional debate about some topics—although it probably depends on the topic; debate about CPT has not the same impact on the layman as a debate about black holes—remember when people believed the LHC would have destroyed the Earth?—or vaccines.

However, when speaking to—or writing for—people that have not the capabilities of critically sieving through information, we should be very careful to not misrepresent the difference between the current scientific consensus and yet untested theories.

After all, not everything is about Turok (the Neil); the image above teaches us that Dark Matter is a pretty delicate issue in Turok (the game) as well 😀

# A Very Brief Detour on the Power of Organization

It’s grant-writing season, and I have a grant request submission deadline at the beginning of next week. On top of that, I have to finalize a paper I will present (also next week) at a statistics conference.

Last week I have been busy finalizing my latest CMS paper that is now on the ArXiv, about WZ bosons production (very nice measurement, improving the current experimental picture quite a lot, I must say—highly suggested reading), and grading programming assignments for the course I am assistant of this semester in UCLouvain.

The nice thing is that—contrary to the past—I am not scratching fingers at the deadline while feeling the pressure of urgency; I am just doing exactly what I should be doing at this stage, i.e. polishing material that is mostly final.

And this, folks, is a hugely relaxing sensation.

So relaxing that, in fact, I am writing this while going to fetch the pizzas The Wife and I ordered for take-away 🙂

# Introducing a LaTeX Object into the Post

After the horrendous pun of the title, let’s get to a geeky post about tech, or specifically about the integration of typesetting systems into the WordPress editor.

You will have noticed that in What are your Expectations for 2019? I started using some simple mathematical formula within the post.

I have long debated with myself whether to write the posts in Markdown (WordPress supports that) or using the WP editor, and I have finally chosen the WP editor; the reason is that the editor features a simple way of inserting $\LaTeX$ formulas. To obtain the same with Markdown, as far as I understand the WP Markdown plugin should support a Markdown $\LaTeX$ plugin, which I guess is asking much of WordPress; correct me if I am wrong (@pablodecm, I am thinking about you).

Long story short, it turns out that the way of embedding formulas in the WP editor is quite simple: you just have to enclose the formula between dollar signs, and the first dollar sign must be followed without spaces by the word “latex” (the dollar signs are delimiters of an interpreted environment, and the word specifies which is the interpreter to be used). You can theb add options to the formula by adding to the end ampersands followed by the parameter=value syntax; it’s very useful for setting the colour of the text and of the background, for example (by default, the $\LaTeX$ interpreter assumes white background and black text, and is blind to the actual HTML/CSS style.

One thing leaves me puzzled, though; one particular formula gets rendered in two mini-lines within a single text line, resulting in a pretty ugly effect:

$E[gamble] = p(K)\times V(K) + p(nonK)\times V(nonK) = 0.08\times 10 + 0.92\times(-10) =-8.4$.

As our can see, the last part gets put into an ugly mini-line that does not correspond to different main lines; as a result, the two lines of the formula are squeezed into a single text line. In case you see the formula just fine (i.e. the -8.4 in the same line as the rest of the formula), please drop me a line in the comments below: I have tried a couple different browsers both on a computer and on a phone, and the issue seems to be common, so I would exclude web browser issues). I have tried varying the spacing to no avail. I will keep digging, but if any of my two subscribers (literally; two people subscribed so far) has any suggestion, I’ll be glad to try it out!

Until then, I will go back to the main theme of the blog 😉

# What are your expectations for 2019?

[Originally posted on December 31th, 2018. Revised a couple sentences on January 2nd, 2019 — sketching under pressure and without proofreading, while the wife is showering, before going to a New Year’s Eve celebration is not really the way to go 😀 ]

As promised in Active inactivity, here is a new blog post, before the end of the year!

Since New Year’s Eve is upon us, I think it is only fair to begin this introduction to De Finetti’s definition of probability with a preparatory introduction to the concept of expectation.

In Statistics, the word expectation has somehow a peculiar meaning, that to me represents an improvement on the everyday meaning of the word; the layman’s definition of the word expectation, according to the Oxford Dictionary is “A strong belief that something will happen or be the case.“. Is this enough for the statistician?

Well, yes and no.

Yes, in the sense that the act of making a statement about the future is somehow maintained, at least for a suitable realization of the abstract definition of probability. No, in the sense that we are not interested in making a generic statement about what we believe that will happen in the future; we want to make a statement that reasonably encompasses everything that could happen, resulting in a statement about the average outcome that I can expect.

Let’s make a simple example: say that you draw a card from a deck, and that you gamble such that you win 10 euros is you get a King, and you lose 10 euros if you get a black card. What do you expect to happen on average?

The relevant useful concept here is that of expectation: each outcome (King, anything but a King) has two numbers associated to it: the probability of obtaining that outcome, and the value or pay-off that you get if that particular outcome happens. For example, there are 4 Kings out of the 52 cards that make the deck, so the probability of obtaining a King is $p(K) = \frac{4}{52} \simeq 0.08$, whereas the pay-off you get if the outcome happens is $V(K)=10$, euros. By converse, the probability of obtaining not-a-King is $p(nonK) = \frac{48}{52} = 1- \frac{4}{52} \simeq 0.92$, and the value you stand to win is $V(nonK)=-10$, euros (negative, since you would incur in a loss).

To know what you can reasonably expect to happen on average in this situation, it is necessary to think a bit. If the situation was simpler, for example if you stand to win 10 euros regardless of the outcome of the card draw, you can expect that you will win 10 euros. By converse, if you stand to loose 10 euros regardless of the outcome, you can expect that you will loose 10 euros. But in our fictitious situation different outcomes are rewarded with different values; it is then crucial to have a way of estimating a global, average value for what you can expect. Since each pay-off will happen only its corresponding outcome happens, a natural choice is to weight each possible pay-off with the probability that its corresponding outcome will actually happen. It turns out then that, following this line of reasoning, your expectation is given by an average of the pay-offs, weighted by the probability that each outcome happen. For our concrete case, $E[gamble] = p(K)\times V(K) + p(nonK)\times V(nonK) = 0.08\times 10 + 0.92\times(-10) =-8.4$; in words, the expected value you get out of your gamble is lose 8.4 euros.

You can already use such considerations to find the expected value of any kind of situation in which you can gamble on some well-known outcomes; this works for example for any gamble on a deck of cards, in which you can easily calculate the probability of any outcome by simply counting the cards—or combinations thereof—in the deck).

In order to interpret such statement, and to really get to the bottom of the meaning of expected value, we need to make a small step back and look into how we can define and compute an essential element of the formula for the expectation: probabilities.

However, you will have to wait for the New Year, because my wife has finished showering, and we need to get ready for this night’s party!

See what I did here? Not only have I described briefly the concept of expectation, but I have also given you a way of computing what is the value you expect to get from this blog for the first week of 2019: what is the probability you assign to me writing and publishing the next blog post before January 7th? What is the value you assign to the blog post coming out, and what is the value you assign to the blog post not coming out?

Try to get your probabilities and pay-offs figured out before midnight!