“Readers of some journals don’t read other journals” (or maybe they do?)

Last September I was in Crete for a conference and overstayed to have a few days of vacation.

The wife and I did a wonderful hike walking down through the Samaría Gorge. I won’t enter into the details: just be warned that it is a magnificent experience but also quite taxing. Bring food, water, hats, hiking shoes, and more water.

After the hike we had the opportunity of sunbathing on a nice rocky beach and take a swim, before taking a ferry back to the picking point for the bus back to Chaniá. During the bus trip the wife fell asleep, and I listened (albeit interrupted by the occasional loss of connection in the mountains) to an episode of the recommendable EverythingHertz podcast, hosted by Dan Quintana and James Heathers, featuring a nice interview with Kristin Sainani.

The interview mentioned Sainani’s scientific writing course on Coursera, but I shamelessly forgot about that—until last Sunday. While searching for a few references about writing, I serendipitously stumbled again upon the course, enrolled, and spent the last couple nights going through the very cool material. If my prose in this blog is not improving it’s totally by my fault: the course is very good.

Long story short, I just watched two of the course’s interviews: one with Brad Efron (the real one, not a bootstrapped replica), which I cite merely because you should really go and watch it; and one with George Lundberg, which I cite because I want to speculate on a point Lundberg raised.

Lundberg—a medic and editor of many journals, from what I could gather—states that you should choose the appropriate journal for your paper based on the typical readers you want to reach. On one side I agree (as also Efron mentioned, if you want readers interested in theoretical statistics you shouldn’t submit to the Journal of Applied Statistics), on the other side I tend to disagree on one specific sentence: “readers of some journals don’t read other journals”.

This was probably true in the pre-internet era, when you had to go to your university’s library to pick up a printed journal: maybe (if the library had not enough copies) you could also read it only for a moderate amount of time, to leave space for colleagues to read too, and had to ruthlessly select the papers you wanted to photocopy. I imagine that bibliographic research was based on a similar approach too—my older readers, if any, are welcome to comment on this. I hear that often people snail-mailed authors to ask for (snail-mailed) copies of their papers.

Nowadays you tend to search for papers online; you can get practically any paper from any journal via a plethora of sources: preprints, open-access journals, university online subscriptions, or a colleague willing to send you a PDF—or sci-hub, if you feel particularly remorseless.

I think that the younger generation certainly still accounts for the perceived importance of a journal when choosing what to read. But I also think that the separation between readers of this or that journal might have washed out so much that readers now should be rather divided by search keys—people searching for “sampling techniques” rather than “likelihood asymptotics” or “cats loving statistics”.

My hunch is totally anecdotal, but if you are interested let me know and we might think of setting up a study (if one does not exist already) in which scientists are interviewed about their reading habits.

Academic writing in High Energy Physics

It turns out I was not writing in this blog since last April, which is a bit disappointing.

Since then, I got more and more involved in academic writing; I have a couple draft articles (not within the Collaboration) that I am now polishing, and I got a contract with a prestigious press for a textbook due next year. As a result, I started writing almost every working day, which is something that as a particle physicist you don’t really do.

The life of a particle physicist in a large experimental Collaboration revolves around doing analysis work and service work. The typical service work consists in accessory tasks like working at tuning some calibration of the detector, or reviewing a specific aspect of analyses you did not perform yourself, or other menial tasks that are nevertheless extremely important for the company Collaboration to keep functioning. Not much writing there (except for emails. You will always be writing emails).

The typical analysis work can be roughly schematized in a workflow like this:

  • Design an analysis targeting an interesting physics case, and reading the relevant bibliography (old analyses targeting the same case, related theory papers, etc);
  • Perform the analysis (select an interesting subset of your data sample, estimate some tricky accessory quantities you need, study the systematic uncertainties your analysis is affected by, extract estimates for the parameters you are targeting);
  • Present a few times the analysis in a meeting to get feedback by other members of the collaboration;
  • Write down a detailed internal documentation (the Analysis Note), and get some more feedback;
  • Write down a draft of the public documentation (journal paper or preliminary analysis summary);
  • Get the analysis approved from the point of view of the physics;
  • Get the paper approved from the point of view of the writing (including the best way of relying the desired concepts, and style/grammar considerations).

I don’t claim total generality, I just find that me and most of the colleagues I know have this workflow; you might have a different one, probably a better one, and that’s just fine.

The implication of such a workflow is that you end up writing down the documentation (internal or external) only after having finalized the bulk of all the analysis work; until that moment, the logical organization of the material is deferred to slides presented at meetings. When you write the documentation you are also generally under pressure to respect some deadline—usually a conference in which your result should be presented. Sadly, sometimes there is not even much organization of the material to be done, because most analyses have been performed and optimized in the past, and the modifications you can do are kind of adiabatic (plug in a different estimate for a specific background, or training a classification algorithm, and so on). For new analyses, the track is predetermined anyway (tune your object identification, tune your event selection, estimate backgrounds, plug in some analysis method specific to the case at hand, estimate systematic uncertainties, calculate the final numbers representing your result).

That’s all fine, but the unintended consequence of this workflow is, in my opinion and experience, that academic writing ends up relegated to the role of a task you have to do pretty quickly and is a mere accessory to an analysis that you have already done.

Things are made worse by the latest stage of the workflow; the review of the paper text made by the collaboration (usually in the form of a Publication Committee) is designed to standardize the text of all the Collaboration’s papers and to ensure the highest standards of quality of the resulting text. The problem is that, while iterating with the internal reviewers on the text, you will often feel that your authorship is taken away from you. What I mean is that the set of rules and comments is designed to produce a perfect Collaboration text, and this will strip most of your personality (reflected in your personal writing style) away from the paper. Unless you discuss a lot and manage to slip some lively bits into it.

Just to make things clear, I am not complaining about the existence of these rules; it is certainly desirable that the Collaboration outputs papers with the highest standard of text quality, and setting internal reviews and writing rules is a necessity. It’s just that the papers end up being the Collaboration’s papers, not your papers.

In any case, my point is that this kind of workflow unwittingly teaches us that writing is the last thing you do after having done everything else, and that the final result is not entirely under your control, because it will be the product of the Collaboration.

If you look at other fields, maybe even going into social sciences or the humanities, writing tends to be seen more as a necessary tool to organize your thoughts. This generally applies to the point of using writing to organize your thoughts into a paper-like format, which helps you at any stage identifying what do you need from an analysis point of view, but it also applies in general to taking random notes to fix your thoughts and reorganize them.

Once I started writing for my own projects regularly, I realized that what in high school was a vague unidentified feeling is actually a clear truth: writing is probably the best way of interacting with your own mind, and that is true regardless of what you are writing about (work, feelings, life in general). Writing activates your mind and enhances its capabilities.

In addition to the projects I am working on, I started to regularly jot down notes on pretty much anything (meetings, random thoughts, summaries of papers I have read, etc). The result is that I feel more focussed, I feel like I am thinking more clearly about pretty much anything, and I am retaining information in an extremely easier way. A bonus is also that I can retrieve from my notes any information I have forgotten or not retained!

In high school I could write pretty easily, but I guess my ability has atrophied in the years; now I think I regained it and pushed it even further. I can now probably be defined a writing junkie. A resource that helped me quite a lot in regaining momentum is Joli Jensen’s Write No Matter What,  a very nice book whose main point is that in order to write you should have frequent, low-stress, and high-reward contacts with your writing.

How does all of this apply to this blog? Well, for long I thought that to write regularly I would need to regularly produce very long pieces of text, mainly because the blogs I usually enjoy reading are made of very long posts. Recently I started to follow and enjoy a lot a blog which mixes longer posts and very short random posts, and I finally came to terms with the idea that a blog can be entertaining and useful even if a post is very short or consists in the jotting down of a single random idea. I will try this new format. I actually started this post with the idea of writing just a few lines to kick off the blog again and look, here I am at 1310 words and a couple more paragraphs to go.

I even have plans for a whole series of posts. The COVID-19 boredom induced me to slip a couple slides about The interesting paper of the week in the news slides of the weekly meeting I chair at my institution. It’s a meeting about the group’s CMS efforts, but all the papers I am slipping in are about Bayesian statistics or Machine learning because that’s where my interests lie right now. Yesterday it suddenly dawned to me that porting those weekly slides to weekly posts would make for a great low-stress series.

So, basically, I’m back and with plans of finally kicking this blog truly off on its intended course.