One very good question students ask themselves when setting out to write a doctoral dissertation or master’s thesis is: How long should it be?
Not finding a definitive answer, Jean-Hugues Roy, a former reporter for Radio-Canada and now a professor at Université du Québec à Montréal, decided to wade into the issue using data analysis. Inspired by University of Minnesota researcher Marcus Beck, who undertook a similar study confined to his alma mater, Mr. Roy embarked on a similar quest involving all Quebec universities.
Mr. Roy consulted the online repositories for theses and dissertations at Quebec’s 13 universities. In the process, he learned that some institutions are better than others at managing their data. “Concordia has great metadata, while Laval is a hodgepodge,” affirms Mr. Roy. “And McGill’s server is always crashing!”
To parse the data, he wrote several different Python scripts for each university. “It’s monks’ work,” he says, laughing.
Nevertheless, in just a few weeks he was able to index 55,000 papers, including 40,000 theses and 15,000 dissertations. “I must admit there are some gaps, and the numbers do not reflect every single paper out there. But I believe they account for 90 percent of the material written in Quebec over the past 25 years.”
From this mountain of information, he came up with an average. All disciplines combined, the average master’s thesis is 133.33 pages, while doctoral dissertations are 251.3 pages on average.
But the bulk of the analysis had yet to be done, as Mr. Roy wanted to see how the numbers broke down by discipline. To do this, he began by standardizing the data based on the OECD Field of Science and Technology Classification.
The results vary greatly according to the field of expertise. For example, in law, some theses top out at over a thousand pages, while dissertations in statistics and mathematics tend to be only a few dozen pages long. “No surprises there,” he notes. “The world record for the shortest doctoral dissertation is held by a mathematician: nine pages for a PhD obtained from MIT in 1966.”
Mr. Roy also discovered that titles have become longer over time. “This would suggest that we are losing our capacity for abridgement,” he notes. At 378 characters, a 2012 thesis about maternal stress submitted to Laval University’s department of psychology had one of the longest titles. (That would be, “Stress maternel prénatal et développement précoce : données de naissance, attention et sécrétion cortisolaire à trois mois. Association entre le stress maternel prénatal, l’âge gestationnel et le poids de naissance du bébé : une analyse d’études prospectives. Association entre le stress maternel prénatal, l’attention/éveil et la sécrétion cortisolaire de l’enfant à trois mois.”)
“There are some real treasures buried in there,” says Mr. Roy, who found himself engrossed for hours by some theses. He hopes to get people interested in dusting off these papers, mothballed all too quickly, and to raise public awareness of the knowledge that academia is producing – and perhaps to also encourage universities to work together to standardize their data. (All the data collected by Roy is available on his blog.)
“I hope others will be inspired to pick up the torch and carry it forward. It would be interesting to expand the project to include universities from the rest of Canada as well.”