Understanding how to effectively improve the impact of my research publications has always been a serious point of consideration for me throughout my years in academia. After earning my PhD, I began to look into this question in more detail. The more I read, the more I realize that the recommendations were somewhat incomplete, and at times, contradictory. Given my experience in machine learning (ML), I recently decided to download a database and apply ML techniques to see if I could mine through the information and answer a few pertinent questions:
Q: How long should the title of a research article be?
A: 10 words (plus or minus three words)
It always seemed intuitive to me that titles play an essential role in capturing the overall purpose and meaning of the paper. Authors who previously published work investigating this question agreed on the significance of the title length and how it can impact readership or citation rates. However, no past study definitively suggested a specific number of words that ought to use. When scanning various Nature articles, it was apparent that, typically, the titles were short and to the point. I proceeded to analyze the following four databases: the top 100 articles published in Nature in 2014 (according to Google Scholar), the top 100 articles indexed by the Web of Science in 2014, the top 100 articles in Altmetric in 2018, and the 100 most-highly cited papers published by the Multidisciplinary Digital Publishing Institute website in 2017. The analysis revealed a relatively consistent pattern for impactful titles in that the titles of highly impacted papers are usually short. Specifically, the overall length of impactful titles is 10 plus or minus three words. This range was calculated using 400 highly cited articles (selected from over millions of articles that have been published in the literature) with the assumption that “impactful titles” can be selected based on high citation rates. Interestingly, impactful titles do not have to contain a period or slash in; instead, they typically use colons.
The key words for crafting the title were also identified that could potentially attract readers, which are as follows: Review, cancer, monitoring, recent, therapeutic, method, theory, analysis, applications, learning, protein, DNA, multiple, new, association, health, and study.
Q: How many authors should there be on a particular article?
A: Include six or more authors
I found a correlation between the number of citations and the number of authors, as there was a significant difference between highly and lowly cited papers. It seems that multi-author papers gain greater exposure from the authors’ institutions, labs, researchers, and students compared to single-author papers. In other words, each author has his own network, and bringing together all authors’ networks will increase the number of readers who share the same research interests, which will in turn increase the likelihood of citations. Moreover, multi-author papers can benefit from self-citations. One may intuitively assume that when forces are joined and more than one person contributes to the work, the quality of the methodology, performance of the experiment, acquisition of funding, and quality of the paper will also improve. \
Q: How many characters are appropriate?
A: Include 35,000 characters (no spaces) minimum
I also found that the number of characters, with no spaces, is significantly different between highly and lowly cited papers. Moreover, the number of characters needs to be more than 33,600, including references, which is approximately 5,600 words. This result is in agreement with the number of words accepted in one of the most highly impactful journals, Nature. According to Nature‘s most recent formatting requirements, the maximum number of words accepted for publication is 6, 500 words, including references. Note that Google’s metrics (h5-index and h5-median) ranked Nature in 2018 as the most impactful journal in the world.
Q: How many figures should be included?
A: Include six figures minimum
To my knowledge, the number of figures has not been investigated in current literature. Based on the results of my analysis, the number of figures is slightly different between highly and lowly cited papers. These results suggest that the more figures there are in an article, the more likely the publication will be cited. This could be due to the fact that more information is conveyed quickly through figures, thus helping readers understand the results more quickly. In open access journals, there is no limitation on the number of figures; however, some other journals require a specific and exact number of figures (in this case, combining multiple figures into one figure can be a good idea).
My analysis revealed that at least six figures are needed to reflect relevance and impact, which is in agreement with the number of figures accepted by Nature. According to Nature‘s most recent formatting requirements, the maximum number of display items (figures or tables) is six.
Q: How many tables?
A: Include two tables minimum
To my knowledge, the number of tables has not been investigated in current literature. I found that the number of tables is significantly different between highly and lowly cited papers. Specifically, at least two tables are needed to represent publication results. Please note that the number of tables investigated here is independent of the number of figures.
Q: How many equations?
A: Use as many equations as necessary
To my knowledge, the number of equations has not been investigated in current literature. I found that the number of equations is not significantly different between highly and lowly cited paper. Perhaps this is related to the fact that reviews are usually more commonly cited than articles that contain equations. Thus, we can include as many equations as needed or if needed.
Past research that attempts to determine the components of a highly read and cited paper have addressed some questions, but not all. At times, subjective answers from authors’ peers, mentors, or advisors were also provided. My analysis is an attempt to provide recommendations based on an objective measure, which is an excellent first step to investigating this topic more thoroughly and completely. Just a small caution that the suggestions I provide here, of course, do not guarantee increased citation rates. Indeed, there are more essential features that improve citation rates and overall impact, such as the journal’s reputation, the authors’ fame, the originality of the work, the importance of the topic, the journal accessibility (i.e., open access vs. non-open access), the publication type (e.g., article, review, communication, etc), and the feedback quality from the editor and reviewer.
Mohamed Elgendi is a postdoctoral fellow at the University of British Columbia.
The article argues that highly-cited articles exhibit certain characteristics – the “tips” ion this case – but this does not demonstrate that these tips generate highly-cited articles. In other words, the author confuses necessary and sufficient conditions! And no machine learning will be able to correct this problem.
About the Title:
To formulate a good title is certainly important. However, an analysis based only on Nature papers is not representative. Journals specialized on specific research fields usually regulate the length of the title and underline the need of specificity of the words one should use. These words must mirror the final conclusion of the paper, i.e. the message the authors want to give their colleagues. The authors must consider the possibility of using the “Main title plus subtitle” structure in which the main title is general and orientates the reader, the subtitle is specific. Uninformative words (mentioned in this analysis) should not be used, for example “study” (because a scientific papers must be a study), “new” (because the authors must publish original results or ideas), etc.
About the number of authors:
A correlation between the number of authors and references is probably evident. But other question is the value of self-citation which is often simply not considered when the scientific activity of a researcher is evaluated.
About the number of characters:
Almost all specialized journal regulates the length of the text. Thus the numbers given here are not general. The authors must simply follow the instructions for the authors given by the target journal.
The number of figures:
This is highly specific and depends on the subject of the paper. Theoretical works cannot show “six figures minimum” (imagine a paper in math) but a paper showing ultrastructural details must show all figures which are needed to prove the final conclusion of the paper. In addition, the figures may use much space. If the journal has page limitation (and not the number of characters is regulated), the authors may have troubles with six figures.
The number of tables:
The tables usually contain numerical information, summarize details, etc. Thus their number is highly specific and depends on the subject of the paper and no general rule can be given. Certainly tables with too few numbers or details what can be described with words must be avoided and too big tables should be published as appendix or attached, online only information.
The number of equations:
The text given here is evident.