How to write a scientific publication

Writing one's first scientific publication is a difficult task. One needs to know how to structure the manuscript and what kind of information is relevant or not relevant. One also needs to have an in depth knowledge of the field and also related fields to develop, in a logical but concise fashion, the hypothesis for the study you have done. The latter issue is something between you and your supervisor, but with the former I can help you.

This website contains a information, tips and a template on how to structure and write a scientific publication. The structure of publications differs from field to field and the structure here is typical for medical, biological and related fields. The layout is modelled after that required for the Journal of Applied Physiology. For each journal you write for, you will have different requirements for the layout, but often supervisors will tell their students for their first publication just to model it after "a journal" and then go from there.

It is a very steep learning curve with writing publications. For the very first manuscript I drafted I felt like I was banging my head against a brick wall - I reached the 13th (!) version of it before I gave up (for other reasons: the statistics were wrong). For the first manuscript I wrote that ended up getting published, it felt like a massive struggle at the time and even now when I open up this manuscript (http://dx.doi.org/10.1016/j.jelekin.2007.02.018), I think to myself "jez, I could write that so much better/concisely now." But, it is a learning process and it requires persistence and practice.

I wish you the best of luck in your endeavours. Additional resources can be found on this useful website https://www.oacommunity.org/resources

You can download a word version of the template here.

Title of paper (typically up to 150 characters including spaces)

(Note: Some journals don’t like titles that start with “Studies in..” “Effect of..”)

Daniel L. Belavý1 B.Phty, PhD

Nigel Bevan1 MD, PhD

Otto Normalwissenschaftler1 Dipl. Sportwiss.

Jane Doe2 MD, PhD

1 Charité Universitätsmedizin Berlin, Center for Muscle and Bone Research, Hindenburgdamm 30, 12203 Berlin, Germany.

2 Some insitution, some department, some street, postcode, city, country.

Correspondence: Daniel L. Belavý, B.Phty, PhD; Charité Universitätsmedizin Berlin, Center for Muscle and Bone Research, Hindenburgdamm 30; D-12203 Berlin; Germany

Tel: +49 30 8445 6666; Fax: +49 30 8445 0001; E-mail: belavy@gmail.com

Conflict of Interest: Jane Doe acts as a consultant to XY Corporation for the exploitation of this study’s results. All other authors have no conflicts of interest.

Running Title: short version of your title in no more than 45 characters

Abstract

(different journals can have very different requirements. An absolute maximum is 250 words, some journals allow a maximum of only 200 words. Some are structured as here, some are unstructured.)

Background: Very short introduction of the background literature and statement of problem/aim

Methods: Describe the main points of the methods and main outcome measures.

Results: Give the main results. Adding p-values in brackets helps to emphasise the main effects (p<.001)

Conclusions: State the main conclusions that can be made from the data at hand. Speculation that is not directly supported by the data presented in the paper should be strictly avoided. I have seen some abstracts that give “future research directions”, but I avoid this (because, well, it isn’t a conclusion and you usually have a hard enough time fitting what you want to say into the abstract without this extra stuff).

Key words: key words; are usually terms that people will; search for but are not in the abstract or title; search engines; will used these words which separated by semi-colons

Introduction

The introduction gives the reader an “introduction” to the background literature and the research problem. Usually you start off with very general statements (e.g. disease “x” is a problem, costs society so and so much dollar or whatever / e.g. the best way to train function “b” is unclear, but this is important for population “y”) and then narrow down very quickly to the things you want to examine in you study. Why these particular things are relevant (e.g. to disease “x”) needs to be given. It can be helpful to introduce the methodology in the introduction, but this is only then “necessary” when the methodology is “novel”. For well established methodologies, this is not really required.

An introduction length of more than two A4 pages at double-spaced 12-point Times New Roman font is “too long”. A “new author” will always tend to write too much. Setting this limit will ensure you stick to the most important things and do not waffle (and hence bore the reader/reviewer). Another thing is “abbreviations”: abbreviations must be defined at their first use (e.g. “Magnetic resonance imaging (MRI) is commonly used to……”) and you should save your abbreviations for those terms central to your manuscript. Just remember that in English “one sentence is not a paragraph”. I see it a lot, even amongst native speakers: try to avoid it.

Whilst you can follow these guidelines to help you write a focussed manuscript, the hard part is understanding your topic well enough to introduce your study in a concise fashion. Also, having a well developed broad knowledge of your field will make it (somewhat) easier to write the paper: you will be able to see alternative approaches for interpreting your data or alternative ways of introducing your topic. Just remember, if this is your first time writing a manuscript: it isn’t meant to be easy. The first manuscript I ever wrote (2) was a lot of hard work and I was very proud of it at the time. However, when I look back at it a while ago, I thought, “my god.. I could write that so much better now”. It is a very steep learning curve with writing publications, so if you feel disheartened at any point, just remember: it is difficult! But that is why at the end of it you will deserve your PhD, Masters or whatever degree you are going for.

In any case, in the final paragraph of your introduciton, you want be at the point where you state your goals. E.g: “therefore, we wanted to examine the effect of intervention a in a population with disease b using methodology c.” It is also strongly advisable to state a clear hypothesis at the end of the introduction. Your hypothesis may look something like this: “We hypothesized that intervention z would have an impact on parameter c but that intervention y would have no effect. As a secondary goal, we expected that exercise form y would however affect parameter d”.

Materials and Methods

Subjects and study protocol

In the methods, you typically start of by describing the general characteristics of your study (e.g. the subjects) and what they went through. Ensure you always state who have the ethical approval and that subjects gave their informed written consent. For example: “The study was approved by the ethical committee of the Charité Universitätsmedizin Berlin and the radiological examinations were approved by the Bundesamt für Strahlenschutz. All subjects gave their informed written consent prior to participation in the study.

A sample size estimate is also important and will typically be demanded by ethical committees anyway. This will show the reader that you had the diligence to check how many subjects you need to be able to see if your intervention (or whatever) actually had an effect. It is important to state the source of the data used in the sample size estimate, the alpha-level (for statistical significance, typically 0.05), the power (typically 0.8). If you for some reason cannot influence the subject numbers (e.g. because of costs or because someone else organised the study) you will need to perform a sensitivity analysis (i.e. the effect size you can detect given your study design). Here I can recommend the (free) software G*Power 3 which, as of writing this manuscript, is available here http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3. To stress this point: I recently recommended a manuscript for rejection where the authors had a small sample size, did not provide any justification of the sample size, did not attempt a sensitivity analysis and did not present data on repeatability (see below). This was of course not the only reason, but such an issue push a reviwer’s or editor’s decision one way or the other.

Intervention description

Here you describe the intervention you did with the subjects (or whatever happened). You have to imagine that it should be possible for someone reading your paper to take the description and reproduce the study. Where machines were used as part of the intervention (though this is typically more an issue for the measurements), you need to state in brackets the company that produced the equipment and its location. How much detail you can go into for the intervention will “depend” and manuscript word-length restrictions can often have a role to play.

Name of outcome measurement 1

Here you describe the first outcome measurement. You say what machine/make it is and how the subjects were measured. Software versions should be stated. It is also important to give information as to the measurement reproducability. I can recommend this publication (5) for a general overview of repeatability, but for such a publication where you just want to give the reader an idea of how repeatable the measurement is, reporting co-efficients of variation (CVs) and their confidence intervals is, in my view, the most readily understandable approach. This publication (4) has a good description of how to calculate CVs. Here is an example for peripheral quantative computed tomography (pQCT):

An XCT 2000 (Stratec Medizintechnik, Pforzheim, Germany) was used to obtain pQCT scans from the left leg as described previously (7, 8). Scout-views were generated in the frontal plane to identify the tibio-talar cleft to position the reference line. Sectional images were then obtained from the tibia at 4% (distal epiphysis) and 66% (diaphysis) of its length. Measurements were performed prior to and immediately at the end of the end of the intervention period. The integrated XCT 2000 software (version 6.20A) was used to analyse the pQCT images. The total bone mineral content (BMC) of the tibia was determined using a detection threshold of 180 mg/cm3 (contour mode 1, peel mode 1) for the epiphysis and cortical BMC of the tibial diaphysis was assessed with a threshold of 710 mg/cm3 (cortical mode 1). Gross anatomical muscle CSA was obtained from the scans taken at 66% of the tibia length (diaphysis) and was calculated as total bone area (detection threshold of 280 mg/cm3; contour mode 1, peel mode 1, filter 2: F03) subtracted from the combined muscle and bone area (detection threshold of 45 mg/cm3; contour mode 3, peel mode 1, filter 2: F03F05). The short-term co-efficient of variation (4) for the main outcome parameter (distal tibia total bone mineral density) in 24 subjects with complete subject repositioning between two repeated measurements on the same day is 0.37% (95% confidence interval: 0.29-0.51%).

Name of outcome measurement 2

Often you will have more than one measurement. State the name in the title and describe as above.

Further data processing

Often, depending on the kind of data you have, some kind of processing will happen. This may be “image measurement” or this may be some kind of “signal processing”. Here you describe what happened to the data after you got it. You don’t need this section if the data from the machine gets fed more or less straight into the statistical models.

Statistical anaylses

Here the statistical analyses are described. Your “advising statistican” should help get you this information. You need to give:

  • What kind of data were fed into the model (e.g. raw data, percentage change data)

  • What models were used and give a reference if appropriate.

  • How the models were set up (e.g. what factors, random effects, allowances for heterogeneity of variance etc etc)

  • What co-variates (if any), or if co-variaties (such as age, height, weight) had no influence on the outcome, then state this here.

  • State what your “alpha level” was – i.e. what p-value was taken as “significant”. 0.05 is typical, but....

  • State the software used including version number

An example: “Where double baseline measurements were performed, these data were averaged prior to further analysis. Linear mixed-effects models (6) were used to assess changes with respect to subject-group (RVE, RE and CTR), study-date (“baseline”, day-5, day-10 and day-15) and a group×study-date interaction. Subject age, height and weight were included as linear co-variates. Where necessary, allowances for heterogeneity of variance according to study-date and/or subject-group were applied. Random effects for each subject were permitted. An a of 0.05 was taken for statistical significance. As multiple measurement sessions were undertaken on the same subjects, a Bonferroni adjustment was not performed, rather we looked for consistent significant differences across time points. All analyses were performed in the “R” statistical environment (version 2.10.1, www.r-project.org).

Results

At the start of your results you typically give general statements about study performance first. For example: were there drop-outs? Was the missing data due to equipment failure or similar? (If so, state briefly “why”) If you are reading this, it is probable that there were “different groups” in your study, hence it is appropriate in this sub-section of results to state whether there were differences (e.g. between intervention groups) at the beginning of the study. If not, a statement like: “At baseline, no differences were apparent between groups for any of the outcome measures (F all≤2.2, p all≥.13; Table 2)” helps, and adding the p-value gives the statement some more weight for the reader and helps to stress your thoroughness in data analysis.

Results for outcome 1.. replace this sub-section title with the name of the outcome measure

Assuming you had some form of ANOVA in your analysis, it is, in my opinion, important to start with stating the F-statistics for the relevant main-effects (e.g. “study-date”) or interactions. An example might be:

“An influence of the interventions were seen on parameters a, b, c (group×study-date: F all≥2.2, p all≤.0065) but not on d, e, f (group×study-date: F all≤1.5, p all≥.20).”

This general statement will lead the reader into the following information on specific parameters. How the results are presented is a matter of taste, emphasis and also journal style. Journals typically have a limit on the number and size of tables and also on the number and size of figures. As a general rule of thumb, I usually put the main findings on the main parameters as figures and the rest of the results in tables, though this is not always the case. It is about communicating to the reader most effectively the main findings you want to get across from your study. After stating the ANOVA results, I typically then begin with statements like:

“Parameter a showed a marked decrease in intervention-group b over the course of the study, but not in group z (Figure y)”

The reference to the Figure will direct the reader to the appropriate place to find the data. In the text of the results you should avoid restating information that is already given in Figures or Tables (e.g. restating mean(SEM or SD) changes or restating p-values). For some reason which I – after ca. 40 publications – still have not yet understood, Figures and Tables are capitalized like “proper nouns” in the text with capital “F” and “T” respectively, even if such words actually are just “common nouns” (e.g. “Data on changes in parameter g are shown in Table 3”)

Results for outcome measurement 2

As above....

Discussion

In the first paragraph of your discussion you should summarize the findings of your study – what parameters were affected by the intevention, what not etc etc. Often, readers do not really understand the methods and results and hence just jump to the discussion for a summary. This belongs in the first paragraph. Avoid restating numbers already given in the results section, unless they are really really really really really important (so... as a “new author”, just don’t do it, even if you’re tempted).

In the second paragraph, one typically focuses on the main findings of the study in light of other studies – you try to get into the “meaning” of the results. How your discussion is structured will always depend on your study and what kind of results you have. Note that you do not “need” to discuss every single finding.

1) Focus on the main findings first

2) move down to secondary findings in the next paragraph(s)

3) perhaps discuss background issues, like:

(a) potential mechanisms of the effects (if appropriate)

(b) general implications such as implications for exercise/intervention prescription (if appropriate)

4) discuss limitations (see comment below) and maybe…

5) make recommendations for future research (if desirable) and then

6) write the conclusion (see below).

The length of a discussion will sometimes depend on the journal requirements: some journals have word limits which will restrict how much you can say. Some journals do not have restrictions, but as a rule of thumb, four A4 pages of double spaced 12-point Times New Roman font should be regarded as an upper limit: if you feel the need to say more than this, then try to focus more on the more important things. A six page double spaced 12-point Times New Roman discussion is an absolute upper limit.

Limitations: it is often appropriate to deal with the limitations of your study. This may be limited subject numbers, measurement repeatability, and potential influences of other (unmeasured) factors such as a “training effect” as a result of repeated measurement or the lack of an additional control group which may have helped the interpretation. These limitations can be mentioned as a formal separate paragraph or can also be fitted in with the relevant points of the discussion. Whilst I prefer integrating limitations with the rest of the discussion, my experience is that the majority of reviewers prefer a formal separate paragraph on limitations.

This example (adapted from ref. 3) where limitations were mentioned as a separate paragraph: “Whilst movement speed accuracy improved over the course of the study, no significant relationships existed between any of the movement accuracy variables and the EMG signal characteristics. It cannot, however, be wholly excluded that improvements in movement performance may have influenced the findings and it would be appropriate to study these electromyographic signal characteristics with skill acquisition.

This example has the limitations (subject numbers and measurement repeatability) incorporated into the discussion of the findings: “The results also showed that there was little effect of the countermeasures on the bone changes at the distal tibia and distal radius during and after bed-rest. It may be possible to argue that the low numbers of subjects in each group coupled with measurement repeatability may have been part of explanation of non-significant impact of countermeasures. Whilst this is conceivable, there was no consistency across parameters in, for example, losses in bone density or changes in bone structure in the exercise-group that may suggest a small (albeit non-significant) impact of the exercises......

Concluding paragraph: make only conclusions that are supported by the findings of the study. I mean this seriously: (higher quality) journals ask reviewers to rate to what extent the conclusions of a manuscript are supported by the findings. So, focus ONLY on those things supported directly by your findings. Ensure the conclusions are given in a balanced way – e.g. that the intervention had an effect on parameter b but not on a or that the intervention affected “some, but not all” parameters (or something like that). Focusing only on the “significant” findings could lead to readers asking themselves why the authors ignored certain findings (e.g. perhaps some kind of underlying agenda) – so ensure they are “balanced”. It is sort of ok to make conclusions that are “strongly supported” by the data in light of available literature (e.g. if you didn’t have a control group for some kind of intervention but that there is lots of literature that states that particular intervention has no effect (or whatever) on a given parameter), though you need to be careful. It may also be ok to provide some recommendations for future work. An example sentence from a concluding paragraph (1): “Overall, whilst further countermeasure optimisation and investigation of other exercise principles (such as exercise “dose”) is required, the results provide evidence that whole-body vibration can increase the efficiency of exercise in preventing bone loss at some skeletal sites during and after prolonged bed-rest.

Acknowledgements

Here acknowledge people who helped with the study (e.g. technical people). Thanking the subjects may be appropraite. Some journals (not many) require written permission from people with PhDs and/or MDs to be mentioned in the acknowledgements. Also mention fundings sources (some journals require this funding information in a separate section, it just depends).

References

(The reference list style will depend upon the journal. The references here are done in the style for the Journal of Applied Physiology)

1. Belavý DL, Beller G, Armbrecht G, Perschel FH, Fitzner R, Bock O, Börst H, Degner C, Gast G, and Felsenberg D. Evidence for an additional effect of whole-body vibration above resistive exercise alone in preventing bone loss during prolonged bed-rest. Osteoporosis Int 22: 1581-1591, 2011.

2. Belavý DL, Mehnert A, Wilson S, and Richardson CA. Analysis of burst and tonic electromyography characteristics in a repetitive-movement task: electromyographic simulation and comparison of novel morphological and linear-envelope approaches. J Electromyo Kines 19: 10-21, 2009.

3. Belavý DL, Richardson CA, Wilson SJ, Felsenberg D, and Rittweger J. Tonic to phasic shift of lumbo-pelvic muscle activity during 8-weeks of bed-rest and 6-months follow-up. J Appl Physiol 103: 48-54, 2007.

4. Glüer CC, Blake G, Lu Y, Blunt BA, Jergas M, and Genant HK. Accurate assessment of precision errors: how to measure the reproducibility of bone densitometry techniques. Osteoporos Int 5: 262-270, 1995.

5. Kottner J, Audigé L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, Roberts C, Shoukri M, and Streiner DL. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol 64: 96-106, 2011.

6. Pinheiro JC and Bates DM. Mixed-effects models in S and S-PLUS. Berlin: Springer, 2000.

7. Rittweger J, Beller G, Ehrig J, Jung C, Koch U, Ramolla J, Schmidt F, Newitt D, Majumdar S, Schiessl H, and Felsenberg D. Bone-muscle strength indices for the human lower leg. Bone 27: 319-326, 2000.

8. Rittweger J, Frost HM, Schiessl H, Ohshima H, Alkner B, Tesch P, and Felsenberg D. Muscle atrophy and bone loss after 90 days bed rest and the effects of flywheel resistive exercise and pamidronate: results from the LTBR study. Bone 36: 1019-1029, 2005.

This is an example of a “simple” table. A small title (sometimes word limits) goes there and then the caption below describes the information in detail. There should be one Table to a page and each Table should be able to be understood from the information given without having to refer to the text. The double line at the top, single line between the titles and the data and single line at the bottom with an empty line between the titles and the start of the data and another empty line after the last data line is the style used by the Journal of Applied Physiology (most journals do not have requirements for table layout or appearance, so I just use this style as I think it “looks nice”).

Table 1: Baseline subject characterictics

Values for age, weight and height are mean(SD). CTR: inactive control group; RE: resistive exercise only group; RVE: resistive exercise with whole-body vibration group. There were no differences between groups for any of these variables (F all<1.2, p>.33).

Note that the caption states what the data are (in this case mean(SD)), what the abbreviations mean and also give extra information (in this case the group differences) as required. Keep the number of Tables to within 7.. and even with this many, you have to start asking yourself how whether you’ve chosen the appropriate format for presenting the data.

This is an example of a more “complex” table with multiple different variables for each of the groups over time.

Table 2: Percentage change in lumbar spine and whole body bone mineral content.

Values are mean(SEM) percentage change compared to baseline. CTR: inactive control group; RE: resistive exercise only group; RVE: resistive exercise with whole-body vibration group. Significance indicated by: a: p < 0.05; b: p < 0.01; c: p < 0.001. HDT: day of head down tilt bed-rest, R+: day of recovery. “d”, “e”, “f” indicate, respectively, significant differences in change compared to baseline in CTR vs. RE, RVE vs. RE, RVE vs. CTR comparisons. BMC: bone mineral content. DXA: dual X-ray absorptiometry.

Note that there are different “panels” with the description of the each parameter writtenin italics.

Example figure labels (short) an captions (often longer to explain everything). The same principles apply as per Tables: each Figure should be able to be understood from the information given. An example is given here of a Figure showing “something happening” (Figure 1) and the second example (Figure 2) shows “data”. Up to seven Figures is the absolute maximum, even then that is a bit too much: you may need to reconsider whether the data are being presented in the most effective format.

Figure 1: Countermeasure exercise

Both the resistance exercise only (RE) and resistance exercise with whole-body vibration (RVE) groups performed their exercises on the specially designed Galileo Space exercise device. Subjects were positioned in head-down tilt on a moveable platform with shoulder pads and hand grips preventing downward movement and permitting application of force via the platform. A pneumatic system generated the force, applied through the moveable platform, against which the subject needed to resist and move (via the shoulder pads and hand grips). The feet were positioned on either side of a platform which was set to vibrate in the RVE group. Subjects were given visual feedback of their actual and target position in the exercise via a monitor placed in the subjects’ field of view. As the force output was dictated by the exercise device, the feedback focussed on ensuring the subjects performing the exercise in the desired range of motion and at the desired speed. During the heel raises the sport scientist monitored the range of movement and encouraged the subject to go to the end of range in each direction. Here the subject is performing a back extension exercise.

Figure 2: Percentage changes in tibia bone mineral content from pQCT.

Top image: distal tibia, total bone mineral content. Bottom image: tibial diaphysis, cortical bone mineral content. Values are mean(SEM) percentage change compared to baseline. CTR: inactive control group; RE: resistive exercise only group; RVE: resistive exercise with whole-body vibration group. Significance indicated by: a: p < 0.05; b: p < 0.01; c: p < 0.001. HDT: day of head down tilt bed-rest, R+: day of recovery. Superscripts “d”, “e”, “f” indicate, respectively, significant differences in change compared to baseline in CTR vs. RE, RVE vs. RE, RVE vs. CTR comparisons. See online supplementary material for further details on comparison between groups from ANOVA.