This is your initial task, but remember your primary task is to write. Therefore, we try to prepare for writing while we are reading.
Start by reviewing quality work first. Go to Scopus or Web of Science or similar. Try different keyword combinations, then sort your results by ‘highest citations.’ Try to find the most relevant but also well-cited papers to begin your review. This should help you get a good grasp of what is considered to be important/core to that area of interest.
Then download the pdf and also the EndNote reference (or whatever referencing app you use – this is crucial, if you are not using EndNote or a similar app, you are not following the fundamentals of good report writing). This does relate to marks (points for referencing, APA format) and also professionalism in writing. Writing competent reports requires appropriate referencing. Using referencing software saves many hours of work and results in fewer citation/referencing mistakes.
You can’t depend on apps like EndNote to always get it right! You still have to correct many of the references within the app, which is typically easy – though does require learning how the app works. Check APA and memorize their standards. Frequently, you will find issues with capitalization of article titles and/or journal names. Know APA for these facets and check them in your EndNote file. Best to fix them in EndNote before citing in your papers.
Read – critically! How do you usually read a research paper from a journal, or an organization report such as from the WHO? After reading the title, which is why we would move on to read more, or not, what should we read next? If it is research report or other report presenting original data and analyses, as opposed to essays or similar, we might read as follows:
Start with the abstract. This will help orient you to the overall work, it is also the second most-read section (after the title), and many people don’t go beyond this part. If the paper is probably not going to be useful to your needs, then go ahead and stop reading. You probably have too many potentially useful papers to read anyway. So, if you see something that seems to be useful to you, continue on to the paper.
The method (we are skipping the intro for now). Option B – Results (for those who are competent in statistics and are looking for specific results). Check some pertinent points (which will vary depending on type of study and your purposes).
How many were in the sample? This is the most cited statistic, but many don’t cite this appropriately. For that type of research/study, is the sample size very small, small, moderate, large? Remember, a qualitative study might be considered large with 20 participants, and a scale development study small with 300 participants.
Where did they get the sample? (university students – often a red flag for a less than ideal study, unless the study relates specifically to students). Note that there are two main types of representativeness – how well a sample represents or is generalizable to a specific population or to manifestations of specific constructs.
type concerns a geographic population. So, does the sample match that population on gender, ethnicity, age levels and other factors?
The second type of representativeness concerns the constructs under study. This aspect is rarely reported but is crucial to the integrity of the findings. The easiest way to look at this is through range restriction. Range restriction refers to a situation where participants do not cover the full range of possible scores on a specific measure. For example, if you were developing a new IQ test, you would want a very large sample that fully covers the lowest possible to highest possible scoring range of your test. Did the study report this matter? Can you see evidence that they did not include people at the lower or higher ends of the spectrums? Many studies are on very specific groups, such as clients/patients with a specific mental disorder. In such cases you will almost certainly see range restriction (e.g., no or very few low BDI-II scores). That can be fine, as it relates to study objectives, but does negatively impact on statistical analyses (check your statistics textbooks). To accurately show correlation-type associations (e.g., correlations, regression models, factor analysis), it is important to have a sample that covers the entire range well. That roughly means multiple cases for each data point.
What was the age range? Often more important than the mean or standard deviation. Range is important but how is the sample distributed? How many people are at each age? A range of 18-90 might look broad, but in a total sample of 1000 what if there were only 5 participants under 30 years old, but 900 over 70 years? That would be primarily an older sample and findings related to younger people may not be valid.
What was the gender/ethnic/etc. distribution? Note that equal proportions (e.g., 50/50) may be ideal but aren’t required, 70/30 splits for a dichotomy can be quite acceptable. There just have to be a sufficient number in the smallest group. What is sufficient? That depends on the area/type of statistics etc. Often, anything worse than a 90/10 split (e.g., 91% non-smokers, 9% smokers) will not be appropriate for many statistical analyses comparing those groups.
What about multiple distributions? A sample of 1000 might report 50/50 females/males and an age range of 18-80. That might look good, but what if 90% of females were over 50 years old, and 80% of males were under 40 years old? Some report writers state that a sample is representative, because of basic matches on group proportions (e.g., number of an ethnic group reasonably matches the population proportion), however, the data may show very skewed proportions such as most of one ethnic group are older and more female than another ethnic group. These details are almost never stated but might be considered in a critical review.
What measures/instruments did they use? Were they the best available? How did they perform in that study (often not reported)? Did they use both high and low performing measures for analyses and ignore the differences in quality? One thing to check is coefficient alpha (Cronbach’s α) but note that what is ‘good’ is debatable, and there are ways to manipulate (inflate) alpha. In general, any measure reporting alpha < .70 is not doing what was intended – revealing a reasonably accurate depiction of that latent factor. Even .70 - .80 can be considered low internal consistency rates. For quality measures, we want to see approximately α ≥ .85 (for clinical work we often aim for ≈ .95).
Alpha inflation – this happens when there are too many items in the scale for calculating alpha. For example, looking at some real data, for the DASS Depression scale (7 items), α = .94, but for the full DASS (21 items), α = .96. That is not a dramatic increase (something of a ceiling effect here), but it does show how adding extra items, from correlated but distinctly different factors, can artificially inflate alpha. Note that different factors should almost never be combined for calculating internal consistency, each alpha should represent a unidimensional factor.
Be critical – note both strengths and weaknesses but be sceptical. Sometimes claims of representation etc. are not accurate.
Results. Next, check the main findings. Look closely at what they did, type of analyses (note the validity of the measures/groupings used), and significance of results. For example, p values are important but don’t tell the full story. A statistically significant difference between two groups may show p < .001, but with a large sample (e.g., 500+) the effect size could still be very small though statistically significant. Check a table on effect sizes (Google this and bookmark a good site that compares different effect size stats, with references. Also, find at least one source to cite, usually Cohen but there are many others). Effect sizes are how we interpret statistical outcomes as ‘small, medium, large,’ etc. These also don’t tell the full story but are very useful. Overall, what do the results say and how do they reflect strengths and weaknesses of the study?
Introduction/Discussion. Each of these is important, but your needs will determine which is more important. The Intro may help with your lit review and help you understand background issues. These can be great reviews of the literature and point you toward good sources, theories, etc. The Discussion should point out study strengths and weaknesses (note – do not just copy these, treat this section critically also), main findings, and forward directions (also, do not just copy their ideas, you need to come up with your own future directions or state that you agree with certain authors in their proposals). Try to find some good ideas in these sections to guide your own writing, in terms of flow, terminology, sentence structure, etc.
Take notes. While you are reading, take notes from the source, write down main points/findings etc. and cite using EndNote. This will make your later write-up easier.