This criterion assesses the extent to which the student’s report provides evidence of evaluation of the investigation and the results with regard to the research question and the accepted scientific context.
Although it may appear that you are asked to repeat the analysis of the data and the drawing of a conclusion again in the evaluation, the focus is different. Once again the data and conclusion come under scrutiny but, in the evaluation, the conclusion is placed into the context of the research question. So, in the analysis, it may be concluded that there is a positive correlation between x and y; in the evaluation, you are expected to put this conclusion into the context of the original aim. In other words, does the conclusion support your original thinking in the topic? If not, a consideration of why it does not will lead into an evaluation of the limitations of the method and suggestions as to how the method and approach could be adjusted to generate data that could help draw a firmer conclusion.
Variability of the data may well be mentioned again in the evaluation as this provides evidence for the reliability of the conclusion. This will also lead into an assessment of the limitations of the method. It is the focus on the limitations that is at issue in the evaluation, rather than a reiteration that there is variability.
Extra notes:
Writing Conclusions
Don't start by saying whether you got it right or now. Talk about the graph, talk about the data. make sure you are using the appropriate scientific terminology.
Talk about about patterns, trends and anomalies first. Make sure you refer to the data directly.
Standard deviation and error bars are often misunderstood and interpreted wrongly. Error bars should be discussed differently depending on whether you are looking to describe a trend (curve of best fit) or looking for significant difference (t-test):
Curve of best fit – are the error bars small enough to allow the trend to be identified easily? Or the error bars so large that the best fit curve could be drawn differently, even reversing the trend, I.e. Weak or no correlation?
T-test – a large overlap means either that there's very little difference between the data set, or there is a lot of natural variation making it hard to identify differences, or the sample is too small and produces large error bars.
Associate findings with qualitative data, if appropriate.
This is your first mention of random error - how good is the curve-fit? Much much variation around the mean is there? Can you draw a definite conclusion?
Do your findings fit with accepted theory (or not)?. To do this most effectively include references/quotes to published material whether it be a textbook, weblink, scientific journal or another reliable source. Even better would be to compare with published data, if appropriate. Is your lab consistent with what others have found before?
How do your findings fit with your hypothesis (if design is part of the investigation)? Use appropriate language for this: “supports my hypothesis”, not ‘proves’ or ‘is correct’.
End by talking about the wider implications for your findings (if any) and suggest further steps / investigations.
Writing Evaluations
Evaluations are best done in a table format. This way you are more likely to address each one fully:
Identify all your sources of error/areas of weakness/control variables omitted from the design
What kind of error is it? Systematic or random?
Discuss how significant the error is
Suggest improvements which are realistic and can be implemented using the information you provide - name (and size) the suggested equipment change, if you recommend extending the range of data collected suggest values
Random error should first focus on the spread of data - how wide are your error bars? The bigger the error bar the more natural variation there is. Anomalous results may also indicate random error. This discussion maybe backed up with qualitative data.
Systematic error is the discussion of how well the equipment and the method worked. First point of call should be the measuring equipment you used and how you used it. The method how the variables were changed/measured and/or controlled.
Other questions you should consider are:
Are there any obvious omissions? If so discuss them.
What parts of the method could be carried out better if done differently?
Is the scope of the experiment correct: e.g. is the age/gender of a subject important or would it be better to select only females aged 16-18? cShould the experiment be extended to a different species to see if all species respond in the same way?
Systematic error also include limitations caused by lack of data, which although they should be fully addressed in the evaluation will probably be first discussed in the conclusion:
Where there enough changes to the independent variable to identify the optimum value to your satisfaction?
Was the data range wide enough to confidently see diminishing returns?
If the error bars are large do more samples need to be taken to make the findings more reliable? This point is of course talking about both random and systematic error
N.B. It should not be necessary to discuss human and zero errors if the method was carried out correctly. If not, why did you not repeat or modify the lab where necessary?