alpha. Magnitude-based inference was used in all subsequent analysis. Magnitude-based inference offers a theoretically justified and practically MANUSCRIPT ACCEPTED ACCEPTED MANUSCRIPT 11 useful approach in any behavioural research that involves statistical inference (van Schaik & Weston, 2016). The approach uses the smallest important effect in making an inference: a clear effect is never an artefact of sample size, which happens in null-hypothesis testing when the hypothesis no effect is tested. The spreadsheets with detailed results are presented as online supplementary materials (adapted from Hopkins, 2007). Outcomes of positivity, triviality and negativity are quantified with probabilities and matching qualitative descriptors, providing a rich type of inference. For interpretation of the obtained probabilities, the following qualitative probabilistic terms are applied: [0; 0.005>: most unlikely, almost certainly not; <0.005; 0.05]: very unlikely; ><0.05; 0.25], unlikely, probably not; ><0.25; 0.75]: possibly; ><0.75; 0.95], likely, probably; ><0.95; 0.995]: very likely; ><0.995; 1]: most likely, almost certainly (Batterham & Hopkins, 2006). For quantitative analysis, descriptive statistics were produced with SPSS. Inferential statistics (t-tests) were run into SPSS and the results were then entered as input for spreadsheets to produce results of magnitude-based inference (Hopkins, 2007). Unrelated t-tests were conducted regarding the smallest important effect size, as defined by the research team: In magnitude-based inference, results are presented for a small effect as the threshold for the smallest important beneficial or positive effect (d = 0.2 in t tests) and for the smallest harmful or negative effect (d = -0.2) (Hopkins, 2010). These were used to analyse a difference between the two treatment orders on (i) outcome measures at baseline, (ii) change in outcomes measures from baseline for SM and TM separately and (iii) difference in change between SM and TM. (i) was to establish whether measures at baseline differed between the two treatment orders (SM first MANUSCRIPT ACCEPTED ACCEPTED MANUSCRIPT 12 or TM first). If no difference occurred, then this would be an additional justification for analysing change from baseline for individual treatment orders. (ii) and (iii) were to establish for a differential effect of treatment depending on treatment order. The results were used to decide whether subsequent analysis over the two combined treatment orders was valid or whether subsequent analysis would have to be achieved separately for each treatment order. Related t-tests were then conducted to test for change from baseline after each of the two treatments on outcome measures and for a difference of change between the two treatments. AD-ACL data after TM from one participant were missing. To avoid introducing bias, we did not replace the missing data with estimates. Qualitative content analysis was based on a summative approach, which goes further than just counting words, but also incorporates the interpretation of underlying meaning of the words (Hsieh & Shannon, 2005). Thematic construct analysis (Dismore at al.,2016) was used to explore the social constructions and experiences from all three interviews and diaries using a biopsychosocialphysics model of complementary therapy (Van Wersch et al., 2009) to frame data analysis. Biopsychosocialphysics model (van Wersch et al., 2009) is an extension of Engel’s (1977) Biopsychosocial Model to include physics in a multidimensional understanding of health and wellbeing as opposed to the one-dimensional biomedical model, especially important in the understanding of energy therapies. Commonalities in data were identified, working from a critical perspective for which researchers remained as faithful as possible to participants’ own accounts, while on the other hand looking for discourses from a narrative or anecdotal perspective MANUSCRIPT ACCEPTED ACCEPTED MANUSCRIPT 13 in which deductive expressions were sought that were in line with the assumed massage benefits). Each interview transcript was read and re-read separately by two coders (to address potential bias, neither of whom had no qualification, experience or training in the delivery of massage and neither of whom were involved with the choice of massage type applied in this study) to ensure familiarisation with the data, following which, coding began. Interesting aspects were identified through written notes, which formed the basis of repeated patterns. Once codes were established, they were compared and agreed across coders and sorted into potential themes with the relevant coded data extracts. Finally, quotes were reviewed, and refined and organised into the final themes. Thematic analysis was carried out, blind to the participants’ names and details, on the diaries by two coder using the method of Krippendorff (2004). One of the coders had no qualification, experience or training in the delivery of massage and was not involved with the choice of massage types investigated in this study. Results Internal-consistency reliability was good for most subscales of the AD ACL, with Cronbach’s alpha >0.70. The exceptions were calmness (alpha = 0.53 after SM; 0.60 at baseline) and tiredness (alpha = 0.67 after TM). With one item removed from these subscales (placid for calmness; wakefulness [reversed] for tiredness), reliability became good (0.75 for calmness after TM; 0.73 for tiredness at baseline) or acceptable (0.61 for calmness after SM). However, irrespective of whether these two items were included, correlations between subscales with and without MANUSCRIPT ACCEPTED ACCEPTED MANUSCRIPT 14 items removed were exceedingly high ( ≥ 0.94). Therefore, subscale total scores were calculated from all five items per subscale (Energy, Tiredness, Tension and Calmness); by summing subscales, scale total scores were calculated for desirable arousal (Energy + Tiredness reversed + Tension reversed + Calmness) and active arousal (Energy + Tiredness reversed + Tension + Calmness reversed). Subscale and scale scores were used in subsequent analysis. Baseline scores. Descriptive statistics (See Table 1, b; Table 1, c) indicated that mean baseline scores were similar between the two treatment orders. In support of this observation, magnitude-based inference showed that there were no clear differences between the two treatment orders (See Supplementary Materials A, Tab 1 unrelated t-test dBaseline). Place Table 1 (a, b and c) here. Descriptives and effect size of outcome measures Change from baseline after treatment. Descriptive statistics and effect sizes (See Table 1, a) over all data indicated moderate-to-large improvements (d = 0.6 to1.2; Hopkins et al., 2009) from baseline after SM on Desirable arousal, Active arousal, Energy, Tiredness (subscale). Thematic Content Analysis showed that although both TM and SM showed improvements in sleep, with mental and physical relaxation and destressing effects, twice the number of participants receiving SM reported these beneficial effects than those receiving TM. The results from content analysis at CO show that TM revealed experiences related to more physical energy and body awareness, such as: ‘Assists physical MANUSCRIPT ACCEPTED ACCEPTED MANUSCRIPT 15 and emotional/ mental wellbeing/holistic’ (n = 8; 80%); ‘Good for energising/ motivating’ (n = 4, 40%); ‘encouraged to think about body/posture’ (n = 4, 40%). By contrast, SM demonstrated a relaxing and calming effect which was experienced by all participants receiving SM: ‘Enables relaxation/ stress reducing/ very calming’ (n = 10, 100%). Benefits were also experienced in a musculoskeletal capacity: “Valuable for physical aches and pains” (n = 6, 60%) (See Figure 1) Place Figure 1 here. Quantitative content analysis. Thematic construct analysis revealed discourse that was categorised in eight themes. The first following four were found in both Swedish and TM: ‘Improved energy’; ‘Improved sleep’; ‘Relaxing and destressing’; and ‘Relief of muscular tension’. The latter four were only found in TM results: ‘Awakening/rejuvenating’; ‘Promoting motivation to engage with physical activity’; ‘improved Posture/flexibility’; and, ’Life changing/psychological stimulating/positivity’ (See Figure 2 and Figure 3): Place Figure 2 here. Themes of beneficial effects for Thai and SM. Place Figure 3 here. Themes for beneficial effects of TM only. Nine participants cited TM as improving energy or being energising: “I felt much more energised following my treatment”). By contrast, five participants commented on SM as improving energy, with two reporting the opposite effect of MANUSCRIPT ACCEPTED ACCEPTED MANUSCRIPT 16 lethargy: “Relaxing, sleep inducing, but not invigorating. Felt really lethargic” 49 (5:57). (Throughout this study, these numbers represent the Diary Thematic analysis code number (participant number: diary line number) following a direct quote taken from participant diaries). More detailed quantitative analysis indicated a small-to-moderate improvement from baseline (d = 0.2 to 0.6; Hopkins et al., 2009) following Swedish message, but a large improvement after TM when SM came before TM (Table 1, b). Moreover, there was a large improvement after both SM and after TM when TM came before SM (See Table 1, c). These results are indicative of a specific carryover effect: a large improvement after TM was maintained after subsequent SM. The results of magnitude-based inference (See Supplementary Materials A, Tab 2 unrelated t-test dSbdTb) provide statistical