CASE STUDIES


Poster at European Food Safety Agency

posted 15 Oct 2018, 07:20 by Jean M Russell

Roberta Fabrisi is doing a PhD with the Grantham Centre on the Portrayal of GMOs in the media. She has exploring the presentation of GMOs within the debate over Brexit and looking at the data between January 2017 and December 2017. She had chance to report her findings at EFSAconference Science Food and Society.  She has used extensively the NVivo Software which is one of the packages provided centrally specifically aimed at researchers. NVivo software  is aimed to help Qualitative Analyst manage their analysis as they typically have rich, lightly structured data.

A New Automated Solar Feature Recognition Facility: Sheffield Solar Catalogue (SSC)

posted 5 Oct 2018, 04:58 by Norbert G Gyenge   [ updated 5 Oct 2018, 05:11 ]

N. Gyenge, H. Yu (余海东 ), V. Vu, M. K. Griffiths and R. Erdélyi

Regular sunspot observations (darker regions on the solar surface, see; https://en.wikipedia.org/wiki/Sunspot) were established as early as the 16th century. Since then, the revolution of IT techniques and tools has reshaped the daily routine of the solar observatories whose main task was and still is to build up various long-term catalogues of a wide range of solar features. The mostly manual workload became gradually being replaced by automated solutions, such as the development of robotic telescopes, automated feature recognition algorithms, etc. Nevertheless, some manual elements still remained a part of the normal daily routine of many astrophysical institutes.

The Sheffield Solar Catalogue (SSC) project (https://github.com/gyengen/SheffieldSolarCatalog) is a free and open-source software package for the analysis of solar data intents to establish a fully automated solar feature recognition environment from the raw images of solar observations to a user-friendly and science-ready data source. The underlying core program is able to provide a real-time comprehensive solar features data analysis environment, aimed to assist researchers within the field of solar physics and astronomy.

At this stage of development, SSC is suitable for generating sunspot data fully automatically, based on white light continuum and magnetogram observations by the Solar Dynamics Observatory (SDO) (https://en.wikipedia.org/wiki/Solar_Dynamics_Observatory) satellite [1]. Although, the project is currently focused on sunspot groups and sunspot identification, the database will be extended later to other solar features, such as solar pores, faculae, coronal holes, jets, spicules and other solar phenomena. 

Figure 1 demonstrates the flowchart of the project, where the rectangles indicate the most important parts of the source code. The source code can be separated into three different layers, as is shown in the lower yellow rectangle. or data production from the raw solar images to the scientific data (i.e., data tables) the backend (or engine) is responsible. This program layer is fully written in Python 3 programming language. 

Figure 1. The flowchart of the SSC project. The main part of the source code is distinguished by the coloured rectangles.

At the first step, the raw observations are downloaded from the JSOC (http://jsoc.stanford.edu/) server which provides the SDO observations. Nonetheless, the data need to be amended before an actual scrutiny. The images must be validated and de-rotated, if necessary. In case of continuum images, limb darkening is corrected, which is an optical effect seen in the solar images, where the centre of the images appears significantly brighter than the edges. Similarly, the magnetogram is corrected as well. After the necessary corrections, the algorithm begins identify the physical boundaries of the sunspots. However, additional information is also required for identifying each sunspot within every active region (AR). Now, the data matrices (i.e. sub-images about the sunspots from the full observation) are selected for each AR by using the HARP data (again, see the JSOC server). The HARP data provide an approximate boundary for each AR, which is an appropriate initial condition for further analysis. The actual physical contour of umbra and penumbra (almost every sunspot can be decomposed into this two regions where the difference between them is the photon intensity) is now generated by the active contour model algorithm (https://en.wikipedia.org/wiki/Active_contour_model). The output is written in individual PDF and PNG files as Figure 2 demonstrates. Finally, the scientifically valuable data are written into an SQL table, where the engine terminates. The appended SQL table is available for further services, however, every few minutes, the engine loops back to the first step with a new observation and so on.


Figure 2. An example for demonstrating the result of the active contour model algorithm. The blue line demonstrates the boundaries of the sunspots. At the top panel, the continuum image is displayed. The lower panel shows the magnetogram observations with the projected sunspot boundaries.


The next layer of processing is the data storage, which contains the output of the engine. Here, the raw scientific data are transformed and stored in a currently popular SQL format. Table 1 demonstrates a few lines of the database. Each line represents one sunspot in each sunspot group. The line contains the most important pieces of information about the spot such as the date, the time of the observation, the coordinates in Carrington Heliographics, Polar and Helioprojected reference systems [2] and the area of every sunspot. The columns on the right-hand side show some basic statistics (maximum, minimum pixels, standard deviation and also the average of the sample) of the pixels composing the sunspots. 

The server also stores images about the processed sunspot groups and contours in FITS and PNG format as demonstrated by Figure 2. The output for one set of observation takes around 50 Mb space on the hard drive. It means that with a 5-minute cadence (the currently chosen default cadence of the project), the program generates about 15 Gb data each day resulting in more than 5 Tb data per annum. Ultimately, a one-minute cadence (the desired temporal resolution) is going to write out 75 Gb daily and 25 Tb data annually, respectively, however this cadence requires massive parallelisation in the source code. 


2017-07-13

20:31:23

c

u

12666

1

292.34

127.06

318.75

23.49

11.7

103.06

18.35

0.18

8.63

26258672

7483371

42519

19808

54286

10787

2017-07-13

20:31:23

c

u

12666

2

267.61

123.16

294.59

24.71

11.5

101.45

16.73

0.01

0.53

1625085

572434

52039

48777

54496

1633.

2017-07-13

20:31:23

c

u

12666

3

273.33

134.4

304.58

26.18

12.17

101.86

17.15

0.5

24.2

73580522

23445947

47365

19811

55608

6900.6

Table 1. The columns are in order: date and time of the observation, type of the data (continuum or magnetogram), type of the sunspot (umbra or penumbra), NOAA number of the AR, the serial number of the sunspot within the sunspot group, x and y coordinates of the Helioprojected reference system, R and Theta coordinates of the Polar system, Latitude, Longitude and LCM of the Heliographics coordinate system. The next two columns show the area of the selected feature. The last 5 columns show the results of the pixels within the defined contours (total, mean, minimum, maximum and standard deviation of the pixels).

Finally, the last layer is the web facility, what is the user-friendly online frontend of the project. The frontend is based on a hybrid software solution, where the HTTPS server is supported by the Python Flask framework (templating HTML pages with CSS) and JavaScript. The web service is able to display, visualise and analyse the data received from the engine backend. The user can select, filter and sort the data. The selected data can be downloaded (via the HTML page or sFTP protocol) or analysed by the built-in plotting tool, powered by the Bokeh engine, which is able to provide elegant and interactive plots (a screenshot is shown in Figure 3).


Figure 3. The user-friendly web interface of the project with fully automatic software solutions based on Python Flask, Bokeh and JavaScript.

The project is going to be extended in the future with additional tools and types of observations. A jet recognition algorithm is now under development, based on SDO AIA images. Furthermore, parallelization techniques will be implemented in the source code in the near future, possibly, by using GPU and/or MPI architectures. 

The project is open-source, therefore, the developing team is constantly looking for researchers who would like to be involved.

[1] Pesnell, W. D. (2015). Solar dynamics observatory (SDO) (pp. 179-196). Springer International Publishing.
[2] Thompson, W. T. (2006). Coordinate systems for solar image data. Astronomy & Astrophysics, 449(2), 791-803
.

MHD Code Using Multi Graphical Processing Units: SMAUG+

posted 20 Sep 2018, 07:09 by Norbert G Gyenge   [ updated 20 Sep 2018, 07:36 ]

    Numerical simulations are one of the most important tools for studying astrophysical magnetohydrodynamic (or MHD) problems since the birth of computer science. MHD modelling (https://en.wikipedia.org/wiki/Magnetohydrodynamics) the physical processes of a complex astrophysical observation frequently requires enormous computational efforts with high compute performance. The SMAUG+ is a numerical finite element solver, which is based on addressing the ideal fully non-linear 3-dimensional MHD equations. The MHD equations are described in details in Griffiths (2015) [1].

    Advances in modern processing unit technology allow us to solve more and more complex physical problems by using faster and higher number of central processing units (CPUs) or accelerators, such as graphical processing units (GPUs). Multi-GPU (mGPU) systems are able to provide further benefits, such as larger computational domains and substantial compute time savings, the latter resulting in saving of operational costs. Many studies demonstrate the performance of effectivity for solving various astrophysical problems with mGPU architecture [2]. The mGPU systems allow us to achieve orders of magnitude performance speed-up compared to CPU cores. However, the mGPU systems also enable to extend considerably the investigated model size or increase the resolution of the computational domain, therefore allowing to obtain more details.

    Figure 1 shows the principles of an example system of architecture. The red boxes represent the computational domain of the initial model configuration. The original grid, however, is now divided into four equal sub-regions, as indicated by the successive serial numbers. Each sub-region is assigned to a CPU. The CPUs are able to communicate with each other using communication fabrics, such as MPI messaging, OMNI-Path technology. Exchanging information between the sub-domains with 'halo' layers is a common practice in parallel computation on CPUs. The halo layers are demonstrated by the white rectangle within the computation domains (red rectangles). The data are obtained from (or sent to) the buffer of the top (or bottom) neighbour processors (indicated by grey rectangles). By sending and receiving only 'halo' cells and not the full grid, we reduce the size of the communications. The CUDA platform, however, provides us to access GPU accelerated solutions. Here, the actual numerical methods are performed/applied by the GPUs.

    We ran the simulations using two different HPC facilities, namely, the University of Cambridge (Wilkes) and the University of Sheffield (ShARC) architectures. The Wilkes computer consists of 128 Dell nodes with 256 Tesla K20c GPUs. Each Tesla K20c contains 2496 GPU cores, hence the total number of GPU cores is 638976 and the total number of CPU cores is 1536. The ShARC cluster provides 8 NVIDIA Tesla K80 (24 GiB) graphical units. The total number of GPU cores is 39936. 


Figure 1. Flowchart outlining SMAUG+ with the implemented MPI parallelisation technique. The red boxes demonstrate the initial and distributed model configurations. The configuration is equally divided and spread around the different CPUs and GPUs, using MPI and CUDA. The white rectangles show the 'halo' cells, the grey rectangles demonstrate the buffers for storing the exchanged information. The red domains within each process show the actual mesh outline and the blue boundaries mark data, which is stored by different processes. The numerically intensive calculations are sent and performed by the GPUs. The GPUs distribute the subdomains further for calculating the actual numbers. These calculations are performed by the thousands of GPU cores (green rectangles). The figure is an example of a 2 x 2 configuration.

 
Figure 2. Orszag-Tang vortex results. The initial configuration contains 1000x1000 data and is distributed among 4 GPUs (2x2). The figure shows the temporal variation of the density for t1=0.04s, t2=0.25s. The simulations are performed at the ShARC Cluster.

    The Orszag-Tang vortex is a common validation test employing two-dimensional non-linear MHD (). Figure 2 is a snapshot of an example Orszag-Tang simulation. The panel shows the temporal variation of the density on a linear colourmap, various string waves pass through each other. This motion creates turbulent flow in different spatial scales. Figure 2 demonstrates that there is a convincing agreement between the results of SMAUG+ and other MHD solvers. We ran a series of simulations with different simulation box resolution as seen in Table 1. 

        Grig Size             

        Number of GPUs              

        Running Time [s]             

1000 x 1000

1 x 1

34

1000 x 1000

2 x 2

11

1000 x 1000

4 x 4

13

2044 x 2044

2 x 2

41

2044 x 2044

4 x 4

43

4000 x 4000

4 x 4

77

8000 x 8000

8 x 8

61

8000 x 8000

10 x 10

41


Table 1. Timings for 100 iterations for the Orszag-Tang test. The results are based on the simulations performed at the Wilkes Cluster.

    The actual parallel performance of the applied models is determined by various factors, such as: (i) the granularity of the parallelizable tasks, (ii) the communication overhead between the nodes, (iii) finally, the load balancing. The load balancing refers to the distribution of the data among the nodes. If the data distribution is not balanced, some of the GPUs with less load must wait until the heavily loaded GPUs finish the job. We always use equally divided configurations, hence all the GPUs are equally loaded. The granularity of the parallel task represents the amount of work that will be carried out by a certain node. The communication overhead is the cost of sending and receiving information between the different nodes. In our case, the overhead is built-up by two components: the MPI node communication and the CPU-GPU information transfer. From the GPU memory the data must be transferred to the system memory. From here, the CPU will send the information to another CPU node, finally, this node transfers the data to the GPU and so on. This continuous data transfer significantly jeopardises the parallel performance. This is the consequence of using not computationally dense GPUs. The parallel slowdown could be the result of a communication bottleneck. More GPUs must spend more time on communication. As shown above, choosing the non-optimal configuration could cause massive wasting of computational power. 

    To avoid parallel slowdown the following must be considered: (i) Only increasing parallelism will not provide the best performance [3]. Increased parallelism with non-optimal data granularity could easily cause parallel slowdown. (ii) The amount of exchanged MPI messages must be reduced as much as possible for the best performance [4]. It also means that a single GPU could give better performance than multiple GPUs, if the applied model size is the same. (iii) A task must be enough to overlap the parallel communication overheads. The processes must have a higher task granularity if the number of applied GPUs increases. For avoiding the communications overhead, it is advisable to use always arithmetically dense GPUs. (iv) It is possible to improve communication performance by using higher-performance communication hardware. 

    By applying the above principle parallel performance speed-up is possible. For instance, the 1000 x 1000 Orszag-Tang test with 4 GPUs is around 3 times faster than the 1 GPU configuration or the 8000 x 8000 test shows 1.5 times speed-up between 64 and 100 GPUs (Table 1). However, the primary aim of our approach is to archive the extended model size. A single GPU is only able to support limited memory but using our method an mGPU system provides as much memory as the total of GPUs. The only disadvantage is the communication overhead, however, an mGPU system may still be faster and significantly cheaper than a multi-CPU approach. For the same price, a GPU contains orders of magnitude more processing cores than a CPU. Our approach provides affordable desktop high-performance computing. 

    The developed software provides the opportunity to execute/perform large-scale simulations of MHD wave propagation mimicking the strongly magnetised solar atmosphere, in particular, representing the lower solar atmosphere from the photosphere to low corona. Such an approach is important, as there are a number of high-resolution ground- (e.g. SST - Swedish Solar Telescope, La Palma; DKIST - Daniel K. Inouye Solar Telescope, USA to be commissioned in 2019 or the EST - European Solar Telescope, to be realised by the second half of the next decade) and space-based (e.g. Hinode, SDO - Solar Dynamics Observatory, IRIS - Interface Region Imaging Spectrograph) facilities providing a wealth of earlier unforeseen observational details that need now to be understood. 

[1] Griffiths, M., Fedun, V., and Erdelyi, R. (2015). A fast MHD code for gravitationally stratified media using graphical processing units: SMAUG. Journal of Astrophysics and Astronomy, 36(1):197–223.

[2] Stone, J. M. and Norman, M. L. (1992a). ZEUS-2D: A radiation magnetohydrodynamics code for astrophysical flows in two space dimensions. I - The hydrodynamic algorithms and tests. The Astrophysical Journal Supplement Series, 80:753–790.

[3] Chen, D.-K., Su, H.-M., and Yew, P.-C. (1990). The impact of synchronization and granularity on parallel systems. SIGARCH Comput. Archit. News, 18(2SI):239–248.

[4] Thakur, R., Gropp, W. D., and Toonen, B. (2004). Minimizing synchronization overhead in the implementation of MPI one-sided communication. PVM/MPI, 3241:57–67.

Modelathon 2017

posted 27 Jun 2018, 01:58 by Ben Hughes   [ updated 27 Jun 2018, 02:05 ]

 CiCS Research IT worked closely with Insigneo Institute to provide compute resources and support for the Modelathon 2017 multi-scale modelling event.

The 2017 event was based around producing a murine in silico model to look at the effects of loading and angiogenesis on the healing process in bones. The starting point for these challenges are CT and MRI images of a mouse tibia.

Competitors used CiCS Managed Desktop machines in the Diamond for pre-processing models and data before submitting them as batch jobs to the ShARC for high performance compute and remote visualisation of post-processed data.



Polaris project - HPC computational modelling and clinical evaluation

posted 22 Jun 2018, 04:59 by Desmond M Ryan   [ updated 25 Jun 2018, 00:49 ]

POLARIS (Pulmonary, Lung and Respiratory Imaging Sheffield)
Pulmonary MR imaging at Sheffield focuses on the physics and engineering methods for lung imaging 
with proton and hyperpolarised gas MRI, and their computational modelling and clinical application.
CICs Research-IT provided advice on, purchase of, the installation and support of:
                    - worker, visualization & storage nodes
                    - software install and support

The Polaris project & CICs jointly funded a Research Computing Support Officer position so as to provide Polaris project specific and Research-IT support.















Solar Physics Space Plasma Research Centre - working with Research-IT

posted 19 Jun 2018, 01:13 by Michael K Griffiths

The work of the Research-IT support team is dependent on establishing partnerships with a variety of groups both external and internal to the University of Sheffield.  Over a number of years Research-IT has worked with Solar Physics and Space Plasma Research Centre (SP2RC2) . This has proven to be a successful partnership with the following successes
  • Graduate Research Computing Support Officer - a research student working with Research-IT support
  • Development of software for computational Magnetohydrodynamics codes for modelling the solar atmosphere
  • Provision of Software and HPC resources
  • Establishing the NVIDIA GPU Computing Research Centre
Solar Corona SP2RC2 is at the forefront of addressing theoretical and observational issues in solar and solar system physics that include solar magneto-seismology, dynamics of the solar atmosphere, solar wind, magnetosphere and Space Weather. This Centre is one of the largest and most dynamic solar and solar system physics research groups in the country and is well renowned internationally.

http://sp2rc.group.shef.ac.uk/
Amongst its many successes, researchers from SP2RC2, as part of an international team, have discovered tornadoes in space which could hold the key to power the atmosphere of the Sun to millions of kelvin.

Space Tornadoes Power the Atmosphere of the Sun
Solar Tornado
The partnership has been particularly successful with the development of training in python programming by our graduate research computing support officer from SP2RC2. The new course was a well received addition to the portfolio of Doctoral development modules with the Research-IT training courses.


Working with SP2RC2 we adapted the Sheffield Advanced Code for Magnetohydrodynamics so that it was able to utilise accelerator hardware such as the NVIDIA Graphical Processing Units. This work has enabled research-IT to not just engineer software for researchers but also to gain an appreciation of the support that researchers require throughout the full lifecycle of a project. For further details see the blogpost:


The partnership with Research-IT provided the first GPU computing nodes for general usage and established the NVIDIA GPU Research Centre at Sheffield.




1-6 of 6