AI in language processing and medical-imaging analytics

Location: Room 9.57, Worsley Building, University of Leeds

14:30-14:45: tea & coffee

14:45-15:00: Ciarán McInerney, Secretary of the Leeds/Bradford Group

Introductions

RSS Diversity and Inclusion policy - Call for new members

15:00-15:40: Prof. Eric Atwell, School of Computing, University of Leeds

How to represent the meaning of an English word as a vector of numbers: distributional lexical semantics and word embeddings.

(Download presentation)

(View recording of presentation)

In this talk, I hope to give a flavour of some of the statistical theory behind modern text analytics algorithms. If you think text analytics research could help your business, talk to me afterwards!

In text analytics applications, we often want to measure how “similar” two English words or texts are. One way is to count how similar the letter-sequences are; so “computing” and “computer” are close, but “marriage” and “wedding” are distant. But we really want to measure similarity in MEANING or semantics, not similarity in letter-sequences. To do this, we need to represent the MEANING of a word or a text as a vector of numbers. Then we can use known metrics for distance between vectors, such as Euclidean distance, the length of the straight line between the two points in space represented by the vectors. We can use this for practical tasks such as text semantic similarity, or Word Sense Disambiguation.

I will introduce some ways to convert the meaning of an English word or text into a vector of numbers. A simple way is one-hot encoding, a vector with a feature for every English word, set to 1 for every word-type in the text and 0 for all other English words. Another way is to link the text to an ontology or set of concepts in a domain. Another way to capture meaning of a word is to look at the contexts it is used in: a vector with a feature for each significant context word, known as distributional lexical semantics. But English has 100,000+ words, and this leads to very large vectors. Deep Learning methods can compress large vectors into vectors of 1000 or even 400 numbers, known as word embeddings. The word embedding of “marriage” should be a vector close to the embedding of “wedding”.

15:40-16:20: Dr Nishant Ravikumar, School of Computing, University of Leeds

Quantitative cardiac magnetic resonance imaging using machine learning

(Download presentation)

(View recording of presentation)

Cardiac magnetic resonance imaging forms a central part of the cardiac diagnostic workup, and is used to extract various quantitative measures that characterise cardiac structure and function. These in turn aid in detecting abnormalities and diagnosing various cardiac pathologies. Designing automated systems for object localisation, segmentation, and registration of cardiac MR images would consequently help enhance the overall diagnostic workflow, and enable the extraction of novel biomarkers that have the potential to improve disease stratification. In this talk I would like to give an overview of some of the past and present work underway at our lab, on designing automated systems for extracting cardiac functional indices from magnetic resonance images.

16:20 - ...: Wrap-up and evening meal

Feel free to join the speakers and members of the local-group committee for an informal meal.

Venue TBA on the day.