A seminar series by and for HDR students at the Clayton School of Information Technology, Faculty of Information Technology, Monash University, Australia.

2days until
the next seminar (Hiran)

Upcoming


Upcomming June 2012

posted May 28, 2012 1:42 AM by James Collier   [ updated May 28, 2012 5:33 PM ]

Tuesday 1 June 2012 1:00 - 2:00 pm (Room 115/63)
Speaker: Hiran Ganegedara
Title: Exploratory data analysis using scalable self-organising maps

Abstract:
There is a significant growth in the amount of data available for anaysis and decision making purposes. The Self-Organising Map (SOM) and the Growing Self-Organising Map (GSOM) are widely used unsupervised techniques to visualises the data set and are useful in identifying patterns in data. Finding interesting patterns from massive volumes of data could be highly time consuming and the time requirement will grow with the increase in the data quantity when SOM/GSOMs are used. Processing high volumes of data is a challenging task, given the limited computing power available in most computers. Recent developments in parallel and distributed computing techniques as well as multi-core CPU architectures have opened up a new avenue for large scale data processing by providing high volumes of computing power. This presentation aims at introducing a new technique which enables the SOM algorithm to scale with the number of computing resources. The presented technique will improve SOM/GSOM’s performance by several orders while maintaining the same level of accuracy.

Upcoming May 2012

posted Apr 30, 2012 11:04 PM by James Collier   [ updated May 15, 2012 6:47 PM ]

Friday 18 May 2012 1:00 - 2:00 pm (Room 115/63)
Part I
Speaker: Upuli Gunasinghe
Title: Sequence Learning using the Adaptive Suffix Trie Algorithm
Authors: Upuli Gunasinghe and Damminda Alahakoon

Abstract:
Sequences occur naturally in many domains such as biology, engineering, finance and scientific research. Since humans have the inherent ability to comprehend and utilize sequences in day to day cognitive tasks such as speech, vision and motor control; biologically inspired sequence learning techniques are used for explanatory data analysis in these domains.

Identifying the common substrings which exist in sequences helps in determining the underlying structure and calculating the similarity between sequences. The suffix trie, suffix tree and suffix array are data structures which are used in many solutions to sequence based problems. However, these are static data structures and not flexible tools which can be used for sequence learning. In this word we present the Adaptive Suffix Trie algorithm, a sequence learning algorithm which can be used for identifying substrings of different lengths and frequencies from a given set of sequences. In contrast to suffix data structures which store all suffixes, the adaptive suffix trie only captures the frequent substrings that occur in the given dataset, resulting in a less complex structure with only the relevant or useful information. We show how the algorithms' learning parameters can be adapted for extracting substrings with the required characteristics and then demonstrate it's application in the classification of biological sequences.

Part II
Speaker: Upuli Gunasinghe
Title: A Sequence Based Dynamic SOM Model for Text Clustering
Authors: Upuli Gunasinghe, Sumith Matharage and Damminda Alahakoon

Abstract:
Text clustering can be considered as a four step process consisting of feature extraction, text representation, document clustering and cluster interpretation. Most text clustering models consider text as an unordered collection of words. However the semantics of text would be better captured if word sequences are taken into account. In this work we propose a sequence based text clustering model where four novel sequence based components are introduced in each of the four steps in the text clustering process. Experiments conducted on the Reuters dataset and Sydney Morning Herald (SMH) news archives demonstrate the advantage of the proposed sequence based model, in terms of capturing context with semantics, accuracy and speed, compared to clustering of documents based on single words and n-gram based models.


Tuesday 8 May 2012 1:00 - 2:00 pm (Room 207/63)

Speaker: Sunil Aryal
Title: Generative classifiers based on mass

Abstract:
Generative classifiers estimate the class conditional likelihood p(x|y) and the class prior p(y) and use Bayes rule to predict the most probable class that maximises the class posterior p(y|x).  Density estimation is required to estimate the class conditional likelihood p(x|y). Current density estimators such as kernel density estimator and k-nearest neighbour density estimator are impractical in big and multi-dimensional databases.

To mitigate this difficulty, some classifiers assume that attributes are conditionally independent given the class label y, and estimate the density distribution on one dimension, p(x_i|y) at a time. This assumption is too rigid and often violated in real world problems where attributes are related in some way. Some flexible Bayesian classifiers are proposed with less restrictive assumptions. 

In this research, we propose a new type of generative classifier called "MassBayes", that estimates the likelihood by mass estimation. Mass estimation does not make any explicit assumption about the distribution. Empirical evaluations show that MassBayes yields better results than existing generative classifiers on benchmark data sets, specially in big data sets. MassBayes has sub-linear time complexity and constant space complexity; hence, it scales better with big databases.

About Sunil Aryal:
Sunil Aryal did his bachelor in Information technology from Purbanchal University, Nepal. He is currently doing Master of Information Technology (Research) in Monash Univeristy with A/Prof. Kai Ming Ting. Before joining Monash, he worked as a research assistant in Katholieke University, Leuven, Belgium. He also worked as a software developer in Sydney for two years. His research interest includes data mining and machine learning, mass-based learning etc.

Upcoming April 2012

posted Apr 15, 2012 7:17 PM by James Collier

Thursday 19 April 2012 1:00 - 2:00 pm (Room 115/63)

Speaker: Cora Beatriz Perez-Ariza
Title: Recursive Probability Trees for Probabilistic Graphical Models

Abstract:
Recursive Probability Trees aim to represent the probabilistic information encoded in a probabilistic graphical model in a more compact and efficient way than traditional structures do. They capture context-specific independencies within the distribution, and also they can represent other peculiarities like certain factorizations. By being able to work with this factorized and compacted structure, the inference process can be speeded up. In this talk I would like to introduce the structure and the approach we have followed so far to build it, that involves looking for good approximations of the distributions when an exact representation is not possible.

Upcoming March 2012

posted Mar 15, 2012 7:36 PM by Hiran Ganegedara

Monday 19 March 2012 1:00 - 2:00 pm (Room 135/26)

Host: Nitin Mahadeo
Title: Towards mainstream use of iris biometrics 

Abstract:
The iris is the most accurate biometric to date. However, iris recognition is still in its infancy compared to other biometrics such as fingerprints or face recognition. In order for the iris to be widely accepted, it needs to be able to perform in a robust and reliable manner in a variety of imaging conditions. In this talk, we examine the strengths and weaknesses of current implementations and how they can be improved. This first part of our work focusses mainly on the segmentation stage in an iris recognition system. Different segmentation techniques are explored and a novel model-based technique for accurate iris localization in noisy images is proposed. Our results show both improved accuracy and speed. We also present a new approach for eyelid, eyelash and shadows detection in eye images. Our aim is to take the iris biometric a step forward towards mainstream use for recognition and identification purposes. New approaches for achieving better performance and reliability are also discussed.

2012

posted Mar 11, 2012 8:11 PM by Hiran Ganegedara

We will be starting the seminars for 2012 soon.
Stay tuned.

Upcoming December 2011!

posted Nov 28, 2011 8:08 PM by Hiran Ganegedara

Friday, 2 December 2011, 1:00-2:00pm (Room 135/26)

Host: Amir Basirat
Title: Distributed Associative Memory Approaches for large-scale Data Processing in Cloud Computing Environments and Wireless Sensor Networks

Abstract:
Unlike the early computations that used several bytes of data, existing computing infrastructure has been able to generate and store more than peta-bytes of data for day-to-day operations. This poses a question of whether our capability to recognise and process these data, matches our ability to generate them? In this short talk, this question will be addressed, by looking at the capability of existing recognition schemes to scale up with this outgrowth of data. Applications such as pattern recognition are essential in providing front-end mechanism for data processing. However, a different perspective of pattern recognition will be considered. Rather than looking at conventional approaches, such as statistical computations and deterministic learning schemes, this research will be focusing on distributed processing approach for scalable pattern recognition. My research work aims to explore new methods of partitioning and distributing data that is, resource vitalisation in the cloud and WSNs by fundamentally re-thinking the way in which future data management models will need to be developed on the Internet. Loosely-coupled associative computing techniques, which have so far not been considered, can provide the break through needed for a distributed data management scheme.

Upcoming November 2011!

posted Nov 23, 2011 9:14 PM by Hiran Ganegedara

Tuesday, 29 November 2011, 1:00-2:00pm (Room 135/26)

Host: A/Prof Graham Farr
Title: Co-authorship and other publication issues

Abstract:
This will cover matters such as:
- what kind of contribution merits a co-authorship of a paper?
- how is order of authors determined?
- how do these things vary from one discipline to another?
- how can disagreements about co-authorships be resolved?


We'll have a panel consisting of:
- A/Prof Graham Farr (HDR Co-ordinator, Clayton School of IT)
- A/Prof Maria Garcia de la Banda (Head, Caulfield School of IT)
- Dr Arun Konagurthu (Larkins Fellow)
The discussion will be chaired by Prof Kim Marriott (Head, Clayton School of IT). Panelists will talk for about 5 mins each and then there will be plenty of time for discussion and questions.
The discussion will focus on computer science but will touch on related issues in other disciplines.

Upcoming October 2011

posted Oct 26, 2011 10:41 PM by Hiran Ganegedara   [ updated Oct 26, 2011 10:41 PM ]

Thursday, 27 October 2011, 1:00-2:00pm (Room 135/26)

Host: Hiran Ganegedara
Title: Scalable Data Mining: A Sammon's Projection Based Techniqe for Merging Self Organising Maps

Abstract:
Self-Organizing Map (SOM) and Growing Self-Organizing Map (GSOM) are widely used techniques for exploratory data analysis. The key desirable features of these techniques are their applicability to real world data sets and their ability to visualize high dimensional data in low dimensional output space. One of the core problems of using SOM/GSOM based techniques on large datasets is the high processing time requirement. One possible solution is the generation of multiple maps for subsets of data where the subsets consists of the entire dataset. However the advantage of topographic organization of a single map is lost in the above process. I will be presenting a new technique where Sammon's projection is used to merge an array of GSOMs generated on subsets of a large dataset.I will be discussing cluster accuracy and performance analysis for several datasets. This technique is ideally suited to harness the processing power of parallel computing resources.

Upcoming October 2011!

posted Oct 9, 2011 11:01 PM by Hiran Ganegedara

Tuesday, 11 October 2011, 1:00-2:00pm (Room 115/63)

Host: 
Dror Cohen
Title: "Computational Neuroscience, Physics envy and the Free-Energy Principle"

Abstract:
The biological sciences are increasingly utilising computational approaches for data analysis, as well as to better understand the governing mechanisms. Computational insights are particularly valuable in the neural sciences where the relationship between function and physiology is intricately coupled and difficult to discern. A recently proposed Free-Energy principle attempts to provide a unifying framework for the understanding of computational mechanisms throughout the cortex. We demonstrate how this principle can produce topography preservation, a feature that has been well observed in the cortex. 

Upcoming September 2011!

posted Aug 28, 2011 5:46 PM by Hiran Ganegedara   [ updated Aug 31, 2011 4:31 PM ]

Tuesday, 6 September 2011, 1:00-2:00pm (Room 115/63)


Host: Sara Miranda
Title: "The Library – your new best friend and partner in your postgraduate degree"

Abstract: 
The Hargrave-Andrew Library has staff dedicated to helping academics and students in the Clayton School of Information Technology.

Sara Miranda, the information research skills librarian can assist in effective use of library services and resources, including databases, finding information, citing and referencing.

Noriaki Sato, the learning skills adviser, can assist with thesis writing, oral communication and presentation, and writing for research projects. Postgraduate students can arrange individual sessions with Nori, or participate in group sessions tailored to your needs.
 
In this session we will present an overview of what we do, and go into some detail on how you can use our services and facilities. If there is anything you wanted to know about the library but were reluctant or didn’t have time to ask, 
this is your opportunity.

1-10 of 19