Current Projects

Project 1: Increasing Agricultural Yield in Pakistan by Statistically Analyzing and Mining Soil Tests
  • Collaborators: Dr. Tariq Mahmood, Madam Asma Hayat
  • Time Period: August 2012 – Date
  • Summary: In this project, we will apply data mining techniques on soil tests (related to Pakistani agriculture), in order to provide useful recommendations to increase the soil yields.

Project 2: Analyzing Terrorist Events in Pakistan to Support Counter-terrorism
  • Collaborators: Dr. Tariq Mahmood, Miss Khadija Rohail, Mr. Ghulam Mujtaba
  • Time Period: August 2011 – Date
  • Summary: Over the past couple of decades, Pakistan has witnessed a remarkable increase in the number of terrorist events across its major provinces. These events are not random, and to date, there exists no concrete analysis which can help us identify important patterns in the occurrences of these events, which could provide valuable assistance to counter-terrorism authorities. This project presents the first effort in this direction. We obtain a reliable database of terrorist events, and apply data pre-processing techniques, followed by cluster analysis with the CLOPE algorithm. We obtain clusters related to two combinations, i.e., “Terrorism Event – Terrorism Target”, “Terrorism Event – Terrorism Method”. We also propose a “Terrorism Intensity Statistic”, which is based on the number of casualties and injured people. We annually analyze the TIS values and number and content of the clusters, effectively from 2001 to 2012, separately for each of the four major Pakistani provinces. 
  • Publications: This work has been published in two papers in the IEEE-based ICRAI conference. It was also short-listed for the best paper award.
  • Current Status: Applying time-series prediction techniques on the terrorist events database

Project Title: Using Reinforcement Learning to Tune Document Clustering

  • Collaborators: Dr. Tariq Mahmood, Mr. Muhammad Rafi
  • Time Period: Fall 2012 - Fall 2012 
  • Summary: In this project, we are using Reinforcement Learning (RL) to learn the particular cluster in which to group a given document. Our document corpus constitutes three different clusters, i.e., documents belonging to automobiles, computer graphics, and software engineering. Using the Q-Learning RL algorithm, we are currently “training” the RL agent to learn which type of document should be clustered in which cluster. Our preliminary results reveal that different configurations need to be adopted to fine tune Q-Learning, which we are currently doing.

Project Title: Are all Social Networks Structurally Similar? A Comparative Study using Network Statistics and Metrics

  • Collaborators: Dr. Tariq Mahmood, Dr. Faraz Zaidi (PAF-KIET)
  • Time Period: Fall 2011 – Spring 2012
  • Summary: The modern age has seen an exponential growth of social network data available on the web. An analysis of these networks reveals important structural information about these networks in particular and about our societies in general. More often than not, analysis of these networks is concerned in identifying similarities among social networks and how they are different from other networks such as protein interaction networks, computer networks and food web. In this project, our objective is to perform a critical analysis of different social networks using structural metrics in an effort to highlight their similarities and differences. We use five different social network data sets which are contextually and semantically different from each other. We then analyze these networks using a number of different network statistics and metrics. Our results show that although these social networks have been constructed from different contexts, they are structurally similar. We also review the snowball sampling method and show its vulnerability against different network metrics.
  • Current Status: Inclusion of more comparison parameters and usage of larger data sets.

Project Title: Developing a Cooking Ontology for Pakistani Recipes

  • Collaborators: Dr. Tariq Mahmood, Dr. Robert Trypuz (Assistant Professor at JPC University, Poland:
  • Time Period: Fall 2012 – Spring 2013 2012
  • Summary: This project concerns the design and implementation of an ontology for Pakistani recipes. We also are applying data mining techniques to the ingredients of recipes, in order to discover useful associations between the recipes. The ontology and the mining results will be used to supplement the query of the users, in order to recommend personalized recipes.

Project Title: Applying Association Rules (Data Mining) on WordNet Entries

  • Collaborators: Dr. Tariq Mahmood, Dr. Ahsan ul Morshed (Post Doctorate in Progress from Victoria University, Australia)
  • Time Period: Fall 2012 – Fall 2013
  • Summary:  The project involves finding useful association rules (data mining) between WordNet entries, in order to assist in building synonyms. Currently, the exact outputs of the projects are still being decided.

Project Title: Mining Players’ and Teams’ Profiles to Support Anti-Match Fixing in Cricket

  • Participants: Dr. Tariq Mahmood + FYP Group
  • Summary: This is an on-going project in which we are applying data mining on the profiles of cricket teams, batsmen, bowlers, and wicket-keepers, in order to identify suspicious activities of these entities or persons. The aim is also to assist the team managers to design playing strategies, e.g., given a set of ground and weather conditions, which team in more likely to win, which player will score more than 50, which bowler will take at least one wicket, how many catches will be taken etc. The suspicious activities are related more to the bowling, batting and performance patterns, e.g., the number of runs which a team should statistically conceive in the first ten overs, the number of runs which should be statistically score in the last ten overs, expected number of no-bowls, or wide-bowls etc.
  • Expected Outputs: We definitely plan to submit our results to both the Pakistan Cricket Board (PCB), as well as to any anti-match fixing authorities. We are expecting at least 3 Conference papers, along with 1 Journal publication on this work.