Tarique Anwar

My research so far has been centered around the area of Data Engineering and Management, with its applications in the domains of Urban Transportation Systems, Social Networks, and Cyber Security. It is mostly based on the core concepts of Data Structures, Algorithm Design, Graph Theory, and Data Mining. A summary of some of the major projects that I have worked on (or working) is given below.

Early Detection of Emerging Spatial Events from Spatio-Temporal Data Streams

We usually become part of various kinds of gathering activities (aka events) in our daily lives, including big fat weddings, sports matches, concerts, political or social protests, and natural disasters. Such events are associated with some particular location, and thus called spatial events. In other words, spatial events are abstract entities existing for limited times at particular locations in space. They are characterized by their start and end times, geographic coordinates of occurrence, and some thematic attributes. The normal functioning of the day-to-day activities often get highly affected by the spatial events in the location of their occurrence. For example, a political protest or a religious procession in a big city leads to halting the nearby traffic and disrupting the normal business activities, which sometimes cause a huge monetary loss. While some of them have recurring patterns, others start without much knowledge in advance. An early detection of such events even before they take place is very important for us to avoid or manage unexpected disruptions of our routine works and plan alternatives. These days we record huge amounts of spatio-temporal mobility data of people in real-time through different sources that include installed sensors, crowd-sourcing platforms, smart-phones, and location-based social networks. The data from these sources form data streams that are rich in information. The aim of this project is to mine the spatio-temporal mobility data streams in order to detect the emerging spatial events at an early stage, so that the unforeseen problems can be alleviated.

Members involved: Tarique Anwar, Timos Sellis, Hai L. Vu, Rahul Pal Singh

Funding source(s): ISIRD, IIT Ropar

Outcomes*:

Identifying and Tracking Influential Events in Large Social Networks

The rapid growth of social networks provides a significant opportunity for people to learn the world. Indeed, social content is the fastest-growing category of new content in the enterprise and will eventually attain 20% market penetration, according to a study. Making sense of big social network data finds tremendous applications and could bring huge benefits to the society, enterprises, and individuals. Social network data collected or aggregated from Twitter, Facebook, YouTube, LinkedIn, etc., can be used directly to identify and track, say, breaking news, important events, and gain insights into the social activities that take place in our world. Social network data can also be enriched by the domain and company data, which enables domain experts to spot underlying trends and patterns and to extract actionable knowledge that can be beneficial for product manufacturing, guiding and marketing. Previous works focused on the topical event prediction by mining text information or graph patterns. However, the research outcomes in this area are still far from satisfactory. The aim of this project is to identify the influential events in real-time, and analyze the evolution in them over a period of time.

Members involved: Tarique Anwar, Chengfei Liu, Rui Zhou, Md. Mushfique Anwar, Jianxin Li, Zafaryab Rasool, Afzal Azeem Chaudhary

Funding source(s): Australian Research Council (ARC) Discover Project DP160102412

Outcomes*: [C.14]

Spatial Partitioning of Road Traffic Networks and their Temporal Evolution (PhD thesis)

Urban areas generally attract people from all interior areas. According to the current global trend, people are rapidly migrating from rural towards urban areas for several reasons that include availing better livelihood services and seeking better employment opportunities. Consequently, the population of cities all over the world is increasing significantly, and thereby raising the mobility demands manyfold. This strongly motivates the research areas of urban planning and urban computing to develop innovative technologies and move towards smart and more sustainable cities. As most of the urban population travel daily or frequently for their work or studies, traffic congestion has become a very important practical problem. It is affecting the urban population directly by incurring extra cost on the fuel and extra time spent, and indirectly in many ways. An important concern in smart urbanization of our societies is the avoidance of such congestions and maintenance of a smooth transportation. While the infrastructure development is one direction to deal with this problem, the analysis of spatial traffic data to discover the congestion formation and propagation patterns, and apply them to optimize the traffic flow is another direction. The research on road traffic networks data analysis is growing with the problems like fastest route computation, traffic clustering, traffic prediction, emerging event detection, anomaly detection and bottleneck identification. To discover the congestion patterns, the continuous tracking of the spatiotemporal evolution of the traffic load leading to congestions is an important problem. The research on development of methods to identify the congested partitions effectively and track their evolution efficiently has been very limited so far. In my PhD thesis, we aim to capture the spatiotemporal evolution of urban road traffic networks. To this end, we propose technical methods to effectively partition road traffic networks in order to obtain the differently congested partitions at a point of time, and incrementally update those partitions in an efficient manner in order to track their evolution in real time.

Members involved: Tarique Anwar, Chengfei Liu, Hai L. Vu, Christopher Leckie, Md. Saiful Islam, Timos Sellis, Serge P. Hoogendoorn

Funding source(s): NICTA / CSIRO Data61

Outcomes*: [C.7], [C.8], [C.9], [C.11], [J.6], [J.7], [J.8], [T.2]

A Web Surveillance Framework for Criminal Networks Identification and Monitoring

Nowadays most of the concealed anti-social organizations and their supporters use World Wide Web (WWW) for propaganda, training, recruitment and fund-raising. It is becoming a challenging task for intelligence agencies to track movement and activities of suspects around the globe. Due to huge amount of Web data and increasing cyber crimes, it has been realized by many researchers that data mining techniques could be very useful for many national and international security initiatives. Primarily focusing on extraction of implicit and novel patterns (knowledge) from huge databases (structured data), data mining techniques have also proven to be very effective for analyzing unstructured (text documents) and semi-structured (Web documents) data. In the context of Web surveillance, data mining can be a potential means to identify terrorists and their activities through analyzing Web data and social networks evolved from communication networks including e-mails, enterprise portals, online forums, social networking sites like Facebook, Twitter, etc. In this project, we worked to design a knowledge-based Web surveillance framework that combines information retrieval, natural language processing and data mining techniques to identify criminals and their networks on the Web. The major components of our project are: (i) a data crawling and indexing process to retrieve focused multilingual Web contents, (ii) social network analysis to identify communities, leaders, (iii) sentiments analysis, (iv) in-depth web content mining to generate criminals profiles, (iv) in-depth web usage mining and link mining to track individual criminals and their linkages with the mined communities, (v) creation of a knowledge portal assist in national security. The system would provide necessary knowledge for the Saudi cyber law and security enforcement agencies to identify individuals who possibly can spread criminal ideologies, criminal profiling, criminal (social) network analysis, and visualization of criminal’s activities, linkages and relationships.

Members involved: Tarique Anwar, Muhammad Abulaish

Funding source(s): King Saud University, KACST NPST project 11-INF1594-02

Outcomes*: [C.3], [C.5], [Ch.1], [J.4], [J.5]

Suspect Profiling and Tracking on the Web

Facebook, a common buzz everyone must have noticed around them; the other ones comprising Twitter, Orkut etc. Nowadays it has become a common trend to create and maintain personal information on WWW. Moreover the rapid growth of online friend-making pattern has resulted in full-fledged global social networking community where like-minded people exchange their thoughts. Online forums are the other common means that have attracted persons to discuss their topics of interest, because of their simplicity and ease of access. Unfortunately, these latest trends have also proliferated its use by various tech-savvy anti social elements for better communications among them and practicing various types of Cyber crimes. The Web is being used as a tool to practice these crimes in a sophisticated manner, which range from hacking for data theft or identity theft and online fraudulences to Cyber bullying, online pedophilia and Cyber terrorism. All these law breaking criminal offences are the biggest threat to the social life of common people and also security aspects of the nations. Criminal organizations find it as a powerful means to spread their propaganda around the globe simplifying their process to raise funds, recruit people, provide training, etc. The enormous amount of open data on the exponentially growing Web can surely bring forth a lot of valuable information to enable suspect profiling and tracking. In the project Warak-Warak we worked to develop a Web People Search Engine for profiling suspected users and tracking their malicious activities in the Cyberspace.

Members involved: Tarique Anwar, Muhammad Abulaish, Khaled Alghathbar

Funding source(s): King Saud University, KACST NPST project 11-INF1594-02

Outcomes*: [C.1], [C.6], [J.3]

Keyphrase Extraction from Text Documents (Maters thesis)

To this date majority of useful information on the Web occur in the form of text, which are either unstructured or semi-structured in nature. The easy accessibility of electronic media and World Wide Web has conduced the exponential growth of text information overload. Thus even though there is no scarcity of information on the Web, locating, extracting and analyzing required information from this vast unstructured collection is a complex and challenging task. These complexities call for the need of an automatic text information processing system to identify and extract right information at right time for effective decision making. Keyphrases provide a semantic metadata that summarize and characterize documents and enable readers to quickly determine whether the given article is in the reader’s fields of interest. We worked to devise a light-weight machine learning approach for automatic keyphrase extraction using various lexical and semantic features mined from text documents, with the objective for its applicability to real time environments. The machine learning approach first builds a prediction model using training documents with known keyphrases, and then uses the model to find keyphrases in new documents.

Members involved: Tarique Anwar, Muhammad Abulaish

Outcomes*: [C.2], [J.1], [J.2], [T.1]

*PS: Please see the publications page to find the referred papers mentioned as outcomes above.