Department of Electrical Engineering, Computer Engineering and Informatics
The main research fields Dr. Herodotou is currently working on are:
As a strong supporter of applied research, he focuses on innovating and solving technically challenging problems in the areas of information management, infrastructure for large-scale cloud database systems, reducing the total cost of ownership of information management systems, enabling flexible ways to query, browse and organize rich data sets containing both structured and unstructured data, and the automated management of database and data processing systems.
Hierarchical Storage Management for Cluster Computing
Scalable Database Management Systems
Maritime Data Management and Analytics
Data Management for IoT and Wearable Devices
Cloud Computing Systems and Networks
Automated Tuning for Large-scale Data Processing Systems
Data-driven Applications in the Energy Domain
Data-driven Applications for Tourism
Database Query Optimization and Tuning
Applied Database Technologies
Google Scholar
US Patent 9,367,601 B2. Cost-Based Optimization of Configuration Parameters and Cluster Sizing for Hadoop. June 2016.
US Provisional Patent DU4146PROV. Systems and Methods for Cost-Based Optimization for MapReduce Workflows. March 2013.
Ongoing Research Projects
MARI-Sense: Maritime Cognitive Decision Support System
The primary general objective of the MARI-Sense project is the integration and adaptation of existing expertise and the development of novel knowledge and skills to develop the MARI-Sense Cognitive Decision Support System for Maritime Activities Planning, Emergency Response and Planning, and Maritime Spatial Planning. The secondary general objective is the development and implementation of strategies for smart, sustainable, and inclusive growth with beneficial impact to the society, technology, and economy powered by the diverse capabilities of members of the quadruple helix and general public.
Funded by: The MARI-Sense Project (INTEGRATED/0918/0032) is co-financed by the European Regional Development Fund and the Republic of Cyprus through the Research and Innovation Foundation (RIF).
Role in project: Work Package Leader
STEAM: Sea Traffic Management in the Eastern Mediterranean
The general objective of the STEAM project is the efficient management of sea traffic in the Eastern Mediterranean sea, while at the same time ensuring safety and environmental sustainability. More specifically, to develop the Port of Limassol to become (i) a world-class transshipment and information hub adopting modern digital technologies brought to the maritime sector, and (ii) a driver for short sea shipping in the Eastern Mediterranean through enhanced services based on standardized ship and port connectivity.
Funded by: The STEAM Project (INTEGRATED/0916/0063) is co-financed by the European Regional Development Fund and the Republic of Cyprus through the Research and Innovation Foundation (RIF).
Role in project: Scientific Coordinator
Distributed Multi-tier Storage for Cluster Computing
Improvements in memory, storage devices, and network technologies are constantly exploited by distributed systems in order to meet the increasing data storage and I/O demands of modern large-scale data analytics. We present a novel distributed file system that is aware of storage media (e.g., memory, SSDs, HDDs, NAS) with different capacities and performance characteristics. The system offers a spectrum of usage patterns ranging from fully automating data management to providing explicit control by exposing the storage media to users.
Funded by: Cyprus University of Technology Startup Grant
Scaling Transactional Databases with Strong Guarantees
Database replication is a common mechanism used for improving availability and performance of distributed transactional databases but it typically leads to lower degrees of consistency and poor scale out support. Hihooi is a data replication middleware solution that employs a novel fully replicated shared-nothing architecture and a light-weight transaction scheduling algorithm to provide good scalability while offering full ACID guarantees.
Collaborator: Dr. Michael Sirivianos, Cyprus University of Technology
Towards a Unified Platform for Multi-Wearable Apps
Wearable technology has recently become an ubiquitous part of everyday life. Smartwatches, activity trackers, and clothing embedded with sensors are used for monitoring personal fitness data, medical devices for detecting health disorders such as sleep apnea, and professional sports devices for offering real-time feedback for athletes. However, the current landscape of wearable devices suffers from two main issues: (i) each device currently offers only a portion of all the combined capabilities of all the devices, and (ii) most devices do not share data with each other and are tied to certain ecosystems. Hence, there is a strong need for a unified framework that will change the current collection of standalone devices to a fully networked technology connected not only to other external devices (such as smartphones) but also to the cloud.
Collaborators: Dr. Andreas Pamboris, University of Cyprus / Dr. Panayiotis Andreou, UCLan Cyprus
Completed Research Projects
Sea Traffic Management Validation Project
The primary goal of this research programme is the innovative optimization of processes and services within and between ports based on enhanced collaboration and regulated information sharing among port actors. The Sea Traffic Management concept is a holistic approach to distributed services related to the berth-to-berth voyage enabling the efficient, safe, and environmentally sustainable sea transport.
Scalable Near Real-Time Failure Localization of Data Center Networks
Despite the built-in redundancy in data center networks, performance issues and device or link failures in the network can lead to user-perceived service interruptions. Therefore, determining and localizing user-impacting availability and performance issues in the network in near real time is crucial. Our key idea is to use statistical data mining techniques on large-scale active monitoring data to determine a ranked list of suspect causes, which we refine with passive monitoring signals.
Starfish: A Self-tuning System for Big Data Analytics
The Hadoop MapReduce platform is a popular choice for big data analytics. Unfortunately, Hadoop's performance out of the box leaves much to be desired, causing suboptimal use of resources, time, and money. Starfish is a self-tuning system for big data analytics that builds on Hadoop while adapting to system workloads and user needs to provide good performance automatically; without any need for users to understand and manipulate the many tuning knobs in the Hadoop platform.
Query Optimization Techniques for Partitioned Tables
Table partitioning has evolved into a powerful mechanism but is currently not utilized effectively during query optimization. We have developed new techniques to generate efficient plans for SQL queries involving multiway joins over partitioned tables. The techniques are designed for easy incorporation into bottom-up query optimizers and have been prototyped in PostgreSQL.
Automating the Process of SQL Tuning
zTuned is a new system that automates SQL tuning using an experiment–driven approach. The nontrivial challenge is to plan the best set of experiments to conduct so that a satisfactory (new) plan can be found quickly. A novel feature of zTuned is a SQL-tuning-aware query optimizer, called Xplus, capable of executing plans proactively, collecting monitoring data from the runs, and iterating. Xplus has been prototyped using PostgreSQL.
MARI-Sense: Maritime Cognitive Decision Support System (INTEGRATED/0918/0032)
STEAM: Sea Traffic Management in the Eastern Mediterranean (INTEGRATED/0916/0063)
ENCASE: EnhaNcing seCurity And privacy in the Social wEb (H2020-MSCA-RISE-2015)
ENGINITE: ENGineering and Industry Innovative Training for Engineers (2017-1-CY01-KA202-026728)
STM: Sea Traffic Management Validation Project (2014-EU-TM-0206-S)
NOTRE: Network for Social Computing Research (H2020-TWINN-2015)
Distributed Multi-tier Storage for Cluster Computing
Editor
Program Committee Chair
Vice/Track Program Committee Chair
Program Committee Member
Journal Reviewer
Professional Memberships
CEI 325 - Database Systems
The course gives a solid background in databases with a focus on relational database management systems. Topics include data modeling, database design theory and methodology, data definition and manipulation languages, storage and indexing techniques, query processing and optimization, transactions, concurrency control, and recovery. The course also covers fundamentals of database management system architecture and techniques for database application development.
CEI 467/526 - Advanced Topics in Data Processing Systems
The need to store and process massive amounts of data has led to the evolution of existing database systems while a new breed of data processing systems has emerged. This course covers a spectrum of topics from core techniques in relational data management to highly-scalable data processing using parallel database systems and MapReduce. First, the course covers the basic principles in database query processing and optimization, including index structures, sort and join processing, query rewrites, and physical plan selection. Next, the course covers topics from parallel and distributed databases, including data partitioning and distributed join algorithms. Finally, the course covers scalable data processing systems such as MapReduce and NoSQL databases (column, document, and key-value stores). The course material will be drawn from textbooks as well as recent research literature. Prerequisite background: Basic database knowledge.
CEI 226 - Algorithms and Complexity
The course focuses on the design and analysis of efficient algorithms and their complexity. In particular, the course covers various topics including algorithm analysis, asymptotic analysis, recurrence relations, divide-and-conquer algorithms, dynamic programming, greedy algorithms, graph representation, graph search, minimum spanning trees, shortest paths, maximum flow, NP-Completeness, and approximation algorithms. Prerequisite background: Data Structures.