Özyeğin University, Çekmeköy Campus Nişantepe District, Orman Street, 34794 Çekmeköy - İSTANBUL

Phone : +90 (216) 564 90 00

Fax : +90 (216) 564 99 99

E-mail: info@ozyegin.edu.tr

Jul 01, 2022 - Jul 07, 2022

Dissertation Defense - Ersin Ersoy (PHDCS)

 

Ersin Ersoy Ph.D. Computer Science

Assoc. Prof. Dr. Hasan Sözer– Advisor

Date: 07.07.2022

Time: 11:00

Location: AB1 412

 

Automated Maintenance Support for Data-Tier Software

 

Assoc. Prof. Dr. Hasan Sözer, Özyeğin University

Asst. Prof. Dr. M. Furkan Kıraç, Özyeğin University

Assoc. Prof. Dr. O. Örsan Özener, Özyeğin University

Assoc Prof. Dr. Mehmet S. Aktaş, Yıldız Teknik University

Assoc Prof. Dr. Kamer Kaya, Sabancı University

 

Abstract:

 

Data-tier software includes the data model and business logic of enterprise systems, and it is subject to long-term maintenance. Even though the user interface of these systems can be completely replaced, data-tier software usually evolves for decades. The number of domain experts with extensive knowledge about the overall software diminishes in time and applying extensions or changes becomes increasingly effort-consuming and error-prone for new developers. In this thesis, we introduce techniques and tools to provide automated maintenance support for data-tier software. These techniques and tools aim at reducing effort and the number of errors specifically for three challenging maintenance tasks: i) correct placement of a new object like a stored procedure in data-tier software, ii) evaluating the impact of changing database tables on software modules, and iii) evaluating the impact of table extensions on other tables of the same database. The first task is important because introducing a new object to data-tier software should not hamper its modular structure. This structure is defined by the allocation of objects among a set of schemas. Therefore, we introduce an approach and a tool to automatically predict the correct placement of new objects. We extract dependencies among various types of objects (database types, sequences, tables, procedures, functions, packages, and views) that are already placed in schemas. These dependencies are used for training an artificial neural network model, which is then used for prediction. Our industrial case studies show that our approach can reach an accuracy of 89%, whereas the baseline approach using coupling and cohesion metrics can reach 57.4% accuracy at most. There are already several techniques and tools for supporting the second task of analyzing the impact of changes in the data model on the source code. However, they fall short to analyse dynamically created SQL statements, queries on multiple tables, and other types of statements that allow data manipulation in PL/SQL, which is a commonly used language for developing data-tier software. We introduce techniques and a tool to parse both the data model and the source code (i.e., PL/SQL functions and procedures) taking part in all the schemas of a given database. Then, a dependency model is created based on queries and manipulation of database tables. Unlike prior studies, our tool can analyze queries that are created dynamically and that involve multiple tables as well as PL/SQL-specific features. We use the derived dependency model to estimate effort for two different common refactoring types on real systems. We observe high consistency between the automated estimations and manual estimations. The third task is concerned with the impact of changing tables on other tables of the same database. There are only a few studies that focus on this concern. Moreover, these studies consider the impact of deletion and modification of columns in database tables only. To address this limitation, we introduce an approach and a tool for automatically detecting the impact of data model extensions on the data model itself. We employ Siamese networks to detect similarities among database tables and such, to learn implicit relations among them. Table similarities are used as the basis for identifying potential impact. We develop another tool as the baseline, which employs the cosine similarity metric to measure similarity among database tables. Results obtained with Siamese networks turned out to be better than the baseline, achieving the mean F1 score of 96.1%

Bio:

Ersin Ersoy received his associate degree in Computer Programming from Marmara University, 1999-2001. He graduated in Computer Engineering from Kocaeli University, MBA from Bilgi University in 2012, and M.Sc. in Computer Engineering from Ozyegin University in 2016. He has been working as a senior manager at Turkcell group since 2000. He is responsible for Mobil Payment and Data Analytics technology solutions. He is currently pursuing his doctorate under the supervision of Assoc. Prof. Hasan Sözer at Ozyegin University, Istanbul, Turkey. His current research includes software engineering and machine learning.