Konstantinos Kallas


he/him or they/them
PhD student at UPenn
Co-organizer of CS PhD MentoRes
Proud member of the PLClub
Office: Levine 513
Contact: kallas@seas.upenn.edu

Curriculum Vitae
Short Bio
Personal Blog

Overview

I am a PhD student at the University of Pennsylvania, where I am fortunate to be advised by Rajeev Alur. My research interests lie in the intersection of programming languages, distributed systems, and software verification. I have done an internship in Microsoft Research, where I worked with Sebastian Burckhardt, and an internship in the Automated Reasoning Group in AWS, where I worked with Daniel Schwartz-Narbonne. Before my PhD, I did my undergraduate studies at the National Technical University of Athens, where I was advised by Kostis Sagonas.

I am a member of the Technical Steering Committee of PaSh, a shell-script parallelization project hosted by the Linux Foundation.

Together with Manos Theodosis, we are leading a mentoring initiative for people that are interested in applying for PhD programs in Computer Science and related fields. If you are interested in applying for a PhD (or just want to know more about what it entails), contact us!

Quick Links: Research, Papers, Awards, Software, Service, Personal


Note: I am committed to ensuring that everyone feels comfortable being part of our research community. I am always happy to talk to anyone (junior or senior) who feels like they want to share or discuss anything; be it a negative (or positive) experience, an interaction with another member of our community, or an issue with university and other processes. I have often needed to seek help myself and found that having access to a listening ear is very helpful.


Research

Automatic parallelization of shell scripts

Links: PaSh Star, PaSh-JIT paper (OSDI 2022), PaSh paper (Best Paper ⭐ EuroSys 2021), Dataflow Model paper (ICFP 2021), Shell Future paper (HotOS 2021), Short video (1st place ⭐ POPL SRC), Shell Future talk (Distinguished Presentation ⭐ HotOS 2021), PaSh is hosted by the Linux Foundation

Collaborators: Nikos Vasilakis, Michael Greenberg, Tammam Mustafa, Achilleas Benetopoulos, Lazar M. Cvetković, Thurston Dang, Shivam Handa, Dimitris Karnikis, Konstantinos Mamouras, and Martin Rinard

Together with Nikos Vasilakis and Michael Greenberg, we are leading a research project on the automatic parallelization of shell scripts. The ultimate goal is to improve our understanding of the shell, approaching it from a programming languages viewpoint, and build frameworks that enable further studies and analyses on it. We have described this vision for the future of the shell in this paper that appeared in HotOS '21, where we also organized a panel on future avenues for shell research.

The paper that started this research was published at EuroSys 2021 (link). One of the main challenges that we had to overcome is that shell commands are arbitrary black boxes that can be written in a plethora of programming languages, making any analysis infeasible. We addressed that by developing an annotation language that captures a few key properties of shell commands that can then be used by our system to parallelize a script. These annotations are written once per command, and can be shared among users in the form of annotation libraries. We have also formalized an order-aware dataflow model that is equivalent to a "scheduling-free" fragment of the shell (paper link). We use this model as an convenient representation on which we apply transformations that expose parallelism. After we are done parallelizing, the dataflow graph is transformed back to a shell script that can execute on any standard shell.

Our work is open-source, available on Github, and hosted under the Linux Foundation.

Partial order driven stream processing

Links: Flumina on Github Star, Diffstream on GitHub Star, Dependency-guided synchronization paper (PPoPP 2022), Diffstream paper (OOPSLA 2020)

Collaborators: Rajeev Alur, Filip Niksic, and Caleb Stanford

Existing abstractions for stream processing either consider streams to either be totally ordered sequences, completely unordered relations, or some fixed point in between, e.g., CQL considers streams to be sequences of relations. However, these representations face a number of issues. If a representation does not capture adequate order, streaming queries can produce erroneous results due to nondeterminism and out-of-order input data. If a representation is "too ordered", then it does not expose available parallelism, and optimizations due to lack of order. We propose a flexible partial order abstraction that can capture fine-grained ordering requirements, allowing for correct and maximally parallel stream processing. Our first paper on this work was published at OOPSLA 2020 (link), where we describe a differential testing framework (code) for stream processing applications that allows users to define the ordering requirements on their application's output, improving testing accuracy. We are also working on a programming model that can exploit the partially ordered nature of the input stream, generating highly parallel implementations. This work is going to appear at PPoPP 2022, and a prototype of the code is available on (Github).


Select Publications

A complete list of my publications, talks, and reports, can be found here or on my Google Scholar profile.


Awards


Software


Service


Personal

In my free time I enjoy human activities. I really enjoy lying down and falling asleep in nature, with or without the presence of other people. During the late 2010s I was obsessed with escape rooms but I am not going that often anymore. I occasionally like producing rhythmic sounds from electric guitars, usually in the context of some jam session.