9–10 Jun 2026
UiT - The Arctic University of Norway in Tromsø
Europe/Oslo timezone

A Benchmark Dataset for Graph Regression with Homogeneous and Multi‑Relational Variants

Not scheduled
20m
Auditorium Cerebrum (UiT - The Arctic University of Norway in Tromsø )

Auditorium Cerebrum

UiT - The Arctic University of Norway in Tromsø

UiT - The Arctic University of Norway Universitetsvegen 61 9019 Tromsø Norway
Poster

Speaker

Antonio Longa (Uit)

Description

Research software increasingly depends on structured code representations, yet publicly available graph-regression benchmarks remain concentrated in domains such as chemistry and offer limited support for studying software-centric, execution-aware graphs. We present RelSC, an open benchmark for software performance prediction that converts Java programs into graph representations and pairs them with measured execution-time labels. RelSC is released in two complementary variants: RelSC-H, a homogeneous flow-augmented AST representation, and RelSC-M, a multi-relational version that preserves semantically distinct relationships in program structure.

Beyond the dataset itself, we release a reproducible construction pipeline, standardized train/validation/test splits, and ready-to-use PyTorch Geometric objects, enabling consistent reuse and comparison across studies. We benchmark source-code, AST-based, homogeneous GNN, and heterogeneous GNN baselines on two corpora that reflect both real-world build variability and controlled execution settings. Our results show that semantic augmentation of ASTs with control-flow and data-flow information substantially improves prediction quality, while richer multi-relational structure introduces additional robustness challenges, especially in smaller projects.

We position RelSC not only as a machine learning benchmark, but as research software infrastructure: a reusable and extensible resource for reproducible evaluation of program-analysis methods, performance regression studies, and future tools for CI/CD, code optimization, and performance-aware scheduling.

Authors

Antonio Longa (Uit) Mr Marcus Vukojevic (University of Trento) Mr Morteza Haghir Chehreghani (Chalmers University of Technology) Mr Peter Samoaa (Chalmers University of Technology)

Presentation materials

There are no materials yet.