Current projects
Partner Projects
Projects in collaboration with industry partners
DaFlEx: Performance portability through dataflow extraction [Prof. Torsten Hoefler, Prof. Luca Benini]
The goal of this project is to extract dataflow information from programs written in imperative programming languages and create efficient versions of these for multiple hardware platforms. We leverage and extend the powerful DaCe framework to expose parallelism and improve application performance.
Key achievements:
- Allow dataflow extraction from MLIR programs Bridging control-centric and data-centric optimization
- Extraction of dataflow representations of C programs Lifting C Semantics for Dataflow Optimization
- Extraction of dataflow representations of Fortran programs with a focus on climate and weather models
- Improvements to the translation of pointer iteration patterns in C towards data-centric representations
Contact
chevron_right Alexandru CalotoiuA New Methodology and Open-Source Benchmark Suite for Evaluating Data Movement Bottlenecks: A Processing-in-Memory Case Study [Prof. Onur Mutlu]
Our methodology to characterize data movement bottlenecks can enable the adoption of processing-in-memory in real-world computing systems.
Key achievements:
- Programming a Real-World Processing-in-Memory Architecture
- System Support for Processing-using-Memory Architectures
- Accelerating Data-Intensive Stencil Applications with Processing-near-Memory Architectures
Contact
chevron_right Juan Gomez LunaMachine-Learning-Assisted Intelligent Micro-architectures to Reduce Memory Access Latency [Prof. Onur Mutlu]
Machine-learning-based control and speculation policies help us create intelligent microarchitectures for next-generation processors.
Key achievements:
- Data-driven and HW/SW-co designed techniques for prefetching (presented in PACT SRC’23)
- Data-driven approaches for managing hybrid memory/store systems (presented in ISCA’23)
Contact
chevron_right Rahul BeraSensor Fusion [Prof. Luc Van Gool]
In the project Sensor Fusion running at Computer Vision Lab, sensor fusion network architectures are developed for semantic understanding of driving scenes under varying and adverse visual conditions. In particular, complementary sensor information is fused adaptively to recognize the content of each scene, depending on the visual conditions at hand and the robustness of each sensor to them.
Key achievements:
- Introduce a transformer-based sensor fusion architecture for dense 2D semantic perception which effectively fuses large numbers of input modalities with minimal computational overhead compared to unimodal counterpart
- Design weakly supervised domain adaptation methods for semantic segmentation based on cross-domain image-level correspondences by adaptively refining pseudolabels and contrastively aligning features
- Construct the first large-scale multimodal driving dataset for dense semantic perception under diverse visual conditions, including different combinations of time of day, visibility, and type of precipitation, and featuring a frame camera, an event camera, a lidar, a radar, and an IMU/GNSS sensor
Contact
chevron_right Christos SakaridisIntegrity and Access Control in Distributed Memory Systems [Prof. S. Capkun, Prof. S. Shinde]
Highly distributed memory systems have become one of the main enablers of modern computing platforms. In this project, we investigate how to design and build appropriate access control and integrity to such systems.
Key achievements:
- Designed a data-center scale confidential computing architecture
- Added TEE support to two accelerators (AI and storage)
- Marginal overhead (0.42-8%)
Contact
chevron_right Supraja SridharaApplication-Specific Architectures [Prof. Torsten Hoefler]
Spatial (or dataflow) devices are a viable and interesting option to classical computer organizations. They offer massive parallelism, having hundreds (or even thousands) of processing elements, that can communicate through a fast network on chip. Many spatial architectures are offered today as ML accelerators, but how to map and schedule applications on these devices is still an open challenge. With this project, we investigate methodologies and tools for the rapid development and evaluation of Domain-Specific Architectures (DSAs) or Domain-Specific Systems on Chip (DSSoCs), to be able to adapt to the rapid evolution of algorithms in a cost-effective way.
Key achievements:
- proposed a computational model to reason about application scheduling for spatial accelerators;
- proposed scheduling solutions that leverage the unique characteristics of these devices.
- proposed a proof-of-concept framework for Application-Specific Architecture design. The framework takes in input a user-provided application and performs a Design Space Exploration phase. The goal of this exploration is to return a (or a set of) macro-level architecture descriptions of a System on Chip (SoC) able to execute the application, resulting in good performance/power/area trade-offs. The framework uses as frontend DaCe
Contact
chevron_right Tiziano De MatteisTowards Optimal Next-Generation Heterogeneous Interconnects for High-Performance Irregular Workloads [Prof. Torsten Hoefler]
In this project, we aim at developing next-generation massively-parallel and heterogeneous interconnects, as well as efficient algorithms and paradigms for solving challenging classes of unstructured irregular workloads such as graph computations.
Key achievements:
- Design of novel paradigms & architectures for scalable and high-accuracy irregular AI applications: graph of thoughts (published in AAAI'24), HOT (published in LoG'23), attentional graph neural networks with global tensor formulations (published in Supercomputing’23), cached operator reordering,
- Design of novel paradigms & architectures for scalable irregular graph applications: the graph database interface (published in Supercomputing’23, Best Paper Finalist), ProbGraph (Best paper award at Supercomputing'22), Neural Graph Databases (published in LoG’22), and harnessing Graph Neural Networks for motif prediction (published in KDD’22).
- Design of scalable interconnects: Sparse Hamming Graph, which is a customizable network-on-chip topology (published in DAC'23) and HexaMesh, a topology that enables scaling to hundreds of chiplets with an optimized chiplet arrangement (published in DAC'23).
- Analysis and taxonomy of data organization, system designs, and graph queries (published in ACM Computing Surveys (CSUR)) and an in-depth concurrency analysis of parallel and distributed graph neural networks (published in IEEE TPAMI).
Contact
chevron_right Matciej BestaCross-layer Hardware/Software Techniques to Enable Powerful Computation and Memory Optimizations [Prof. Onur Mutlu]
Cross-layer techniques provide expressive interfaces to transfer semantic information about applications, in order to improve performance, energy efficiency, security, QoS, and many more properties of computing systems.
Key achievements:
- Leveraging underutilized cache resources to accelerate address translation
- Employing Hybrid Address Mappings for Efficient Address Translation (accepted at MICRO2023)
- Development of an open-source simulation framework for memory management and virtual memory research, which includes a cross-layer interface to associate application metadata with memory regions
Design of ExG-glasses for Brain-Computer-Interfaces [Prof. Luca Benini]
The project targets the development of inconspicuous smart glasses for recording of EOG (ocular) and EEG (brain) signals with a fully-dry setup, coupled with onboard ultra low-power processing capabilities.
Key achievements:
- first prototype designed and presented at Tokyo Wearable Expo
- demo of EOG-based speller
Contact
chevron_right Andrea CossettiniResearch Grants
Internal research projects supported by an EFCL grant
Unified Management of Address Spaces and Files and Its Implications on Security and Performance [Prof. K. Razavi]
File systems are riddled with critical security vulnerabilities, particularly when it comes to the management of i-nodes. Can we tackle these problems by taking advantage of recent storage technologies? This project explores new designs to improve performance, hugely simplify and improve the security of future file systems.
Key achievements:
- built the fastest PM file system on the planet
- made a submission to the OSDI conference based on the design/implementation and the results.
Contact
chevron_right Prof. Kaveh RazaviScalabel: Distributed Human Machine Collaboration System for Visual Data Annotation [Prof. F. Yu]
Large-scale data annotation has become the fuel for modern machine learning applications in the industry. However, the success of the paradigm and the hunger for annotated data also incur significant labor and engineering cost. This project proposes Scalabel, a large-scale human-machine collaboration platform for visual data annotation.
Key achievements:
- re-designed the whole system framework and user interfaces based on the latest technologies. New interface elements such as brush and label cards are added to improve the user interface flexibility and efficiency.
- On the backend, the system can support using the latest segmentation and tracking models to accelerate the visual data labeling process. The user of the system can also plug in their models.
- released a working system with a user-friendly interface and integration with various deep learning models such as SAM at https://github.com/SysCV/nutsh.
- A well-maintained documentation is released at https://nutsh.ai/docs.
Contact
chevron_right Prof. Fisher YuProving Properties to Improve HLS-Produced Circuits [Prof. L. Josipovic]
The goal of this project is to use formal methods (e.g., BDD-based reachability analysis, induction-based proofs) to systematically simplify dataflow circuits obtained from high-level code (e.g., C/C++) and advance the area and timing efficiency of their hardware implementations.
Quadrupedal Robot for Visually Impaired People Assistance [Dr. M. Magno]
The project will be based on the Unitree A1 quadrupedal robot, involving both the aspect of autonomous robot navigation and Human Robot Interaction with the visually impaired person. The goal is to reach autonomous navigation in a dynamic indoor environment, with the possibility of extending the operating range to outdoor environments.
Key achievements:
- created a new Bachelor Course: 227-0085-58L Projekte & Seminare: Autonomous Cars and Robots
- Various interviews, including international exposure (Reuters, ETH news, RSI https://www.youtube.com/watch?v=oyYWoCH7ij0)
- Partecipation at Scientifica 2023
- First complete autonomous navigation assistance demo
Practical energy savings with rate adaptation in today’s computer networks [Prof. L. Vanbever]
We have known for years that modulating the available capacity of a computer network —number of Gbps transportable—can save a lot of energy… in theory. This project aims to materialize such savings by filling the gap between theory and practice. This requires adjusting for the limitations of today’s networking hardware and designing rate adaptation controllers integrating smoothly with existing networking protocols.
Contact
chevron_right Romain JacobBlended Projects
Internal research projects with multiple PIs supported by an EFCL grant
Ultrasound Image Data Recycler [Prof. L. Benini & Prof. L. Van Gool]
In this project, we are developing a physically informed neural network architecture that is able to convert ultrasound images back into raw frequency data. In order to provide doctors with a human-readable image for their diagnoses, this step is usually carried out in reverse order, whereby the more comprehensive raw data is lost. However, novel wearable ultrasound devices need to be trained on raw data sets so that the networks can operate at the extreme edge. Therefore, we aim to develop a model that can perform robust back conversion regardless of vendor type, anatomical structure, or subjects,...
PIM Acceleration of Nanopore Raw-signal-based Genome Analysis [Prof. O. Mutlu & Prof. T. Jang]
Nanopore sequencing is a widely utilized, high-throughput, and low-cost genome sequencing technology, which is capable of sequencing long genome fragments into raw electrical signals. Our goal for this project is to enable real-time analysis for the sequencing of multiple nanopores by using energy-efficient and highly-parallel processing-in-memory techniques.
Key achievement:
Accelerated raw signal analysis by codesigning algorithm and PIM-based architecture
Contact
chevron_right Haiyu MaoStudent projects
Smaller projects based on pre-PhD research supported by an EFCL grant
Efficient Smart Edge Computing for Controlling Unmanned Aerial Vehicles using a Brain–Machine Interface [Prof. L. Benini]
In this project, we develop miniaturized, comfortable, and non-stigmatizing BMIs based on dry EEG electrodes to decode users’ intention in real-time. We use the BMI paradigm of motor movement and/or imagery, i.e., the subject moves or imagines the movement of a body part while the BMI device collects EEG data and decodes the subject’s intention. Moreover, the BMI device features on-board processing capabilities using an open-source Parallel Ultra-Low-Power platform based on RISC-V instruction set architecture. The goal is to acquire EEG data, process it locally at the edge in real time, and finally send out the command to control a flying drone.
Key achievements:
- Shorter training & calibration time; Continuously adapt to new sessions with minimal resources.
- Inference time of only 6 ms
- Energy as low as 30 uJ/inference
- Total system power envelope: 8 mW when performing inference every 100ms. 30h of operation on a 65mAh battery for inference
Contact
chevron_right Xiaying WangImproving multi-row activation in off-the-shelf DRAM chips for in-memory operations [Prof. O. Mutlu]
We observe that off-the-shelf DRAM chips are capable of simultaneously activating up to 32 rows, which enables us to achieve high reliability and performance in Processing-using-DRAM operations.
Key achievement:
Simultaneous Many-Row Activation in DRAM: Experimental Analysis and Characterization of Real DRAM Chips