GSoD 2024 Project Proposal - STEllAR-GROUP/hpx GitHub Wiki

HPX Education Initiative - STE||AR Group

About HPX

HPX is a C++ standard-conforming library for concurrency and parallelism, managed by The STE||AR Group. It implements all the APIs and facilities defined by the C++ Standard and extends these to distributed computing cases. Additionally, the implementation covers functionalities proposed in the ongoing C++ standardization process. Beyond aligning with the C++11/14/17/20/23 ISO standard, the HPX API also follows the programming guidelines of the Boost collection of C++ libraries.

The STE||AR Group aims to develop high-quality, freely available, open-source programming models suitable for all scales, from Raspberry Pis and multi-socket SMP nodes to Beowulf clusters with thousands of nodes. The HPX runtime system's development is application-driven, ensuring API robustness and stability. Engaging with application developers has facilitated a smooth transition for users to this new system.

The HPX community focuses on enhancing the scalability of current applications and unlocking the potential of future systems by exposing parallelism at all levels, from threads to tasks.

About HPX Documentation

Project’s current issues

HPX is a popular library that garners the attention of professionals, researchers, and students alike. However, the lack of educational resources and the complexity of the HPX library pose several challenges for our users. As a sophisticated library with advanced features, newcomers frequently find it difficult to grasp advanced concepts and best practices. Offering structured educational content can streamline the learning curve for effectively utilizing HPX.

We propose creating a pilot open-source educational program that provides a range of learning materials, including video tutorials, documentation, blog posts, and an AI-powered tutor. This program aims to promote modern and safe programming practices, especially in high-performance computing (HPC) settings, by utilizing the HPX library's APIs. It seeks to educate and empower users and developers with safe and modern programming models and practices.

Project’s scope

The aim of the project is to: As part of our mission at the STE||AR Group, we are committed to: (1) developing modern, safe, and scalable high-performance computing (HPC) software solutions, and (2) supporting community education by disseminating knowledge and developing skills. With the increasing demand for HPC, driven by the expanding availability of data and computational resources, we propose to launch an initiative to create a suite of modern, open-source, and structured learning resources for the HPX community. This initiative aims to create a wealth of educational content in the form of video tutorials (example), user manuals, and blog posts. The following topics will be covered:

  1. Introduction to HPX: This aims at providing a foundational understanding of HPX, discussing its architecture, key features, and the rationale behind its design. The goal is to present HPX capabilities in the context of HPC.

  2. Getting Started with HPX: Targeting beginners and those new to HPX, this series will offer a step-by-step guide on setting up the HPX environment, installing dependencies, and executing simple applications. Our aim here is to lower the entry barrier for new users.

  3. Parallel Algorithms in HPX: Here, we will delve into various algorithms implemented in HPX and their use cases in real-world applications.

  4. Vectorization and SIMD in HPX: In this module, we will cover common parallelization techniques in HPC and introduce their corresponding HPX APIs.

  5. Futures and Asynchronous Programming: We will explore HPX's capabilities in asynchronous programming, detailing how futures and async functionalities can be leveraged to write efficient, non-blocking code. We will use real-world examples to demonstrate these concepts, providing users with practical use cases.

  6. Performance Analysis and Optimization: A crucial skill in HPC is the ability to analyze and optimize performance. This module will introduce tools and methodologies for profiling HPX applications, guiding users in identifying bottlenecks and optimizing code for better performance.

  7. Task Scheduling and Custom Executors: Understanding HPX's task scheduling and the creation of custom executors is vital for developing scalable applications. This module will provide in-depth explanations and examples to facilitate understanding of advanced HPX features for developing custom execution orders and strategies to exploit concurrency across tasks.

  8. HPX for Distributed Computing: Distributed computing is a cornerstone of HPC. This section will showcase HPX features that facilitate distributed programming, covering setup, configuration, and effective resource management in distributed applications.

  9. Migration Guides: For users transitioning from other HPC frameworks to HPX, this series will offer guidance and best practices to help adopt HPX.

  10. HPX in Practice: In this module, we will review open-source libraries and applications that use HPX to familiarize users with the HPX ecosystem.

Beyond content generation, we plan to improve our AI-Powered tutor developed as a proof of concept. The goal is to refine this tool, enhancing its quality and accuracy. The proposed improvements will make it a more effective educational resource, furthering users' understanding of HPX in a dynamic and interactive manner.

As this is an extensive list of project ideas, we will consider hiring two technical writers to perform the work. Depending on their expertise and interests, we'll assign tasks accordingly. Our primary focus will be on the first five topics, prioritizing their execution. The remaining five topics will serve as stretch goals, offering additional opportunities for exploration and development as resources permit.

We believe the proposed work will benefit both HPX users and the HPC community in general. We will cover core ideas in high-performance computing and use the HPX API to demonstrate these concepts. By presenting the materials in video and blog post formats, we plan to accommodate different learning styles and preferences, ensuring that users can engage with the content in the most effective way for them. In short, we aim to empower developers, researchers, and students with the knowledge and skills necessary to leverage HPX in their HPC applications.

Measuring our project’s success

Last year HPX received an average of 15 pull requests per month to add or update code or documentation. The majority of these pull requests are from previous contributors. We believe that this education initiative will result in more pull requests from new contributors. Since many of our active contributors began either by GSoD or GSoC, we appreciate the initiative of GSoD to support Open Source projects and we believe this is a great opportunity for our project to get bigger.

We will use a survey method (Q&A) to ask users and contributors how they feel about the updates of the documentation. Since the GSoC period is overlapping with the anticipated GSoD project time, we believe this is a great opportunity to evaluate how fast people will familiarize themselves with HPX before and after the documentation changes.

We would consider the project successful if after publication of the new documentation:

  • More than 80% of the people participating in the survey declared to be happy with the updates.
  • We see a slight increase in the Pull Requests, which will mean that people understand faster how to use and develop features of HPX.
  • We see newcomers being able to familiarize themselves with HPX without explicit guidance from a previous member.
  • Members of the group find it easier to use new features or/and have a clear guideline on how to structure new content of and where to place it.

Timeline

The project itself will take approximately six months to complete. Once the tech writer is hired, we'll spend a month on tech writer orientation, then move onto the development of documentation depending on the projects chosen by the technical writer. An example-timeline is provided below.

Dates Action Items
May Orientation
June - August Create video tutorials
September - October Create documentation
November Project completion

Project budget

Item Amount Running Total Notes
Technical writer restructures documentation 10000 10000 2 writers x 5000 each or 1 writer x 10000
Swag for volunteer proof-reading 500 11000 2 volunteer stipends x 500 each
TOTAL 11000

Additional information

Previous experience with technical writers or documentation

Our mentors have extensive experience working with technical writers, as one of them is technical writers themselves. Our current technical writer joined us during GSoD 2020 and has continued to make significant contributions to our documentation. Over the years, we have made substantial improvements to the documentation, including updating the theme and layout to make it more user-friendly and readable, enhancing the API reference, and updating many pages of the Manual, Examples, and Quickstart guide.

To ensure a thorough review process, we hold regular meetings where we discuss the technical writer's progress and next steps. Additionally, our group forum is always available, and we have an active community that is eager to answer questions. Through this experience, we have learned how to effectively organize our work and mentor individuals while maintaining their interest.

Previous participation in Google Season of Docs and Google Summer of Code

In recent years, our project has been selected to participate in both GSoC and GSoD programs, providing us with valuable experience and opportunities to promote the quality of our project and support its goals. This year, we have already been accepted for GSoC, and we have received a lot of interest from individuals who are eager to participate in our project. This experience will help us further improve our project this year during GSoD, should we be selected.