Lab 3 ‐ Code Generation - ftsrg-edu/ase-labs GitHub Wiki

Introduction

Continuing on the LSP and IDE developed in the last laboratory session, you will implement the .dataspace to .ts code generator components to realize the modeled service chains in this session.

The expected learning outcomes of this session are the following.

  • 🎯 You should have sufficient skill to extend partially implemented code generator components.
  • 🎯 You should have sufficient skill to extend the code generator component with new features.

Notation

Tip

The guide is annotated using the following notation.

Regular text: Information related to the laboratory.

💡 Extra insights about the current topic.

✏️ You should perform some specific exercise.

📋 CHECK: You should check something and write down your experience.

🔍 You should do some research to continue.

Suggested Reading

Starter Project

Since the last session, we have extended the starter project with the final version of the LSP server (grammar and validation) and a new generate command in the Cli application.

We have also added an example project called dataspace-example inside the root directory, which serves as an example for using the code generation capabilities.

Tasks

Running Example

The QS World University ranking is a well-known ranking of universities worldwide. This running example describes a service chain that connects an imagined University to this ranking service while respecting the Students' PII consent.

The system has multiple stakeholders. The University has a list of their Students and their personal information, but their grades are stored in the Moodle system. Moodle can map each student's grades. Since QSWorldUniversity does not (and should not) trust the University to calculate the aggregate data, we offload the task to a mutually agreed upon AnalyticsCompany. Finally, the aggregate results are sent to the QSWorldUniversity database.

flowchart TB
  A[University] -->|students| B(Moodle)
  B -->|student grades| C(AnalyticsCompany)
  C -->|aggregate| D[QSWorldUniversity]
Loading

PII concerns arise from the fact that Students are subject of the University's student data, and they can decide who they consent to process their personal data. In this example, the Students gave explicit permission to Moodle (and University, since they hold the data). Now, your job will be to create a domain-specific language that allows the specification of models such as this example while providing compliance validation checks to ensure PII consents are respected.

Task 0: Project Onboarding (0 points)

For this session, we added a new launch configuration called Run Generator. It builds the project, and then executes the Cli application with the generator --configuration dataspace-example/data-space-config.json arguments. By default, if no --configuration flag is specified, the generator looks for a data-space-config.json file in the directory where it is executed. However, it is overridden to use the one in the dataspace-example directory.

The data-space-config.json file configures the input .dataspace model, what files the application will generate, and where they will be placed. The initial version specifies students.dataspace and gen as the model and output directories, both relative to the location of the configuration file.

✏️ Run the generator from your VSCode instance using the Run Generator launch configuration!

📋 CHECK: Some initial files should be generated into the dataspace-example/gen directory.

✏️ Now, open the dataspace-example folder in a new VSCode instance. Initialize the project with npm install, and check out the generated files.

The generated files should consist of the following.

dataspace-example/gen
├─ base-client.ts           # Contains the BaseClient class, which all *Client implementations will extend.
├─ base-server.ts           # Contains the BaseServer class, which all *Server implementations will extend.
├─ base-service-chain.ts    # Contains the BaseServiceChain class, which all service chain implementations will extend.
├─ schemas.ts               # Contains all Schema definitions generated from the model.
└─ stakeholders.ts          # Contains an interface for each Stakeholder implementation, defining its methods.
├─ clients.ts               # Contains all *Client implementations generated from the model, implementing their relevant Stakeholder classes.
├─ servers.ts               # Contains all *Server implementations generated from the model, forwarding the calls to the specified Stakeholder logic instance.
├─ service-chains.ts        # Contains all service chain implementations generated from the model.
  • The base-*.ts files have already been implemented, so you do not need to edit them. However, the generated files will use them, so you should familiarize yourself with them.
  • The stakeholders.ts and schemas.ts files are fully generated; their generators do not need to be implemented.
  • The clients.ts and servers.ts files are partially generated; you will need to extend their generators.
  • Finally, service-chains.ts is almost empty; you will need to implement the generator yourself.

Throughout the session, you will develop the code generator in the first VSCode instance and will only use the second to check out the generated files and, at the end, to run an example project.

✏️ To familiarize yourself with the generation process, place a breakpoint at the first line of the generate.ts#generateAction function and step through the generation process.

📋 CHECK: You should understand which functions are called, which functions you need to edit, and which functions you can take inspiration from.

Task 1: Client Generator (2 points)

In contrast with the template engines used in the practice sessions, we will use JavaScript's string interpolation to generate our files, extended with some Tag functions created to simplify the creation of well-formatted code.

The two tag functions used are the following.

expandToNode`
    export interface ${schema.name} {
        // TODO
    }
`

The expandToNode tag function takes the template literal specified between the '`' (backtick) characters and returns a GeneratorNode out of it. This node encodes the internal indentation information, which allows this literal to be indented consistently.

joinToNode(model.schemas, mapSchemaToInterface, { appendNewLineIfNotEmpty: true })

The joinToNode function takes a collection, a mapping function, and some options as arguments. The mapping function will be called for each element in the collection. The options can specify how to stringify the resulting GeneratorNode, e.g., appending new lines or inserting separators in between the strings.

Of course both functions can compose each other as needed, see generate-schemas.ts and generate-stakeholders.ts for reference.

Tip

You can insert backticks using the Alt Gr + 7 combination!


First code to generate are the *Client classes. *Client classes should implement their relevant Stakeholder interfaces by connecting to some HTTP server using a specific Url + endpoint. To increase code reuse, it extends the BaseClient class, which defines the getMethod and postMethod methods, simplifying the implementation of the class.

✏️ Extend the generate-client.ts file at the // TODO: comments according to the following reference clients.ts output file.

/* eslint-disable */
import { BaseClient } from './base-client.js';

import * as schemas from './schemas.js';
import * as stakeholders from './stakeholders.js';

export class UniversityClient extends BaseClient implements stakeholders.University {
    students(): Promise<schemas.StudentData[]> {
        return this.getMethod('students');
    }
}
export class StudentClient extends BaseClient implements stakeholders.Student {
}
export class MoodleClient extends BaseClient implements stakeholders.Moodle {
    calculateGrades(input: schemas.StudentData[]): Promise<schemas.StudentGrade[]> {
        return this.postMethod('calculateGrades', input);
    }
}
export class AnalyticsCompanyClient extends BaseClient implements stakeholders.AnalyticsCompany {
    calculateAggregate(input: schemas.PseudonymizedStudentGrade[]): Promise<schemas.StudentAggregate[]> {
        return this.postMethod('calculateAggregate', input);
    }
}
export class QSWorldUniversityClient extends BaseClient implements stakeholders.QSWorldUniversity {
    recordAggregate(input: schemas.StudentAggregate[]): Promise<void> {
        return this.postMethod('recordAggregate', input);
    }
}

Data sets are implemented using GET HTTP method, while services and consumes are implemented using POST. Use the proper input and output types according to the generated schemas.ts and stakeholders.ts files!

📋 CHECK: Rerun the generator. The generated clients.ts file should be similar to the provided reference.

💡 Note how you don't need to worry about name clashes across the generated classes and their methods! This is why we needed all those naming convention validation rules, as they ensure the generated code will also be correct.

Task 2: Server Generator (3 points)

The next task is to generate the corresponding Server classes. The *Server classes will be the counterparts to the generated *Client classes. Similarly to the BaseClient class, the BaseServer class is implemented to abstract away the setup of the HTTP server, simplifying the generated classes.

The BaseServer class has a registerEndpoints(app: Express): void method, which child classes should implement to register their own routes. The routes should be the same as those used by the *Client classes.

class SomeServer extends BaseServer {
    registerEndpoints(app: Express): void {
        app.get('/students', async (req, res) => {
            const result = await this.logic.students();
            res.json(result);
        });
        app.post('/calculateAggregate', async (req, res) => {
            const result = await this.logic.calculateAggregate(req.body);
            res.json(result);
        });
    }
}

To allow the customization of the actual logic behind the server implementations, they take an instance of the associated Stakeholder interface as a parameter (called logic). Make sure to forward the actual calls to the logic property!

💡 This pattern is called "dependency inversion" and can be simplified using "dependency injection" technologies. We are not using any DI tools in this session.

✏️ Extend the generate-server.ts file at the // TODO: implement mapStakeholderToClass! line to generate server implementations compatible with the Client classes. Make sure to forward each call to the logic instance's appropriate method!

📋 CHECK: Rerun the generator. The generated server.ts file should be compatible with the generated client classes!

Task 3: Service Chain Generator (4 points)

Finally, service chain classes put all of the above together and orchestrate the data transfer between the stakeholders. The generated service chain classes take one instance of each used Stakeholder interface and call their respective functions with the last output as a parameter.

The BaseServiceChain class has an abstract execute method, which is overridden in each service chain class. It also implements a pseudonymize method, which can generate randomized IDs for specific PII values, ensuring the same ID is returned for the same PII value.

The partial generator implementation generates the execute method implementation, calling stepN() methods one by one and forwarding their outputs to the next step.

✏️ Extend the generate-service-chains.ts file at the // TODO: implement step method generator! line to generate the stepN() methods with the correct signature (parameter, return type) as used in the execute method! Make sure to handle the valueMapping of the step if it exists!

Tip

You should first try to implement the service-chains.ts file inside the dataspace-example directory and then write the actual generator implementation. Decide how you will handle the value mapping between steps and what types you will use as parameters and return types.

📋 CHECK: Run the generator. The generated service chain implementation should be valid TypeScript code that passes the required data between the stakeholders, respecting the value mappings defined in the model!

Task 4: Logic Implementation (1 points)

Now that the generator components are implemented, it is time to try out the generated code!

✏️ To do so, open your dataspace-example VSCode instance.

The folder has a similar preconfigured structure, with the package.json file containing the necessary scripts to compile, build, and run the examples. The dataspace script executes the Dataspace Cli generator (from the ../dist/cli/main.cjs file) to generate the TypeScript code. To use it, ensure that the root project is built.

💡 In real-life projects, the generator component should be published with specific versions to ensure loose coupling between the generator's implementation and the user project.

✏️ Try to build the project using npm run build.

You may have noticed that no .js file has been created. The reason is that the project contains no entry points and no code that uses the generated infrastructure.

How do we solve this issue? One idea could be to generate the runner code as well, but that would eliminate our customization ability. The other could be to implement the code manually, but we can do better than that.

With code generators, one interesting use case is stub generation. Stubs are basically initial implementations of user code based on some configuration. The benefit of using stubs is that they are only generated if the given file does not exist, meaning our manual modifications will remain in the code. You may remember this pattern as the "Generation Gap" pattern from the lectures.

✏️ To enable stub generation, extend the data-space-config.json file with the following settings.

{
    "modelPath": "students.dataspace",
    "genPath": "gen",
    "stubs": {
        "srcPath": "src",
        "serverRunnerStubs": {
            "Moodle": {
                "defaultPort": 3001
            },
            "University": {
                "defaultPort": 3002
            },
            "AnalyticsCompany": {
                "defaultPort": 3003
            },
            "QSWorldUniversity": {
                "defaultPort": 3004
            }
        },
        "serviceChainStubs": {
            "CollectUniversityRankingData": {
                "clients": {
                    "Moodle": {
                        "defaultUrl": "http://localhost:3001"
                    },
                    "University": {
                        "defaultUrl": "http://localhost:3002"
                    },
                    "AnalyticsCompany": {
                        "defaultUrl": "http://localhost:3003"
                    },
                    "QSWorldUniversity": {
                        "defaultUrl": "http://localhost:3004"
                    }
                }
            }
        }
    }
}

This configuration allows stub code generation into the src directory and configures the server runner and service chain stubs to be generated. Some default ports and URLs are also defined, making it easier to extend and customize it in the future.

✏️ To include the stubs in the build process, uncomment the following lines in esbuild.mjs.

    // entryPoints: [
    //     'src/service-chain/*.ts',
    //     'src/server-runner/*.ts',
    // ],

📋 CHECK: Run npm run build again. This time, you should see the initial logic, server runner, and service chain runner files generated and their corresponding .js files in the dist directory.

✏️ Finally, customize the generated logic classes with the following implementations.

export class AnalyticsCompanyLogic implements stakeholders.AnalyticsCompany {
    calculateAggregate(input: schemas.PseudonymizedStudentGrade[]): Promise<schemas.StudentAggregate[]> {
        console.log('Logging received student grades:', JSON.stringify(input, null, 2));

        const groups = new Map<string, {
            id: string;
            year: number;
            term: string;
            sumCredits: number;
            sumWeightedGrades: number;
        }>();

        for (const grade of input) {
            const key = `${grade.id}-${grade.year}-${grade.term}`;
            let group = groups.get(key);
            if (!group) {
                group = {
                    id: grade.id,
                    year: grade.year,
                    term: grade.term,
                    sumCredits: 0,
                    sumWeightedGrades: 0,
                };
                groups.set(key, group);
            }

            group.sumCredits += grade.credits;

            if (grade.signature && grade.finalGrade >= 2) {
                group.sumWeightedGrades += grade.credits * grade.finalGrade;
            }
        }

        const aggregates: schemas.StudentAggregate[] = Array.from(groups.values()).map(group => ({
            id: group.id,
            year: group.year,
            term: group.term,
            credits: group.sumCredits,
            average: group.sumCredits > 0 ? group.sumWeightedGrades / group.sumCredits : 0,
            correctedAverage: group.sumWeightedGrades / 30,
        }));

        return Promise.resolve(aggregates);
    }
}
export class MoodleLogic implements stakeholders.Moodle {
    calculateGrades(input: schemas.StudentData[]): Promise<schemas.StudentGrade[]> {
        console.log('Logging received student data:', JSON.stringify(input, null, 2));
        const grades = input.flatMap(student => [{
            name: student.name,
            studentId: student.studentId,
            year: 2023,
            term: 'Fall',
            courseName: 'Mathematics',
            credits: 4,
            signature: true,
            finalGrade: 85
        }, {
            name: student.name,
            studentId: student.studentId,
            year: 2023,
            term: 'Fall',
            courseName: 'Physics',
            credits: 3,
            signature: true,
            finalGrade: 78
        }, {
            name: student.name,
            studentId: student.studentId,
            year: 2023,
            term: 'Spring',
            courseName: 'Chemistry',
            credits: 5,
            signature: true,
            finalGrade: 92
        }, {
            name: student.name,
            studentId: student.studentId,
            year: 2024,
            term: 'Fall',
            courseName: 'Biology',
            credits: 4,
            signature: false,
            finalGrade: 45
        }]);
        return Promise.resolve(grades);
    }
}
export class QSWorldUniversityLogic implements stakeholders.QSWorldUniversity {
    recordAggregate(input: schemas.StudentAggregate[]): Promise<void> {
        console.log('Logging received aggregated data:', JSON.stringify(input, null, 2));
        return Promise.resolve();
    }
}
export class UniversityLogic implements stakeholders.University {
    students(): Promise<schemas.StudentData[]> {
        console.log('Logging sending Student data.');
        return Promise.resolve([
            { name: 'Alice Smith', studentId: '1', age: 20 },
            { name: 'Bob Johnson', studentId: '2', age: 22 },
            { name: 'Charlie Brown', studentId: '3', age: 21 }
        ]);
    }
}

✏️ In a separate terminal, run npm run servers to start the 4 servers on the specified ports. In another terminal, run npm run servicechain to execute the service chain.

📋 CHECK: Check the logs of the servers and the service chain. Ensure that each step was executed successfully and that no PII data was passed on to non-consented stakeholders!

Extra Task: OpenAPI Generator (3 IMSc points)

Our code generator is now capable of generating ExpressJS client and server codes in the TypeScript language. However, this locks us into this particular technology stack; to use JVM or other technologies we must reimplement the generator to those as well.

Luckily, there are open technologies out there that allow the definition of platform-independent HTTP APIs, and the generation of specific server or client implementations. One major benefit of using such a technology is that the actual logic does not depend on us, and we can inherit the development efforts of the community.

To define the definition, you can use the openapi-types npm package. OpenApi uses YAML by default, but since YAML is a superset of JSON, you can output a JSON file as well.

✏️ Generate an OpenAPI definition out of the .dataspace model that can be used to generate client and server code for various platforms.

📋 CHECK: Try to generate a server and client code for some platform, e.g., the JVM. https://openapi-generator.tech/docs/generators/java/

⚠️ **GitHub.com Fallback** ⚠️