Lab 3 ‐ Code Generation - ftsrg-edu/ase-labs GitHub Wiki
Continuing on the LSP and IDE developed in the last laboratory session, you will implement the .dataspace
to .ts
code generator components to realize the modeled service chains in this session.
The expected learning outcomes of this session are the following.
- 🎯 You should have sufficient skill to extend partially implemented code generator components.
- 🎯 You should have sufficient skill to extend the code generator component with new features.
Tip
The guide is annotated using the following notation.
Regular text: Information related to the laboratory.
💡 Extra insights about the current topic.
✏️ You should perform some specific exercise.
📋 CHECK: You should check something and write down your experience.
🔍 You should do some research to continue.
- Related practice material: Practice 6 ‐ Code Generation
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
Since the last session, we have extended the starter project with the final version of the LSP server (grammar and validation) and a new generate
command in the Cli application.
We have also added an example project called dataspace-example
inside the root directory, which serves as an example for using the code generation capabilities.
The QS World University ranking is a well-known ranking of universities worldwide. This running example describes a service chain that connects an imagined University to this ranking service while respecting the Students' PII consent.
The system has multiple stakeholders. The University has a list of their Students and their personal information, but their grades are stored in the Moodle system. Moodle can map each student's grades. Since QSWorldUniversity does not (and should not) trust the University to calculate the aggregate data, we offload the task to a mutually agreed upon AnalyticsCompany. Finally, the aggregate results are sent to the QSWorldUniversity database.
flowchart TB
A[University] -->|students| B(Moodle)
B -->|student grades| C(AnalyticsCompany)
C -->|aggregate| D[QSWorldUniversity]
PII concerns arise from the fact that Students are subject of the University's student data, and they can decide who they consent to process their personal data. In this example, the Students gave explicit permission to Moodle (and University, since they hold the data). Now, your job will be to create a domain-specific language that allows the specification of models such as this example while providing compliance validation checks to ensure PII consents are respected.
For this session, we added a new launch configuration called Run Generator
. It builds the project, and then executes the Cli application with the generator --configuration dataspace-example/data-space-config.json
arguments. By default, if no --configuration
flag is specified, the generator looks for a data-space-config.json
file in the directory where it is executed. However, it is overridden to use the one in the dataspace-example
directory.
The data-space-config.json
file configures the input .dataspace
model, what files the application will generate, and where they will be placed. The initial version specifies students.dataspace
and gen
as the model and output directories, both relative to the location of the configuration file.
✏️ Run the generator from your VSCode instance using the Run Generator
launch configuration!
📋 CHECK: Some initial files should be generated into the dataspace-example/gen
directory.
✏️ Now, open the dataspace-example
folder in a new VSCode instance. Initialize the project with npm install
, and check out the generated files.
The generated files should consist of the following.
dataspace-example/gen
├─ base-client.ts # Contains the BaseClient class, which all *Client implementations will extend.
├─ base-server.ts # Contains the BaseServer class, which all *Server implementations will extend.
├─ base-service-chain.ts # Contains the BaseServiceChain class, which all service chain implementations will extend.
├─ schemas.ts # Contains all Schema definitions generated from the model.
└─ stakeholders.ts # Contains an interface for each Stakeholder implementation, defining its methods.
├─ clients.ts # Contains all *Client implementations generated from the model, implementing their relevant Stakeholder classes.
├─ servers.ts # Contains all *Server implementations generated from the model, forwarding the calls to the specified Stakeholder logic instance.
├─ service-chains.ts # Contains all service chain implementations generated from the model.
- The
base-*.ts
files have already been implemented, so you do not need to edit them. However, the generated files will use them, so you should familiarize yourself with them. - The
stakeholders.ts
andschemas.ts
files are fully generated; their generators do not need to be implemented. - The
clients.ts
andservers.ts
files are partially generated; you will need to extend their generators. - Finally,
service-chains.ts
is almost empty; you will need to implement the generator yourself.
Throughout the session, you will develop the code generator in the first VSCode instance and will only use the second to check out the generated files and, at the end, to run an example project.
✏️ To familiarize yourself with the generation process, place a breakpoint at the first line of the generate.ts#generateAction
function and step through the generation process.
📋 CHECK: You should understand which functions are called, which functions you need to edit, and which functions you can take inspiration from.
In contrast with the template engines used in the practice sessions, we will use JavaScript's string interpolation to generate our files, extended with some Tag functions created to simplify the creation of well-formatted code.
The two tag functions used are the following.
expandToNode`
export interface ${schema.name} {
// TODO
}
`
The expandToNode
tag function takes the template literal specified between the '`' (backtick) characters and returns a GeneratorNode
out of it. This node encodes the internal indentation information, which allows this literal to be indented consistently.
joinToNode(model.schemas, mapSchemaToInterface, { appendNewLineIfNotEmpty: true })
The joinToNode
function takes a collection, a mapping function, and some options as arguments. The mapping function will be called for each element in the collection. The options can specify how to stringify
the resulting GeneratorNode
, e.g., appending new lines or inserting separators in between the strings.
Of course both functions can compose each other as needed, see generate-schemas.ts
and generate-stakeholders.ts
for reference.
Tip
You can insert backticks using the Alt Gr + 7 combination!
First code to generate are the *Client
classes. *Client
classes should implement their relevant Stakeholder interfaces by connecting to some HTTP server using a specific Url + endpoint. To increase code reuse, it extends the BaseClient
class, which defines the getMethod
and postMethod
methods, simplifying the implementation of the class.
✏️ Extend the generate-client.ts
file at the // TODO:
comments according to the following reference clients.ts
output file.
/* eslint-disable */
import { BaseClient } from './base-client.js';
import * as schemas from './schemas.js';
import * as stakeholders from './stakeholders.js';
export class UniversityClient extends BaseClient implements stakeholders.University {
students(): Promise<schemas.StudentData[]> {
return this.getMethod('students');
}
}
export class StudentClient extends BaseClient implements stakeholders.Student {
}
export class MoodleClient extends BaseClient implements stakeholders.Moodle {
calculateGrades(input: schemas.StudentData[]): Promise<schemas.StudentGrade[]> {
return this.postMethod('calculateGrades', input);
}
}
export class AnalyticsCompanyClient extends BaseClient implements stakeholders.AnalyticsCompany {
calculateAggregate(input: schemas.PseudonymizedStudentGrade[]): Promise<schemas.StudentAggregate[]> {
return this.postMethod('calculateAggregate', input);
}
}
export class QSWorldUniversityClient extends BaseClient implements stakeholders.QSWorldUniversity {
recordAggregate(input: schemas.StudentAggregate[]): Promise<void> {
return this.postMethod('recordAggregate', input);
}
}
Data sets are implemented using GET
HTTP method, while services and consumes are implemented using POST
. Use the proper input and output types according to the generated schemas.ts
and stakeholders.ts
files!
📋 CHECK: Rerun the generator. The generated clients.ts
file should be similar to the provided reference.
💡 Note how you don't need to worry about name clashes across the generated classes and their methods! This is why we needed all those naming convention validation rules, as they ensure the generated code will also be correct.
The next task is to generate the corresponding Server classes. The *Server
classes will be the counterparts to the generated *Client
classes. Similarly to the BaseClient
class, the BaseServer
class is implemented to abstract away the setup of the HTTP server, simplifying the generated classes.
The BaseServer
class has a registerEndpoints(app: Express): void
method, which child classes should implement to register their own routes. The routes should be the same as those used by the *Client
classes.
class SomeServer extends BaseServer {
registerEndpoints(app: Express): void {
app.get('/students', async (req, res) => {
const result = await this.logic.students();
res.json(result);
});
app.post('/calculateAggregate', async (req, res) => {
const result = await this.logic.calculateAggregate(req.body);
res.json(result);
});
}
}
To allow the customization of the actual logic behind the server implementations, they take an instance of the associated Stakeholder interface as a parameter (called logic
). Make sure to forward the actual calls to the logic
property!
💡 This pattern is called "dependency inversion" and can be simplified using "dependency injection" technologies. We are not using any DI tools in this session.
✏️ Extend the generate-server.ts
file at the // TODO: implement mapStakeholderToClass!
line to generate server implementations compatible with the Client classes. Make sure to forward each call to the logic
instance's appropriate method!
📋 CHECK: Rerun the generator. The generated server.ts
file should be compatible with the generated client classes!
Finally, service chain classes put all of the above together and orchestrate the data transfer between the stakeholders. The generated service chain classes take one instance of each used Stakeholder
interface and call their respective functions with the last output as a parameter.
The BaseServiceChain
class has an abstract execute
method, which is overridden in each service chain class. It also implements a pseudonymize
method, which can generate randomized IDs for specific PII values, ensuring the same ID is returned for the same PII value.
The partial generator implementation generates the execute
method implementation, calling stepN()
methods one by one and forwarding their outputs to the next step.
✏️ Extend the generate-service-chains.ts
file at the // TODO: implement step method generator!
line to generate the stepN()
methods with the correct signature (parameter, return type) as used in the execute
method! Make sure to handle the valueMapping
of the step if it exists!
Tip
You should first try to implement the service-chains.ts
file inside the dataspace-example
directory and then write the actual generator implementation. Decide how you will handle the value mapping between steps and what types you will use as parameters and return types.
📋 CHECK: Run the generator. The generated service chain implementation should be valid TypeScript code that passes the required data between the stakeholders, respecting the value mappings defined in the model!
Now that the generator components are implemented, it is time to try out the generated code!
✏️ To do so, open your dataspace-example
VSCode instance.
The folder has a similar preconfigured structure, with the package.json
file containing the necessary scripts to compile, build, and run the examples. The dataspace
script executes the Dataspace Cli generator (from the ../dist/cli/main.cjs
file) to generate the TypeScript code. To use it, ensure that the root project is built.
💡 In real-life projects, the generator component should be published with specific versions to ensure loose coupling between the generator's implementation and the user project.
✏️ Try to build the project using npm run build
.
You may have noticed that no .js
file has been created. The reason is that the project contains no entry points and no code that uses the generated infrastructure.
How do we solve this issue? One idea could be to generate the runner code as well, but that would eliminate our customization ability. The other could be to implement the code manually, but we can do better than that.
With code generators, one interesting use case is stub generation. Stubs are basically initial implementations of user code based on some configuration. The benefit of using stubs is that they are only generated if the given file does not exist, meaning our manual modifications will remain in the code. You may remember this pattern as the "Generation Gap" pattern from the lectures.
✏️ To enable stub generation, extend the data-space-config.json
file with the following settings.
{
"modelPath": "students.dataspace",
"genPath": "gen",
"stubs": {
"srcPath": "src",
"serverRunnerStubs": {
"Moodle": {
"defaultPort": 3001
},
"University": {
"defaultPort": 3002
},
"AnalyticsCompany": {
"defaultPort": 3003
},
"QSWorldUniversity": {
"defaultPort": 3004
}
},
"serviceChainStubs": {
"CollectUniversityRankingData": {
"clients": {
"Moodle": {
"defaultUrl": "http://localhost:3001"
},
"University": {
"defaultUrl": "http://localhost:3002"
},
"AnalyticsCompany": {
"defaultUrl": "http://localhost:3003"
},
"QSWorldUniversity": {
"defaultUrl": "http://localhost:3004"
}
}
}
}
}
}
This configuration allows stub code generation into the src
directory and configures the server runner and service chain stubs to be generated. Some default ports and URLs are also defined, making it easier to extend and customize it in the future.
✏️ To include the stubs in the build process, uncomment the following lines in esbuild.mjs
.
// entryPoints: [
// 'src/service-chain/*.ts',
// 'src/server-runner/*.ts',
// ],
📋 CHECK: Run npm run build
again. This time, you should see the initial logic, server runner, and service chain runner files generated and their corresponding .js
files in the dist
directory.
✏️ Finally, customize the generated logic classes with the following implementations.
export class AnalyticsCompanyLogic implements stakeholders.AnalyticsCompany {
calculateAggregate(input: schemas.PseudonymizedStudentGrade[]): Promise<schemas.StudentAggregate[]> {
console.log('Logging received student grades:', JSON.stringify(input, null, 2));
const groups = new Map<string, {
id: string;
year: number;
term: string;
sumCredits: number;
sumWeightedGrades: number;
}>();
for (const grade of input) {
const key = `${grade.id}-${grade.year}-${grade.term}`;
let group = groups.get(key);
if (!group) {
group = {
id: grade.id,
year: grade.year,
term: grade.term,
sumCredits: 0,
sumWeightedGrades: 0,
};
groups.set(key, group);
}
group.sumCredits += grade.credits;
if (grade.signature && grade.finalGrade >= 2) {
group.sumWeightedGrades += grade.credits * grade.finalGrade;
}
}
const aggregates: schemas.StudentAggregate[] = Array.from(groups.values()).map(group => ({
id: group.id,
year: group.year,
term: group.term,
credits: group.sumCredits,
average: group.sumCredits > 0 ? group.sumWeightedGrades / group.sumCredits : 0,
correctedAverage: group.sumWeightedGrades / 30,
}));
return Promise.resolve(aggregates);
}
}
export class MoodleLogic implements stakeholders.Moodle {
calculateGrades(input: schemas.StudentData[]): Promise<schemas.StudentGrade[]> {
console.log('Logging received student data:', JSON.stringify(input, null, 2));
const grades = input.flatMap(student => [{
name: student.name,
studentId: student.studentId,
year: 2023,
term: 'Fall',
courseName: 'Mathematics',
credits: 4,
signature: true,
finalGrade: 85
}, {
name: student.name,
studentId: student.studentId,
year: 2023,
term: 'Fall',
courseName: 'Physics',
credits: 3,
signature: true,
finalGrade: 78
}, {
name: student.name,
studentId: student.studentId,
year: 2023,
term: 'Spring',
courseName: 'Chemistry',
credits: 5,
signature: true,
finalGrade: 92
}, {
name: student.name,
studentId: student.studentId,
year: 2024,
term: 'Fall',
courseName: 'Biology',
credits: 4,
signature: false,
finalGrade: 45
}]);
return Promise.resolve(grades);
}
}
export class QSWorldUniversityLogic implements stakeholders.QSWorldUniversity {
recordAggregate(input: schemas.StudentAggregate[]): Promise<void> {
console.log('Logging received aggregated data:', JSON.stringify(input, null, 2));
return Promise.resolve();
}
}
export class UniversityLogic implements stakeholders.University {
students(): Promise<schemas.StudentData[]> {
console.log('Logging sending Student data.');
return Promise.resolve([
{ name: 'Alice Smith', studentId: '1', age: 20 },
{ name: 'Bob Johnson', studentId: '2', age: 22 },
{ name: 'Charlie Brown', studentId: '3', age: 21 }
]);
}
}
✏️ In a separate terminal, run npm run servers
to start the 4 servers on the specified ports. In another terminal, run npm run servicechain
to execute the service chain.
📋 CHECK: Check the logs of the servers and the service chain. Ensure that each step was executed successfully and that no PII data was passed on to non-consented stakeholders!
Our code generator is now capable of generating ExpressJS client and server codes in the TypeScript language. However, this locks us into this particular technology stack; to use JVM or other technologies we must reimplement the generator to those as well.
Luckily, there are open technologies out there that allow the definition of platform-independent HTTP APIs, and the generation of specific server or client implementations. One major benefit of using such a technology is that the actual logic does not depend on us, and we can inherit the development efforts of the community.
To define the definition, you can use the openapi-types npm package. OpenApi uses YAML by default, but since YAML is a superset of JSON, you can output a JSON file as well.
✏️ Generate an OpenAPI definition out of the .dataspace
model that can be used to generate client and server code for various platforms.
📋 CHECK: Try to generate a server and client code for some platform, e.g., the JVM. https://openapi-generator.tech/docs/generators/java/