OpenGL Coordinate Systems - tonykwok/made-mistakes-again GitHub Wiki

The global picture

To transform the coordinates from one space to the next coordinate space we'll use several transformation matrices of which the most important are the model, view and projection matrix. Our vertex coordinates first start in local space as local coordinates and are then further processed to world coordinates, view coordinates, clip coordinates and eventually end up as screen coordinates. The following image displays the process and shows what each transformation does:

Figure 1: Different spaces in the graphic pipeline. Source: Adapt from [1]

Local coordinates are the coordinates of your object relative to its local origin; they're the coordinates your object begins in.
The next step is to transform the local coordinates to world-space coordinates which are coordinates in respect of a larger world. These coordinates are relative to some global origin of the world, together with many other objects also placed relative to this world's origin.
Next we transform the world coordinates to view-space coordinates in such a way that each coordinate is as seen from the camera or viewer's point of view.
After the coordinates are in view space we want to project them to clip coordinates. Clip coordinates are processed to the -1.0 and 1.0 range and determine which vertices will end up on the screen. Projection to clip-space coordinates can add perspective if using perspective projection.
And lastly we transform the clip coordinates to screen coordinates in a process we call viewport transform that transforms the coordinates from -1.0 and 1.0 to the coordinate range defined by glViewport. The resulting coordinates are then sent to the rasterizer to turn them into fragments.

dougbinks [2]

If you look at the projection matrix calculations you will see that they do not depend on the resolution of the framebuffer, but on the aspect ratio defined by the width / height. This is because the output of the vertex shader is in clip space which is an axis aligned cube with coordinates ranging from (-1.0,-1.0,-1.0) to (+1.0,+1.0,+1.0). The clip space coordinates are transformed to screen space coordinates using the viewport transform you set with glViewport.

So you don’t need to worry about transforming your window coordinates to viewport coordinates if you use the inverse projection transform, as your input to these is in clip space. You will likely need a transform like:

// Following code untested
double x, y;
glfwGetCursorPos(  g_pSys->window, &x, &y );

// cursor pos is in window coordinates, not gl fragment coords so use window size not framebuffer size
int w, h;
glfwGetWindowSize( g_pSys->window, &w, &h );

vec4 clip_space;
clip_space.x = 2.0f * (float)x/(float)w- 1.0f;
clip_space.y = 1.0f - 2.0f * (float)y/(float)h;
clip_space.z = -1.0f;  // depends on whether you want the near or far plane
clip_space.w = 1.0f;
vec4 view_space = mInverseProjFromView * clip_space;

The orthographic projection also does not depend on the resolution. The input coordinates define a box in view space (view space is the coordinates after the camera view transform, similar to world space coordinates with a non-rotated, camera at the origin) which is converted to clip space by the projection and then after the vertex shader to viewport space by the current glViewport setting.

This is often confused in tutorials which use the window coordinates as inputs to the orthographic matrix calculation. It’s fairly easy to figure out why this should be view space - as they define a box around what you want to display.

To make the result of an orthographic projection have the same scale in your rendered view, it’s usually best to ensure that the aspect ratio is the same as the aspect ratio of your window.

The coordinates of glOrtho are all view space coordinates.

So if you have three points, p0, p1, p2 which define the vertices of a triangle, and you want to view this triangle with glOrtho then you do something like the following:

// untested code

float left    = min(p0.x, min(p1.x, p2.x) );
float right   = max(p0.x, max(p1.x, p2.x) );
float bottom  = min(p0.y, min(p1.y, p2.y) );
float top     = max(p0.y, max(p1.y, p2.y) );
float nearVal = min(p0.z, min(p1.z, p2.z) );
float farVal  = max(p0.z, max(p1.z, p2.z) );

glOrtho( left, right, bottom, top, nearVal, farVal );

If you want to ensure the aspect ratio is the same as your window, then you modify the left, right or bottom, top coordinates to maintain your desired aspect ratio.

[1] https://learnopengl.com/Getting-started/Coordinate-Systems

[2] https://discourse.glfw.org/t/converting-between-screen-coordinates-and-pixels/1841/8