CS354 - Virtual Mannequin Handout

Project Three

Project 3: Virtual Mannequin
Assigned: Tuesday, Oct 18, 2016
Due: Thursday, Nov 10, 2016 (by 11:59PM)

A reference implementation of the virtual mannequin, showing a model in one sculpted pose.

In this project, you will implement a skinning system -- a way to animate a 3D character by manipulating bones embedded inside the character. To deform the character using skinning, you will implement a basic mouse-based bone selection and rotation interface.

Starter Code

As before, we have provided you with starter code, written in C++ and using Open GL and several OpenGL helper libraries (glm, glfw) as a starting point for implementing this project. We have also included code for reading and writing images to JPEG files (this requires a nontrivial amount of low-level parsing and byte manipulation). The starter code is a working example of how to set up OpenGL and a shader program to draw a checkerboard floor, and move around the world using an orbital and FPS camera (as per assignment 2). We have also provided a reference solution of how this project should function if you implement it correctly. The program responds to keyboard and mouse input as specified below. If you have questions or doubts about how your program should behave, the reference solution is a first place to check.

Building and Running the Starter Code

We have provided cmake scripts to automate compiling and running the assignment code. Please do not modify the build process as it has been designed to streamline grading. To compile your code, type

mkdir build
cd build
cmake ..
make

to generate an executable in the bin subdirectory. For additional information about the starter code, please see the included README.md file.

Submission Instructions

All submissions must be uploaded to Canvas by the project due date and time. In exceptional cases where Canvas is not working, you may turn in the assignment by emailing the TA a Dropbox link, handing him a USB stick, etc, but there will be no exceptions to the project due date--late submissions will be penalize per the syllabus late policy. Submit your project as a single .tgz file. This file should be free of temporary files, core dumps, etc; you also should not submit the (large) mesh files included in the starter code. For example, you can execute the following command from the parent folder containing the code to create an appropriate .tgz file:

tar --exclude obj --exclude build --exclude ".*" -czf submission.tgz foldername

where foldername is the name of the folder containing your code. In addition to your complete source code, the .tgz file should include a README file containing, at minimum:

Your name;
An explanation of any required features that you know are not working completely correctly, but would like to have considered for partial credit;
Anything the spec below asks you to include.

All code ***MUST*** compile and run on the 3rd floor GDC computer lab Linux machines!!!

In particular,

Do not submit a 7zip of a Visual Studio .soln file;
Do not include any absolute paths in your Makefiles that point to your home folder;
If you use any external libraries (which must be approved in advance by the instructor) that are not installed by default on the GDC computers, you must include the library source files with your submission, and include building of the library in your build process.

Please triple-check that your submission adheres to these instructions. The instructor or TA reserves the right to return, ungraded, any submissions that fail to compile or run, or that do not include a README file.

Tip: After uploading your files to Canvas, it is not a bad idea to download them onto a different computer and check that it still compiles and runs, in case you accidentally leave out some important files from the .tgz. It is also wise to try to upload to Canvas well ahead of the final deadline: again, there will be no exceptions to the lateness policy, not even for technical difficulties.

Grading

The assignment is broken down into several features described below. Each feature is worth a number of points and will be graded separately (although later feature may build on and require a working implementation of earlier features). Your code should be correct and free of segfaults, deadlocks, or other buggy behavior. You will not be graded on software engineering (use design patterns and unit tests if they help you) but unreadable code is less likely to receive partial credit. You should not worry too much about performance or optimizing your code, but egregiously inefficient implementations of any of the features will be penalized.

Required Features (150 points)

Parsing the Joints and Weights (30 points)

All data related to a character is stored in a PMD file. This file contains the model itself - represented as a triangle mesh along with some information about the materials and textures to use for that mesh - as well as a skeleton rig and blending weights specifying how moving the bones affects the different parts of the model.

We have provided code for parsing and rendering the model mesh. Your first task will be to build data structures for representing the skeleton, bones, joints, and blending weights, and to populate these data structures from the provided PMD file.

The skeleton is represented using a forest of bones which connect pairs of joints. The PMD file contains only information about the joints, from which the bones and skeleton can be reconstructed. Each joint has

an integer ID identifying the joint;
the ID of the parent joint. If this ID is equal to -1, then the joint has no parent - it is a root joint. Otherwise, there exists a bone extending from the parent joint to the current joint.
an offset vector which gives the joint's initial ("rest") position in world coordinates in terms of its parent's initial position. If the joint is a root joint, the offset vector simply specifies its position with respect to the origin in world coordinates.

Notice that several joints might share the same parent - this represents a branching in the skeleton. For example, several finger joints might specify a wrist joint as its parent.

You will want to convert this endpoint-based data into a bone data structure. Each bone has a coordinate system whose origin is its first endpoint, and whose basis vectors are

the tangent direction \(\hat{t}\), which points from the bone's first endpoint to its second endpoint;
the normal direction \(\hat{n}\), an arbitrary unit vector perpendicular to \(\hat{t}\). Given \(\hat{t}\), you can compute such a vector with the following algorithm. Set \(v=\hat{t}\), then set the entry of v with smallest magnitude to 1, and the other two entries to zero. Finally compute \(\hat{n} = \frac{\hat{t} \times v}{\|\hat{t} \times v\|}\).
the binormal direction \(\hat{b} = \hat{t}\times \hat{n}\).

You will represent each bone's coordinate system relative to the parent bone's coordinate system. This way, if you e.g. bend your arm, all child bones (hand, fingers, etc) will bend with it automatically. To do this, build two transformations for each bone:

the translation T_i, in the coordinate system of the parent's bone, which maps the parent's origin (first endpoint) to the bone's origin. For all bones in this assignment, T_i should translate only the first coordinate (corresponding to translating along the parent's \(\hat{t}\) direction). (Translations of the second or third coordinate would correspond to dislocated bones that are not attached to the parent bone.)
the rotation R_i which maps the parent coordinate axes to the bone's coordinate axes.

If a bone has no parent, T_i and R_i should be with respect to world coordinates.

For example, suppose bone 7 is connected to bone 4, which is connected to bone 1, which is connected to bone 0, which has no parent. If the coordinate systems are set up correctly, the origin of bone 7 should be at \[T_0R_0T_1R_1T_4R_4T_7\left[\begin{array}{c}0\\0\\0\\1\end{array}\right]\], the second endpoint at \[T_0R_0T_1R_1T_4R_4T_7R_7\left[\begin{array}{c}0\\0\\L\\1\end{array}\right]\] where L is bone 7's length, and the \(\hat{t}, \hat{n}\), and \(\hat{b}\) axis should be given by \[R_0R_1R_4R_7.\]

Create a data structure which stores, for each bone, its source and destination joints, its length, and its T and R matrices. Finally, write code to visualize the bones as line segments.

We have implemented a translucency mode when rendering the mesh to make it easier to work with the skeleton. Hit t to toggle translucency mode. The bones should be visible within the mesh whenever translucency mode is turned on (obviously the bones will be mostly hidden when the mesh is opaque; this is fine.)

Notes: (ignore these if they are more confusing than helpful)

Skeleton Construction:

Recursive tree-like construction takes much of the pain of this away.

Joints can have child joints and can be viewed much like nodes. Bones have a (from, to) tuple which specifies the joint they come from and the joint they go to.

Joints can keep other things, such as their reference orientation (this is a matrix), which is useful when implementing the skinning part.

Pose Updates:

You can keep a boolean around to tell whether the pose has been changed or not.

Pose updates are not too hard if this is done recursively. You should implement a function that updates a joint and recursively updates all the children.

These updates will typically just be rotations applied to the joint. Along with a rotation R, you will have a translation T htat translates the origin to the current joint position. The transformation you want is <(T_inv * R * T\). This transformation, when applied recursively, will rotate the initial joint, rotating it by R in its own coordinate system, and then rotate all of its children relative to it.

Constructing the Bone Orientation Matrix:

You will have the direction of the bone and you will need to produce a matrix that gives them an orientation for the bone. This is similar to a camera LookAt() routine, but there is no eye position, and you only get to start with one vector.

You just need to generate a vector that is *not* parallel to the bone direction. One way to do this is to find the smallest component of the bone direction and generate \(e_j\) where j is the index of the smallest component. Then take two cross products to get the other two vectors.

Bone Picking (50 points)

As the user moves the mouse around the screen, you should highlight nearby bones; these highlighted bones are then the ones the user will rotate by clicking and dragging the mouse. Implement functionality that determines, whenever the user moves the mouse, which bone should be highlighted:

(5 points) Each time the user moves the mouse, convert the position of the mouse cursor in screen coordinates to a ray in world coordinates, whose origin p is at the eye's position, and whose direction v is such that the ray from p in the v direction pierces the near plane at the pixel containing the mouse cursor.
(35 points) The bones, when rendered as line segments, are very thin and hard to click on; to make the user interface easier to use, we will thicken the bones into cylinders of radius kCylinderRadius for the purposes of detecting which bone the user intends to click on.
The details of ray-cylinder intersection are up to you. One way to proceed is to first find the closest point \(c_0\) on the ray \(p + tv\) to the infinite line passing through the cylinder's axis. If the distance between \(c_0\) and the infinite line is greater than kCylinderRadius, the ray misses the cylinder. Also, if the value of t corresponding to \(c_0\) is negative, the cylinder is behind the eye and the ray misses the cylinder. Otherwise, project \(c_0\) onto the cylinder's axis, and check whether the projected point is between the two cylinder endcaps. If so, the ray hits the cylinder; otherwise, the ray misses (there is a corner case where the ray hits one endcap, pierces the entire length of the cylinder, and leaves through the other endcap, which is declared a miss by this procedure. But this corner case is rare enough that you can ignore it.)
If the ray hits one or more cylinders, the first such cylinder hit becomes the highlighted cylinder. Otherwise, no cylinders are highlighted.
(10 points) Visualize highlighted bones by drawing a wireframe prism approximating a cylinder of radius kCylinderRadius (and same length as the bone) around the highlighted bone. (I recommend precomputing the vertices and faces of a single prism, scaling it to match the length of the bone, and transforming it to the correct position and orientation using the R and T matrices.) Also visualize the \(\hat{n}\) and \(\hat{b}\) directions for the highlighted bone, using short red and blue line segments anchored at the bone's origin point. (Obviously, the \(\hat{t}\) direction points down the length of the bone, so there is no need to visualize it separately.)

Notes: (ignore these if they are more confusing than helpful)

There are two parts to bone picking:

Ray generation. Implement a ScreenToWorld function that generates a world space point given screen space (x, y).
There are several ways to do this:
1. Use the camera matrix:
  1. calculate the width and height (w, h) of the image plane in world coordinates (you need to know the field of view)
  2. calculate ndc.x, ndc.y
  3. use (ndc.x, ndc.y), (w, h), look, up, tangent to get the world space coordinates
2. Use the projection matrix:
  1. unproject (x, y, 0) to get the world space coordinates
  2. Once you have this, subtract the eye position to get a world space ray.
3. Another method is:
  1. let q be unproject (x, y, 0),
  2. let p be unproject (x, y, 1),
  3. take the ray through the two points p and q.
Cylinder intersection.
Implement a cylinder/ray intersection routine where the cylinder points straight up and is centered at the origin (or lying on its side), has radius R, and height H. Transform the ray into the cylinder's coordinates. Use the bone orientation matrix to do this. The orientation matrix needs to be *just* the orientation (no translation).
Turn this problem into a 2D problem. Project the line onto the (x, z) plane and intersect a line with a circle of radius R. If the ray misses the circle, reject the ray. Else check the intersection times \(t_0\) and \(t_1\) and see if they are good. Compute the intersection points and take the closest good one. A point is good if it has a y-value between 0 and H. This algorithm does not handle intersection at the caps.

Cylinder drawing:

Generate a \(n \times n\) two-dimensional grip of lines. Use a model transformation which is the bone transformation with a scale along the y-axis applied to stretch the cylinder out, and a scale in the x-z plane to widen the cylinder. The bone transformation won't contain a scale, since it has to be purely a rotation.

The vertex shader can wrap the line mesh around (0,0) using \(\cos(\theta), \sin(\theta)\).

Bone Drawing:

You are just drawing a bunch of lines. In fact, all the shade does is draw a single line. The line is transformed by the bone transform and a scale along the y-axis. You should instance the line and the transformation matrices you send to the shader.

Bone Manipulation (25 points)

When the user clicks and drags the mouse, the camera rotates, as in assignment 2. You will implement a second mode, toggled by pressing the M key, where clicking and dragging the mouse rotates the highlighted bone (and all child bones attached to the highlighted bone) instead. To do this you will track not only the original coordinate system \(R_i\) of each bone, but also the deformed coordinate system \(S_i\) of the bone. Initialize \(S_i = R_i\); as the user rotates the bones, the two will no longer match.

(5 points) Modify your bone data structure to also track \(S_i\) for each bone. Modify your rendering and bone-picking code to use the deformed bone postiion (given by the \(T_i\) and \(S_i\)) rather than starting bone position (given by the \(T_i\) and \(R_i\)).
(15 points) Left-clicking and dragging the mouse should rotate the bone relative to the camera coordinate system. That is, convert the drag direction into a vector in world coordinates, take its cross product with the look direction, and rotate all basis vectors \(S_i\) of the bone about this axis by rotation_speed radians. (This calculation should remind you of the FPS camera controls).
(5 points) Pressing the left and right arrow keys in bone-movement mode should roll the highlighted bone: spin the bone about its tangent axis \(\hat{t}\) by roll_speed radians.

Notice that rotating a bone changes \(S_i\), and so implicitly also changes the origin and orientation of all child and descendent bones (just like bending your arm also affects your wrist, fingers, etc.) If you have set up your transformations properly, bones should transform correctly automatically when you change their ancestors.

Linear Blend Skinning (35 points)

You now have all the machinery needed to implement linear blend skinning. The instructions below walk you through how to this using the provided weights in the PMD file, and assuming you will animate the vertex positions on the CPU. You are welcome to implement linear-blend skinning entirely on the GPU instead, for extra credit.

The PMD file contains weights which specify how much influence each joint has on each vertex of the model mesh. You will first need to convert these into a (# bones) \(\times\) (# vertices) matrix of per-bone blending weights, which tell you how much influence each bone has on each vertex of the mesh.

(10 points) Compute and store weights \(w_{ij}\), encoding the influence of bone i on vertex j, in a data structure. We have provided code for reading joint weights from the PMD file. To convert joint weights to bone weights, assign a weight for joint i to all bones which have that joint as their source (parent) joint. This means that each joint weight might turn into multiple different bone weights; similarly, joints at the leaves of the skeleton forest (at the fingertips, for instance) will have no joint weights.
(10 points) Every frame, for each bone, compute the overall transformation \(U_i\) that maps from the bone's undeformed coordinate system to world coordinates, and the overall transformation \(D_i\) that maps from the bone's deformed coordinates to world coordinates.
(15 points) Implement linear-blend skinning: every frame, transform each undeformed vertex \(v_j\) of the ogre using the transformation \[\tilde{v}_j = \sum_{\mathrm{bones\ } i} w_{ij} D_iU_i^{-1}v_j\] to get the current, deformed position of the vertex \(\tilde{v}_j\).

Notes: (ignore these if they are more confusing than helpful)

Linear Blend Skinning: This is easier than it seems.

Compute blending matrices:
For each bone you have a reference pose. Invert that pose. In fact, invert it once and save it for later, it never changes.
You have a current animated pose as well. So, for each bone j you will produce a blending matrix: \(B_j\) = current_bone_transformation * inv_reference_pose.
Compute all the blending matrices first.
Transform each vertex:
The transformation \(T_i\) for a vertex i is a sum of the \(B_j\) weighted by the weight \(w_{ij}\) where \(w_{ij}\) corresponds to bone j and vertex i.
Transform vertex i by T.
Render!

Miscellaneous (10 points)

When the user hits the J key, save a screenshot to the assignment's root directory. We have provided a routine for doing the image file IO for you, but you will need to look up annd implement how to read pixels from the framebuffer using OpenGL.

Extra Credit (up to 50 points)

Many opportunities for extending the project exist, beyond the basic functionality required above. You may implement additional features for extra credit. Particularly impressive submissions will receive more points, and you may be invited to demo an outstanding entry for the class. Some ideas include

some of the example models exhibit z-fighting, due to loss of precision in the depth buffer (see comments in code near kNear in config.h for details). This loss of precision is due to the way that the projection matrix maps depth into normalized device coordinates non-linearly. Fix the problem.
instead of performing the matrix blending on the CPU, do everything on the GPU using shaders.
linear-blend skinning is very popular thanks to its simplicity, but has well-known "pinching" and volume-loss artifacts. Implement dual-quaternion skinning, the current state-of-the-art skinning algo- rithm;
find a cool mesh online and rig it with your own bones file;
implement pose keyframing: the user specifies a few deformed poses for the ogre, and hits a key; the program then animates the ogre by interpolating between the keyframed poses;

Important Any extra credit you attempt must obey the following two cardinal rules:

Your code must obey the above spec and match the reference solution. In other words, you can add new features for extra credit, but should not modify the way that existing features function. You will be deducted points if your extra credit work causes your program to violate the spec! This includes the functionality of the T key: if you want to extend the number or types of texture supported, add a new key that enables the ``extra'' textures (and document it in your README).
Include a brief explanation of any work you would like considered for extra credit in your README file. Undocumented or poorly-documented extra credit features risk receiving no credit.

Reminder about Collaboration Policy and Academic Honesty

You are free to talk to other students about the algorithms and theory involved in this assignment, or to get help debugging, but all code you submit (except for the starter code, and the external libraries included in the starter code) must be your own. Please see the syllabus for the complete collaboration policy.

Last modified: 10/18/16 by Don Fussell fussell@cs.utexas.edu