Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project4 #17

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
8618872
nothing much
Oct 19, 2016
944d680
Merge branch 'master' of github.com:CIS565-Fall-2016/Project4-CUDA-Ra…
Oct 19, 2016
1986b05
questions
Oct 19, 2016
5d70df0
started coding. not working yet'
Oct 20, 2016
c1fb342
added interect and retabbed
ethanabrooks Oct 20, 2016
53e9433
found the macro bug and the depthInit bugs. Now I can't get it to col…
Oct 21, 2016
54b9304
triangle is working. added depth. put common fields in primitve.
Oct 23, 2016
8c26fbe
view transforms are working (I think)
Oct 23, 2016
67be313
got most of _rasterize working on triangle
Oct 24, 2016
e294bce
everything is crashing :('
Oct 25, 2016
ad037f8
still debugging crashes
Oct 25, 2016
96d26c2
got a shadow ducky
Oct 25, 2016
1088dfb
got the ducky working on textures.
Oct 25, 2016
f22e613
got the ducky working on textures.
Oct 25, 2016
fc17db0
working ducky
Oct 26, 2016
e2366d9
solved the screwy texture problem. Now working on depth correction.
Oct 27, 2016
38fa59e
solved the screwy texture problem. Now working on depth correction. t…
Oct 27, 2016
b774e4d
abandoned the mutex project. still fixing textures.
Oct 27, 2016
a1bd0fa
init
Oct 27, 2016
3722e97
got duck to work after reworking init
Oct 28, 2016
8b18a33
duck is working. about to try to o antialiasing in one fatal go
Oct 28, 2016
f4736bb
init
Oct 28, 2016
faf7456
going back to working to see when mixed-depth bug happens
Oct 28, 2016
16a2c4c
solved the upside down bug
Oct 28, 2016
fe888b6
broken. working on screen going black with samplesPerPixel > 1
Oct 28, 2016
7963830
can render with supersampling although color is screwed up.
Oct 28, 2016
4ab27a7
duck is working, but no jitter and cesium truck/cow/box is not working
Oct 28, 2016
a6019c3
added checkerboard
Oct 28, 2016
8d52858
implemented supersampling with jitter
Oct 28, 2016
876105c
added tune shading
Oct 28, 2016
b19dddb
got truck and box working but not cow
Oct 29, 2016
0ca4fee
got pictures
Oct 29, 2016
7eef3f9
README
ethanabrooks Oct 29, 2016
192c34c
images
ethanabrooks Oct 29, 2016
1170dee
images
ethanabrooks Oct 29, 2016
90fcfa7
profiled
Oct 29, 2016
d422c55
Merge branch 'master' of https://github.com/lobachevzky/Project4-CUDA…
Oct 29, 2016
5b8e17e
profile
ethanabrooks Oct 29, 2016
9a38379
merged
ethanabrooks Oct 29, 2016
2a0667e
readme
ethanabrooks Oct 29, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
message("ROOT")
cmake_minimum_required(VERSION 3.0)

project(cis565_rasterizer)
Expand Down
Binary file added Capture.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MilkTruck.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
103 changes: 93 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,103 @@
CUDA Rasterizer
===============

[CLICK ME FOR INSTRUCTION OF THIS PROJECT](./INSTRUCTION.md)

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Ethan Brooks
* Tested on: Windows 7, Intel(R) Xeon(R), GeForce GTX 1070 8GB (SIG Lab)

### Summary
For this project we implemented a graphics rasterizer. Like path-tracing, rasterization is a method for converting a specification of points in 3d space into a 2d image. Whereas a path-tracer simulates the movement of light rays through space, a rasterizer uses matrix transforms to project 3d objectives onto the screen. Also, instead of representing objects as Platonic solids as in the path tracer, a rasterizer decomposes objects into smaller "primitives", usually triangles.

Our basic pipeline is as follows:

Vertex Assembly -> Vertex Transform -> Primitive Assembly -> Rasterization -> Fragment Shading

It's worth noting that while most rasterizer pipelines look something like this, there are variations from one to the next.

## Vertex Assembly
The objects that a rasterizer uses require assembly. Initially, a GLTF loader iterates through various "meshes" or collections of associated points and fill buffers with

- positions (in model space) of vertices
- normals (also in model space)
- pointers to shared texture images
- coordinates into these shared texture images
- vertices (associated with triangles in the next step)
- indices for associating vertices with primitives

Vertex assembly uses common indices to fill all the values associated with each vertex struct. These include:
- positions
- normals
- texture coordinates

## Vertex Transform
Our rasterizer essentially combines this step with the previous, since we transform vertex positions while assigning them to vertex structs. Transformation is actually not a single step.

Positions start in world space, in which the origin is at an arbitrary global position.

Next they are transformed into view space, where the origin is at the camera. This space is primarily used during vertex shading, when positions relative to the viewer and relative to light sources are taken into account.

Next they are transformed into "clip" space, in which object are projected onto the 2d plane of the screen, but parts that extend past the edge of the screen have not yet been clipped. This space is sometimes also called NDC space for "Normalized Device Coordinate" space. Finally, we scale this space so that the origin is at the lower left corner of the screen and a unit corresponds to a pixel.

## Primitive Assembly
This step simple associates vertices with triangles (or whatever primitive is being used). The indices mentioned earlier map each vertex to its parent primitive.

## Rasterization
This step actually accounts for the bulk of the code, although the tasks it performs are seemingly trivial: rasterization takes on two challenges: coverage and occlusion.

# Coverage
Coverage is mapping the vertices of a primitive to pixels that fall within the area of the primitive. We use the AABB method, where we search within the smallest possible bounding box that surrounds a primitive. Specifically, we scan from the upper left of the box and stop at the lower right, testing each pixel to see if it falls within the primitive and assigning a fragment to it if it does.

# Occlusion
Points in space should only be rendered if other objects don't obstruct them. To check for occlusion we use "depth testing." We use a "Z Buffer" with size proportional to the number of pixels (we scale up in the case of supersampling). For each sample of the screen, we update the Z buffer with an integer that corresponds to the depth of the closest point with larger values corresponding to greater distances. Usually we measure these depths from 0 to INT_MAX. Once all the depths have been updated, we iterate back through the sampled points and only assign fragments where the depth of the fragment corresponds to the depth in the Z buffer -- this indicates that the fragment was closer than any other at that sample point.

# Fragment Shading
During this part, we shade the base color of pixels to reflect material properties or lighting. We used a standard technique: Blinn-Phong shading, in which the intensity of the lighting is proportional to the angle between the surface normal and the mean of light angle and view angle. Thus light is brightest when a viewer views an object head on with a light source at the camera. Surfaces are dark, essentially when an object is lit from behind.

## Additional Features
# Texture Mapping.
In order to apply more complex color patterns, our rasterizer gives each primitive a pointer to a texture image. A texture image looks a bit like a smashed version of the object. Each vertex is assigned a "texture coordinate" that points to the spot in the texture image where the object gets its base color. In order to assign colors to fragments which are usually between vertices (not exactly at them) we simply interpolate the texture coordinates of all three vertices using barycentric coordinates.

Barycentric coordinates associated with a triangle (as in our case) have values that are proportional to a points nearness to each vertex of the triangle. For example, if a point is colocated with the third vertex, its barycentric coordinate would be (0, 0, 1). Moreover, a point falls inside a triangle only if all three of its barycentric coordinates are in the range [0, 1].

Interpolation over barycentric coordinates simply involves weighting the contribution of each vertex by the value of the barycentric coordinate associated with it.

Here is an image of textures applied to a duck:

![alt text] (https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/duckAA32-2.PNG)

and to a milk truck:
![alt text] (https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/milktruckAA32.PNG)

The main performance cost of texture mapping is the requirement to repeatedly access global memory, both for texture coordinates and the texture image itself. However, an arbitrarily complex texture can be used with only minor additional cost in memory.

Since the texture coordinates of the three vertices are repeatedly accessed by the pixels that fall within them, this is a feature that would strongly benefit from the use of shared memory.

# Antialiasing
Unlike previous efforts, we this time used randomization to perform antialiasing. Antialiasing is a process wherein the value assigned to a pixel is actually the average of several colors calculated from points within the pixel. These points are called "samples" and the technique of taking multiple samples per pixel is known as "supersampling." The result may be seen below:

This is the duck with antialiasing (x32):

![alt text] (https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/duckAA32.PNG)

And this is the duck without:

![alt text] (https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/duckAA1.PNG)

In many ways the contrast is clearest in the case of geometric objects, especially when viewed at an oblique angle:

With antialiasing:

![alt text](https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/checkerboardAA32.PNG)

### (TODO: Your README)
Without:

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
![alt text](https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/checkerboardAA1.PNG)

Like texture mapping, antialiasing comes with a performance cost -- probably an even more significant one, actually. In general runtime scales with the number of samples taken per pixel as demonstrated by this chart:

### Credits
![alt text] (https://github.com/lobachevzky/Project4-CUDA-Rasterizer/blob/master/antialiasing-profile.PNG)

* [tinygltfloader](https://github.com/syoyo/tinygltfloader) by [@soyoyo](https://github.com/syoyo)
* [glTF Sample Models](https://github.com/KhronosGroup/glTF/blob/master/sampleModels/README.md)
Two major memory optimizations include:
1. Only sampling at edges, this the effects of aliasing are really only observable there.
2. Directly averaging colors in place in the fragment buffer, instead of increasing the size of the fragment buffer, assigning separate samples to separate indices and subsequently averaging. This proved tricky, since multiple threads would have to access the same index in the fragment buffer simultaneously leading to race conditions.
Binary file added antialiasing-profile.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added box1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added box2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added boxAA1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added boxAA32.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added checkerboard.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added checkerboard2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added checkerboardAA1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added checkerboardAA32.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions debug.txt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@

bcCoord=0.007811,0.009287,0.982902
texCoord[0]=0.000100,0.999900
texCoord[1]=0.999900,0.999900
texCoord[2]=0.999900,0.000100
weighted texCoord=0.992090,0.017195
rescaled coord=535.728760,9.285239
color=0.905882,0.905882,0.905882

bcCoord=0.006082,0.009879,0.984039
texCoord[0]=0.000100,0.999900
texCoord[1]=0.999900,0.999900
texCoord[2]=0.999900,0.000100
weighted texCoord=0.993819,0.016058
rescaled coord=536.662292,8.671115
color=0.000000,0.000000,0.000000
Binary file added duck2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added duckAA1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added duckAA32-2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added duckAA32.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions external/include/glm/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
message("GLM")
set(NAME glm_dummy)

file(GLOB ROOT_SOURCE *.cpp)
Expand Down
Loading