CIS565-Fall-2019 · Iron-Stark · Sep 23, 2019 · Sep 24, 2019 · Sep 26, 2019 · Sep 27, 2019
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -92,10 +92,12 @@ list(SORT sources)
 source_group(Headers FILES ${headers})
 source_group(Sources FILES ${sources})
 
-#add_subdirectory(stream_compaction)  # TODO: uncomment if using your stream compaction
+add_subdirectory(stream_compaction)
+add_subdirectory(oidn)  # TODO: uncomment if using your stream compaction
 
 cuda_add_executable(${CMAKE_PROJECT_NAME} ${sources} ${headers})
 target_link_libraries(${CMAKE_PROJECT_NAME}
     ${LIBRARIES}
-    #stream_compaction  # TODO: uncomment if using your stream compaction
+    stream_compaction  # TODO: uncomment if using your stream compaction
+    OpenImageDenoise
     )
diff --git a/README.md b/README.md
@@ -3,11 +3,165 @@ CUDA Path Tracer
 
 **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Dewang Sultania
+  * [LinkedIn](https://www.linkedin.com/in/dewang-sultania/)
+* Tested on: Windows 10, Intel Xeon E-2176M @ 2.70GHz 16GB, Quadro P2000 4GB (Personal Computer)
 
-### (TODO: Your README)
+![](img/main.png)
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+### Table of Contents
 
+1.	 [Overview](#overview)
+2.	 [Graphics Features](#graphics)
+		1.	 [Diffusion](#diffusion)
+        2.	 [Reflection](#reflection)
+        3.	 [Refraction with Fresnel effects using Schlick's approximation](#refraction)
+        4.	 [Anti Aliasing](#anti-alias)
+        5.	 [Motion Blur](#motion-blur)
+        6.	 [Open Image AI Denoiser](#denoiser)
+3.	 [Optimization Features](#optimization)
+	 	1.	 [Stream Compaction](#stream)
+        2.	 [Material Sorting](#material-sort)
+        3.	 [Cache First Bounce](#cache)
+4.	 [References](#references)
+
+<a name = "overview"/>
+
+## Overview
+
+This repository contains code for GPU implementation of a Monte-Carlo Path Tracer. It is a rendering technique for generating an image by tracing the path of light as pixel in an image plane and simulating the effects of its encounters with virtual objects. The technique is capable of producing a very high degree of visual realism, usually higher than that of typical scanline rendering methods, but at a greater computational cost. This makes ray tracing best suited for applications where taking a relatively long time to render a frame can be tolerated, such as in still images and film and television visual effects, and more poorly suited for real-time applications such as video games where speed is critical. Ray tracing is capable of simulating a wide variety of optical effects, such as reflection and refraction, scattering, and dispersion phenomena (such as chromatic aberration).
+
+![](img/path_tracer.png)
+
+<a name = "graphics"/>
+
+## Graphics Features
+
+This section contains description and results of the graphics features that were implemented.
+
+<a name = "diffusion"/>
+
+#### Diffusion
+
+Diffuse Shading is obtained using a cosine-weighted sampling function. It basically means that the incident light is uniformly scattered in all directions.
+
+<a name = "reflection"/>
+
+#### Reflection
+
+Reflection is implemented using glm::reflect.
+
+![](img/reflection.jpg)
+
+<a name = "refraction"/>
+
+#### Refraction with Fresnel effects using Schlick's approximation
+
+
+![](img/refraction.png)
+
+Refraction was implemented using glm::refract and there is also a toggle for if we want to use Schlick's approximation. Special Case of total internal reflection was also handled.
+
+Without Schlick's approximation       |  With  Schlick's approximation 
+:-------------------------:|:-------------------------:
+![](img/refraction_no_fresnel.png) | ![](img/fresnel.png)
+
+<a name = "anti-alias"/>
+
+#### Anti Aliasing
+
+Anti aliasing is achieved by jittering the origin of a ray sent out from each pixel using unifrom sampling.
+
+With Anti Aliasing       |  Without Anti Aliasing
+:-------------------------:|:-------------------------:
+![](img/alias.JPG) | ![](img/no-alias.JPG)
+
+<a name = "motion-blur"/>
+
+#### Motion Blur
+Motion blur is the averaging of multiple shots in a motion.
+ ![](img/motion_blur.png)
+
+<a name = "denoiser"/>
+
+#### Open Image AI Denoiser
+I was able to get the denoiser kind of working and all credits to Intel's out of the world documentation (Really only aliens can understand it). Here is my blooper reel from that. 
+
+Blooper 1       |  Blooper 2 | Blooper 3 | Blooper 4
+:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:
+![](img/denoise_blooper.png) | ![](img/denoise_blooper2.png)|  ![](img/denoise_blooper3.png) |  ![](img/denoise_blooper4.png)
+
+Finally after fixing my issues I was able to get it working:
+
+
+
+The library expects the pixel values to be in little endian format according to the documentation, so I had written a ReverseFloat function to convert big-endian to little endian, but doing so resulted in the blooper reel, when I did not use that function, I got this result for output after 5 iterations.
+
+
+
+Original      |  Denoised
+:-------------------------:|:-------------------------:
+![](img/denoise_orig.png) | ![](img/denoise_decent.png)
+
+
+Then I also passed albedos and normals buffers to the library, the results after doing that were:
+
+Original      |  Denoised
+:-------------------------:|:-------------------------:
+![](img/original_albedo_nromal.png) | ![](img/denoise_albedo_nromal.png) 
+
+It was a hard task to set it up and build it. I have listed down the steps I had to take here and I think this can serve as an easy documentation of how to set it up and get it running because the existing one is simply great!!
+
+* Install tbb from here: https://github.com/intel/tbb/releases
+* Then run this command : ```git clone --recursive https://github.com/OpenImageDenoise/oidn.git```
+* Then copy the oidn folder into your Path Tracer folder
+* Now in your CMakeLists.txt add the lines ```add_subdirectory(oidn)``` and add ```OpenImageDenoise``` to target_link_libraries
+* Then run ```cmake-gui ..```  and add the following four entries before clicking configure:
+
+  * ```TBB_ROOT``` which is equal to something like ```C:/Users/dewan/Desktop/tbb2019_20190605oss_win/tbb2019_20190605oss```
+* ```TBB_INCLUDE_DIR``` which is something like ```C:\Users\dewan\Desktop\tbb2019_20190605oss_win\tbb2019_20190605oss\include```
+  *  ```TBB_LIBRARY``` which is something like ```C:\Users\dewan\Desktop\tbb2019_20190605oss_win\tbb2019_20190605oss\lib\intel64\vc14\tbb_debug.lib```
+  *  ```TBB_LIBRARY_MALLOC``` which is something like ```C:\Users\dewan\Desktop\tbb2019_20190605oss_win\tbb2019_20190605oss\lib\intel64\vc14\tbbmalloc_debug.lib```
+* Now install oidn from here ```https://github.com/OpenImageDenoise/oidn/releases``` and copy ```OpenImageDenoise.dll, tbb.dll, tbbmalloc.dll``` from the bin folder to your System32 windows folder.
+
+The code should build now, atleast it did for me, but I make no guarantees as these steps were results of solving all the error messages that were thrown at me when trying to run this.
+
+
+<a name = "optimization"/>
+
+## Optimization Features
+
+<a name = "stream"/>
+
+#### Stream Compaction
+
+After each bounce, some rays would hit the light source and terminate.  We can stop the threads that are assigned to these rays or equivalently run less threads in the next one. Using thrust::partition function all the active rays are kept together after every iteration and then only those need to be started. 
+
+<a name = "material-sort"/>
+
+#### Material Sort
+
+This idea is based on the fact that if neighboring threads are executing same material type, they will run the same instructions which will result in less warp divergence. 
+
+<a name = "cache"/>
+
+#### Cache First Bounce
+The rays always start at the pixel they belong to and shoot out at the same location. So we can cache the first bounce in the first iteration and we won't need to recalculate their intersections again.
+
+
+
+A performance comparison of these optimizations can be seen below:
+
+![](img/perf.JPG) 
+
+
+
+<a name = "references"/>
+
+## References
+
+1. https://en.wikipedia.org/wiki/Ray_tracing_(graphics)
+2. https://en.wikipedia.org/wiki/Schlick%27s_approximation
+3. http://viclw17.github.io/2018/07/17/raytracing-camera-and-msaa/
+4. https://www.andrew.cmu.edu/user/hgifford/projects/msaa.pdf
+5. https://github.com/RayTracing/InOneWeekend
diff --git a/img/alias.JPG b/img/alias.JPG
diff --git a/img/anti-alias.JPG b/img/anti-alias.JPG
diff --git a/img/denoise_albedo_nromal.png b/img/denoise_albedo_nromal.png
diff --git a/img/denoise_blooper.png b/img/denoise_blooper.png
diff --git a/img/denoise_blooper2.png b/img/denoise_blooper2.png
diff --git a/img/denoise_blooper3.png b/img/denoise_blooper3.png
diff --git a/img/denoise_blooper4.png b/img/denoise_blooper4.png
diff --git a/img/denoise_decent.png b/img/denoise_decent.png
diff --git a/img/denoise_orig.png b/img/denoise_orig.png
diff --git a/img/fresnel.png b/img/fresnel.png
diff --git a/img/main.png b/img/main.png
diff --git a/img/motion_blur.png b/img/motion_blur.png
diff --git a/img/no-alias.JPG b/img/no-alias.JPG
diff --git a/img/original_albedo_nromal.png b/img/original_albedo_nromal.png
diff --git a/img/path_tracer.png b/img/path_tracer.png
diff --git a/img/perf.JPG b/img/perf.JPG
diff --git a/img/reflection.jpg b/img/reflection.jpg
diff --git a/img/refraction.png b/img/refraction.png
diff --git a/img/refraction_no_fresnel.png b/img/refraction_no_fresnel.png
diff --git a/oidn/.gitignore b/oidn/.gitignore
@@ -0,0 +1,87 @@
+# This file is used to ignore files which are generated
+# ----------------------------------------------------------------------------
+
+*~
+*.autosave
+*.a
+*.core
+*.moc
+*.o
+*.obj
+*.orig
+*.rej
+*.so
+*.so.*
+*_pch.h.cpp
+*_resource.rc
+*.qm
+.#*
+*.*#
+core
+!core/
+tags
+.DS_Store
+.directory
+*.debug
+*.prl
+*.app
+moc_*.cpp
+ui_*.h
+qrc_*.cpp
+Thumbs.db
+*.res
+*.rc
+/.qmake.cache
+/.qmake.stash
+
+# Qt Creator generated files
+*.txt.user*
+*.pro.user*
+
+# xemacs temporary files
+*.flc
+
+# Vim temporary files
+.*.swp
+
+# Visual Studio generated files
+*.ib_pdb_index
+*.idb
+*.ilk
+*.pdb
+*.sln
+*.suo
+*.vcproj
+*vcproj.*.*.user
+*.ncb
+*.sdf
+*.opensdf
+*.vcxproj
+*vcxproj.*
+*.log
+
+# Visual Studio Code generated files
+.vscode
+
+# MinGW generated files
+*.Debug
+*.Release
+
+# Python byte code
+*.pyc
+
+# Binaries
+*.dll
+*.exe
+
+# Build directories
+build*
+
+# Dependencies
+deps
+
+# Data directories
+images
+
+# Generated files
+include/OpenImageDenoise/version.h
diff --git a/oidn/.gitmodules b/oidn/.gitmodules
@@ -0,0 +1,6 @@
+[submodule "mkl-dnn"]
+	path = mkl-dnn
+	url = ../mkl-dnn.git
+[submodule "weights"]
+	path = weights
+	url = ../oidn-weights.git
diff --git a/oidn/CHANGELOG.md b/oidn/CHANGELOG.md
@@ -0,0 +1,51 @@
+Version History
+---------------
+
+### Changes in v1.0.0:
+
+-   Improved denoising quality
+    -   More details preserved
+    -   Less artifacts (e.g. noisy spots, color bleeding with albedo/normal)
+-   Added `maxMemoryMB` filter parameter for limiting the maximum memory
+    consumption regardless of the image resolution, potentially at the cost
+    of lower denoising speed. This is internally implemented by denoising the
+    image in tiles
+-   Significantly reduced memory consumption (but slightly lower performance)
+    for high resolutions (> 2K) by default: limited to about 6 GB
+-   Added `alignment` and `overlap` filter parameters that can be queried for
+    manual tiled denoising
+-   Added `verbose` device parameter for setting the verbosity of the console
+    output, and disabled all console output by default
+-   Fixed crash for zero-sized images
+
+### Changes in v0.9.0:
+
+-   Reduced memory consumption by about 38%
+-   Added support for progress monitor callback functions
+-   Enabled fully concurrent execution when using multiple devices
+-   Clamp LDR input and output colors to 1
+-   Fixed issue where some memory allocation errors were not reported
+
+### Changes in v0.8.2:
+
+-   Fixed wrong HDR output when the input contains infinities/NaNs
+-   Fixed wrong output when multiple filters were executed concurrently on
+    separate devices with AVX-512 support. Currently the filter executions are
+    serialized as a temporary workaround, and a full fix will be included in a
+    future release.
+-   Added OIDN_STATIC_LIB CMake option for building as a static library
+    (requires CMake 3.13.0 or later)
+-   Fixed CMake error when adding the library with add_subdirectory() to a project
+
+### Changes in v0.8.1:
+
+-   Fixed wrong path to TBB in the generated CMake configs
+-   Fixed wrong rpath in the binaries
+-   Fixed compile error on some macOS systems
+-   Fixed minor compile issues with Visual Studio
+-   Lowered the CPU requirement to SSE4.1
+-   Minor example update
+
+### Changes in v0.8.0:
+
+-   Initial beta release