-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME
199 lines (145 loc) · 8.84 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Author: Alan Reiner (USA)
Orig Date: 05 Jan, 2011
Last Update: 03 Feb, 2011
Description: Interactive fractal viewer. Uses CUDA for super-fast fractal
generation, and OpenGL for displaying and interacting
Windows Linux Mac
CUDA Compute
1.0 ? No No
1.1 ? No No
1.2 Yes No No
1.3 Yes No No (All GTX 2XX)
2.0 Yes Yes No (All GTX 4XX)
2.1 Yes Yes No (GTX 460)
I don't have a MAC on which to try this.
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
*** One code base for Windows and Linux
--------------------------------------------------------------------------------
Update 04 Feb, 2011
Checked out the Win64 branch, updated the makefile, and compiled. It works!
So I merged it into the master branch and all new development will continue
from here.
--------------------------------------------------------------------------------
*** Windows Support!
--------------------------------------------------------------------------------
Update 03 Feb, 2011
After a tremendous amount of pain and frustration, it appears that NVIDIA does
not want any Windows users to get into CUDA. They don't make their SDK compile
in MSVS 2008 or 2010, and normal constructs I can use in Linux fail miserably
in Windows. I had to do complete code re-organization to get this to work...
With the exception of the mouse scroll wheel, the exact same code CUDA/OpenGL
code that worked in Linux, also works in Windows. That's nice...
At the moment there are two branches... I need to check out the win64 branch
in Linux and see if I can get it working via preprocessor branches. Then I'll
be able to merge the win64 branch into the master and everyone should be happy!
Then to start implementing new stuff! Dynamic colormaps, new IFS functions,
randomization...
--------------------------------------------------------------------------------
*** MAJOR UPDATE -- Integrated OpenGL with CUDA for interactive fractal viewing!
--------------------------------------------------------------------------------
Update 30 Jan, 2011
The original fractal code in CUDA was combined with CUDA-OpenGL-interop code to
allow for real-time passing of CUDA output to OpenGL for display and interaction
(via video memory -- no need to pass image through host RAM). I don't know how
NVIDIA could've made this task any more complicated... But the torture is over,
and it works (LINUX ONLY -- no windows support yet).
Also, the code compiles and runs for CUDA Compute Capability less than 2.0, and
it actually loads without crashing, but it doesn't actually display a fractal.
So, for the time being, consider this code Fermi-only.
With OpenGL installed, this produces a real-time Julia set. Left-click-drag
will pan the view, scroll wheel will zoom. Right-clicking and dragging will
adjust the C-parameter, which causes the fractal to morph in real time. At
this time, the colormap can only be changed via added a cmap*.txt file and
updating the code in main_gl.cu (and recompiling).
---------------------
This CUDA code is a very basic fractal generator using CUDA (NVIDIA gfx card)
The code is kind of sloppy, and simplistic (though Mandlebrots and Julia sets
aren't difficult), but it works if you have CUDA installed and a newer NVIDIA
graphics card. Eventually, as I learn more about fractals, I will expand this
library to generate more types of fractals.
To learn more about CUDA, or to download more of the files you need to compile
this code, you can check out my other CUDA project on github,
CUDA-Image-Processing
If you are familiar with CUDA, you will realize that fractals are an absolutely
*PERFECT" application for CUDA. You don't have to pass in an image, only a few
parameters that define the fractal, and the calculations are relatively simple
and completely parallelizable. You should see the full advantage of your GPU
with this program.
For reference, I put this same code together, though much more simply, in MATLAB
using MATLAB complex numbers and it took 632s to generate a 2048x2048 Mandlebrot
with max escape time of 256. By comparison, a 8192x8192 Mandlebrot with max
escape time of 1024 took less than 1 second using this code. Epic!
Typically you would expect 20x-200x speedup from CUDA for most parallelizable
applications (like image processing). However, this program demonstrates that
sometimes you can achieve even more than that (10,000x?) if your problem is
just right.
---------------------
Update 13 Jan, 2011
Added a Colormap class for defining what colors you want in the output png.
Will eventually implement the capability to supply N colors, and a 256x3
colormap will be constructed by spreading out those three colors across the
spectrum of gray values and interpolating the missing ones. An example cmap
is provided, via cmap_blue_green.txt
Also implemented more useful tiling to the rendering process. Rather than
writing each CUDA tile to a separate file, it now allocates one large host
image and writes each CUDA tile to the host RAM. This assumes that you are
rendering an image that is too big for your GFX card, but will fit in host
RAM. I changed this because it looks like there's nothing useful to come
out of writing separate files (no print shop will be able to do anything
with a sequence of files...)
---------------------
Update 12 Jan, 2011
** Added Julia Sets:
Julia sets turned out to be very closely related to the Mandlebrot,
and it was actually only a couple extra lines of code to modify the
kernel to do Julia instead. This also gives me the possibility of
doing 3D fractals by varying the c parameter. However, this may not
be desired, since it seems most Julia sets are rich enough in two
dimensions.
** Added writePngFile():
I finally succeeded in harnessing the libpng[12] library to write
png files directly from the main() function, without every going
through MATLAB. Bypassing MATLAB saves a lot of time and RAM.
** createcolormap.m:
I created a MATLAB method for exploring colormaps. It creates a
256x3 matrix of colors, one for each gray level, to be applied to
an image loaded in MATLAB. Given the completion of writePngFile()
this won't be so useful as a MATLAB script, but will soon be adapted
to C++ so that it can be used in the writePngCode() to add color
via the C++ code.
** cudaQuaternion<T> class:
Extends complex numbers to four dimensions. Originally planned to
use quaternions for 3D Mandlebrot, but found out the there is no
such thing (or, rather, the Mandlebrot is pretty boring in higher
dimensions). However, these may be useful for other kinds of
fractals...
---------------------
Update 09 Jan, 2011
** Added the cudaComplex<T> class:
Can be run by devices of CUDA compute 2.0 or higher. Using this,
I can use complex numbers as native, algebraic objects, and will
be able to implement any kind of IFS fractal, now. Using the new
cudaComplex<T> class increases computation time by about 5-10%
(prob due to extra memory alloc/dealloc due to operators returning
copies of the answers, instead of writing directly to a var)
Manual complex-multiplication in kernel (4096x4096 tile, 1024 max esc):
---Single-precision w/ mem copy: 0.195 s
---Double-precision w/ mem copy: 0.562 s
Using cudaComplex<T> (4096x4096 tile, 1024 max esc):
---Single-precision w/ mem copy: 0.202 s
---Double-precision w/ mem copy: 0.624 s
** Complex<T> class:
This was a regular C++ implementation of the cudaComplex<T> before
I realized that it needed to be converted to __device__ code. I left
this class in the project even though it's completely redundant w.r.t.
to std::complex<T>
** Quaternion Class:
I implemented this in the most tedious way possible, to later realize
it's probably completely unnecessary. It hasn't been compiled or
tested in any way, I just wanted to get the non-commutative algebra in
there. I should be able to use them for higher-dimensional fractals,
later.