I’m going to begin my renderer studies by writing a SIMD, multithreaded implementation of smallpt (http://kevinbeason.com/smallpt/).
I looked at the source code of smallpt a couple of months ago after the OpenCL GPGPU implementation of it, SmallptGPU (http://davibu.interfree.it/opencl/smallptgpu/smallptGPU.html), came out. I was learning DirectCompute and thought it would be good for me to write a DirectCompute version of smallpt, but instead of doing that I ended up writing a DirectCompute Buddhabrot renderer (http://www.yakiimo3d.com/2010/03/29/dx11-directcompute-buddhabrot-nebulabrot/).
Using SmallPtGPU as reference, I had a CPU version of smallpt working inside a DX11 StructuredBuffer render framework, and I’m picking up where I left off and using that source code for my smallpt studies. With my current CPU version of smallpt, on my Core2 Quad Q6600 2.4ghz, I get around 0.46 fps while rendering a 400×400 image with 2×2 supersampling, for 400*400*0.46*4=294400 samples/sec.