I rewrite almost of the shader code. The concept of this change is 'Move'em to GPU'. it means that moving code running on CPU to GPU. In general speaking, GPU is fast because of their parallel processing. CPU has just 2-4 cores, on one hand, GPU has hundreds or thousands cores. Yeah, That's great indeed. There is one more thing GPU is superior than CPU. That is GPU can compile and load program dynamically. You can run the minimum code when you need. That's very important for a complicated system like SEGA Saturn emulation. For example, VDP2(Sega Saturn Display unit) has 3 blend modes, 4 blend condition options, and 3 color mode. That means you need 3x4x3=36 branches to draw just one pixel. On CPU you need to prepare all branches you need statically. it really complicated and makes emulation slow down. On GPU, because you can dynamically compile shader code, you can organize the parts of the code and run the generated code straightly( without branches ). For This 'Move'em to GPU' concept, I got to be able to control VDP1 and VDP2 emulation properly and finely.
I really happy sharing these screenshots improved from the previous version.