Log in

Q3A with open source generated shaders! - LIBV Intentionally Breaks Videodrivers [entries|archive|friends|userinfo]
Luc Verhaegen

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Q3A with open source generated shaders! [Mar. 18th, 2013|08:28 pm]
Luc Verhaegen
[Tags|, , , , , , , , , ]
[Current Location |Rumpelkammer]
[mood |Happy]
[music |Cornershop - Brimful of asha (norman cook remix)]

The combination of limare and open-gpu-tools can now run Quake 3 Arena timedemo without depending on the binary driver for the shader compiler!

Connor Abbott has been being his amazing (16y old!) self again in the weeks after his talk at FOSDEM, and he pushed his compiler work in his open-gpu-tools tree to be able to handle basic vertex shaders. Remember that our vertex shader is a rather insane one, where the compiler has to work real hard on getting scheduling absolutely right. This is why an assembler for our vertex shader was not too useful and the most part of a compiler had to be written for it to generate useful results. A mammoth task, and Connor his vertex shader code is now larger than the code I have in my limare library.

So it was high time that we brought limare and OGT together to see what they were capable of with some basic shaders. Luckily, the Q3A GLES1 emulation has basic shaders, what a nice coincidence :)

So Connor turned my simple vertex shader essl into the high level language used by the OGT vertex shader compiler, and through steps described at this wiki page, turned them into MBS files (Mali Binary Shader - the file type output by the standalone compiler, and also by newer binary driver integrated compilers). Limare can then load and parse those MBS files, and run the shaders. No need to involve the ARM binary anymore when we have OGT generated MBS files :)

The result was quite impressive. We had a few issues where the limare driver (which has mostly taken its cues from the output of the binary driver) and OGT disagreed over symbol layout, but apart from that, bringing up the shaders connor produced was pretty painless. Amazingly effortless, for such a big step.

Connor then spent another day playing with the fragment shader assembler, fixed some bugs, and produced 3 fragment shaders for us. One for the clear shader used by limare directly, and 2 for Q3A. After some more symbol layout issues, these also just worked! We even seem to be error-margin faster with the MBS files (due to texture coordinate varyings being laid out differently).

So this is a really big milestone for the lima driver project. Even with our insane pre-optimized architecture, we now are able to run Quake 3 Arena without any external dependencies, and we are beating the ARM binary while doing so.

For generating your own shader MBS files, check out Connors OGT, and then you can head straight to Connors wiki page. My Q3A tree now has the MBS code included directly. And i pushed a dirty version of my FOSDEM limare code.

As for this new limare code, this fosdem_2013_pile branch will vanish soon, as i need to properly pry things apart still. This is run-for-the-price code, and often includes many unrelated fixes in the same commit. It's better to do archeology on it now, than 3y from now, so this needs to be split. But in the meantime, you all can go and give Q3A on a fully free driver stack on Mali hw a go :)

I will not post a video, as there really is nothing new to see. It is the exact same timedemo, running some promille faster. Build things, and then run it yourself on your sunxi hardware (i am still working on porting it to the new kernel of a more powerful platform). That's the best proof there is!

For building limare, check out the fosdem2013_pile branch and then just run make/make install.

For building Q3A all you need to do is run:
make ARCH=arm USE_LIMARE=1
And, when you have the full quake installed in ~ioquake3/baseq3, you can create a file called ~ioquake3/baseq3/demofour.cfg with the following content:
cg_drawfps 1
timedemo 1
set demodone  "quit"
set demoloop1 "demo four; set nextdemo vstr demodone"
vstr demoloop1
You can then run the ioquake3 binary with "+exec demofour.cfg" added to the command line, and you will have the demo running on top of fully free software!

Now we really have covered all the basics, time to find out how Mesa will play with our plans :)

From: (Anonymous)
2014-01-30 07:28 pm (UTC)
True, and while reverse engineering is much slower (and harder!!), it can also have a positive effect: you don't rely on the possibly inefficient coding of the binary driver. You write an opensource driver based on the behaviour of the binary driver, not on the code of the binary driver. And you get rid of bugs that exist in the binary driver (because the human perception of a driver's behaviour will filter out bugs), however, for firmware, it's a different story.
This is something I learnt of the reverse engineering of the whole Logitech force feedback stack. It will be implemented in a proper way for Linux in some time. Of course, that's much easier than reverse engineering a graphics driver!

Amazing job, Luc! Much respect!
(Reply) (Parent) (Thread)