Learning to write Unity Shaders

Dynamically generated infinite scrolling terrain with a custom shader
Dynamically generated infinite scrolling terrain with a custom shader (inside the Unity editor)

Unity has replaced Photoshop in terms of adding features faster than I can assimilate them. It’s been possible to write custom shaders for years, but every time I tried to do something non-trivial I would give up in frustration. Recently, however, I finally wrote some pretty nice dynamic terrain and dynamic planet code and was very frustrated with shading options.

What I wanted to do, in essence, was embed multiple tiling textures inside a single texture, and then have a shader continuously interpolate between a pair of those shaders based on altitude, biome, or whatever. This doesn’t appear to be something other people are doing, so I was not going to be able to take an existing shader and tweak a couple of things to make it work. I’d actually need to understand what I was doing.

If you look at the picture, you’ll see seamless transitions (based on altitude) between a sand dune texture (at sea level) and a forest texture (at middle levels) and further up the beginning of a transition to bare rock. I’ve got a darker blue-tinged rock below the sand, so the material I’m using looks like this:

A single texture that contains multiple tiling textures.
A single texture that contains multiple tiling textures. Most of the texture is blank, but you get the idea.

Obviously there’s room for expansion. I could do some really interesting things with this technique (even moreso if I can interpolate between three or four samples without choking the GPU). I haven’t figured out how to benchmark this stuff — I’m not seeing any hit on the GPU using Unity’s profile, but I haven’t tried running this on a mobile device — so far, this shader seems to run just as fast as (say) a standard diffuse shader.

How to Start

Writing shaders is actually pretty simple. The big problems are finding useful documentation (I couldn’t find any useful documentation on ShaderLab, but it turns out that nVidia’s cg documentation appears to do the trick) and tutorials (I couldn’t find any). It doesn’t help that Unity has made radical changes to its Shader language over time (not really their fault, the underlying GPU architectures have been in flux) which makes a lot of the tutorials you do find worse than useless.

For the record, I’m still using Unity 4.6.x so this may all be obsolete in Unity 5.x. That said, I was working off the latest Unity 5.x online documentation.

The closest thing to useful tutorials I could find is in the Unity documentation — specifically Surface Shaders Examples. Sadly, you’re going to need to infer a great deal from these examples, because I couldn’t find explanations of the simplest things (e.g. how data gets from the shader’s UI to the actual pixel shader code — there’s a lot of automagical linking going on).

Shader "Custom/DynamicTerrainShader" {
	Properties {
		_MainTex ("Base (RGB)", 2D) = "white" {}
		_Color ("Main Color", Color) = (0,0,0,1)
		_WorldScale("World Scale", Float) = 0.25
		_AltitudeScale("Altitude Scale", float) = 0.25
		_TerrainBands("Terrain Bands", Int) = 4
	}
	SubShader {
		Tags { "RenderType"="Opaque" }
		LOD 200
		
		CGPROGRAM
		#pragma surface surf Lambert

		sampler2D _MainTex;
		float _WorldScale;
		float _AltitudeScale;
		float4 _Color;
		int _TerrainBands;

		struct Input {
			float3 worldPos;
			float2 uv_MainTex;
            float2 uv_BumpMap;
		};

		void surf (Input IN, inout SurfaceOutput o) {
			float y = IN.worldPos.y / _AltitudeScale + 0.5;
			float bandWidth = 1.0 / _TerrainBands;
			float s = clamp(y * (_TerrainBands + 2) - 2, 0, _TerrainBands);
			float t = frac(s);
			t = t < 0.25 ? t * 0.5 : ( t > 0.75 ? (t - 0.75) * 0.5 + 0.875 : t * 1.5 - 0.25);
			float band = floor(s)  * bandWidth;
			float2 uv = frac(IN.uv_MainTex * _WorldScale) * (bandWidth - 0.006, bandWidth - 0.006) + (0.003, 0.003);
			uv.y = uv.y + band;
			float2 uv2 = uv;
			uv2.y = uv2.y + bandWidth;
			half4 c = tex2D(_MainTex, uv) * (1 - t) + tex2D(_MainTex, uv2) * t;
			o.Albedo = c.rgb * 0.5;
			o.Alpha = c.a;
		}
		ENDCG
	} 
	FallBack "Diffuse"
}

So here’s my explanation for what it’s worth.

To refer to properties in your code, you need to declare them (again) inside the shader body. Look at occurrences of _MainTex as an instructional example. As far as I can tell you have to figure out which parameters are available where, and which type declarations in the shader body correspond with the (different) types in the properties declaration by osmosis.

The Input block is where you declare which bits of “ambient” information your shader uses from the rendering engine. Again, what you can declare in here and what it is you simply need to figure out from examples. I figured out how worldPos worked from the example which turns the soldier into strips.

Note that the Input block declaration determines what is passed as Input (referred to as IN in the body). The way the declaration works (you declare the type rather than the variable) is a bit puzzling but it kind of makes sense. The SurfaceOutput object is essentially a set of parameters for the pixel that is about to be rendered. So the simplest shader body would simply be something like o.Albedo = (1,0,0,1) which would be the constant color red (depending on the basic shader type, lighting would or wouldn’t be applied, etc.).

Variables and calculations are all about vectors. Basically everything is a vector. A 3D point is a float3, a 2D point is a float2. You can add and multiply vectors (component by component) so (1,0.5) + (-0.5, 0.25) -> (0.5,0.75). You can mix scalars and vectors in some obvious and not-so-obvious ways (hint: it usually pays to be explicit about components).

The naming conventions are interesting. For vectors, you can use x, y, and z as shorthand for accessing specific components. I’m not sure if the fourth coordinate is w or a or something else. I’m also pretty sure that spatial coordinates are not in the order I think they are (so I do ok with foo.x, but get into trouble if I try to handle specific components via (,,) expressions. Hence lines like uv.y = uv.y + band instead of uv = uv + (0,band,0) (which doesn’t work).

You may have noticed some handy functions such as floor and frac being used and wonder what else there is. I couldn’t find any list or references on the Unity website, but eventually found this cg standard library documentation on nVidia’s website (for its shader language). Everything I tried from this list seemed to work (on my nVidia-powered Macbook Pro).

If you’re looking for control structures and the like, I haven’t found any aside from the ternary operator — condition ? value-if-condition-true : value-if-condition-false — which is fully supported, can be nested, etc.. This alone would probably have driven me away just five years ago before I learned to stop worrying and love the ternary operator.

Why no switch statements, loops and such? I’m writing a pixel shader here and I suspect it relies on every program executing the same instruction at the same time, so conditional loops are out. (Actually I may be wrong about this — see the cg language documentation. I don’t know how closely ShaderLab corresponds to cg though.)

Once you see that list of functions you’ll understand why I am using piecewise linear interpolation between the materials (it looks just fine to me).

Some final points — the shader had terrible problems with the edges of the sub-textures until I changed the bitmap sampling to point (versus linear or trilinear interpolation). I suspect this may be a wrapping issue, but (as you may find) debugging these suckers is not easy.

One final comment — even though you’re essentially writing assembler for your GPU, shader programming is pretty forgiving — I haven’t crashed my laptop once (although Unity itself seems to periodically die during compiles).

Digital Archeology and Markup Options

I’m currently developing a tool for archivists which will allow them — or indeed anyone — to “publish” a repository of finding aids (xml documents containing metadata about collections of stuff, e.g. the papers of a 19th century lawyer) by, essentially, installing the tool (making some changes to config) and pressing “go”. Actually you don’t need to press go. There are a bunch of similar tools around, but most of them have the dubious virtue of storing and maintaining valuable data inside a proprietary or arbitrary database, and being a lot more complicated to set up.

E.g. the typical workflow for one of these tools, we’ll call it “Fabio”, goes like this:

  1. Archivist creates a complicated XML file containing metadata about some collection of stuff.
  2. Archivist creates or uses an existing XSL file to transform this into data Fabio understands.
  3. Fabio loads the data, and there are almost certainly errors in it because (a) the XML had mistakes in it and/or (b) the XSL had errors in it.
  4. Archivist discovers some errors and fixes either (a) the XML (and reimports), (b) the XSL (and reimports), or (c) the data in Fabio. Probably (c) because Fabio reimporting data into Fabio is a pain, and the whole point of Fabio is it’s supposed to be an “all in one” solution once you get your data munged into it.
  5. Later, more errors are discovered and fixed by the process listed in 4.

Now, the problem with all of this, as I see it, is that it’s completely nuts.

  • People who already know (to some extent) how to create and maintain XML and XSL files now need to learn how to use Fabio. Knowledge of Fabio is relatively useless knowledge (when you change jobs, the new place probably doesn’t use Fabio, but a completely different but equally stupid product).
  • Corrections may be made in either the raw data (XML), reusable scripts (XSL), or Fabio’s black box database (???). Later corrections can easily overwrite earlier corrections if, for example, one mistake is fixed inside Fabio and then another is fixed in the XML.
  • If Fabio stops being maintained or you just want to stop using it, all (or much) of your valuable data is in Fabio’s black box database. Even if you know how it’s structured you may lose stuff getting your data back out.
  • The XML repository needs separate version control anyway, otherwise what happens if you make a change to your XSL to fix one import and then need to reimport another file that worked before but doesn’t work now.
  • Data may be lost during the import process and you won’t know.
  • Fabio needs to provide an API to expose the data to third parties. If it doesn’t expose a particular datum (e.g. because it was lost on import, or Fabio’s developers haven’t gotten around to it yet, you’re out of luck).
  • Fabio may have stupid flaws, e.g. provide unstable or ugly URLs — but that’s between you and your relevant committee, right?

My solution is intended to be as thin as possible and do as little as possible. In particular, my solution wraps a user interface around the raw data without changing the workflow you have for creating and maintaining that data. My solution is called metaview (for now) but I might end up naming it something that has an available .com or .net address (metaview.com is taken).

  • I don’t “import” the data. It stays where it is… in a folder somewhere. Just tell me where to find it. If you decide to stop using me tomorrow, your data will be there.
  • If you reorganize your data the urls remain unchanged (as long as you don’t rename files and files have unique names — for now)
  • Unless you’re maintaining the UI you don’t need to understand it.
  • You fix errors in your data by fixing errors in YOUR data. If you stop using me tomorrow, your fixes are in YOUR data.
  • I present the raw data directly to users if they want it (with the UI wrapped around it) so that there’s no need to wait for an API to access some specific datum, or worry about data loss during import processes.
  • Everything except base configuration (i.e. what’s my DB password and where’s the repository) is automatic.
  • I don’t try to be a “one stop solution”. No I don’t provide tools for cleaning up your data — use the tools you already use, or something else. No I don’t do half-assed version control — use a full-assed version control solution. Etc. I don’t even think I need to do a particularly good job of implementing search — since Google et al do this already. I just need to make sure I expose everything (and I mean everything) to external search engines.

This has me thinking quite a bit about markup languages and templating systems. At first, I tried to decide what kind of template system to use. The problem for me is that there seems to a number of templating systems for any given web development stack that is some multiple of the nth power of people using that stack and the number of components in the stack. So if you’re looking at PHP (roughly two bazillion users) and MySQL (three bazillion) and Apache (four…) that’s a metric frackton of options, and knowledge of any one of those is way over on the bad side of the Knowledge Goodness Scale.

Aside: my unoriginal Knowledge Goodness Scale. This isn’t original, but I try to acquire “good” knowledge and try to avoid “bad”. The more times, contexts, and situations in which knowledge is accurate, useful, and applicable, the better it is. So knowledge about how to understand and evaluate information (such as basic logic, understanding of probability and statistics, understanding of human nature) is incredibly far over on the good end. Knowledge of how to build templates for a “content management system” that only works with PHP 5.2.7 with MySQL 5.x and Apache 2.x is closer to the bad end.

It follows that if you are going to force your users to learn something, try to make it good stuff, not bad stuff. So, let’s continue…

Archivists try to preserve knowledge and/or create new — preferable good — knowledge. We don’t produce index information or metadata about something because we want to have to do it again some day. The knowledge they’re dealing with is often very specific, but to make it good it can still be accurate and applicable across time. Developing an engine based on a bunch of technologies which themsevles are unlikely to be useful across a span of time and contexts is not a good start. (Card indexes have lasted a long time. If your electricity goes out or your server crashes you can still use them today.)

So, my solution involves requiring users to change their lives as little as possible, learn as little as possible, and build on their existing good knowledge rather than acquire new bad knowledge. Instead of figuring out the foibles of Fabio, they can learn how to better create and maintain raw data.

So, what’s the approach?

  • XML is rendered using XSL — on the browser or the server. If you want the XML, click the XML link. (It looks the same on modern browsers — folks with primitive browsers will need server-side rendering.)
  • The templating system is XSL.
  • The database contains index information, but is built dynamically from the repository (as needed).

Of all the different markup languages around, XML is probably the best. It satisfies much of the original intent of HTML — to truly separate intention from presentation (order is still much to important in XML — it’s quite a struggle to reorder XML content via XSL in a nice, flexible way). It’s very widely used and supported. Many people (my target audience in particular) already know it. And it’s not limited to a particular development stack.

XSL is part of XML, so it’s easy for people who already use XML to grok, and again it’s not limited to a particular development stack.

There’s no escapting binding oneself to a development stack for interactivity — so metaview is built using the most common possible free(ish) technologies — i.e. MySQL, PHP, JavaScript, and Apache. Knowledge of these tools is probably close to the least bad knowledge to force on prospective developers/maintainers/contributors.

Less Bad Options

I do have some misgivings about two  technical dependencies.

XML has many virtues, but it sucks to write. A lot of what I have to say applies just as much to things like blogs, message boards, and content management systems”, but requiring users of your message board to learn XML and XSL is … well … nuts. XML and XSL for blog entries is serious overkill. If I were making a more general version of metaview (e.g. turning it into some kind of content management system, with online editing tools) I’d probably provide alternative markup options for content creators. Markdown has many virtues that are in essence the antithesis of XML’s virtues.

Using Markdown to Clean Up XML

Markdown — even more than HTML — is all about presentation, and only accidentally discloses intention (i.e. the fact you’re making something a heading might lead one to infer that it’s important, etc.). But unlike HTML (or XML) Markdown is easy and intuitive to write (anyone who has edited ASCII files is already 75% of the way there) and the marked up text looks good as is (one of its design features). There are a ton of “similar” markup languages, but they are all either poor clones of HTML (the worst being bbcode) or just horrible to look at (e.g. Wiki syntax). Markdown also lets you insert HTML making it (almost) supremely flexible should the need arise. So, if I wanted to create an alternative method for maintaining content, Markdown seems like a nice option.

Markdown also seems like a nice way of embedding formatted text inside XML without polluting the XML hierarchy… e.g. rather than allowing users to use some toy subset of HTML to do rudimentary formatting within nodes, which makes the DTD and all your XML files vastly more complex, you could simply have <whatever type=”text/markdown”> and then you can have the XSL pass out the text as <pre class=”markdown”> which will look fine, but can be made pretty on the client-side by a tiny amount of JavaScript. In a sense, this lets you separate meaningful structure from structure that purely serves a presentational goal — in other words, make your XML cleaner, easier to specify, easier to write correctly, easier to read, and easier to parse.

My other problem is PHP. PHP is popular, free, powerful, and it even scales. It’s quite easy to learn, and it does almost anything I need. I’m tired of the “PHP sucks” bandwagon, but as I’ve been writing more and more code in it I am really starting to wonder “why do I hate it so much?” Well, I won’t go into that topic right now — others have covered this ground in some detail — (e.g. I’m sorry but PHP sucks and What I don’t like about PHP) — but there’s also the fact that it’s popular, free, powerful, and scales. Or to put it another way, PHP sucks, but it doesn’t matter. It would be nice to implement all this in Python, say, but then how many web servers are set up to serve Python-based sites easily? While it may be painful to deal with PHP configuration issues, the problem is fixable without needing root privileges on your server.

So while I think I’m stuck with PHP, I can at least (a) stick to as nice a subset of it as possible (which is hard — I’m already using two different XML libraries), and (b) write as little code as possible. Also I can architect the components to be as independent as possible so that, for example, the indexing component could be replaced with something else entirely without breaking the front end.

It’s Only Sociology

Tom Lehrer apparently performed some new songs (his lyrics set to old tunes) and videotapes have surfaced on YouTube. Here’s “Sociology”:

 

This second video features a trio of mildly funny songs about Mathematics (the first is an attempt to do for differentiation what Lehrer already done for subtraction, the second will only makes sense to those of us who have sat through one too many Analysis proofs, while the last one is probably the most pointed) but unfortunately cuts off mid-sentence.

I used to watch Parkinson when I was in High School, but I must have missed this episode. If he was also interviewed, I’d love to see the interview. In any event, he performed “I got it from Agnes”.

 

In the unintentionally funny category, and proving once again that if you’re mentally challenged and you browse the web, you leave comments on YouTube, is the plaintive comment that “I still can’t figure out what it is they’ve got.”