GLSL-based image filters (FX)
over the last few years the built-in apple framework for image processing- CoreImage- has been getting slower and (particularly with 10.9) less stable for our purposes. CoreImage is heavily integrated into Quartz Composer, to the extent that these problems are evident in many QC compositions. These bugs may get ironed out- CI appears to be getting some level of active development, though we can't say the same for QC- but since we can't rely on apple to do this, we wanted to replace many (or most) of the apple-provided CoreImage filters with something that was faster for our purposes and more stable.
when we started looking at the filters that we wanted to replace, it became apparent that most of them could be put together with a simple fragment shader. for these purposes, FF/FFGL seemed like overkill- you don't need a compiler if you just want to shuffle some pixels around, and we wanted to make creating, editing, and learning from others' filters to be as quick and easy as possible. this is where GLSL-based FX came from.
since we just made this up, the only implementations are in vdmx (8.2.0.9's up) and in a test app i built for testing filters. right now, everything described in this doc works (please let me know if it doesn't), but the spec should still be considered totally unreleased: i'm sharing this with a small number of people and seeking feedback, at which point changes (perhaps drastic changes) may be made. if there's any interest, i'd be glad to open-source a reference implementation as a framework so others can use this once i've cleaned it up a bit and done substantially more testing to make sure the implementation's relatively bug-free.
on to the good stuff!
- GLSL-based FX have persistent buffers (storing an image with the effect), temporary buffers, and allow multiple rendering passes (you can render into a buffer in one pass, then read it back in another). GLSL-based fx also recognize seven different input types: events (click buttons), booleans (toggle buttons), long/int (pop-up buttons), float (slider), two-dimensional points, colors, and other images which allow for interaction with the fx.
- GLSL-based FX require, at minimum, a fragment shader file. the only recognized file extensions are .fs or .frag. FX may have a vertex shader file too, but this is optional. if you have a vertex shader, it must be in the same directory as your frag shader and you must call the function "vv_vertShaderInit();" in its main function.
- the fragment shader you provide must start out with a comment delineated using /* to start and */ to end it. inside this comment must be a JSON dictionary. the contents of this JSON dictionary describe the image filter- its name, inputs, rendering passes, persistent buffers, etc. if this JSON dictionary can't be parsed (please use the test app), your filter can't be loaded.
THE JSON DICT THAT DESCRIBES YOUR FILTER
- first of all, there are super-simple examples that cover all of this- check out the various "Test____.fs" sample filters. you will probably learn more, faster, from the examples than you'll get by reading this document.
- if there's a string in the dict stored at the "DESCRIPTION" key, this string will be displayed as a description associated with this filter in the host app.
- the "CATEGORIES" key should store an array of strings. the strings are the category names you want the filter to appear in (assuming the host app displays categories)
- the "INPUTS" key should store an array of dictionaries (each dictionary describes a different input- the inputs should appear in the host app in the order they'r listed in this array). for each input dictionary:
- the value stored with the key "NAME" must be a string. this is the name of the input, and is also the variable name of the input in your shader.
- the value stored with the key "TYPE" must be a string. this string describes the type of the input, and must be one of the following: "event", "bool", "long", "float", "point2D", "color", or "image".
- where appropriate, "DEFAULT", "MIN", "MAX", and "IDENTITY" may be used to further describe attributes of the input. note that "image"-type inputs don't have any of these, and that "color"-type inputs use an array of floats to describe colors. everywhere else values are stored as native JSON values where possible.
- other notes:
- "event" type inputs describe events that do not have an associated value- a momentary click button.
- the "long" type input is used to implement pop-up buttons/pop-up menus in the host UI. as such, "long"-type input dictionaries have a few extra keys:
- the "VALUES" key stores an array of integer values. this array may have repeats, and the values correspond to the labels. when you choose an item from the pop-up menu, the corresponding value from this array is sent to your shader.
- the "LABELS" key stores an array of strings. this array may have repeats, and the strings/labels correspond to the array of values.
- the "PERSISTENT_BUFFERS" key describes persistent buffers: these are buffers that will stay with your effect until it's deleted. this key is optional: you don't need to include it. the object at this key needs to be either an array or a dictionary:
- if the item at "PERSISTENT_BUFFERS" is an array, each item in that array must be a string. this string is the name of the persistent buffer. this is the easiest/fastest way to add a persistent dict. if you ask the filter to render a frame at a different resolution, persistent buffers are resized to accommodate.
- if the item at "PERSISTENT_BUFFERS" is a dictionary, then each key-value pair in that dictionary describes a buffer (the key is the name of the buffer, the value is a dictionary with its properties).
- if the buffer dictionary has a value for the keys "WIDTH" or "HEIGHT", that value is expected to be a string with an equation describing the width/height of the buffer. this equation may reference variables: the width and height of the image requested from this filter are passed to the equation as "$WIDTH" and "$HEIGHT", and the value of any other inputs declared in "INPUTS" can also be passed to this equation (for example, the value from the float input "blurAmount" would be reqpresented in an equation as "$blurAmount"). this equation is evaluated once, when you initially pass the filter a frame. for more information (constants, built-in functions, etc) on math expression evaluations, please see the documentation for the excellent DDMathParser by Dave DeLong, which is what we're presently using.
- the "PASSES" key should store an array of dictionaries. each dictionary describes a different rendering pass. this key is optional: you don't need to include it, and if it's not present your effect will be assumed to be single-pass.
- if one of the dicts in "PASSES" has a string stored at the key "TARGET", this string describes the name of the buffer this pass should render into. you can specify that pass should render into a persistent buffer by putting its name in here. alternately, you could make up a different name and put it here- the implementation will automatically create a temporary buffer at this name, and you can still read the image back in a later pass. temporary buffers are deleted (or returned to a buffer pool) after every rendering pass.
- if one of the dicts in "PASSES" has a string stored at the keys "WIDTH" or "HEIGHT", that string is an equation used to describe the width or height of the target buffer this pass is rendering into. for more info, see the decsription of the "WIDTH" and "HEIGHT" keys from "PERSISTENT_BUFFERS" (above).
THE REST OF YOUR SHADER
- outside of the opening comment, the code in your shader file will be modified slightly (stuff will be added at the beginning and your code will be find-and-replaced) and then run. the following describes the modifications that are made and the handful of "functions" you can call from GLSL. please refer to the included test app, which displays errors as well as the modified fragment and vertex shaders, for a more detailed view of exactly what's going on behind the scenes. if you have a problem or encounter a bug with the find-and-replacing (or any other parsing errors), please let me know.
- a variable is automatically declared for each of the INPUTS described in your JSON dict. the value of your UI items is automatically passed to this variable- if you want to get the value of an input in your shader you can simply refer to it by name. "event" and "bool"-type inputs are declared as "bool", "long" is "int", "float" is "float", "point2D" is "vec2", and "color" is "vec4".
- a variable is automatically declared for each of your "image"-type inputs.
- the FX implementation knows ahead of time whether this "image" will be receiving a GL_TEXTURE_2D or a GL_TEXTURE_RECTANGLE_EXT, and will automatically declare the variable appropriately as either a "sampler2D" or "sampler2DRect". if the texture type changes in the future, the implementation generates a new version of the shader with the appropriate variable types: you don't have to concern yourself with the "type" of sampler at any point in the process, and can work with it transparently by referring to it with either normalized or pixel coords.
- in addition to the variable for your sampler, several other variables are created for each image input. you'll probably never need to use these variables- they're used to communicate the size of the image, the area the image occupies in a GL texture, and the vertical flippedness of the image- but you can see them by looking at the source in the "frag shader" tab of the test app. "__imgRect" passes the texture coords of the image in the sampler as a vec4 (x-origin, y-origin, width, height) (if it's a GL_TEXTURE_2D these are normalized, if it's RECT they aren't), "__imgSize" stores the size in pixels of the image, and "__flip" is a bool
- variables are automatically declared for all the persistent buffers declared in your JSON dict. functionally, these are implemented almost identically to "image"-type inputs.
- variables are automatically declared for all the temporary buffers referred to in your array of PASSES dictionaries in your JSON dict. functionally, these are implemented almost identically to "image"-type inputs.
- the "PASSINDEX" variable is automatically declared as an int. this is 0 on the first rendering pass, and incremented by one on each subsequent rendering pass.
- the "RENDERSIZE" variable is declared as a vec2. this is set to the resolution of the current rendering pass, and may change from pass to pass (depending on whether or not the buffer resolutions specified in the JSON dict included any fancy equations).
- the "vv_fragCoord" variable is declared as a vec2. this is set to the approximate pixel location of the frag currently being evaluated.
- the "vv_fragNormCoord" variable is declared as a vec2. this is set to the approximate pixl location of the frag currently being evaluated in normalized coords.
- pseudo-functions- fetching pixels from "image"-type inputs
- this implementation provides four different pseudo-functions for fetching pixel values from samplers. behind the scenes, these pseudo-functions are find-and-replaced with calls to other functions (which have significantly longer and more complicated calls that depend on whether the samplers are 2D or RECT, whether the passed image was flipped or not, its location in the GL resource, etc)- you can see exactly how this works by using the test app to check out the compiled fragment shader for any of the provided sample fx.
- vec4 IMG_PIXEL(inputName, vec2 pixelLocation); pass it the name of an "image"-type input and a pixel location, and it will return the color of the image at that location.
- vec4 IMG_NORM_PIXEL(inputName, vec2 normalizedLocation); pass it the name of an "image"-type input and a normalized location (both horiz and vert tex coords ranged 0.0-1.0, regardless of aspect ratio), and it will return the color of the image at that location.
- vec4 IMG_THIS_PIXEL(inputName); pass it the name of an "image"-type input, and it will return the color of the pixel from that image at the location of the fragment being evaluated.
- vec4 IMG_THIS_NORM_PIXEL(inputName); works similarly to IMG_THIS_PIXEL(), but fetches the pixel from the passed "image"-type input using normalized coords (rather than pixel-based coords).