Categories
games publishing programming recruiting

Should freelancers in gamedev industry sign NDA’s?

I’m sure I’ve written about this before, but it’s come up again on Reddit, so here’s my thoughts…

IANAL, but … I have been offered more than 100 NDA’s, both as an individual and working in large companies (including a games publisher, and including a funding team).

I have signed fewer than 30 – and most of those were before I realised how “wrong” it is to sign an NDA.

These days, I only sign 1 in every 10 NDA’s sent my way – and yet I still end up working for, or doing business with, at least 50% of the people who initially “required” an NDA.

Often, people send me a bad, broken, poorly worded, typo-riddled NDA and I say “I’m not signing that. Fix it, get it down to 1.5 pages max, and I’ll reconsider” — and they usually don’t bother, they just tell me their “secrets” anyway. They didn’t really need or want the NDA!

Also: In most jurisdictions, an NDA can’t realistically be more than 2 pages, maybe 3 if it’s badly worded … if it’s more than 2 pages, you’re either in a niche industry (not games), or … it’s not an NDA, it’s a contract that’s pretending to be an NDA, perhaps for evil reasons. When someone sends me a 5-10 page NDA I say “send me a 1-2 page NDA. Don’t send me a secret employment contract”. AGain, they often simply carry on talking to me without NDA.

One large games company (100 staff) sent me (IIRC) a TWENTY TWO page NDA. Buried on page 17 was something like: “we own all converations with you, and at YOUR cost we can confiscate all hard-disks and storage belonging to you or your company if we have any suspicion you might be developing ideas similar to our own”.

(It specifically took ownership of any ideas similar – even if we’d never discussed them. It was disgusting.)

There are some exceptions to be aware of – but the company can easily explain + justify this to you. For instance, if the company is based on a patent, then in Europe they may be required to prove everyone is under NDA or risk invalidating the patent (not the same as USA, where you can talk about patents publically, IIRC).

TL;DR – many “NDA’s” are a scam. Refuse them without hesitation. Or question them, and discover the company didn’t really need it anyway. Most “decent” companies won’t care that you’re not under NDA; the obsession with NDA’s usually comes from the kind of dumb-ass employer that is a PITA to work for.

(to reiterate: IANAL. I’m happy giving advice, but legally you shouldn’t trust random strangers on the internet, and I take no responsibility for your actions ! :))

Categories
advocacy usability

Great use of UX: too-close signs on cars

5 years as an iOS developer has pushed me to understand and improve my product/physical/functional design skills. Especially “industrial” design – problems and solutions.

Here’s a great one I saw the other day: combining a Snellen Chart with Stopping Distances, i.e.:

Screen Shot 2014-04-26 at 15.24.14 + dg_070645

Existing (bad) solutions

The status quo on driving “too close for safety” is provocative, antagonistic. Have a look at a search for eBay: “too close bumper sticker”:

Screen Shot 2014-04-26 at 15.57.35

There’s a couple of major problems with this:

  1. It’s wrong. If you’re stationary, 6 feet behind the vehicle, you can read it – but you’re definitely NOT too close.
  2. It’s combative and offensive. It appeals to the anger and frustration of the driver in front – when we’re trying to modify the behaviour of the driver behind. Offending people who you’re trying to convince to change … is generally unsuccessful.

But combining the charts above, we get…

Net result

A plate you attach to your vehicle that informs the viewer how much too close/far they are right now, using a dynamic scale:

Screen Shot 2014-04-26 at 15.26.36

I couldn’t get a close enough photo to show it clearly, so here’s my redrawing of what the plate looked like:

too-close-am-vector

Categories
advocacy bitching programming Web 0.1 web 2.0

Why you shouldn’t use webfonts instead of images

Warning: principled rant against sloppy design and bad coding about to start; kids these days! Get off my lawn!

There’s a terrible disease affecting modern web developers – Twitter.com has just fallen ill with it, and it could be a long time before they cure themselves.

Deleting all images, and replacing them with the dreaded “web font”.

It’s the wrong solution for the problem, it doesn’t do what you think it does, and it pisses all over some of the Web’s core principles. If you’re a web developer, and you respect your craft, you shouldn’t even consider this, not for one moment. Here, for instance, is how Twitter currently looks for me – ugly, and very hard to use!

Screen Shot 2014-04-15 at 10.18.17

GitHub was another recent victim of this disease. They’re still in recovery, but they at least made their site “slightly usable” by adding tooltips:

Screen Shot 2014-04-15 at 10.30.18

What’s going on?

The problem

Core Web principle:

Information is everything; presentation is optional. It’s acceptable to forego presentation, so long as everyone can access the information.

Effect: When you load a webpage, your browser requests every image separately. This is a long way short of the “most optimal” code implementation.

Is this a big problem? For most of us … Not really. The system works, it’s flexible, it’s powerful – it’s a little inefficient, but for the corporations that care there are plenty of hacks and optimizations they can deploy.

The other problem

Most artists should be creating Vector images most of the time, but the software vendors who made the editing tools for Vectors … all died out around 15 years ago. Back then, the advantages of Vectors were small, because most screens were low res.

We still haven’t recovered. There are many standards for bitmap image files, and two very popular ones – PNG, JPG. There are many standards for vector image files, but no popular ones.

Effect: web developers end up looking to Web Fonts as a de facto “vector image” standard.

SVG, or not to SVG?

There is an official Web/HTML approved vector standard – SVG – in wide use, with strong support in all current browsers.

Does it work? Let’s see (http://caniuse.com/svg) …

Screen Shot 2014-04-15 at 10.45.46
Screen Shot 2014-04-15 at 10.45.33

… but many software companies ignore it. For instance, Apple allows programmers to use both PNG and JPG (why?) in core iOS, but not SVG (despite having a full SVG parser built-in to their web browser). Many programmers I speak to believe that SVG isn’t supported at all – FUD wins again.

The other, other problem

A core principle of the Web is that information is accurately described (the M in HTML). A recent trend in web development is “progressive enhancement”.

def. “Progressive Enhancement HTML”: a webpage written the way you were supposed to write it, instead of being hacked-together by unskilled monkeys

HTML has always been “progressive” – this was a core principle 20 years ago. But HTML was so easy to use and abuse that many of us (most of us? nearly all of us?) have been writing poor HTML most of that time. Shame on us (shame on me, certainly – been there, done that :( ).

But … the key point here is: Progressive Enhancement isn’t an “optional extra”, it is the Web. If you fight PE, you’re fighting the entire web infrastructure – and we know how that war will end.

…whatever. What about Web fonts?

So, when you remove an “image” and put a “web font letter” there instead, and change that letter so that the font contains an image you wanted …

…your HTML is now a lie.

Maybe you’re the kind of web developer who scorns blind people (and partially blind), who ignores the Internationalization features of software. You laugh in the face of Accessibility Standards, so you can reduce development time.

But HTML doesn’t make these things optional. They are so core to HTML that they are “always on”, even if you personally never use (need) them. One of the beautiful features of HTML is that if you do nothing, most of the Accessibility is automatically done for you.

With HTML, you have to go out of your way to prevent Accessibility. For instance: replacing images with magic-letters from a magic font.

You’re not blind; why does the Web Font fail?

The thing about custom Web Fonts is … the user can disable them.

Again, this is fundamental to the web. Partly for the Accessibility issue (who are you to decide which users require Accessibility? No. It’s for the user to decide).

But also to support the web principles of openness, and user-control (not corporation-control). My machine, my browser, my choice.

Just as you cannot prevent users from hitting the “View -> Zoom” menu option and making your web page take more or fewer pixels on their screen (I’ve worked with web designers – mostly ex-print designers – who HATED this, and felt it was a feature that should be banned) … you cannot force a crappy font on the user.

Information is everything; presentation is optional. It’s acceptable to forego presentation, so long as everyone can access the information.

In my case: I have a MacBook Air. While wonderful in many ways, they have tiny (11″), non-retina screens, and they’re laptops – so the screen is often further away than I’d like. When WebFonts came to CSS, a lot of the websites I use (art sites, design agencies) started using “beautiful but TINY” fonts that were unreadable. Game Studios still do this today, sadly – lots of hard-to-read but “edgy!” fonts and bad colour choices.

Sometimes, the only way I can do my day to day work is to disable the crappy 3rd party fonts.

Another solution?

SVG says “Hi!”. Think about it.

Categories
network programming programming

Debugging odd problems when writing game servers

Cas (@PuppyGames) (of PuppyGames fame – Titan Attacks, Revenge of the Titans, etc) is working on an awesome new MMO RTS … thing.

He had a weird networking problem, and was looking for suggestions on possible causes. I used to do a lot of MMO dev, so we ran through the “typical” problems. This isn’t exhaustive, but as a quick-n-dirty checklist, I figured it could help some other indie gamedevs doing multiplayer/network code.

Scenario

  1. Java, Windows 7
  2. DataInputStream.readFully() from a socket… a local socket at that… taking 4.5 seconds to read the first bytes
  3. and I’ve already read a few bits and bobs out of it

Initial thoughts

First thought: in networking, everything has a realm of durations. The speed of light means that a network connection from New York to London takes a few milliseconds (unavoidably) – everything is measured in ms, unless you’re on a local machine, in which case: the OS internally works in microseconds (almost too small to measure). Since Cas’s problem is measured in SECONDS – but is on localhost – it’s probably something external to the OS, external to the networking itself.

Gut feel would be something like Nagle’s algorithm, although I’m sure everyone’s already configured that appropriately ;).

That, or a routing table that’s changing dynamically – e.g. first packet is triggering a “let me start the dialup connection for you”, causing everything to pause (note that “4 seconds” is in the realm of time it takes for modems to connect; it’s outside the realm of ethernet delays, as already noted)

My general advice for “bizarre server-networking bugs”: start doing the things you know are “wrong” from a performance view, and prove that each makes it worse. If one does not, that de facto that one is somehow not configured as intended.

Common causes / things to check

1. Contention: waht’s the CPU + C libraries doing in background? If system has any significant (but small) load, you could be contended on I/O, CPU mgiht not be interrupting to shunt data from hardware up to software (kernel), up to software (OS user), up to JVM

Classic example: MySQL DB running on a system can block I/O even with small CPU load

2. file-level locking in OS FS. Check using lsof which files your JVM is accessing, and what other processes in the machine may have same files open.

(NB: there’s a bunch of tools like “lsof” which let you see which files are in use by which processes in a unix/linux system. As a network programmer, you should learn them all – they can save a lot of time in development. As a network programmer, you need to be a competent SysAdmin!)

Classic problem: something unexpected has an outstanding read (or a pre-emptive write lock) which is causing silliness. I remember several times having a text editor open, or etc, that was inadvertently slowing access to local files (doh).

3. can you snoop the traffic? (i.e. its socket, not pipe)

Classic problem: some unrelated service is bombarding the loopback with crap. eg. Windows networking (samba on linux) going nuts

Try running Wireshark, filtered to show only 127. and see what’s going through at same time

Also: this is a good way to check where the delay is happening. Is it at point of send, or point of receive? … both?

There’s “net traffic” and there’s “net traffic”. Wireshark often shows traffic that OS normally filters out from monitoring apps…

4. Check your routing table? AND: Check it’s not changing before / aftter the attempted read?

5. Try enabling Nagle, see if it has any effect?

My point is: use this as a check: it ought to make things worse. If not … perhaps the disabling wasn’t working?

6. Have you done ANY traffic shaping (or firewalling) on this machine at any time in the past?

Linux: in particular, check the iptables output. Might be an old iptables rule still stuck in there – or even a firewall rule.

Linux + Windows: disable all firewalls, completely.

7. Similarly: do you have any Anti-Virus software?

Disconnect your test machines from the internet, and remove all AV.

AV software can even corrupt your files when they mistakenly think that the binary files used by the compiler/linker are “suspicious” (IIRC, that happened to us with early PlayStation3 devkits).

8. On a related note: security / IPS tools installed? They will often insert artificial delays silently.

CHECK YOUR SYSTEM LOG FILES! …whatever is causing the delay is quite possibly reporting at least a “warning” of some kind.

9. (in this case, Cas’s socket is using SSL): Perhaps something is looking up a certificate remotely over the web?

…checked your certificate chain? (if there’s an unknown / suspicious cert in the chain, your OS might be trying to check / resolve it before it allows the connection)

10. (in this case, Cas is using custom SSL code in Java to “hack” it): Get a real SSL cert from somewhere, see if it behaves any different

(I think it would be very reasonable for your OS to detect any odd SSL stuff and delay it, as part of anti-malware / virus protection!)

11. “Is it cos I is custom?” – Setup an off-the-shelf webserver on the same machine and check if YOUR READING CODE can read from that SSL localhost fast?

(it often helps to get an answer to “is it the writer, the reader, both … is it the hacked SSL, or all SSL?” etc)

12. (getting specific now, as the general ideas didn’t help): if you bind to the local IP address instead of 127 ?

13. Howabout reverse-DNS? (again, 4 seconds is in the realm of a failed DNS lookup)

Might be a reverse DNS issue … e.g. something in OS (kernel or virus protection), or in JVM SSL library, that’s trying to reverse-lookup the 127 address in order to resolve some aspect of SSL.

I’ve had to put fake entries into hosts file in the past to speed-up this stuff, some libs just need a quick answer

Result

Turns out we were able to stop here…

Cas: “Well, whaddya know. It was a hosts lookup after all – because I’m using 127.0.0.2 and 127.0.0.3 it was doing a hostname lookup. I added those to hosts and all is solved

Me: “(dance)” (yeah. It was skype. The dance smiley comes in handy)

Categories
entity systems

Unconference on Entity Systems – Pre-Signup

Entity Systems are widely used in gamedev and starting to appear in mainstream IT / software development. But we want MORE developers and designers to benefit from this…

So, Richard Lord and I are going to do a mini conference on ES ideas/designs/uses/implementation/etc.

I’ll arrange venue, agenda (e.g. a Keynote) – but this will be an Unconference, so most/all sessions will be interactive, freeform, with no fixed Speakers.

To book a venue, I need an idea of numbers. So, if you’d like to come, please fill out this google form. Note: I’m planning to do this in Brighton, (UK) – less than 30 minutes from an International Airport (Gatwick).

https://docs.google.com/forms/d/156Sb9Qhx4RIX1d0BZg_22MVlUSoUb-gRqwrOq0zeT-w/viewform

Key info:

  • Date: Summer/Autumn 2014
  • Location: Brighton, UK
  • Cost: minimal (depends on demand + venue)
Categories
entity systems

Your best links and articles on Entity Systems / Component Systems

The Entity Systems wiki is pretty good – simple source-code examples in 5 different languages, and links to 10 x richer, more complex “frameworks”.

But it’s got little (almost nothing) in links to articles, introductions, tutorials on the topic. I know there’s a lot that’s been written – just look at all the trackbacks on my old ES blog posts.

Let’s fill out a page on the wiki with tutorials, techniques, etc. What are your favourites (and can you give a 1-sentence summary of what each one does well?)

Categories
advocacy games design games industry

How to Become a Game Designer (2014)

Step 1: Do not buy the book with this title.

Writing a book is tough, and I respect the time and effort that goes into it. But I don’t rate a book highly simply because it was hard to write: it has to fulfil it’s purpose, and help the reader. Some books are so poor they actively hinder the reader.

This one is being marketed to bloggers to promote it (for cash) without reading it. This (no review copies) might be a beginner’s mistake. I hope so; the alternative is horribly cynical.

For bonus points, the email was sent from a bouncing email address. If you want a job in the games industry, you probably don’t want to be advised by someone so careless/clueless they send out emails from broken email accounts.

That annoyed me enough to look through the marketing materials for this book, and it seems weak.

The author does not appear to have been a recruiter, nor a hiring manager, and apparently has never run a studio. If you read a book on this topic – make sure the author has spent years recruiting, hiring, and managing people in the industry. That’s the core expertise you want from the author – if they don’t have it, they need a really good excuse.

Further, the title seems deliberately misleading – linkbait for readers. It’s titled “become a game designer” but the content appears to be “get a job, somehow, doing something – anything you can scrape-by on – in the games industry”. Core topics are missing from those listed – e.g. how to advance your skills as a game-designer (key to getting hired!), etc.

…but I haven’t read the book, so I can only give vague suggestions and guesses at what it contains.

TL;DR

Personally I’d steer clear of all such books, and spend the time entering Game Jams instead. Every jam you enter will teach you more, and give you more experience that’s directly exciting to hiring managers, than reading books like this one.

I have classes of teenagers making their own games with little or no help; if they can do it, why can’t you?

Categories
entity systems games design programming

Data Structures for Entity Systems: Contiguous memory

This year I’m working on two different projects that need an Entity System (ES). One of them is a non-game app written natively on iOS + Android. The other is an FPS in Unity3D.

There are good, basic Open-Source ES’s out there today (and c.f. the sidebar there). I tried porting a few, but none of them were optimized for performance, and most of them were too tightly coupled to a single programming language or platform. I’ve started a new ES of my own – Aliqua.org – to fix these problems, and I’m already using it in an app that’s in alpha-testing.

I’ll be blogging experiences, challenges, and ideas as I go.

2016 Update: Support me on Patreon, writing about Entity Systems and sharing tech demos and code examples

Background: focus on ES theory, or ES practice?

If you’re new to ES’s, you should read my old blog posts (2007 onwards), or some of the source code + articles from the ES wiki.

My posts focussed on theory: I wanted to inspire developers, and get people using an ES effectively. I was fighting institutionalised mistakes – e.g. the over-use of OOP in ES development – and I wrote provocatively to try and shock people out of their habits.

But I avoided telling people “how” to implement their ES. At the extreme, I feared it would end up specifying a complete Game Engine:

…OK. Fine. 7 years later, ES’s are widely understood and used well. It’s time to look at the practice: how do you make a “good” ES?

NB: I’ve always assumed that well-resourced teams – e.g. AAA studios – need no help writing a good ES. That’s why I focussed on theory: once you grok it, implementation concerns are no different from writing any game-engine code. These posts are aimed at non-AAA teams: those who lack the money (or expertise) to make an optimized ES first time around.

For my new ES library, I’m starting with the basics: Data Structures, and how you store your ES data in memory.

Where you see something that can be done better – please comment!

Aside on Terminology: “Processors, née Systems”

ES “Systems” should be batch-processing algorithms: you give them an array/stream of homogeneous data, and they repeat one algorithm on each row/item. Calling them “Processors” instead of “Systems” reduces confusion.

Why care about Data Structures?

There is a tension at the heart of Entity Systems:

  • In an ES game, we design our code to be Functional: independent, data-oriented, highly efficient for streaming, batching, and multi-threaded execution. Individual Processors should be largely independent, and easy to split out onto different CPU cores.
  • With the “Entity” (ID) itself, we tie those Functional chunks together into big, messy, inter-dependent, cross-functional … well, pretty much: BLOBs. And we expect everything to Just Work.

If our code/data were purely independent, we’d have many options for writing high-performance code in easy ways.

If our data were purely chunked, fixed at compile-time, we’d have tools that could auto-generate great code.

But combining the two, and muddling it around at runtime, poses tricky problems. For instance:

  1. Debugging: we’ve gone from clean, separable code you can hold in your head … to amorphous chunks that keep swelling and contracting from frame-to-frame. Ugh.
  2. Performance: we pretend that ES’s are fast, cache-efficient, streamable … but at runtime they’re the opposite: re-assembled every frame from their constituent parts, scattered all over memory
  3. Determinism: BLOBs are infamously difficult to reason about. How big? What’s in them? What’s the access cost? … we probably don’t know.

With a little care, ES’s handle these challenges well. Today I’m focussing on performance. Let’s look at the core need here:

  • Each frame, we must:
    1. Iterate over all the Processors
    2. For each Processor:
      1. Establish what subset of Entity/Component blobs it needs (e.g. “everything that has both a Position and a Velocity”)
      2. Select that from the global Entity/Component pool
      3. Send the data to the CPU, along with the code for the Processor itself

The easiest way to implement selection is to use Maps (aka Associative Arrays, aka Dictionaries). Each Processor asks for “all Components that meet [some criteria]”, and you jump around in memory, looking them up and putting them into a List, which you hand to the Processor.

But Maps scatter their data randomly across RAM, by design. And the words “jump around in memory” should have every game-developer whimpering: performance will be bad, very bad.

NB: my original ES articles not only use Maps, but give complete source implementations using them. To recap: even in 2011, Android phones could run realtime 30 FPS games using this. It’s slow – but fast enough for simple games

Volume of data in an ES game

We need some figures as a reference point. There’s not enough detailed analysis of ES’s in particular, so a while back I wrote an analysis of Components needed to make a Bomberman clone.

…that’s effectively a high-end mobile game / mid-tier desktop game.

Reaching back to 2003, we also have the slides from Scott’s GDC talk on Dungeon Siege.

…that’s effectively a (slightly old) AAA desktop game.

From that, we can predict:

  • Number of Component-types: 50 for AA, 150 for AAA
  • Number of unique assemblages (sets of Component-types on an Entity): 1k for AA, 10k for AAA
  • Number of Entities at runtime: 20k for AA, 100k for AAA
  • Size of each Component in bytes: 64bits * 10-50 primitives = 100-500 bytes

How do OS’s process data, fast?

In a modern game the sheer volume of data slows a modern computer to a crawl – unless you co-operate with the OS and Hardware. This is true of all games. CPU and RAM both run at a multiple of the bus-speed – the read/write part is massively slow compared to the CPU’s execution speed.

OS’s reduce this problem by pre-emptively reading chunks of memory and caching them on-board the CPU (or near enough). If the CPU is processing M1, it probably wants M2 next. You transfer M2 … Mn in parallel, and if the CPU asks for them next, it doesn’t have to wait.

Similarly, RAM hardware reads whole rows of data at once, and can transfer it faster than if you asked for each individual byte.

Net effect: Contiguous memory is King

If you store your data contiguously in RAM, it’ll be fast onto the Bus, the CPU will pre-fetch it, and it’ll remain in cache long enough for the CPU(s) to use it with no extra delays.

NB: this is independent of the programming-language you’re using. In C/C++ you can directly control the data flow, and manually optimize CPU-caching – but whatever language you use, it’ll be compiled down to something similar. Careful selection and use of data-structures will improve CPU/cache performance in almost all languages

But this requires that your CPU reads and writes that data in increasing order: M1, M2, M3, …, M(n).

With Data Structures, we’ll prioritize meeting these targets:

  1. All data will be as contiguous in RAM as possible; it might not be tightly-packed, but it will always be “in order”
  2. All EntitySystem Processors will process their data – every frame (tick) – in the order it sits in RAM
    • NOTE: a huge advantage of ES’s (when used correctly) is that they don’t care what order you process your gameobjects. This simplifies our performance problems
  3. Keep the structures simple and easy to use/debug
  4. Type-safety, compile-time checks, and auto-complete FTW.

The problem in detail: What goes wrong?

When talking about ES’s we often say that they allow or support contiguous data-access. What’s the problem? Isn’t that what we want?

NB: I’ll focus on C as the reference language because it’s the closest to raw hardware. This makes it easier to describe what’s happening, and to understand the nuances. However, these techniques should also be possible directly in your language of choice. e.g. Java’s ByteBuffer, Objective-C’s built-in C, etc.

Usually you see examples like a simple “Renderer” Processor:

  • Reads all Position components
    • (Position: { float: x, float y })
  • Each tick, draws a 10px x 10px black square at the Position of each Component

We can store all Position components in a tightly-packed Array:

compressed-simple-array

This is the most efficient way a computer can store / process them – everything contiguous, no wasted space. It also gives us the smallest possible memory footprint, and lets the RAM + Bus + CPU perform at top speed. It probably runs as fast or faster than any other engine architecture.

But … in reality, that’s uncommon or rare.

The hard case: One Processor reads/writes multiple Component-types

To see why, think about how we’d update the Positions. Perhaps a simple “Movement” Processor:

  • Reads all Position components and all Velocity components
    • (Position: { float: x, float y })
    • (Velocity: { float: dx, float dy })
  • Each tick, scales Velocity.dx by frame-time, and adds it to Position.x (and repeats for .dy / .y)
  • Writes the results directly to the Position components

“Houston, we have a problem”

This is no longer possible with a single, purely homogeneous array. There are many ways we can go from here, but none of them are as trivial or efficient as the tight-packed array we had before.

Depending on our Data Structure, we may be able to make a semi-homogeneous array: one that alternates “Position, Velocity, Position, Velocity, …” – or even an array-of-structs, with a struct that wraps: “{ Position, Velocity }”.

…or maybe not. This is where most of our effort will go.

The third scenario: Cross-referencing

There’s one more case we need to consider. Some games (for instance) let you pick up items and store them in an inventory. ARGH!

…this gives us an association not between Components (which we could handle by putting them on the same Entity), but between Entities.

To act on this, one of our Processors will be iterating across contiguous memory and will suddenly (unpredictably) need to read/write the data for a different Entity (and probably a different ComponentType) elsewhere.

This is slow and problematic, but it only happens thousands of times per second … while the other cases happen millions of times (they have to read EVERYTHING, multiple times – once per Processor). We’ll optimize the main cases first, and I’ll leave this one for a later post.

Iterating towards a solution…

So … our common-but-difficult case is: Processors reading multiple Components in parallel. We need a good DS to handle this.

Iteration 1: a BigArray per ComponentType

The most obvious way forwards is to store the EntityID of each row into our Arrays, so that you can match rows from different Arrays.

If we have a lot of spare memory, instead of “tightly-packing” our data into Arrays, we can use the array-index itself as the EntityID. This works because our EntityID’s are defined as integers – the same as an array-index.

rect3859

Usage algorithm:

  • For iterating, we send the whole Array at once
  • When a Processor needs to access N Components, we send it N * big-arrays
  • For random access, we can directly jump to the memory location
    • The Memory location is: (base address of Array) + (Component-size * EntityID)
    • The base-address can easily be kept/cached with the CPU while iterating
    • Bonus: Random access isn’t especially random; with some work, we could optimize it further

Problem 1: Blows the cache

This approach works for our “simple” scenario (1 Component / Processor). It seems to work for our “complex” case (multiple Components / Processor) – but in practice it fails.

We iterate through the Position array, and at each line we now have enough info to fetch the related row from the Velocity array. If both arrays are small enough to fit inside the CPU’s L1 cache (or at least the L2), then we’ll be OK.

Each instance is 500 bytes
Each BigArray has 20k entries

Total: 10 MegaBytes per BigArray

This quickly overwhelms the caches (even an L3 Cache would struggle to hold a single BigArray, let alone multiple). What happens net depends a lot on both the algorithm (does it read both arrays on every row? every 10th row?), and the platform (how does the OS handle RAM reads when the CPU cache is overloaded?).

We can optimize this per-platform, but I’d prefer to avoid the situation.

Problem 2: memory usage

Our typeArray’s will need to be approimately 10 megabytes each:

For 1 Component type: 20,000 Entities * 50 variables * 8 bytes each = 8 MB

…and that’s not so bad. Smaller components will give smaller typeArrays, helping a bit. And with a maximum of 50 unique ComponentTypes, we’ve got an upper bound of 500 MB for our entire ES. On a modern desktop, that’s bearable.

But if we’re doing mobile (Apple devices in 2014 still ship with 512 MB RAM), we’re way too big. Or if we’re doing dynamic textures and/or geometry, we’ll lose a lot of RAM to them, and be in trouble even on desktop.

Problem 3: streaming cost

This is tied to RAM usage, but sometimes it presents a bottleneck before you run out of memory.

The data has to be streamed from RAM to the CPU. If the data is purely contiguous (for each component-type, it is!), this will be “fast”, but … 500 MB data / frame? DDR3 peaks around 10 Gigabytes / second, i.e.:

Peak frame rate: 20 FPS … divided by the number of Processors

1 FPS sound good? No? Oh.

Summary: works for small games

If you can reduce your entity count by a factor of 10 (or even better: 100), this approach works fine.

  • Memory usage was only slightly too big; a factor of 10 reduction and we’re fine
  • CPU caching algorithms are often “good enough” to handle this for small datasets

The current build of Aliqua is using this approach. Not because I like it, but because it’s extremely quick and easy to implement. You can get surprisingly far with this approach – MyEarth runs at 60 FPS on an iPad, with almost no detectable overhead from the ES.

Iteration 2: the Mega-Array of Doom

Even on a small game, we often want to burst up to 100,000+ Entities. There are many things we could do to reduce RAM usage, but our biggest problem is the de-contiguous data (multiple independent Arrays). We shot ourselves in the foot. If we can fix that, our code will scale better.

es-datastructures-structured-bigarray

In an ideal world, the CPU wants us to interleave the components for each Entity. i.e. all the Components for a single Entity are adjacent in memory. When a Processor needs to “read from the Velocity and write to the Position”, it has both of them immediately to hand.

Problem 1: Interleaving only works for one set at a time

If we interleave “all Position’s with all Velocity’s”, we can’t interleave either of them with anything else. The Velocity’s are probably being generated by a different Processor – e.g. a Physics Engine – from yet another ComponentType.

mega-array

So, ultimately, the mega-array only lets us optimize one Processor – all the rest will find their data scattered semi-randomly across the mega-array.

NB: this may be acceptable for your game; I’ve seen cases where one or two Processors accounted for most of the CPU time. The authors optimized the DS for one Processor (and/or had a duplicate copy for the other Processor), and got enough speed boost not to worry about the rest

Summary: didn’t really help?

The Mega Array is too big, and it’s too interconnected. In a lot of ways, our “lots of smaller arrays – one per ComponentType” was a closer fit. Our Processors are mostly independent of one another, so our ideal Data Structure will probably consist of multiple independent structures.

Perhaps there’s a halfway-house?

Iteration 3: Add internal structure to our MegaArray

When you use an Entity System in a real game, and start debugging, you notice something interesting. Most people start with an EntityID counter that increases by 1 each time a new Entity is created. A side-effect is that the layout of components on entities becomes a “map” of your source code, showing how it executed, and in what order.

e.g. With the Iteration-1 BigArrays, my Position’s array might look like this:

rect3859

  1. First entity was an on-screen “loading” message, that needed a position
  2. BLANK (next entity holds info to say if loading is finished yet, which never renders, so has no position)
  3. BLANK (next entity is the metadata for the texture I’m loading in background; again: no position)
  4. Fourth entity is a 3d object which I’ll apply the texture to. I create this once the texture has finished loading, so that I can remove the “loading” message and display the object instead
  5. …etc

If the EntityID’s were generated randomly, I couldn’t say which Component was which simply by looking at the Array like this. Most ES’s generate ID’s sequentially because it’s fast, it’s easy to debug (and because “lastID++;” is quick to type ;)). But do they need to? Nope.

If we generate ID’s intelligently, we could impose some structure on our MegaArray, and simplify the problems…

  1. Whenever a new Entity is created, the caller gives a “hint” of the Component Types that entity is likely to acquire at some time during this run of the app
  2. Each time a new unique hint is presented, the EntitySystem pre-reserves a block of EntityID’s for “this and all future entities using the same hint”
  3. If a range runs out, no problem: we add a new range to the end of the MegaArray, with the same spec, and duplicate the range in the mini-table.
  4. Per frame, per Processor: we send a set of ranges within the MegaArray that are needed. The gaps will slow-down the RAM-to-CPU transfer a little – but not much

es-datastructures-structured-megaarray

Problem 1: Heterogeneity

Problem 1 from the MegaArray approach has been improved, but not completely solved.

When a new Entity is created that intends to have Position, Velocity, and Physics … do we include it as “Pos, Vel”, “Pos, Phys” … or create a new template, and append it at end of our MegaArray?

If we include it as a new template, and insist that templates are authoritative (i.e. the range for “Pos, Vel” templates only includes Entities with those Components, and no others) … we’ll rapidly fragment our mini-table. Every time an Entity gains or loses a Component, it will cause a split in the mini-table range.

Alternatively, if we define templates as indicative (i.e. the range for “Pos, Vel” contains things that are usually, but not always Pos + Vel combos), we’ll need some additional info to remember precisely which entities in that range really do have Pos + Vel.

Problem 2: Heterogeneity and Fragmentation from gaining/losing Components

When an Entity loses a Component, or gains one, it will mess-up our mini-table of ranges. The approach suggested above will work … the mini-table will tend to get more and more fragmented over time. Eventually every range is only one item long. At that point, we’ll be wasting a lot of bus-time and CPU-cache simply tracking which Entity is where.

NB: As far as I remember, it’s impossible to escape Fragmentation when working with dynamic data-structures – it’s a fundamental side effect of mutable data. So long as our fragmentating problems are “small” I’ll be happy.

Problem 3: Heterogeneity and Finding the Components within the Array

If we know that “Entity 4” starts at byte-offset “2048”, and might have a Position and Velocity, that’s great.

But where do we find the Position? And the Velocity?

They’re at “some” offset from 2048 … but unless we know all the Components stored for Entity 4 … and what order they were appended / replaced … we have no idea which. Raw array-data is typeless by nature…

Iteration 4: More explicit structure; more indexing tables

We add a table holding “where does each Entity start”, and tables for each Component stating “offset for that Component within each Entity”. Conveniently, this also gives us a small, efficient index of “which Entities have Component (whatever)”:

es-datastructures-structured-megaarray-by-component

Problem 1: non-contiguous data!

To iterate over our gameobjects, we now need:

  • One big mega-array (contiguous)
  • N x mini arrays (probably scattered around memory)

Back to square one? Not quite – the mini-arrays are tiny. If we assume a limit of 128,000 entities, and at most 8kb of data for all Components on an Entity, our tables will be:

[ID: 17bits][Offset: 13 bits] = 30 bits per Component

…so that each mini-array is 1-40 kB in size. That’s small enough that several could fit in the cache at once.

Good enough? Maybe…

At this point, our iterations are quite good, but we’re seeing some recurring problems:

  • Re-allocation of arrays when Components are added/removed (I’ve not covered this above – if you’re not familiar with the problem, google “C dynamic array”)
  • Fragmentation (affects every iteration after Iteration 1, which doesn’t get any worse simple because it’s already as bad as it could be)
  • Cross-referencing (which I skipped)

I’ve also omitted history-tracking – none of the DS’s above facilitate snapshots or deltas of game state. This doesn’t matter for e.g. rendering – but for e.g. network code it becomes extremely important.

There’s also an elephant in the room: multi-threaded access to the ES. Some ES’s, and ES-related engines (*cough*Unity*cough*), simply give-up on MT. But the basis of an ES – independent, stateless, batch-oriented programming – is perfect for multi threading. So there must be a good way of getting there…

…which gives me a whole bunch of things to look at in future posts :).

PS … thanks to:

Writing these things takes ages. So much to say, so hard to keep it concise. I inflicted early drafts of this on a lot of people, and I wanted to say “thanks, guys” :). In no particular order (and sorry in advance if final version cut bits you thought should be in there, or vice versa): TCE’ers (especially Dog, Simon Cooke, doihaveto, archangelmorph, Hypercube, et al), ADB’ers (Amir Ebrahimi, Yggy King, Joseph Simons, Alex Darby, Thomas Young, etc). Final edit – and any stupid mistakes – are mine, but those people helped a lot with improving, simplifying, and explaining what I was trying to say.

Categories
iphone programming

OpenGL ES2 – Shader Uniforms

There’s a famous animated GIF of an infinitely swirling snake (here’s one that’s been Harry Potterised with the Slytherin logo):

Impressive, right?

What if I said it only relies upon one variable, and that you can reproduce this yourself in 3D in mere minutes? It’ll take quite a lot longer to read and understand, but once you’ve grokked it, you’ll be able to do this easily.

Background reading

Make sure you understand:

  1. Draw Calls
  2. VAOs + VBOs
  3. and at least one kind of Texturing:

Uniforms vs. Vertex Attributes

In theory, you don’t need Uniforms: they are a special kind of Vertex-Attribute (which gives them their name: they are a “uniform attribute”).

In practice, as we’ve already seen, bitmap-based texture-mapping in GL ES requires Uniforms to make a link between Texture and Shader.

The code for sending them to the GPU is different from sending Vertex Attributes – annoyingly, it’s more complicated – but the concept is identical. Then why do we have them?

Performance.

And, as a bonus: convenience.

How many items per Spaceship?

A typical 3D model of a spaceship has:

  • 5,000 polygons
  • 10,000 vertices
  • 5 bitmap tetures

OK, fine, so …

Performance basics

Your vertices each specify which texture(s) they’re using. If you want to change the textures, from “standard” ones to “damaged” ones, you’ll have to:

  1. Upload the new texture (one-time cost; once it’s on GPU you can fast-switch between textures)
  2. Re-Upload the “uses texture 1 (out of the 5)” vertex-attribute … once for every vertex (repeated cost: has to be done every time)

Uniforms bypass this by saying:

A GL Uniform is a Vertex Attribute that has the same value for EVERY vertex. It only needs to be uploaded ONCE and is immediately applied to every vertex

But there’s more …

Each vertex can hold up to 2kb of data (in OpenGL ES; more on desktop GL) – making our ship take 10 megabytes of GPU RAM. But that’s small by today’s standards – and as the model gets more complicated, the storage needed increases.

By contrast, the number of Uniforms needed for a model is typically constant.

The net effect is that GPU vendors can afford to use faster RAM for their Uniforms, boosting performance even further.

Convenience basics

Revisiting that spaceship, if we’re using it in a game, there’s a lot more things we’ll want to include in the 3D model:

  • 10 different versions, one for each player. They’re all the same, but some elements change colour to match the Player’s colour
  • Some of the textures animate: e.g. Landing strips, with lights that strobe
  • Gun turrets need to rotate hundreds of vertices at once, without affecting their neighbours
  • … and so on.

Each of these becomes trivial when your DrawCall has global-constants – i.e Uniforms. For instance:

  • 10 different versions…:
    • The vertices that might change colour have a vertex attribute signalling this; at render-time, the shader sees this flag on the verte, and reads from a Uniform what the “actual” colour should be. Change the uniform, and the colour changes everywhere on the model at once
  • Textures animate…:
    • When you read a U,V value from the texture-bitmap, add a number to U and/or V that comes from a Uniform.
    • Mark the teture as “GL_REPEAT”, so that GL treats it like an infinitely tiled teture
    • Increase that uniform by a tiny amount each time you render a frame (e.g. 0.001), and the texture appears to “scroll”
  • Gun turrets rotate…:
    • Use a second DrawCall to draw the turrets.
    • Each turret has a Uniform “rotation angle in degrees”
    • When rendering, your Shader pre-rotates ALL vertices in each turret by the Uniform’s value.
    • Per frame, change the Uniform for “rotation angle”, and the whole turret rotates at once
  • …etc…

Implementing and Using Uniforms in OpenGL

Render / Update cycle for Uniforms

With Vertex-Attributes, it was easy:

  1. CPU: Generate geometry (load from a .3ds file; algorithm for making a Cube; etc)
  2. GPU: Create 1 or more VBO’s to hold the data on GPU
  3. CPU->GPU: Upload the geometry from CPU to GPU in a single big chunk
  4. Every frame: GPU reads the data from local RAM, and renders it
  5. To change the data, re-do all the above

With Uniforms, it’s more tricky.

Firstly – like everything else in Shaders – Uniforms ignore any OpenGL features that already existed. The GPU intelligently selected the correct VBOs each frame, by using the data inside the VAO. But Shaders ignore the VAO, and need to be manually switched over.

Secondly – in VBO’s, OpenGL does not care what format your data is in. But with Uniforms, suddenly it does care: you have to specify, every time you upload them.

Thirdly – GL uses an efficient C-language mechanism for uploading Uniforms with minimal overhead. With VBO’s, the VAO took care of this automatically, but again: Shaders need you to do it by hand.

Together, these complicate the process:

  1. CPU: generate a value for the Uniform
  2. CPU: create an area in RAM that will hold the value, and place it there
  3. GPU: automatically creates storage for the Uniform when you compile/link the ShaderProgram
  4. CPU->GPU: switch to the specific ShaderProgram that will use the Uniform
  5. CPU->GPU: don’t send the data; instead, send the “memory-address” of the data
  6. CPU->GPU: upload using one of thirty three unique methods (instead of the one for Verte Attributes)
  7. Every frame: GPU reads the data from local RAM, but each ShaderProgram has its own copy
  8. To change the data, re-do all the above

Uploading a value to a Uniform

After you’ve linked your ShaderProgram, you can ask OpenGL about the Uniforms it found. For each Uniform, you get:

  1. The human-readable name used in the GLSL file
  2. The OpenGL-readable name generated automatically (an integer: GLint)
  3. The GLType (int, bool, float, vec2, vec3, vec4, mat2, mat3, mat4, etc)
  4. Is this one value, or an array of values? If it’s an array: how many slots in the array? (all GLSL arrays are fied length)

The GLType has to be saved, because when you want to upload, there’s a different upload method for each distinct type:

glUniform1i – sends 1 * integer

glUniform3f – sends 3 * floats

glUniformMatri4fv – sends N * 4×4-matrices, each using floats internally

etc

To handle this automatically, I wrote three chunks of code:

  1. GLK2Uniform.h/m:: stores the GLType, is it a Matrix or Vector (or float?), etc
  2. GLK2ShaderProgram.h/m .. -(NSMutableDictionary*) fetchAllUniformsAfterLinking: parses the data from the Shader, and creates GLK2Uniform instances
  3. GLK2ShaderProgram.h/m .. -(void) setValue:(const void*) value forUniform:(GLK2Uniform*) uniform: uses the GLType data etc to pick the appropriate GL method to upload this value to the specified Uniform

That last method takes “const void*” as argument: i.e. it has no type-checking. I find this much simpler than continually specifying the type. It also intelligently handles dereferencing the pointer (for matrices and vectors) or not (for ints, floats, etc).

Uniforms and VAOs: a missing feature from OpenGL

So far, we’ve used VAO’s. They’re very useful, seemingly they:

…store all the render-state that is specific to a particular Draw call

Tragically: Shaders ignore VAO’s. Once you start using Uniforms, you find that VAO’s actually:

…store all the render-state that is specific to a particular Draw call, so long as that state isn’t in a ShaderProgram (i.e. isn’t a Uniform)

Shaders are the only place where VAO’s aren’t used, and it’s very easy to forget this and have your code break in weird and wonderful ways. If you find Shader state seems to be leaking between DrawCalls, you almost certainly forgot to eplicitly switch ShaderProgram somewhere.

Note: this applies not only for rendering a DrawCall, but also for setting the value of the Uniform. You must call glUseProgram() before setting a Uniform value.

Because we often want to set Uniform values outside of the draw loop – e.g. when configuring a Shader at startup – I added a method that automatically switches to the right program for you. If you use this repeatedly on every frame, it’ll damage performance, but it’s great for checking if you’ve forgotten a glUseProgram somewhere:

GLK2ShaderProgram.m:
[objc]
-(void) setValueOutsideRenderLoopRestoringProgramAfterwards:(const void*) value forUniform:(GLK2Uniform*) uniform
{
GLint currentProgram;
glGetIntegerv( GL_CURRENT_PROGRAM, &currentProgram);

glUseProgram(self.glName);
[self setValue:value forUniform:uniform];
glUseProgram(currentProgram);
}
[/objc]

Uniforms in your Game Engine

Game-Engine code has to treat Uniforms a little specially:

  • Unlike Vertex-Attributes, we tend to update Uniforms very frequently – often every frame.
  • We have to reference them by human-readable name (instead of simply ramming them into a homogeneous C-array).
  • We have to remember to keep calling glUseProgram() each time we write to a Uniform, or render a DrawCall.

You can layer it in fancy OOP wrappers, but ultimately you’re forced to have a Hashtable/Map somewhere that goes from “human-readable Uniform name” to “chunk of memory holding the current value, that can be sent to the GPU whenever it changes”.

Desktop GL is different; they modified the GLSL / Shader spec so that it allowed for slightly tighter integration of variables with your main app. Sadly, they didn’t include those features in GL ES

I’ve tried it a few different ways, but the problem is that you have to store a “string” mapping to a “C-struct”. Worse, OpenGL ignores the value of the struct, it only uses the memory-address. So that struct has to be at a stable location in RAM.

This might not seem a problem, but Apple’s system for storing structs in NSDictionary is to create and destroy them on-the-fly (on the stack) – so there’s never a stable memory-address.

Here’s my current best workaround for ObjectiveC OpenGL apps…

An intelligent, C-based, “Map” class

Observations:

  1. All our data will be structs
    1. C can easily store data if it’s homogeneous
    2. and OpenGL only has circa 10 unique structs for Uniforms
    3. …so: 10 arrays will be enough to store “all possible” Uniform values for a given ShaderProgram
  2. Data is unique per ShaderProgram
    1. C arrays-of-structs can’t change size once created :(
    2. But: the Uniforms for a ShaderProgram are hard-coded, cannot change at runtime
    3. …so: we can create one Map per ShaderProgram, and we know it will always be correct
  3. C-strings are horrible, and we want to avoid them like the plague
    1. We can easily convert C-strings into Objective-C strings (NSString)
    2. Apple’s NSArray stores NSString’s, and returns an int when you ask “which slot contains NSString* blah?”
    3. C allows int’s for direct-fetching of locations in an array-of-structs
    4. …so: we can have a Data Structure of NSString’s, and a separate C-array of structs, and they never have to interact

GLK2UniformMap.h:
[objc]
@interface GLK2UniformMap : NSObject

+(GLK2UniformMap*) uniformMapForLinkedShaderProgram:(GLK2ShaderProgram*) shaderProgram;

– (id)initWithUniforms:(NSArray*) allUniforms;

-(GLKMatrix2*) pointerToMatrix2Named:(NSString*) name;
-(GLKMatrix3*) pointerToMatrix3Named:(NSString*) name;
-(GLKMatrix4*) pointerToMatrix4Named:(NSString*) name;
-(void) setMatrix2:(GLKMatrix2) value named:(NSString*) name;
-(void) setMatrix3:(GLKMatrix3) value named:(NSString*) name;
-(void) setMatrix4:(GLKMatrix4) value named:(NSString*) name;

-(GLKVector2*) pointerToVector2Named:(NSString*) name;
-(GLKVector3*) pointerToVector3Named:(NSString*) name;
-(GLKVector4*) pointerToVector4Named:(NSString*) name;
-(void) setVector2:(GLKVector2) value named:(NSString*) name;
-(void) setVector3:(GLKVector3) value named:(NSString*) name;
-(void) setVector4:(GLKVector4) value named:(NSString*) name;

@end
[/objc]

You create a GLK2UniformMap from a specific GLK2ShaderProgram. It reads the ShaderProgram, finds out how many Uniforms of each GLType there are, and allocates C-arrays for each of them.

Later, you can use the “setBLAH:named:” methods to set-by-value any struct. Importantly, this does NOT take a pointer! This ensures you can create a struct on the fly – all of Apple’s GLKit methods do this. e.g. you can do:

[objc]

GLK2UniformMap* mapOfUniforms = …

[mapOfUniforms setVector3: GLKVector3Make( 0.0, 1.0, 0.0 ) named:@"position"];

[/objc]

Connecting the GLK2UniformMap to a GLK2DrawCall

In previous posts, I created the GLK2UniformValueGenerator protocol. This is a simple protocol that uses the same method signatures as used by OpenGL’s Uniform-upload commands.

We etend GLK2UniformMap, and implement that protocol, to create something we can attach to a GLK2DrawCall, and have our rendering do everything else automatically:

GLK2UniformMapGenerator.h:
[objc]
@interface GLK2UniformMapGenerator : GLK2UniformMap <GLK2UniformValueGenerator>

+(GLK2UniformMapGenerator*) generatorForShaderProgram:(GLK2ShaderProgram*) shaderProgram;
+(GLK2UniformMapGenerator *)createAndAddToDrawCall:(GLK2DrawCall *)drawcall;

@end
[/objc]

Internally, the methods are very simple, e.g.:

GLK2UniformMapGenerator.m:
[objc]
@implementation GLK2UniformMapGenerator

-(GLKMatrix2*) matrix2ForUniform:(GLK2Uniform*) v inDrawCall:(GLK2DrawCall*) drawCall
{
return [self pointerToMatrix2Named:v.nameInSourceFile];
}

[/objc]

NB: in the protocol, I included the GLK2DrawCall that’s making the request. This is unnecessary. In future updates to the source, I’ll probably remove that argument.

Animated textures: the magic of Uniforms

Finally, let’s do something interesting: animate a texture-mapped object.

The sample code has jumped ahead a bit on GitHub, as I’ve been using it to demo things to a couple of different people.

Have a look around the project, but I’ve split into two projects. One contains the reusable library code, the other contains a Demo app that shows the library-code in use.

I simplified all the reusable render code to date into a Library class: GLK2DrawCallViewController (etends Apple’s GLKViewController)

I’ve also moved the boilerplate “create a triangle”, “create a cube” etc code into a Demo class: CommonGLEngineCode

The sample project – permanent link to branch for this article – has a simple ViewController that loads a snake image and puts it on a triangle:

AnimatedTextureViewController.m:
[objc]
@interface AnimatedTextureViewController ()
@property(nonatomic,retain) GLK2UniformMapGenerator* generator;
@end

@implementation AnimatedTextureViewController

-(NSMutableArray*) createAllDrawCalls
{
/** All the local setup for the ViewController */
NSMutableArray* result = [NSMutableArray array];

/** — Draw Call 1:

triangle that contains a CALayer texture
*/

GLK2DrawCall* dcTri = [CommonGLEngineCode drawCallWithUnitTriangleAtOriginUsingShaders:
[GLK2ShaderProgram shaderProgramFromVertexFilename:@"VertexProjectedWithTexture" fragmentFilename:@"FragmentTextureScrolling"]];

[/objc]

That’s using the refactored CommonGLEngineCode class to make a unit triangle appear roughly in the middle of the screen.

Then we setup the UniformMapGenerator (no values yet):

[objc]

self.generator = [GLK2UniformMapGenerator createAndAddToDrawCall:dcTri];

[/objc]

NB: the generator class automatically detects requests for Sampler2D, and ignores them. Those are only used for texture-mapping, which we handle automatically inside the GLK2DrawCall class (see previous post for details).

[objc]

/** Load a scales texture – I Googled "Public Domain Scales", you can probably find much better */
GLK2Texture* newTexture = [GLK2Texture textureNamed:@"fakesnake.png"];
/** Make the texture infinitely tiled */
glBindTexture( GL_TEXTURE_2D, newTexture.glName);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

/** Add the GL texture to our Draw call / shader so it uses it */
GLK2Uniform* samplerTexture1 = [dcTri.shaderProgram uniformNamed:@"s_texture1"];
[dcTri setTexture:newTexture forSampler:samplerTexture1];

[/objc]

…again: c.f. previous post for details of what’s happening here, nothing’s changed.

[objc]

/** Set the projection matrix to Identity (i.e. "dont change anything") */
GLK2Uniform* uniProjectionMatrix = [dcTri.shaderProgram uniformNamed:@"projectionMatrix"];
GLKMatrix4 rotatingProjectionMatrix = GLKMatrix4Identity;
[dcTri.shaderProgram setValueOutsideRenderLoopRestoringProgramAfterwards:&rotatingProjectionMatrix forUniform:uniProjectionMatrix];

[result addObject:dcTri];

return result;
}
[/objc]

…ditto.

Finally, we now have to implement a callback to update our Generator’s built-in structs and ints and floats once per frame:

[objc]
-(void)willRenderDrawCallUsingVAOShaderProgramAndDefaultUniforms:(GLK2DrawCall *)drawCall
{
/** Generate a smoothly increasing value using GLKit’s built-in frame-count and frame-timers */
double framesOutOfFramesPerSecond = (self.framesDisplayed % (4*self.framesPerSecond)) / (double)(4.0*self.framesPerSecond);

[self.generator setFloat: framesOutOfFramesPerSecond named:@"timeInSeconds"];
}
[/objc]

Run the project, tap the button, and you should see snakey skin scrolling along the surface of a 3D triangle:

Screen Shot 2014-02-20 at 01.51.20

From scales-on-a-triangle to realistic snake

The scrolling works by moving our offset across the surface of the triangle. Doing this with a Uniform means that the speed is constant relative to the corners of the triangle.

i.e. if you make the triangle smaller, it will take the same time to cover the distance, but it’s covering a shorter distance, so appears to move slower.

We’re getting the effect – for free! – of skin bunching up and stretching out. All you have to do is make your triangles shorter on the inside of a snake-coil, and longer on the outside.

If you model your snake the easiest possible way, this bunching will happen automatically. Simply take a cylinder and bend it with a transform – the vertex attributes (that force the texture to map across each triangle) won’t change, but the triangle sizes will, causing realistic bunching/stretching of the skin.

Categories
entity systems iphone programming

Cross-platform Entity System design

I’ve started a new open-source project to make an ES designed specifically to be ported to different platforms and work the same way, with the same class/method names, on every platform:

Aliqua.org

Here’s some thoughts on “why” I’m doing this, and “how” I’m making it work…

Categories
iphone programming

GLKit Extended: Refactoring the View Controller

If you’ve been following my tutorials on OpenGL ES 2 for iOS, by the time you finish Texturing (Part 6 or so) you’ll have a lot of code crammed into a single UIViewController. This is intentional: the tutorials are largely self-contained, and only create classes and objects where OpenGL itself uses class-like-things.

…but most of the code is re-used from tutorial to tutorial, and it’s getting in the way. It’s time for some refactoring.

Categories
entity systems programming

Artemis Entity System in ObjectiveC

I wanted to try the latest version of Artemis, and I had an old game project that was quickly written in OOP style. So I went looking for an ObjC port…

Existing port: outdated

There was a port linked on the Artemis site that was OK – but had no documentation or updates, and Artemis has moved on since then.

There were also no unit tests etc – and the current Artemis getting-started wouldn’t work with this port (because so much has changed). So I started a new port…

New port: ObjC Artemis 100%

Primary aim:

  • make it identical to the core Artemis.
Categories
computer games dev-process devdiary entity systems games design

2014 Entity Systems: what are your Unity3D questions and problems?

In 2014, I’ll be making a new game in Unity that makes intensive use of an Entity System.

This will give me lots of ammo for a new post exploring the pros and cons of Unity’s “partial” Entity System architecture. I’ve been thinking about this a lot for the last couple of years, but I’ve been unhappy with the draft posts, and didn’t publish them.

Over the next couple of months, I’d love to hear from all of you the challenges, confusions, problems, and questions you have about this. I’m not promising quick answers – but it will help shape the blog posts I write, and as soon as I have a good enough set of answers, I’ll start posting them :).

Categories
community dev-process programming project management

Reporting issues in Open Source software

I maintain a medium-sized open-source project. I welcome bugs and ideas via GitHub’s excellent “Issues” tab.

Some Issues are well-written, others are not. Strangely: Issues get resolved fastest are often the shortest – as little as 2 or 3 sentences. Doing it “right” is easier than I expected…

Categories
iphone programming

OpenGL ES 2 – Textures 2 of 3: Texture Mapping

Recap: Texturing in OpenGL can be achieved in two separate ways, using different API’s and hardware features. The preferred, modern approach – procedural texturing – uses nothing more than a Fragment shader (this is really what Fragment shaders were created for: Texturing).

Part 5 (Textures: 1 of 3) covered that in detail – but there’s another option: Texture-Mapping (also known as “UV Mapping”, it’s identical).

Categories
GamesThatTeach

GamesThatTeach: Conway’s Life, gamified…

Quick and simple conversion of Life into an actual game (instead of a mathematical “game”):

https://thomashunter.name/games/strategic-game-of-life/

A couple of observations:

  • “Life” makes for a terrible game: it’s far too complicated to translate between “Thing I want to achieve” and “how I would choose/use/combine the pieces at hand to achieve it”
  • …Life using a library might work better, or Life with more human-friendly rules
  • Even in a simple form, the game aspects work quite well: you quickly start to learn *if you have a window open to Wikipedia or similar* how to remember and recreate the classic shapes
Categories
server admin

WordPress’s Akismet pushes Database to 30x real size

Like half the planet, I use WordPress as a blogging platform. There are many good things about it. One of those used to be Akismet (if you ignore the slightly unpleasant sales strategy: it’s free, if you give WordPress some personal details) – but today I hit a very serious bug in Akismet. I’ve had to delete Akismet – but WordPress’s coders don’t clean up after themselves, so you have to do some manual clean-up too. Read on.

Categories
iphone

Apple’s dirty secret: Xcode and Libraries

This is a post for every iPhone/iPad developer. Every day, we try to use “best practices” when writing code, and building apps.

Categories
iphone

OpenGL ES2 – Textures 1 of 3: Texturing Triangles using Shaders

OpenGL Texturing has been re-invented multiple times, with different hardware implementations each time. As a developer, you need to be comfortable with all of them.

Categories
iphone

How to keep code and images in multiple bundles in Xcode5

Apple invented “bundles” and “frameworks” to greatly improve the ease of development and maintenance of OS X projects. They are really good.

But Apple modified Xcode to try and stop people using these on iOS. I don’t know why.

Today I delved into this to find out the correct, working, simplest/easiest solution…