Monday, November 05, 2012

APIs Should Not Be Copyrightable

My friend Chris Adamson has a post up about why he thinks the effort post Oracle v Google to keep APIs as non-copyrightable is flawed.

Chris makes two main points here. 1) Protecting interoperability isn't a primary concern and 2) APIs are substantial collections of creative work and deserve protection.

The first point is interesting. The operative graph here is...

Interoperability doesn’t end if APIs are copyrighted, it just means that people and companies who create stuff control how it’s used — that’s literally what copyright is, after all — which may or may not include seeking/wanting/tolerating interoperability or reimplementation.
Ok.  That is one perspective. However, it would be a fascinating change to copyright vs patent law. Patent law has always supported a clear exception for "Reverse engineering for the sake of interoperability". Even if your design includes a 62 tooth gear at .5cm, and no one has ever made one before, that doesn't mean you can stop people from making a .5cm, 62 tooth gear as spare parts for your device. I think it is pretty clear the same applies to software.

But the part of this argument that bothers me the most is, it starts by blushing over what "An API Is" as a topic. (The second argument suffers a little from this, but it isn't as much a concert). Is an API a memory location a computer JMPs to begin executing code? Obviously no. An API requires a symbology. So lets try and narrow it down:

An API is a set of symbols that instruct a machine to behave in a predictable, predefined way.
That is a pretty big definition isn't it? By jumping immediately to the Java/C/ALGOL type definition of an API, we jump right past a lot of things. So here is a quick example:

pencolor red
fd 100
rt 120
fd 100
rt 120
fd 100
rt 60
Many of you may immediately recognize this, but what is this? Is this using an API to draw on the screen? Is this a data file format for outputting to a plotter? Does it matter? How much of LOGO can you reuse before you have "stolen" the API? If I use "rt" to mean "turn right" is that OK? What about REST APIs? Could I copyright a URL that ends in /users/[id] and /users/[id]/friends? Is that different than getUser(id) or getUser(id).getFriends()? Why would the later be protected and the former not?

Determining what is an API vs what we would call a "data file" is harder and harder. Open Office reads Word files, Word reads Word Perfect files, Word Perfect reads Wordstar files. Are these files, which contain a fixed symbology to tell a computer how to output something onto a screen and/or bit of paper, not expressions of an API? How does that differ from an interpreted language?

The thing is, in 1992 a lawsuit already determined that the Hayes Command Set for modems wasn't protected. Surely that was an API if there is any possible definition of one. But not just Hayes, "API" compatibility has been at the core of the entire PC industry since its inception. Language re-implementations, AMD using the Intel instruction set, "Soundblaster" becoming the default audio API on MS-DOS and Windows for many years. Chris might feel that these uses were unfair, but I shudder to think what the industry would look like today without them. It might also be fair to say that I am being somewhat farcical saying URL templates could be copyrightable. However, if we have learned anything about copyright and patent law in recent years, it is that what seems to be "common sense" among practitioners is rarely how things shake out in a legislature or jury room.

Chris's second main point is captured in this graph:

The software architect who designs a public API has to make value judgements about readability, feasibility, practicaility, implementability, and so on. She has to conceive of both how the code will be implemented, and how it will be used, how it it is likely to consume resources (storage, I/O, db, CPU) under different use scenarios, and how to deliver value to whoever calls it. In a way, this is the most abstract, highest-level of thinking we do in software. Why would that be unworthy of copyright, but the drudgery of all the for-next blocks in its implementation be protected? This is backwards!

Again, no one is arguing that an API isn't a creative work. However, simply being a creative work is not enough to warrant copyright protection. Clothing designs, recipes, and many other significant works of creativity are not covered by block copyright. Indeed, as much as the NFL and MLB might hate it, statements of fact, even if they include references to copyrighted works, are not protected. There used to be a huge industry of creating indexes and concordances as well, that would seem to me to be akin to reusing an API, that were considered protected.

But beyond this, I am forced to fall back to analogy. The Encyclopedia Britannica has represented a monumental amount of work for a great many scholars for decades. I tend to look at software as being very much akin to writing Encyclopedias, as you are coordinating authoritative locations for expressions of ideas, attempting to reference other articles and be as concise as possible without omitting key ideas. Now, suppose I took the Table of Contents of the Encyclopedia Britannica, itself a couple hundred pages, and payed a bunch of people to build a new encyclopedia with those entries. Have I violated the copyright of the Britannica? I would say no. Certainly there was creative input that went into the selection of those topics, and there is definitely an editorial product there. However, the Table of Contents is generally not something that we would consider "the work." Rather, like indexes and concordances, it is a fact about the work.

Now, let's say I took that Table of Contents and edited it down to 1/3rd the original size and produced "Cooper's Brief Encyclopedia." Now, I started with Britannica's TOC and made my own editorial judgement as to what was important and what was not, then payed a bunch of people to fill in the pages. Surely this is analogous to Google's use of a ~30% subset of the Java API in Android, no?

That is, an API is a creative work, but it is also a simple statement of fact about the larger creative work, not a work unto itself. To say it deserves the same protection as the implementation is opening a very large can of worms, not just in the software world, but out side of it.