Schema Schmema

Written by Nexcerpt on June 3rd, 2011 in Dating & Online.
Tags: , , , , ,

According to nearly simultaneous announcements yesterday from Google, Yahoo, and Bing, this new project brought together their collective best efforts to standardize certain markup elements in one Schema.

Summary: more evidence that corporations should not be in charge of Anything Important ™.

As Bing’s release said, the web is full of “messy bits of data”: an incomprehensible volume of material lacking structure. wants to make the web easier to read for the computers that increasingly do our reading for us.

You might read the name “James Cameron,” recognize him as a movie director, and remember his films. wants computers to exchange hidden codes that say at least as much. If you do that sort of thing — tagging data with codes — the idea seems simple. If you don’t, it seems bizarre.

To help you visualize the work, a search engine might have access to hidden codes that specify “James Cameron” [a Person with the property “birthDate” of type “Date,” and the property “nationality” of type “Country,” and the property “affiliation” of type “Organization,” and the property “performerIn” of type “Event…“) People would see only a name. Search engines would see details, and be able to discover useful connections.

Much of the web lacks such structured detail. That lack, of course, reflects a wide range of failures — failures of planning, and of organization, and of understanding, and of systems — but primarily the failure of those who manage such data to exercise much discipline behind the scenes. There are other, more obvious priorities… for example, spelling “Cameron” correctly.

So, must have solved the behind-the-scenes discipline problem, right? Joining the most technologically advanced companies in the world? Using the best minds they can collectively muster? Applying the most rigorous organizational thinking? Haha. Dream on.

Let’s say someone wants to sell stuff online. (Lots of people want to do that. It’s the economy, stupid.) Obviously, they want to organize their stuff within a disciplined, structured system. Here’s the system offered by the Young Geniuses at

First, recognize that a “Thing” can be some “Intangible” like a “Quantity” which has “Mass.” What could be more intangible than mass? OK. With that firmly in mind…

The “Thing” — which is in the “Offer” — is the “Item.”

See how thoughtfully defined words move you toward a more organized, disciplined, structured way of thinking? Especially such logical and exquisitely parallel naming conventions!

Property "itemCondition" is of type "OfferItemCondition."
Property "availability" is of type "ItemAvailability."

See the pattern there? No? Right! There isn’t one.

It’s mind boggling that people of such intelligence, from organizations of such power, didn’t actually sit down and make some decisions. allows a property and type to share the same name. For example, aggregateRating (starting with lower case for property) with AggregateRating (starting with upper case for type). And yet, on the same page, the type “Review” serves the property “reviews.” In facts, that occurs everywheres… from tops to bottoms of the entires schemas.

Perhaps I’m digging too deep, too soon. Let’s look at the top-level designations. These should be perfectly organized and sensibly structured for ideal discipline!

“Article” can be of type “NewsArticle” or “ScholarlyArticle” or “BlogArticle” “BlogPosting.” Oops.

Then, “Blog” uses the type “BlogPosting” to articulate the property “blogPosts.” (Which is not the type… in case you were confused.)

I’m not harping on my personal pet peeve. If I were, I’d point out my horror upon seeing names like “BusinessEvent,” “ComedyEvent,” and “DanceEvent.” (Because, obviously, we’d never want to sort — “EventBusiness,” EventComedy,” EventDance…” for example, to organize our documentation.)

No, I’m harping on naming thirteen thing-item-objects in a particular way — “{some}Event” — and then inserting a fourteenth called “Festival.” Not “FestivalEvent.” Because… that would be confusing.

Harping that each of these can have “SuperEvent” and “SubEvents.” Not “SubEvent.”

Harping that property “bookFormat” is of type “BookFormatType.” Literally the property with the word “Type” added.

Harping that their science training held sway when they named a Thing’s Intangible Quantity “Mass” rather than “Weight.” But science got lost in naming one “Landform” a “BodyOfWater” rather than “Water.” Consider this inchoate offense against all our sensibilities:

More specific types
* Canal
* LakeBodyOfWater
* OceanBodyOfWater
* Pond
* Reservoir
* RiverBodyOfWater
* SeaBodyOfWater
* Waterfall

…wherein we must be reminded that — lest we confuse such unusual terms — “Ocean,” “Sea,” “Lake,” and “River,” all constitute a “BodyofWater,” though “Pond” and “Reservoir” do not.

There are moments when we’re led to believe someone was awake. Type “AudioObject” for property “audio,” type “VideoObject” for property “video,” and type “MediaObject” for — D’oh! — for property “encodings.”

The group has squandered most opportunities to structure this system. It’s relentlessly bad at the most fundamental level. One property of Place is named “maps,” which is of type “Map.” Another property named “photos” is of type “Photograph.” Yet another property named “geo” is of type “GeoCoordinates.” It’s as though they never even looked at that list.

With such an eye toward organization and discipline, these same people soon will be defining new tools to manage our credit card usage. Or private medical records. Or public elections. Imagine how much easier it will be to understand… everything.

Comments are closed.

Recent Posts and Other Categories