Saturday 11 February 2012

The Web, Time to retract the wheels?

It is said that one of the most important things about the growth and dissemination of the Web is the fact that HTML as well as CSS and Javascript are textual formats which can be read by human beings and copied.

The View Source Principle

This is considered such an important principle, by some, and as part of the Web orthodoxy, that it has a name, "The View Source Principle".

This principle and has been quoted over a long period by many highly respected Web pioneers.

For example:
The Web has mostly been built by hackers, originally for hackers, and is well-known to have spread virally via the “View Source” principle: find something you like, View Source, and figure it out.
Tim Bray 2003-06-03
However, I think the virtuousness of the principle needs to be questioned. Both from a straightforward "even good ideas need to be tested" point of view, and also from the point of view that time has moved on in all these years and things ain't what they used to be. Whether it ever should have been be an overarching or over-riding principle is moot. Whether it still is, is more interesting.

Of course I'm not the first person to observe this. The discussion goes back a long, long way, for example:
The "view source principle" should be treated respectfully, but it must be weighed against other requirements and constraints on the Web architecture.
Mike Champion, 2003-10-23
Here are some reasons why the virtuousness of the "View Source Principle" may be suspect:

Usage is not Exemplification

The assumptions you make from live usage may be wrong. Examples illustrate a point and are made with a specific didactic purpose. Live usage is not created, in general, with exemplification in mind. Assuming that the content has been created by a human being and that the human being is a competent (or better) practitioner of the Web arts, whatever they did may not be a good example for what you plan to do. There may be a better way of doing that, the author may have balanced competing design principles in making the decisions they did, which may not be at all applicable to your circumstances. Generally speaking, you don't know by inspection.

Copying Good Practice is Good, Copying Bad Practice is Bad

A more extreme version of the above point is that the author may not have been an expert, or even competent. Their usage may may be wrong, out of date, or for whatever reason not in line with good practice.

Times Change

The Web has moved on. Increasingly, it's not primarily composed of HTML that has been created in a text editor. It may be that already the majority of Web content is not. For example, this blog is composed using the standard "Blogger" tools, a Javascript-based somewhat wysiwyg editor.

Try doing a "View Source" on this page. You learn that for some reason <p> elements are not used. It would seem that for this post, at least, instead of using <p> elements, <div class="p"> is used instead. Presumably there's some CSS somewhere that specifies the same kind of visual representation of a div that a <p> would. Is that good practice that should be copied? Might you infer that a Web document should be composed of anonymous containers in a tree structure with appropriate visual styling? You might. Is that good practice? Not in my book, no.

If the viral nature of View Source is good, from some points of view, it's bad from others, since it can just as easily spread bad practice and misinformation as it can anything else.

Unit of Authorship

Originally, unitary Web pages were composed in textual editors and early Web sites often had the Web page as both the unit of authorship and the unit of consumption. Some things seem to follow from that, like having the same authoring language as delivery language (HTML).  But it seems to me that the assumption that the authoring language and delivery language are the same is wholly open to challenge. More on that elsewhere, later, probably.

Most Web sites today - or many at least - do not have pages that correspond to units of authorship. This one (the one you're reading) is probably a little unusual, in fact, in that the majority of the page is a single unit of authorship (sorry to keep using that ugly term) with only a limited amount of site-wide framing and additional content.

But the structure of HTML seems to follow from this (unspoken) assumption. A good example being that the <style> element was not allowed outside of <head> in HTML until HTML5. I don't know whether it was included in HTML5 to facilitate fragment processing. It's also true that you can't properly embed HTML and XML documents inside other documents without resorting to ugly (and actually impractical) escaping and commenting (you can't put a comment inside a comment so that really doesn't work well at all.)

Even if you can work around this embedding problem, it ought to be the other way round, namely that the language should be designed to facilitate composing multiple discrete units of authorship into a unit of delivery/consumption.

It's not View Source, it's View Transfer Syntax

If you do View Source, what you're looking at, in fact, is unlikely to be what someone did directly (i.e. by writing HTML) to achieve a particular effect you want to copy. I might have created my Web page using PHP and you might want to do something similar using Ruby-on-Rails - does View Source on the HTML that is created as a result of my PHP executing help you much? And so much, that it's considered a fundamental principle?

Bootstrapping the Web imposed certain requirements

Bootstrapping the Web successfully was probably at least partly a result of there being minimal dependencies on tools and the fact that you could create Web pages using a simple text editor, save it to file store and have it delivered untransformed to a browser.

Operating the Web imposes different requirements

Moving forward a few years, though, and a different view starts to predominate. Now, the Web consists of content that's transferred between computers in a form designed for human beings to create and read, but that is verbose, inconvenient to generate and inconvenient to process for computers. It's increasingly rare for a delivered page to correspond to an unprocessed piece of content mapped to file store. The content is rarely unprocessed server side and is likely also to be processed client side.

The virtue of being able to capture content in transit and be able to interpret it using only a minimal tool is an extremely limited virtue. Though debugging is important, in general, the robustness and efficiency of an operational system once live are of equal or greater concern . To compromise the efficiency and robustness of live systems in the interests of being able to use a trivial debugging tool seems out of balance.

Debugging Tools

Using a text editor on raw content is in any case a matter of habit rather than priority. If you're debugging markup you're likely to be a lot better off using a validating parser to check the content. And given the inherent complexity of such a tool whether the markup was or was not originally human readable is moot. How hard would it actually be to have tools that provided a more easily human digestible view of the transfer syntax given that it was not originally so?

View Source in a browser is useful for debugging, what's even more useful is inspecting the DOM as built by the browser - as exposed by Firebug or Inspect Element in Webkit terms - i.e. view source doesn't tell you what the browser has done with it, it tells you what the author did before the browser sorted it out into something it understands.

False Friend

There's an insidious aspect to this supposed virtuosity of "View Source" - and that is that it is strongly implied that you "can" or even "should" create the human readable content "by hand".

Although we probably accept that being able to use a simple content creation tool available on any platform was an important ingredient to the early Web, it's actually quite hard for even quite adept humans to write HTML correctly. Always has been.

And today it's even harder than ever, given that the Web has become more sophisticated and the components of it are more numerous. Today you have to master 4 syntaxes - (X)HTML, CSS, Javascript and URLs (URIs or IRIs) - none of which appear to have given any thought to harmonious co-existence with any of the others. And that's just the syntax. Never mind more consequential issues relating to grammar, the DOM and other things.

I think it's fair to say that even the most skilled practitioners cannot create more than an extremely simple modern Web site without error by using basic tools. It's way too much to ask for practitioners who are merely "functionally skilled".


And as for teaching it to unskilled people who'd like to have even basic skills and are trying to attain  functional skills or better? Well, that's really the whole story behind this sequence of posts.

I haven't researched this and can't in any sense prove it - but I imagine that the teaching of basic mathematical skills is one of the most long term and hardest undertakings that we try as a matter of routine with children. Well those children who are fortunate to receive a systematic education as a matter of expectation of course.

Most will speak. Many will write. A large number fall by the wayside of mathematics despite over 10 years of tuition. The notational systems of mathematics have always seemed to me to be moderately coherent - though undoubtedly this hasn't always been the case. If you had to learn four different notations to get to a modest level of competence that would only make things so much harder, wouldn't it?

It's important to understand most of the aspects of mathematics that we teach. Arithmetic is "essential" for everyday life for most people. Being able to create a Web page is not essential in the same way, but in order for "The Web to reach its full potential" it ought to be within the reach of most high school educated people to be able to create a Web presence - beyond a most basic "hello world" (with the obligatory yellow-on-purple sideways scrolling ticker, of course - so sad that that still exists.)

Priesthoods and all That

I don't think for a moment that the current set of Web technologies was deliberately created to be hard. In fact, if anything, quite the opposite. However, it's worth wondering whether there are vested interests in making tools and services that depend to some degree on it being as hard as it is. Would Web consultants, educators and others rejoice in a simpler Web more open to less skilled people? Just because this is a paranoid point of view doesn't mean we shouldn't look at it :-)

What we should be looking for is the continual deskilling of routine tasks. This allows skills to be applied to higher level and higher value activities. It used to take an expert to make a Web page at all. It still takes an expert to make a Web page that stands a half-way decent chance of rendering in a usable way across a range of delivery targets. The business of building moderately functional cross-platform Web sites is something that is beyond the capabilities even of most priests.

Where are the tools then?

It doesn't matter, though, you may say. Look, you are using Blogger to create this page and its companion pages and to do so you haven't used any Web knowledge that you may profess to have.

That's fair, up to a point. However, using this tool I routinely create Web pages that pull in information from a number of different sources, and which I duplicate across various different destinations. I could not create those Web pages without a knowledge of HTML. I need that knowledge to sort out the Tag Soup that results from the cross pasting that is necessary to make those compositions. If I don't then I end up with a mish-mash of styles, line spacing and so on that is - well - awful. And it's not just because of the <div> <br /> HTML used here.

I don't know of any widely accepted tool set for Web creation that covers more than a niche aspect of the market or use cases for Web site creation. At best we seem to have syntax aware highlighting and optional validation. Deployment and testing cycles seem extremely poorly catered for.

One possible interpretation of this situation is that since the current set of standard components we not designed for creation by tools it turns out that creating tools for them is really hard. A "straightforward" HTML editor needs to have a split personality if it wants to provide you with an expected wysiwyg creation experience - along with a desirable logical view.

(I accept that I've never used the Adobe tools which apparently are quite good. It would be nice to think that this was an area for a vibrant market and open competition, though).

(I guess, to be fair, I'm also not aware of any very good wysiwyg non-Web word processor, either. At least not one that allows me to pick up a document edited by someone else and continue editing it, oblivious of the assumptions they've made in creating it).

Starting Conditions vs Conditions for Growth and Scalability

I don't know what, specifically, Mike Champion had in mind by way of "other requirements or constraints on the Web" in his quote above. I should try to ask him - I don't want to have wrongly co-opted his point in support of my own. Meanwhile here are some further thoughts on why this - and related vestiges of the Web's beginnings - should not be considered inviolate and indeed should be considered similarly to the coccyx - interesting from a historical perspective only. The coccyx isn't usually considered harmful. I'm suggesting that View Source is.

View Source may well have been a significant contributor  to the Web taking off. But the Web has now taken off and just as an airplane in flight doesn't need wheels, the Web today doesn't need View Source. Wheels get in the way of efficient flight and the current set of standard components (HTML, CSS, JavaScript and URL syntax) in their present form and together with View Source as a design principle and delivery format get in the way of efficient creation and operation of Web pages.

Deep respect, View Source, you showed the way - may your retirement be long and happy.

No comments:

Post a Comment