Saturday, February 21, 2009 

The 'Semantic Web' vs 'Emergent Semantics' on the web

…or syllogisms vs neologisms

Tim Berners-Lee - “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”

This statement projects the typical view of the ‘semantic web’ that somehow the chaotic and loosely defined nature of the web can be tamed by applying syllogistic deductive logic. However, syllogisms often lead to the inadvertent application of generalizations that, while seeking to prove truth, end up only proving that there are always exceptions to any rule. As a pertinent example, (and to maintain a theme):

All people, are unique individual humans
All Facebook users are people
Therefore, all Facebook users are unique individual humans

So, one could obviously factor in the undeterminable probability of 'Facebook-Trolls', but to put in place a system that corrects the contextual mistakes of syllogisms would be a gargantuan task. As Clay Shirky stated back in November 2003, (discussing the semantic web) - “Any requirement that a given statement be cross-checked against a library of context-giving statements, which would have still further context, would doom the system to death by scale.” (

In the end, it could be said that it’s the ‘top-down’ nature of the push for the ‘semantic web’ that makes it so obviously not a ‘bottom-up’ phenomenon.

The new breed of P2P search projects that are contending for the ‘next big thing in search’ holy-grail, like Faroo ( and Minerva ( have taken the semantic overlay networks (SON) approach to organize peer-nodes and data objects into clusters in accordance with the inherent semantics of the content in these networks.

These projects look at the semantics of an existing resource and attempt to use semantic rules and processes to facilitate the search and retrieval of data or files from that resource. The trouble is that this is a little like the semantic web approach, where semantic rules are formulated to attempt to make some order with regard to an open-ended amount of heterogeneous web data.

This application of ‘semantics’ is in contrast to the way the term ‘semantics’ is being used in the fields of ‘Emergent-Semantics’ and ‘Semiotic-Dynamics’ which are more concerned with neologisms (newly coined words or expressions) and evolving language systems, and specifically, ‘tagging’ and ‘folksonomies’ as evidence of these phenomena. (see the work of Ciro Cattuto and ref: )

‘Emergent-Semantics’ and ‘Semiotic-Dynamics’ are relatively new fields of study that have gained some interest due in part to the general interest in ‘Semantic-Web’ research but specifically the recognized properties of folksonomies that display power-law and small-world characteristics. (

These fields, “study how semiotic relations can originate, spread, and evolve over time in populations, by combining recent advances in linguistics and cognitive science with methodological and theoretical tools from complex systems and computer science.” [quote from:]

The stated aims of the ‘Sematic Web’: “a universal medium for data, information, and knowledge exchange making it possible for the web to understand and satisfy the requests of people and machines to use the web content.” [Berners-Lee 2001] seem somewhat quixotic by comparison to the immediacy and relevance of the study of the ‘emergent semantics’ of the web and the plainly obvious evolving language systems characterised by the tagging phenomenon, which are unmistakably ‘bottom-up’ in nature.

So, what are the practical applications of ‘bottom-up’ emergent-semantic systems? I’ll have to leave that for another post.

Wednesday, August 20, 2008 

'Friends'… your new enemies

or how ‘closed’ may become the new ‘open’…

I have a friend, who up until recently, was quite a good friend, but then something strange happened. His dark, mischievous sense of humor, which had always been one of the qualities that made him unique and often terribly funny, suddenly discovered a vehicle that offered him something akin to supernatural powers. Like the power to transform himself into anyone he wished, or to be multiple people at the same time. The power to gain the confidence and trust of strangers by morphing into the identity of their trusted friends. On top of this, he had the power to anonymously wreak social havoc, distress and disorder, only to then be able to disappear like a thief in the night.

How did he obtain these supernatural powers? He signed up with Facebook, and slowly but surely became a Facebook “Troll”. Unfortunately, he is not alone. There are many individuals that exploit the unintended gaps within the fabric of sites like Facebook to impersonate and humiliate people that they don’t know.

One alarming aspect of this phenomenon is that these people are able to conduct this activity only by making quasi-partners of legitimate web-sites and services like Facebook and GMail, which is often used to generate fake email addresses to qualify for additional user accounts on social networking sites.

So, with human nature being what it is, one thing that we can depend on is that the trend will continue and there is very little that can be done about it. This then leads to the conclusion that in many ways the web has reached a point akin to what is known as the ‘tragedy of the commons’… meaning that the common area that became popular has now become too popular. So popular that in fact many of the benefits have been spoiled.

Its clear that many people will regret profoundly, releasing their private pictures and personal details innocently on the web, because once released, often they may never be able to be completely retrieved.

Which brings me to the idea of ‘open’ vs ‘closed’… Is it just me, or does the idea of a closed personal network to exchange information with friends seem so much more appealing than an open one?

I think there is a huge area of opportunity here, to appeal to ‘non-consumers’ of open-networks. These would be networks that people used to conduct genuine conversations with real friends from the real world. They would not necessarily be exclusive of strangers, but rather protective of relationships. New acquaintances could be invited in based on genuine qualification, again, in the real world.

My guess is that this period in the first decade of the 21st Century will be characterized by recollections of how so many people got burned by being ‘too open’.

Tuesday, May 27, 2008 

Web-Advertising is sooooo broken....

Danah Boyd had a great discussion going on her Zephoria Blog in late 2007, called: "Who clicks on ads? And what might this mean?" There's some really worthwhile information there, starting with some quotes from Dave Morgan (AOL Global Advertising Strategy)

"99% of web-users don't click on ads... and only a tiny % of those actually purchase!"

But wait... it gets worse!

"Ninety-nine percent of Web users do not click on ads on a monthly basis. Of the 1% that do, most only click once a month. Less than two tenths of one percent click more often. That tiny percentage makes up the vast majority of banner ad clicks. ~ Who are these "heavy clickers"? They are predominantly female, indexing at a rate almost double the male population. They are older. They are predominantly Midwesterners, with some concentrations in Mid-Atlantic States and in New England. What kinds of content do they like to view when they are on the Web? Not surprisingly, they look at sweepstakes far more than any other kind of content. Yes, these are the same people that tend to open direct mail and love to talk to telemarketers."

That's actually pretty revealing data, especially the rough demographic profile of the clickers themselves. This may go some way to explaining the proliferation of those really annoying gambling pop-up ads and flashing, vibrating banners proclaiming to (what would surely be) a seemingly implausibly gullible web-user who by some incredible stroke of luck, has just won a really neat prize! ~ Regrettably, market forces don't lie... it seems these ads are apparently targeted at the only people who dependably click.

As is often the case, the blog's commentariate kick in with some worthwhile observations:

"There is another aspect to the question of "who is clicking on these ads" that I don't believe has been raised. That is, if the ads are taken as indicative of "level of interest in the population at large", then the people who are clicking on those ads are the ones who are driving marketing decisions for the world at large."

and KEVIN: "...Its the marketer that gets hurt - HE/She advertised to the wrong person. The clicker did not buy anything- (do we know if they convert?) It means that ads in the web world are worth even less than we thought. It means that Google's revenue and business model is a huge scam?"

[my note: even Google’s Adwords average only around a 2% CTR, and Google consultant Professor Hal Varian has stated that less than 2% of ads might get clicks and less than 2% of clicks might convert to sales, meaning that 0.04% of clicks might result in sales... ~ A $40B web-ad market might sound impressive, but according to Sir Martin Sorrell, of WPP, the wider Advertising market is a Trillion $ market; so with such dubious current ROI, that $40B might really currently be largely driven by hype, and the pressure to be 'a player']

But my favorite comment comes from CASEY:

"I actually work for a company that does a lot of online advertising campaigns, so I think I can shed some light this. The honest-to-god truth is that the people in charge of these campaigns have absolutely no idea what they're talking about. They describe their target audiences with phrases like, "Interested Non-Users," or by using terms they've made up, such as the gag-worthy "prosumer."
[SUBSTANTIAL EDIT] "...Of course, the punchline to all of this is the fact that most click-throughs don't translate to actual sales. If an ad campaign is relying on accidental click-throughs, or on attracting the attention of a niche market who can't afford what they're selling, then the joke is on the person footing the bill. The model is clearly broken, and most people in the industry know that, but the people signing the checks aren't in on the joke."

This all indicates a kind of grand-illusion based on volume metrics: i.e. If total number of clicks is counted in millions, even a tiny percentage will bring some users sales. However, It’s the same logic as Spam and ‘Cold-Calling’ i.e. If you call 100 people and only get one buyer, it’s a sale, but you really annoy the other 99.

A post on:, entitled "Bye Bye Ads" quotes Usability expert Jakob Nielsen: "The most prominent result from the new eyetracking studies is not actually new. We simply confirmed for the umpteenth time that banner blindness is real. Users almost never look at anything that looks like an advertisement, whether or not it's actually an ad."

These are Clayton Christensen's 'Non-Consumers'... 'Non-Consumers' of web advertising.

Monday, February 25, 2008 

The Medium is the Mess....

Although Web leviathans like YouTube, MySpace and Facebook all clearly leverage aspects of the many-to-many/ peer-to-peer trend, they also usurp and plunder the power freely given by their users via constraining them inside the legacy client-server system of the web. The difficulty is, that in the web-context, the P2P meme’s pluralistic tendencies, as is obvious on sites like 'The P2P Foundation' (of which I am a member) tend to see the term 'P2P' applied in ever increasing ways, arguably diluting some of its power and potential, and its valid identity as a technical-system born of the Internet that actually predates 'The Web' by about 20 years.

A more objective Value-Axis of the Internet?

From where I stand there is a clear ‘value-axis’ existing on the Internet, and a rather peculiar ‘Cargo-Cult’ type adherence to a dominant cultural meme called “The Web” which as a term is used too often interchangeably with the term “Internet”. This simple semantic muddle must end, as it is the source of a lot of confused reasoning.

There are three primary components in a value-axis of the Internet, Connectivity, Communications and Transactions. Of these three, Connectivity is the most fundamental, with the next most fundamental factor being Communications and then Transactions with all other general applications, (information, entertainment, blogs, websites, web2.0 etc) sitting above these three. This simple taxonomy ranks factors in terms of which is more primary in its ability to ‘enable’ the others.

Websites, Portals (Facebook, MySpace, Saleforce, etc) are at the upper end of this scale of importance. (ie least fundamental) This does not mean to imply that consumer or business websites and ASP-based web-services are not important, but rather that as a rule these sites function atop a foundation of established connectivity, communications and transaction protocols, and are not in themselves ‘fundamental’ in the sense that they exclusively enable higher applications.

The Web itself sits on layer 2, ‘Communications’. After all, the Web, for all the hype associated with it, really just resembles a massive Amusement Park accessed by obtaining a ‘Browser’ ticket. In other words the Browser is your ticket, and you ride this communication platform which is actually built on the more fundamental Connectivity layer. Its no secret where the value truly resides in this mega-market duality. Browsers are free, Connectivity you pay for, and the ‘Attention Economy’ (acknowledgement to Umair Haque) sits like an ecosystem above all that, with Google currently at the top of the food-chain.

In his illuminating article ‘Content is Not King’ written in 2001, Andrew Odlyzko nailed it with prescient clarity, even though he, like so many, has used the term Internet, when he could well have been referring to the Web.

“The Internet is widely regarded as primarily a content delivery system. Yet historically, connectivity has mattered much more than content. Even on the Internet, content is not as important as is often claimed, since it is e-mail that is still the true "killer app."
- Andrew Odlyzko, First Monday:

Email (by the way) has the same status as the Web, it is a communication platform on layer two. Andrew Odlyzko does not distinguish between Communications and Connectivity. In his article referred to above, they are to all intents and purposes the same, yet his message is clear. Its the connectivity between people that is more fundamental (and valued) than the content exchanged.

The Web and the Internet are not interchangeable concepts

So we need to appreciate that the internet and the World Wide Web are quite different things. The internet is a network that is in fact a loose array of interconnected networks. The Web has been superimposed on this global network, and is the dominant overlay-system, but it is not the only possible system that can utilize that network. The web has allowed many hundreds of millions of people to download information from ‘servers’ via protocols like DNS, (domain name system) and communicate between each other via email by use of DNS and SMTP (simple mail transfer protocol). However these protocols, serve to lock users into the ‘client’ paradigm where ‘clients’ have to accept the terms of the businesses that control the web servers. This system also helps to make the Web and email systems vulnerable to a wide array of security problems. Albert Benschop pulls back the curtains in this slightly ominous description.

“The exponential growth and far-reaching commercialization of the web have lead to an ever-stronger manifestation of the power structures of society in the virtual world. At present specialized computers channel the data traffic on the Internet and portals and search machines such as AOL, Google and Yahoo! dominate and exploit the market of the internet-dollars. Strongly concentrated hubs have arisen that play a crucial role in the Internet traffic. They are monster-servers, diverting their information to millions of regular web-users.”
- Albert Benschop, Peculiarities of CyberSpace- University of Amsterdam

The client/server paradigm of the World Wide Web, overlaid on the internet in the late 1980‘s, with its multiple layers of servers sitting on their underlying enabling protocols (DNS, SMTP, FTP etc) represented, at the time, a ground-breaking innovation and has gone on to become a global phenomenon. However, as the Web has grown, its hierarchical structure, identity and addressing protocols have also facilitated many of its almost intractable negative externalities.

For all the web’s vulnerabilities to attack and corruption, there is considerable ‘lock-in’ to WWW legacy systems, with the marketplace in general having built up a history of blind-acceptance trust and familiarity with it’s processes. This is a large part of the conundrum typified in the usual search for solutions to the web’s problems.

Projects like APML (Attention Profiling Mark-up Language), BCCF (the Buyer Centric Commerce Forum) and Project VRM (Vendor Relationship Management) are all well intentioned projects by switched-on people who want to do something about the inherent inequities and privacy problems of the web, and are arguably contained within this larger P2P pluralism. But… with the greatest respect, they all miss the point. Doing it actually ‘on the Web’, is self-defeating because its not a level playing field. There’s an orthodoxy present on the Web as dominant as the Catholic Church during the middle ages.

This is where an understanding of the pure definition of P2P, as it has developed on the Internet, may provide an instructive counter-weight, and clues to dealing with the over-hyped and over-rated orthodoxies of the web.