Extremely XML

XML is one of the more exciting file formats in the past fwe decades. Rather than just being a convenient way to store information, it tends to open up accessibility to information to more software than any other format in history.

Tuesday, June 27, 2006

PC Pro: News: paper at w3 conference uses semantic web to turn social web into information goldmine

Data-ming seeks to bring out valuable nuggets of knowledge buried deep in a morass of data. Not only can it do that, if often does.

Really poor data mining jumps to conclusions about people based on false precepts. An example would be that matching first and last names proves matching identity. It does not. Everyone knows it does not.

Good information is qualified, by carefully matching up multiple pieces of information - and taking context into account.

Semantic web researchers recently used their skills to piece together a puzzle involving a huge number of people.

They analyzed a bunch of FOAF files, figured out who-knew-who - and compared that with a list of C.S. researchers.

They took that a step further, and tried to determine how prevalent the possibility of conflict of interest (COI) issues were.

PC Pro:
The plan was to map a simple social network Friend of a Friend, where individuals listed their immediate friends, against a commercial bibliographic database of authors computer science papers.

The latter was Semantically tagged, whereby records are attributed additional data describing each record, for example subject, date, author and so on. This means that online information can be meaningful not just to people viewing it, but also to computers accessing that data.

The goal of the research project was to discover whether there were any conflicts of interest between those authors putting forward papers and those chosen to review them. The researchers claimed the project brought out inferences that a simple topographical view would have missed.


The result is the researchers gleaned some interesting facts while preparing their paper, Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection. Facts they would not have stumbled over or inferred any other way.

What makes it interesting is how many people in the US have dumped their information into MySpace and other social websites. Even more interesting is that they identify on that same site, who their friends are. Sadly, I doubt most of those friends listed on that site really are friends in any conventional meaning of the word.

That is where context actually comes in.

However, the mechanism is still valid. The source of the input data just has be of adequate quality. The FOAF data the researchers culled from was closer to that than MySpace. So their conclusions, if identities were matched by more than first and last name, are probably interesting.

That... makes the Semantic Web a whole lot more interesting.
Technorati tags: , , , , , ,

0 Comments:

Post a Comment

<< Home