http://www.developer.com/java/other/article.php/1475381/Role-of-Java-in-the-Semantic-Web.htm
The definition of Semantic Web according to Tim Berners-Lee, the inventor of World Wide Web
is: "The extension of the current web in which information is given well-defined meaning, better
enabling computers and humans to work in cooperation."
Semantic Web is the abstract representation of data on the World Wide Web, based on the RDF (Resource Description
Framework) standards and other standards to be defined. This is being developed by the W3C (World Wide Web Consortium),
with participations from academic researchers and industrial partners. Data can be defined and linked in such a way so
that there is more effective discovery, automation, integration, and reuse across different applications.
The majority of today's World Wide Web's content is designed for humans to read and understand, not for machines and computer
programs to manipulate meaningfully. Computers can adeptly parse Web pages for layout and routine processing but, in general,
machines have no reliable way to process the semantics. The Semantic Web will bring structure to the meaningful content
of Web pages, where software agents roaming from page to page or from site to site can readily carry out automated sophisticated
tasks for users.
The Semantic Web is an extension of the World Wide Web, in which information is given well-defined meaning, better enabling
computers and people to work in cooperation. The first steps in incorporating the Semantic Web into the structure of the
existing Web are already under way. In the near future, these developments will roll out significant new functionality as
machines become much better able to process and understand the data that the Web merely displays at present. To date, the Web
has developed most rapidly as a medium for documents for humans rather than for data and information that can be processed
automatically. If you want something from the Web, then you have to do it manually. I take the context of "manual" to mean
that if you want to look for specific information or a product to buy from the internet, for example buying a book, then you
must sit at your computer searching most popular online bookstores through categories of titles that match what you want.
The Semantic Web aims to make up for this manual dependency (users should rely on software to do the task autonomously), and it
will be decentralized as much as possible, just like the Internet.
The key to the development of Semantic Web is Machine Intelligence. Other terms that are frequently used interchangeably
with Machine Intelligence are, Machine Learning, Computational Intelligence, Soft-Computing and Artificial Intelligence.
Although the five terms are used interchangeably by industries and academics, they are different branches to the researchers
who are involved in these fields. Artificial Intelligence involves symbolic computation while Soft-Computing involves
intensive numeric computation.
The following sub-branches of Machine Intelligence (mainly symbolic Artificial Intelligence) that are being addressed for Semantic Web:
Although symbolic Artificial Intelligence is currently built and developed into Semantic Web data representation, there
is no doubt that software tool vendors in the future and also software developers will incorporate the Soft-Computing paradigm
into it. The benefits and advantages that Soft-Computing adds to symbolic Artificial Intelligence is that it makes software
applications (systems) adaptive. This means that Soft-Computing program (software) will deal and adapt to would be unforeseen
input that it was (were) not built into it. This is in contrast to the non-adaptive nature of the pure symbolic
Artificial Intelligence which it cannot deal or adapt to unforeseen input (stimuli).
There are a number of related Machine Intelligence JSRs (Java Specification Request) in the JCP (Java Community Process) with two are
currently in public review. These JSRs are listed below:
As can be seen from the above list, it is only a small domain in the area of Machine Intelligence that is being extended as JSRs
in the JCP. The list is expected to grow in the future as new related Machine Intelligence JSRs will be proposed to the JCP.
Different disciplines of Machine Intelligence which have existed for over fifty years were successfully applied in different areas of
software applications, and it is only now that they are being applied to the internet in such extension as the Semantic Web. New
branches of Machine Intelligence are being constantly developed .
Knowledge Acquisition is defined as the extraction of knowledge from different sources of information. Examples
are: the extraction of knowledge from a human expert on a specific domain (such as a doctor, lawyer, or financial advisor),
extracting trading rules from a stock exchange database or extracting linguistic rules from a linguistic database and so on.
Knowledge Representation is defined as the expression of knowledge in computer-tractable form, so that it can be used
to help software-agents perform well. Software Agents (also called Softbot as opposed to Robot which is a mechanical device)
are a software (programs) that perceive its environment through sensors and acts upon that environment through
effectors. A Knowledge Representation language is defined by two aspects:
For the Semantic web to function computers must have access to structured collections of information and sets of
inference rules that they can use to conduct automated reasoning. Traditional knowledge-representation systems typically
have been centralized, requiring everyone to share exactly the same definition of common concepts such as "fruit" or "vehicle."
But central control is stifling, and increasing the size and scope of such a system rapidly becomes unmanageable. These systems
usually carefully limit the questions that can be asked so that the computer can answer (reasonably) reliably- or answer at
all. In avoiding such problems, traditional knowledge-representation systems generally each had their own narrow and
idiosyncratic set of rules for making inferences about their data. For example, a genealogy system, acting on a database
of family trees, might include the rule "a wife of an uncle is an aunt." Even if the data could be transferred from one
system to another, the rules, existing in a completely different form, usually could not.
Semantic Web provides a language that expresses both data and rules for reasoning about the data and that allows
rules from any existing knowledge-representation system to be exported onto the Web. eXtensible Markup Language (XML)
and the Resource Description Framework (RDF) are important technologies for developing the Semantic Web. XML lets
everyone create their own tags-hidden labels such as Web pages or sections of text on a page. Scripts and programs
can make use of these tags in sophisticated ways, but the script writer has to know what the page writer uses each
tag for. XML allows users to add arbitrary structure to their documents but says nothing about what the structures
mean. The meaning is expressed by RDF, which encodes it in sets of triples, each one being rather like the subject,
verb and object of an elementary sentence. These triples can be written using XML tags. In RDF, a document makes
assertions that particular things (fruit, Web pages, movies and so on ) have properties such
as "is a type of" - Orange is a type of fruit or "is a kind of" - Die Hard is a kind of action movie,
with certain values (Orange, Action). This sort of structure turns out to be a natural way to describe
the vast majority of the data processed by machines. Subject and object are each identified by a Universal
Resource Identifier (URI), just as used in a link on a Web page. (URLs, Uniform Resource Locators, are
the most common type of URI.) The verbs are also identified by URIs, which enables anyone to define a
new concept, a new verb, just by defining a URI for it somewhere on the Web. The triples of RDF form webs
of information about related things. Because RDF uses URIs to encode this information in a document, the
URIs ensure that concepts are not just words in a document but are tied to a unique definition that everyone
can find on the Web.
Knowledge Engineering is defined as the: "Selecting of a Logic for building Knowledge-Base Systems, with the
implementation of the Proof Theory and also the Inference of new Facts". Thus Knowledge Engineering specifies
what is true, and the inference procedure figures out how to turn the facts into a solution. Since a fact is true
regardless of what task one is trying to solve, knowledge bases can in principle, be reused for a variety of different
tasks without modification, and this is in sharp contrasts to procedural programming where a slight modification
requires a recompilation of the program. Thus Knowledge Engineering has a declarative approach.
Agent-based software engineering makes all sorts of systems and resources interoperable by providing an interface
based on first-order logic. An important step in the development of a knowledge-base is to decide on a vocabulary
of predicates, functions and constants, example, should Size be a function or a predicate? Would Bigness be a better
name than Size? Should Small be a constant or a predicate? Once the choices have been made, the result is a vocabulary
known as Ontology. Thus an Ontology is a document or file of vocabularies that formally define the relations among
terms. The most typical kind of ontology for the Web has a taxonomy and a set of inference rules.
The taxonomic hierarchy defines classes of objects and relations among them. Taxonomies have been used explicitly for
centuries in technical fields. For example, systematic biology aims to provide a taxonomy of all living and extinct
species; library science has developed a taxonomy of all fields of knowledge, encoded as Dewey Decimal systems; tax
authorities and other government departments have developed extensive taxonomies of occupations and commercial products.
First-order logic (FOL) is defined as a general-purpose representation language that is based on an ontological
commitment to the existence of objects and relations in the world. FOL makes it easy to state facts about categories,
either by relating objects to the categories or by quantifying over their members, for example:
Since Tomatoes is a category, and is a member of DomesticatedSpecies, then DomesticatedSpecies must be a category
of categories. One can have categories of categories of categories, but they are not much use. Although subclass and
instance relations are the most important ones for categories, there is a need to be able to state relations between
categories that are not subclasses of each other. Example, if you say that Males and Females are subclasses of Animals,
then you have not said that a male cannot be a female. If two or more categories have no members in common, they are
called disjoint.
The ontologies on the Web range from large taxonomies categorizing Web sites (such as on Yahoo!) to categorizations
of products for sale and their features (such as on Amazon.com). For example, an address(category) may be defined as
a type of location(category), and city codes(property) may be defined to apply only to locations, and so on. Classes,
subclasses and relations among entities are a very powerful tool for Web use. A large number of relations can be
expressed among entities by assigning properties to classes and allowing subclasses to inherit such properties.
Application of Inference rules in Ontologies provide powerful logical deductions. With ontology pages on the Web,
solutions to terminology problems begin to emerge. The definitions of terms and vocabularies or XML codes used on a Web
page can be defined by pointers from the page to an ontology. Different Ontologies need to provide equivalence
relations (defining same meaning for all vocabularies), or otherwise there would be a conflict and confusion.
Ontologies have the ability to enhance the functioning of the Web in different ways. They can be used in a simple
fashion to improve the accuracy of Web searches-the search program can look for only those pages that refer to a precise
vocabularies and concepts instead of all the ones using ambiguous keywords. More advanced applications will use ontologies
to relate the information on a page to the associated taxonomy hierarchies, knowledge structures and inference rules.
As defined previously, Software-Agents perceive the environment through sensors and act upon that environment
through effectors. Autonomous Agents are defined as Software Agents who's behaviour is determined by both its own
experience and its built-in knowledge. Here is an example of a non-autonomy agent (physical agent): If a clock manufacturer knew that
the clock owner would be going to Australia on some particular date, then a mechanism could be built in to adjust
the hands automatically by six hours at just the right time. This certainly would be a successful behaviour, but
the intelligence seems to belong to the clock's designer rather than the clock itself. The same applies to software agents,
where the software seems to exhibit intelligence but actually, it is the programmers effort rather than the software itself. Autonomy
gives true intelligence where the software agent can use its past experience plus its built-in knowledge to deduce new goals.
Intelligent web agents such as a travel software agents - would be doing everything for the user, the agents find possible
ways to meet user needs, and offer the user choices for their achievement. Much as a travel agent might give you a list of
several flights you could take, or a choice of flying versus taking a train, a web agent should offer a range of possible ways
to get you what you need on the web. You delegate the agents the task of finding the best flight route, minimum costs of the
trip from London to Washington via Paris stopping over for two nights at a hotel with reasonable prices, also this hotel must
be near a beach in a city where the temperature is below room temperature. As you can see, this is a complex query that needs
a human to search the WWW to find flights, prices, hotels, weather and so forth to conclude a best deal. You can specify what
you want to the agents and leave the agents to complete the task, you(the user) can take a break, make yourself a cup of coffee
and then come back to your computer, perhaps the agent has returned (lapse time depends on the complexity of the query). It will
come back to you to propose a best flight route, minimum costs for two nights in Paris, propose a hotel that is near a beach and
so forth. All these are done by a machine, you have been freed from the burden of having to sit and browse on the internet for
perhaps hours, trying to achieve your goal. With Semantic Web the software agents can rely on its own experience and learn to
adapt and gain new knowledge during its exchange of dialogue with the hotel web ontologies, weather agency ontologies or the
airline ontologies and so on.
Many automated Web-based services already exist without semantics, but other programs such as agents have no way to locate
one that will perform a specific function. This process, called service discovery, can happen only when there is a common
language to describe a service in a way that lets other agents "understand" both the function offered and how to take advantage
of it. Services and agents can advertise their function by, for example, depositing such descriptions in directories similar
to the Yellow Pages.
There are some low-level service-discovery schemes which are currently available, such as Microsoft's Universal Plug and
Play, which focuses on connecting different types of devices, and Sun Microsystems's Jini, which aims to connect services. These
initiatives, however, attack the problem at a structural or syntactic level and rely heavily on standardization of a predetermined
set of functionality descriptions. The Semantic Web is more flexible by comparison. The consumer and producer agents can reach
a shared understanding by exchanging ontologies, which provide the vocabulary needed for discussion. Agents can even bootstrap
new reasoning capabilities when they discover new ontologies. Semantics also makes it easier to take advantage of a service
that only partially matches a request.
There are a number of tools (commercial and free-source) which are written in Java for development of Ontology and
Knowledge Base (refer resources). The following tools will help you build ontologies , logical reasoning systems, develop
agents systems and knowledge base in Java. Here are some of the popular ones (refer resources for download links):
The tools which have been described above are only some of the vast number of Java tools and APIs in machine intelligence
which would be instrumental in Semantic Web application development. The number of Java tools available for machine
intelligence (free-source and commercial) are growing week by week. New machine intelligence tools and APIs written in Java are announced
at Source Forge regularly.
The role of Java language for future of Semantic Web will be influential as all the current FIPA compliant Software Agent
tools (commercial and free-source) are written in Java. Why? The advantage of Java for distributed computing and also
the WORA (Write Once Run All) principle. Software agents can move as mobile code from host-to-host carrying its state
with it. Note that the .NET platform relies on XML and applications built on these can only move data (mobile data) around
between different applications, while Java can dispatch software agents as mobile code (plus mobile data) wherever JVM is
available. The ability of software agents to act autonomously is through true mobility of the code, because the code does the execution of
computational task not the data. An agent can freeze its computation task at host A , saves its state then moves(with its
current state) to host B to continue on the computational where it left off from host A. There is a huge role for Java
in Semantic Web in which JSR such as Java Agent Services API (JSR-87) is already FIPA compliant.
Sione Palu has developed software for Publishing Systems, Imaging,
Web Applications, Symbolic and Computer Algebra Systems for secondary education. Palu graduated from the University of Auckland,
New Zealand, with a science degree (B.Sc) in mathematics and computing. His interests involve the application of Java and
Mathematics in the fields of mathematical modelling and simulations, symbolic AI and soft-computing, numerical analysis,
image processing, wavelets, digital signal processing, control systems and computational finance.
Role of Java in the Semantic Web
October 3, 2002
Definition
Introduction
Machine Intelligence
Knowledge Acquisitions and Representations
General Ontology
Software Agents
Available Java Tools
Example of Backward-chaining reasoning: Suppose , that I leave my pot of potatoes to boil in the kitchen, while I go to the lounge
to watch a favourite TV show. I estimated to time it for approximately half an hour(potatoes would be cooked by then), before I can come
back to the kitchen and turn off the oven. The TV show was exciting, so I forgot I had a pot of potatoes cooking, therefore the pot kept boiling
and finally all the water evaporated and ended up burning. The fire alarm in the kitchen went off, and when I heard it I turned my head to the direction
of the kitchen noticing smoke coming out. With no evidence at all (I am still in the lounge, I have not seen what is going on in the kitchen) that this was
caused by the pot of potatoes (my own negligence for failure to go back and turn off the oven after half an hour), I have already drawn that conclusion by
using Backward-chaining reasoning. Here is the chaining process of reasoning: (burnt_pot_of_potatoes <= smoke <= trigger_fire_alarm): As mentioned
above, that I am still in the lounge when I heard the alarm and have not seen my pot yet, I hypothesize immediately that my pot of potatoe is burnt given
the facts: "fire_alarm_went_off" and "smoke_coming_out_of_the_kitchen".
Forward Chaining: Suppose I have just walked out through my front door to go to work one morning, and looked up to the sky, and noticed it is cloudy. I immediately turned back
and went into the house to get an umbrella. Given a fact that it is a cloudy morning, I must need an umbrella because it is going to rain. It has not rain yet, but I reason
in a forward manner, that the knowledge base in by brain cells has a rule that says: "If it is a cloudy morning day then take an umbrella when going to work (expected it will be raining)". The syntax
expression of rules in JTP and also JESS use a LISP-like expression (everything is a list).
Outlook for Java in Semantic Web
Resources
Downloads and Links
Books
Journal Publications
About the author