FIELD OF THE INVENTION
The invention relates to a system and method for facilitating user interaction with multi-relational ontologies.
BACKGROUND OF THE INVENTION
Knowledge within a given domain may be represented in many ways. One form of knowledge representation may comprise a list representing all available values for a given subject. For example, knowledge in the area of “human body tissue types” may be represented by a list including “hepatic tissue,” “muscle tissue,” “epithelial tissue,” and many others. To represent the total knowledge in a given domain, a number of lists may be needed. For instance, one list may be needed for each subject contained in a domain. Lists may be useful for some applications, however, they generally lack the ability to define relationships between the terms comprising the lists. Moreover, the further division and subdivision of subjects in a given domain typically results in the generation of additional lists, which often include repeated terms, and which do not provide comprehensive representation of concepts as a whole.
Some lists, such as structured lists, for example, may enable computer-implemented keyword searching. The shallow information store often contained in list-formatted knowledge, however, may lead to searches that return incomplete representations of a concept in a given domain.
An additional method of representing knowledge is through thesauri. Thesauri are similar to lists, but they further include synonyms provided alongside each list entry. Synonyms may be useful for improving the recall of a search by returning results for related terms not specifically provided in a query. Thesauri still fail, however, to provide information regarding relationships between terms in a given domain.
Taxonomies build on thesauri by adding an additional level of relationships to a collection of terms. For example, taxonomies provide parent-child relationships between terms. “Anorexia is-a eating disorder” is an example of a parent-child relationship via the “is-a” relationship form. Other parent-child relationship forms, such as “is-a-part-of” or “contains,” may be used in a taxonomy. The parent-child relationships of taxonomies may be useful for improving the precision of a search by removing false positive search results. Unfortunately, exploring only hierarchical parent-child relationships may limit the type and depth of information that may be conveyed using a taxonomy. Accordingly, the use of lists, thesauri, and taxonomies present drawbacks for those attempting to explore and utilize knowledge organized in these traditional formats.
Additional drawbacks may be encountered when searches of electronic data sources are conducted. As an example, searches of electronic data sources typically return a voluminous amount of results, many of which tend to be only marginally relevant to the specific problem or subject being investigated. Researchers or other individuals are then often forced to spend valuable time sorting through a multitude of search results to find the most relevant results. It is estimated, for example, that scientists spend 20% of their time searching for information existing in a particular area. This is time that highly-trained investigative researchers must spend simply uncovering background knowledge. Furthermore, when an electronic search is conducted, data sources containing highly relevant information may not be returned to a researcher because the concept sought by the researcher is identified by a different set of terms in the relevant data source. This may lead to an incomplete representation of the knowledge in a given subject area. These and other drawbacks exist.
SUMMARY OF THE INVENTION
The invention addresses these and other drawbacks. According to one embodiment, the invention relates to a system and method for facilitating user interaction with multi-relational ontologies. According to one aspect of the invention, one or more multi-relational ontologies may comprise domain specific ontologies that may be used individually or collectively, in whole or in part, based on user preferences, user access rights, or other criteria.
As used herein, a domain may include a subject matter topic such as, for example, a disease, an organism, a drug, or other topic. A domain may also include one or more entities such as, for example, a person or group of people, a corporation, a governmental entity, or other entities. A domain involving an organization may focus on the organization's activities. For example, a pharmaceutical company may produce numerous drugs or focus on treating numerous diseases. An ontology built on the domain of that pharmaceutical company may include information on the company's drugs, their target diseases, or both. A domain may also include an entire industry such as, for example, automobile production, pharmaceuticals, legal services, or other industries. Other types of domains may be used.
As used herein, an ontology may include a collection of assertions. An assertion may include a pair of concepts that have some specified relationship. One aspect of the invention relates to the creation of a multi-relational ontology. A multi-relational ontology is an ontology containing pairs of related concepts. For each pair of related concepts there is a broad set of descriptive relationships connecting them. As each concept within each pair may also be paired (and thus related by multiple descriptive relationships) with other concepts within the ontology, a complex set of logical connections is formed. These complex connections provide a comprehensive “knowledge network” of what is known directly and indirectly about concepts within a single domain. The knowledge network may also be used to represent knowledge between and among multiple domains. This knowledge network enables discovery of complex relationships between the different concepts or concept types in the ontology. The knowledge network enables, inter alia, queries involving both direct and indirect relationships between multiple concepts such as, for example, “show me all genes expressed-in liver tissue that-are-associated-with diabetes.”
According to an embodiment of the invention, one or more users may view one or more ontologies and perform other knowledge discovery processes via a graphical user interface (GUI) as enabled by a user interface module. An export manager may enable the selective export of one or more ontologies or portions of one or more ontologies. Also, the system may enable an entity to provide various ontology services, including exporting parts of one or more ontologies, the creation of custom ontologies, knowledge capture services, ontology alert services, merging of independent taxonomies, optimization of queries, and other services.
According to another aspect of the invention, a user interface module may enable a novel graphical user interface. The graphical user interface may enable a user to interact with one or more ontologies. In one embodiment, a graphical user interface may include a search pane. Within the search pane, a user may input a concept of interest, term of interest, or relevant string of characters. The system may search one or more ontologies for the concept of interest, term of interest, or the relevant string (including identifying and searching synonyms of concepts in the ontologies). The graphical user interface may then display the results of the search, including the name of the concepts returned by the search, their concept type, their synonyms, or other information. The user may then select a concept from the displayed results and utilize the functionality described below. In some embodiments, a user may select a concept (or other element of an ontology) using a mouse, a cursor, pointer, or other selection method known in the art.
In one embodiment, the system may enable a user to add a relationship to a concept or term of interest when conducting a search of one or more ontologies. For example, a user may desire to search for concepts within one or more ontologies that “cause rhabdomyolysis.” Instead of searching for “rhabdomyolysis” alone, the relationship “causes” may be included in the search and the search results may be altered accordingly. In another embodiment, the system may enable a search using properties. In this embodiment, a user may search for all concepts or assertions with certain properties such as, for example, a certain data source, a certain molecular weight, or other property.
In one embodiment, the graphical user interface may include a hierarchical pane. A hierarchical pane may display a hierarchy of concept types as defined by the upper ontology. Within this hierarchy, specific instances of concept types contained within the ontology may be displayed along with certain relationships existing between these instances and their concept types. In one embodiment the relationships that may exist may include “instance,” “part-of,” or other relationships. Certain concepts may be instances (or parts) of concept types and may have additional concepts organized underneath them. In one embodiment, a user may select a concept from the hierarchical pane, and view all of the descendants of that concept. The descendants may be displayed with their accompanying assertions as a list, or in a merged graph, similar to those described in detail below.
In one embodiment, the graphical user interface according to the invention may include a relationship pane. The relationship pane may display the relationships that are present in the hierarchical pane for a selected concept. For instance, the relationship pane may display the relationship between a selected concept and its parents. Because of the interconnectedness of an ontology, a given concept may have multiple hierarchical parents. Additionally, the relationship pane may display relationships up one or more levels in the hierarchy, down one or more levels in the hierarchy, or laterally in the hierarchy (e.g., synonyms).
In one embodiment, the graphical user interface according to the invention may include a multi-relational display pane. The multi-relational display pane may display multi-relational information regarding a selected concept. For example, the multi-relational display pane may display descriptive relationships or all known relationships of the selected concept from within one or more ontologies. The multi-relational display pane may enable display of these relationships in one or more forms.
In one embodiment, the multi-relational display pane may display concepts and relationships in graphical form. One form of graphical display that may be used includes a clustered cone graph. A clustered cone graph may display a selected concept as a central node, surrounded by sets of connected nodes, the sets of connected nodes being concepts connected by relationships. In one embodiment, the sets of connected nodes may be clustered or grouped by common characteristics. These common characteristics may include one or more of concept type, data source, relationship to the central node, associated property, or other common characteristic.
In one embodiment, connected nodes in a clustered cone graph may also have relationships with one another, which may be represented by edges connecting the connected nodes. Additionally, edges and nodes within a clustered cone graph may be varied in appearance to convey specific characteristics of relationships or concepts (e.g., thicker edges for high assertion confidence weights, etc.). The textual information underlying a node or edge in a clustered cone graph may be displayed to a user upon user selection of a node or edge. Furthermore, a connected node may be selected by a user and placed as the central node in the graph. Accordingly, all concepts directly related to the new central node may be arranged in clustered sets around the new central node.
In one embodiment, more than one concept may be selected and placed as a merged central node (merged graph). Accordingly, all of the concepts directly related to at least one of the two or more concepts in the merged central node may be arranged in clustered sets around the merged central node. If concepts in the clustered sets have relationships to all of the merged central concepts, this quality may be indicated by varying the appearance of these connected nodes or their connecting edges (e.g., displaying them in a different color, etc.). In one embodiment, two or more nodes (concepts) sharing the same relationship (e.g., “causes”) may be selected and merged into a single central node. Thus the nodes connected to the merged central node may show the context surrounding concepts that share the selected relationship.
In one embodiment, more than one concept may be aggregated into a single connected node. That is, a node connected to a central node may represent more than one concept. For example, a central node in a clustered cone graph may be a concept “compound X.” Compound X may cause “disease Y” in many different species of animals. As such, the central node of the clustered cone graph may have numerous connected nodes, each representing disease Y as it occurs in each species. If a user is not in need of immediately investigating possible differences that disease Y may have in each separate species, each of these connected nodes may be aggregated into a single connected node. The single merged connected node may then simply represent the fact that “compound X” causes the “disease Y” in a number of species. This may simplify display of the graph, while conveying all relevant information.
In one embodiment, each of the sets of clustered nodes of a clustered cone graph may be faceted. Faceting may include grouping concepts within a clustered set by common characteristics. These common characteristics may include one or more of data source, concept type, common relationship, properties, or other characteristic. Faceting display within a set of connected nodes may take the form of a graph, a list, display of different colors, or other form. A user may sort through, and selectively apply, different types of faceting for each of the sets of connected nodes in a clustered cone graph. Furthermore, a user may switch faceting on or off for each of the sets of connected nodes within a clustered cone graph.
Additionally, faceting may also apply to a taxonomy view of ontology data. For example, a user may wish to reconstruct the organization of data represented in a taxonomy view such as, for example, chemical compound data. The user may reconstruct this taxonomic organization using therapeutic class, pharmacological class, molecular weight, or by other category or characteristic of the data. Other characteristics may be used to reconstruct organizations of other data.
In one embodiment, the multi-relational display pane of the graphical user interface may display information regarding a selected concept in list form (as opposed to the graphical form described above). That information may include all relationships for the selected concept, the label of each related concept, the type of each related concept, evidence information for each assertion of the related concepts, or other information. Evidence information for an assertion may include the number of pieces of evidence underlying the assertion or other information. Additionally, a user may select one or more of the assertions of the selected concept and aggregate all the related concepts of the selected assertions as selected concepts in the multi-relational display pane (either list view or graphical view [i.e., merged graph]).
In one embodiment, the multi-relational display pane may enable the display of confidence weights for assertions in one or more ontologies. Confidence weights may include a measure of the strength of evidence underlying an assertion. The multi-relational display pane may also enable application of filters to displayed data from one or more ontologies. Filters may selectively display data from one or more ontologies based on user preferences, user access rights, or other criteria. Furthermore, the multi-relational display pane and the hierarchical display pane may be linked, such that one or more concepts selected from one, may become selected concepts in the other.
In one embodiment, the graphical user interface of the invention may include an evidence pane. The evidence pane may display information regarding each piece of evidence for a selected assertion. The information displayed may include one or more of the data source of a piece of evidence, its version, information identifying the record or document that contains the evidence, or other information. In one embodiment, the evidence pane may include a document viewer that enables display of actual evidence-laden documents to a user. A user may also link to the data source containing the document via the evidence pane. In some embodiments, a user's access control rights may dictate the user's ability to view or link to evidence underlying a concept. For instance, a user with minimal rights may be presented with a description of the data source for a piece of evidence, but may not be able to view or access the document containing that evidence.
In one embodiment, the graphical user interface may include a details pane. The details pane may show one or more of properties, synonyms, evidence (concept evidence, not assertion evidence), or other information underlying a selected concept.
These and other objects, features, and advantages of the invention will be apparent through the detailed description of the preferred embodiments and the drawings attached hereto. It is also to be understood that both the foregoing general description and the following detailed description are exemplary and not restrictive of the scope of the invention.