Concept Maps

From AI KR (Artificial Intelligence Knowledge Representation) Community Group

We should try to provide some visual representation for our vocabulary.

Initial Concepts Overview

KR as a Human/Machine Interchange Format

Mysearch Summary of Intelligence

See the two diagrams on the Mysearch Website.

Alexa Meaning Representation Language (AMRL)

See this article on a proposed enhancement to AMRL to enable more sophisticated interactions.

Knowledge Representation

What is knowledge representation (KR)?

AIKR map -- draft 11/2018. See references.

Davis et.al. 1993: can best be understood in terms of five important and distinctly different roles: (I) Surrogate; (II) Set of ontological commitments; (III) Fragmentary theory of intelligent reasoning; (IV) Medium for pragmatically efficient computation; (V) Medium of human expression.

Human expression: “If the representation makes things possible but not easy, then as real users we may never know whether we have misunderstood the representation and just don’t know how to use it, or it truly cannot express some things we would like to say.” from Davis, R., Shrobe, H., & Szolovits, P. (1993). What is a knowledge representation?. AI magazine, 14(1), 17. Available here

“common framework that allows data to be shared and reused across application, enterprise, and community boundaries.” from Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 28-37. Available here

“early systems reasoned by performing some form of logical inference on (somewhat) human readable symbols”

“new techniques [...] models of the world in their own internal representations […] support vector machines, random forests, probabilistic graphical models, reinforcement learning, and deep learning neural networks.” from Explainable Artificial Intelligence (XAI). The Defense Advanced Research Projects Agency. August 10, 2016. Available here

“patterns seem to follow linguistic or old colonial lines” from Beauchesne, O.H., Map of scientific collaboration. Available here and here

Data: Structured, Semi-Structured, Unstructured

Structured content is usually DBMS-based, see relational databases, from Chen, P. P. S. (1976). Unstructured content is mostly Web-based, from Berners-Lee, T., Hendler, J., & Lassila, O. (2001).

Structured data includes a model of the data, a specification of how the data is stored, and the data types, e.g. (alpha)numeric. Sources of structured data include:

  • U.S. Government’s open data, available here, and the United States Census Bureau, available here
  • the DBpedia database of information structured from Wikipedia, available here
  • World Bank Open Data, available here
  • Google Trends, Correlate, and Public Data, available here, here, and here
  • the Freebase community-curated database of well-known people, places and things, the last available data dump available here
  • Dakota State University (DSU) data sources for Marketing research, available here

Unstructured data, from Kuechler (2007):

  • Business Intelligence: Web, industry blogs, online databases;
  • Customer Relationship Management: customer feedback, helpdesk reports;
  • Regulatory compliance: all internally generated electronic documents;
  • Intellectual property management: Web, copyright and patent databases;
  • Call support (help desk applications): call documentation, customer feedback, email, online manuals;
  • Accounts payable/receivable analysis: invoices, customer and vendor correspondence (used frequently with traditional structured data mining and analysis);
  • Legal department support: legal databases, specific streams of organizational communications (such as customer communication, internal email).

Combining multiple (un)structured data sources is possible:

  • Trendsmap: Google Maps + Twitter mashup
  • Poligraft: “an enhanced view of the people, organizations and relationships described within political news stories, blog posts and press releases.”

See also the NYU Web Remix course on mashups, most of it freely available here, and Duane Merrill′s introduction to mashups, available here.

“The three disjuncts -- structured, semi-structured, unstructured -- do not cover the full content [...]. That which is unstructured, not seizable within knowledge representation formalisms, devoid of reference such as time or space, confined to the full realization of language, not submersible through translation, rather abiding the latter, remains extraneous to these formalisms.” from Milea, V. (2016).

Applications

Market Intelligence: Recommender systems, Social- and Virtual Games;
Science and Technology: Innovation, Hypothesis testing, Knowledge discovery;
Smart Health and Wellbeing: Human and plant genomics, Healthcare decision support, Patient community analysis;
Security and Public Safety: Crime analysis, Computational criminology, Terrorism informatics, Open-source intelligence;
Open access of citizens to public information: public announcements, (Supreme) Court decisions, enactment/revision in legislation (see for an example the Supreme Court Database).

“...data is now the fourth factor of production, as essential as land, labor, and capital.” From The Deciding Factor: Big Data & Decision Making, the Economist Intelligence Unit. Available here.

“We don’t have better algorithms. We just have more data.” – Peter Norvig from McAfee, A. & Brynjolfsson, E. Big Data: The Management Revolution. Harvard Business Review, October 2012. Available here

References

  1. The Deciding Factor: Big Data & Decision Making, by the Economist Intelligence Unit. Available here
  2. McAfee, A. & Brynjolfsson, E. Big Data: The Management Revolution. Harvard Business Review, October 2012.
  3. Chen, H., Chiang, R.H.L & Storey, V.C. Business Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly, vol. 36, no. 4, 1165-1188.
  4. JSON description and resources @ json.org Available here
  5. XML page at the W3C Available here
  6. Davis, R., Shrobe, H., & Szolovits, P. (1993). What is a knowledge representation?. AI magazine, 14(1), 17. Available here
  7. Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 28-37. See the W3C Semantic Web page here
  8. RDF Primer by W3C. Available here
  9. OWL2 Primer by W3C. Available here
  10. Kuechler, W. L. (2007). Business applications of unstructured text. Communications of the ACM, 50(10), 86-93.
  11. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. Available here
  12. Navigli, R. (2009). Word sense disambiguation: A survey. ACM Computing Surveys (CSUR), 41(2), 10.
  13. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and trends in information retrieval, 2(1-2), 1-135. Available here
  14. Mohammad, S. M., Kiritchenko, S., & Zhu, X. NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets. Available here
  15. Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of natural language processing. Available here
  16. Esuli, A., & Sebastiani, F. (2006, May). Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC (Vol. 6, pp. 417-422). Available here
  17. Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In LREC (Vol. 10, pp. 2200-2204) Available here
  18. Mann, W. C., & Thompson, S. A. (1988). Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3), 243-281. Available here
  19. Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., & de Jong, F. (2011). Polarity analysis of texts using discourse structure. In Proceedings of the 20th ACM international conference on Information and Knowledge Management (pp. 1061-1070), ACM. Available here
  20. Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113. Available here
  21. Lin, J., & Dyer, C. (2010). Data-intensive text processing with MapReduce. Morgan & Claypool Publishers. Required reading: Chapter 2. Available here
  22. Steele, J., & Iliinsky, N. (2010). Beautiful visualization. O'Reilly Media, Inc. Required reading: Chapters 1, 7, 10, and 11.
  23. Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social sciences. Science, 323(5916), 892-895.
  24. Holten, D. (2006). Hierarchical edge bundles: Visualization of adjacency relations in hierarchical data. Visualization and Computer Graphics, IEEE Transactions on, 12(5), 741-748. Available through the ACM Digital Library here
  25. Kuechler, W. L. (2007). Business applications of unstructured text. Communications of the ACM, 50(10), 86-93. Available here
  26. Milea, V. Semi-Automated Legal Decision Support. October 2, 2016. Available here
  27. Chen, P. P. S. (1976). The entity-relationship model—toward a unified view of data. ACM Transactions on Database Systems (TODS), 1(1), 9-36. Available here

Add your proposed drafts, links and images here