| KAS On-line Demo | Try KAS on-line! |
|---|---|
| Software | KAS is available for download: kas.tar.gz kas.zip |
The keyword acquisition system (KAS) was developed during the phase I effort of the Machine Learning for Record Linkage project. KAS can be used to process Web-based resources to discover their keywords and to classify them according to domain. For instance, KAS can be used to assign the domain "semiconductors" to the URL "<http://www.semiconductor.org/>".
The domain names used by KAS are completely under the control of users. When a document is assigned to a domain, its keywords are extracted and associated with the domain as well. This allows KAS to suggest domains for new documents that are submitted for processing based on how closely the new document's keywords match keywords associated with existing KAS domains. For example, a user can input the URL "<http://www.e-design.org.uk/>" and KAS will process it and automatically detect that it is related to the "semiconductor" domain based on the keywords it shares with the semiconductor.org website.
KAS provides an information browser that allows documents, domains, and keyword information to be viewed in a Web browser. Furthermore, users can submit a new URL to KAS and browse documents related to that URL by domain or keywords. The information in KAS's database can be exported to an XML file. This XML file is used by the machine learning component of the Machine Learning for Record Linkage project to discover additional relationships among the processed Web documents.
Send questions concerning the use or installation of KAS to Dr. Kenneth M. Anderson.