Issues of the morphological analysis in comparison with the compound noun extraction analysis for a patent document

by Kyoko Yanagihori,Kazuhiko Tsuda
( University of Tsukuba, University of Tsukuba )

Date Published: 02 Dec 2013
Published In: Information Systems International Conference (ISICO)
Volume: 2013
Publisher: Departemen Sistem Informasi, Institut Teknologi Sepuluh Nopember
Language: id-ID

Keywords: text mining,patent search,compound noun,information retrieval,similarity calculation,morphological analysis

Abstract

Compound nouns are frequently encountered in the claims of a patent application. We compared the use of compound noun analysis to morphological analysis as a search method for similar documents in patent applications. This paper focused on the claims written in the Jepson format with consideration to Japanese language claims. Our analysis indicated that the co-occurrence frequency between morphemes and compound nouns in claims is significantly different, where the recurrence of compound nouns is significantly less than morphemes. Although this proved to be a useful feature in precision searches, it was necessary to extend the meaning of compound nouns to include a wider range of similar documents. This was accomplished with the construction of a preliminary semantic dictionary. An important feature discovered during the analysis was that the position of a compound noun in a claim affects the meaning of the noun, thus affecting the search results.


© 2024 Open Access Journal of Information Systems (OAJIS) | created by : radityo p.w (http://about.me/radityopw) and rully a.h (eraha99 [at] gmail.com)