Comparative Analysis of String Similarity on Dynamic Query Suggestions 论文

2018引用 255
Web Data Mining and AnalysisData Management and AlgorithmsAdvanced Text Analysis Techniques

详细信息

发表日期
2018-10-01
发表年份
2018

关键词

Web Data Mining and AnalysisData Management and AlgorithmsAdvanced Text Analysis Techniques

摘要

This research is a continuation of previous research. At this time search information through online media is something that is commonly used by many people. Search information through online media grows by providing search suggestions. Search suggestions can be either pre-stored keywords in the browser or by using reference data from the database. Dynamic query suggestions are a feature that is widely applied to information search websites. Some methods that can be used in query suggestion are MySQL Pattern Matching, MySQL Fulltext Index, Levenshtein Distance and Jaccard Similarity. All of these methods are compared in terms of processing time, proximity of data search by suggestion, calculation of data proximity rank and sorted from nearest data, search data based on number of words entered, amount of data that can be suggested. From a simple concept of dynamic query suggestion, we need to know which method is better used in dynamic query suggestion. Dynamic query suggestion method is implemented in the form of article website using PHP programming language, MySQL database, responsive web design and also use AJAX as data retrieval technique. Comparison of Jaccard similarity, MySQL pattern matching, Levenshtein distance and MySQL Fulltext Index with process time comparison parameters, proximity of data search by suggestion, calculate the proximity rank of data and sorted from the nearest data, search data based on the number of words entered, amount of data that can be suggested indicates the MySQL pattern matching is the fastest method for query suggestion but less accurate for suggestions, MySQL Fulltext Index provides more accurate search suggestions than other methods and requires less processing time. Jaccard similarity method has a fairly good accuracy but requires more processing time. The Levenshtein distance method is a less precise method used for query suggestion because it is less accurate for suggestions and also requires considerable time.