Big Data-2021 -

Prof. P Krishna Reddy and his students published the following papers at IEEE International Conference on Big Data (Big Data) held virtually from 15 – 18 December:

Discovering Top-k Spatial High Utility Itemsets in Very Large Quantitative Spatiotemporal databases – P Pallikila; P Veena, SBRT College Ananthapur; R U Kiran, The University of Aizu; Ram Avatar, The University of Hokkaido,Japan; Sadanori Ito, NICT,Tokyo; Koji Zettsu, NICT,Tokyo; and Prof. P K Reddy.

Research work as explained by the authors:

Spatial High Utility Itemset Mining (SHUIM) is an important knowledge discovery technique with many real-world applications. It involves discovering all itemsets that satisfy the user-specified minimum utility (minU til) in a quantitative spatiotemporal database. The popular adoption and the successful industrial application of this technique has been hindered by the following two limitations: (i) Since the rationale of SHUIM is to find all itemsets that satisfy the minU til constraint, it often produces too many patterns most of which may be redundant or uninteresting to the user. (ii) Specifying a right minU til value is an open research problem in SHUIM. This paper tackles these two problems by proposing a novel model of top-k spatial high utility itemsets that may exist in a database. A new constraint, called dynamic minimum utility (dM inU til), was explored to reduce the search space effectively. This constraint is based on greedy search, where we raise its value through five threshold-raising strategies. An efficient single scan algorithm that employs depth-first search to find all top-k spatial high utility itemsets was also presented in this paper. Experimental results demonstrate that our algorithm is memory and runtime efficient. We will also demonstrate the usefulness of our algorithm with two real-world case studies.

Discovering Relative High Utility Itemsets in Very Large Transactional Databases Using Null-Invariant Measure – R U Kiran, The University of Aizu; P Pallikila, José María Luna, University of Cordoba,Spain; P Fournier-Viger, Shenzhen University; M Toyoda, The University of Tokyo; and Prof. P K Reddy

Research work as explained by the authors:

High utility itemset mining is an important model in data mining. It involves discovering all itemsets in a quantitative transactional database that satisfy a user-specified minimum utility (minU til) constraint. M inU til controls the minimum value that an itemset must maintain in a database. Since the model evaluates an itemset’s interestingness using only the minU til constraint, it implicitly assumes that all items in the database have similar utility values. However, some items have high utility, while others may have relatively low utility in a database. If minU til is set too high, the user will miss all itemsets containing low utility items. To find itemsets that involve both high and low utility items, minU til has to be set very low. However, this may cause a combinatorial explosion as the items with high utility may combine with others in all possible ways. This dilemma is called the low utility item problem. This paper proposes a flexible model of relative high utility itemset to address this problem. We introduce a new null-invariant measure, called utility ratio, to evaluate the interestingness of an itemset in the database. We also present a fast single scan algorithm to find all desired itemsets in the database. Experimental results demonstrate that the proposed algorithm is efficient. Finally, a case study on Yahoo! JAPAN retail data shows that the proposed model is useful.