Mining Focused Patterns from Big Data Streams
With rapid advances in computing power and dramatic expansion of data collection and storage capability, nowadays, businesses and organizations have collected vast amounts of data about their business processes. These data are modern-day treasure stores that can be mined to glean insights into a business’ products, services and customers. Despite great efforts that have been made in the field of data analytics, there is a large gap between academic deliverables and business expectations and as a result, many questions still remain to be answered. How can we discover actionable knowledge from dynamically changing data? How can we effectively combine human and machine intelligence to gain more useful and effective insights from Big Data?
One major objective in Big Data analytics is to discover patterns that can represent intrinsic and important properties of massive datasets in different domains. Finding patterns has been studied extensively in the field of data mining. However, most of the techniques fail to incorporate the user preference into pattern mining process, and thus, lack the ability to steer algorithms to more interesting parts of Big Data. In this talk, I will describe how we overcome this limitation by using user-oriented approaches (so-called focused) for mining patterns in Big Data streams. In this new problem setting, patterns are then simultaneously mined according to the user preference. I will talk about the problems, challenges and opportunities for discovering such patterns in Big Data. I will also talk about my research projects with the industry and how theory can be implemented in real world systems.