Distinguished Lecture Series

Prof. Dr. Michael J. Carey

University of California (UCI), USA

11 February 2021, 05:00 pm

“BAD to the Bone: Managing and Serving Big Active Data”

via Zoom

Meeting-ID: 896 8340 5166

Abstract:

Nearly all of today’s Big Data systems are passive in nature, responding to queries posed by their users. This talk will describe a project that has been working to shift Big Data platforms from passive to active. In our view, a Big Active Data (BAD) system should continuously and reliably capture Big Data while enabling timely and automatic delivery of relevant information to a large pool of interested users, as well as supporting retrospective analyses of historical information. While various scalable streaming query engines exist, their active behavior is limited to a (relatively) small window of the incoming data. To this end, we have created a BAD platform that combines ideas and capabilities from both Big Data and Active Data (e.g., publish/subscribe, streaming engines). It supports complex subscriptions that consider not only newly arrived items but also their relationships to past, stored data. Further, it can provide actionable notifications by enriching subscription results with other useful data. Our BAD platform extends an existing open-source Big Data Management System, Apache AsterixDB, with an active toolkit. The toolkit contains features to rapidly ingest semistructured data, share execution pipelines among users, manage user data subscriptions, and actively monitor the state of the data to produce individualized information for each user. This talk will describe the features and design of our current BAD data platform and briefly examine its ability to scale.

Bio:

Michael J. Carey received his B.S. and M.S. degrees from Carnegie-Mellon University and his Ph.D. from the University of California, Berkeley, in 1979, 1981, and 1983, respectively. He is currently a Bren Professor of Information and Computer Sciences at the University of California, Irvine (UCI) and a Consulting Architect at Couchbase, Inc. Before joining UCI in 2008, Mike worked at BEA Systems for seven years and led the development of BEA's AquaLogic Data Services Platform product for virtual data integration. He also spent a dozen years teaching at the University of Wisconsin-Madison, five years at the IBM Almaden Research Center working on object-relational databases, and a year and a half at e-commerce platform startup Propel Software during the infamous 2000-2001 Internet bubble. Mike is an ACM Fellow, an IEEE Fellow, a member of the National Academy of Engineering, and a recipient of the ACM SIGMOD E.F. Codd Innovations Award. His current interests all center around data-intensive computing and scalable data management (a.k.a. Big Data).