Speakers

Prof Ihab   Ilyas
Professor, University of Waterloo


Biography:

Ihab Ilyas is a professor in the Cheriton School of Computer Science at the University of Waterloo, where his main research focuses on the areas of big data and data science, with special interest in data quality and integration, managing uncertain data, and information extraction. Ihab is also a co-founder of Tamr, a startup focusing on large-scale data integration and cleaning. He is a recipient of the Ontario Early Researcher Award (2009), a Cheriton Faculty Fellowship (2013), an NSERC Discovery Accelerator Award (2014), and a Google Faculty Award (2014), and he is an ACM Distinguished Scientist. Ihab is an elected member of the VLDB Endowment board of trustees, elected SIGMOD vice chair, and an associate editor of the ACM Transactions of Database Systems (TODS). He holds a PhD in computer science from Purdue University, West Lafayette.


"Big Data Analytics and its Challenges in Large Scale Socio-economic Development"

Economy and science are now driven by data, from evidence-based medicine, big data analytics to drive spending and decision making in all economic sectors, to data science infrastructure to speed up discoveries in astronomy, chemistry and many other scientific domains. Enterprises and governments across all verticals (e.g., healthcare, financial services, and manufacturing) have been aggressively collecting data from a variety of sources including customers, transactions, sensors and human contributions on social media to build the ultimate data asset, often referred to as the "data lake", to allow data scientist to find key insights and business-driving analytics to guide the decision making process.
In this talk, I will highlight multiple examples where data analytics played a key role in socioeconomic development (the worldbank and Internews), policy making in multiple governments, and social good such as disaster management (the micorfilter project).   
However, due to the variety and the imperfection of the data collecting methods (for example scrapping data from social media or text documents, accessing legacy and obsolete sources, integrating data with different schemas, units, and languages), this data asset is often dirty and siloed and cannot be directly used as intended. Hence, data cleaning was realized as the most time-consuming task performed by data scientists (as articulated by Forbes magazine in 2016), and a key hurdle to effective data science (as stated by the New York Times in 2014).


Technical Partner
Planetarium Science Center, Library of Alexandria
Chatby 21526, Alexandria, EGYPT
Tel: +(203) 4839999 - Ext.: 1766, Fax: +203 4820464
E-mail: EMMESCHOOL.2015@bibalex.org