Machine learning (ML) is globally accepted as the core technology of disruptive services such as Amazon, Google, Facebook, Baidu and Twitter. These organizations have invested heavily in ML start-ups and data scientists to exploit existing machine learning technologies and to develop new dedicated methods, e.g., taking leadership in areas such as Deep Learning. Relevant work is already going on in business and academia. A successful machine learning service should:
1) be able to serve the appropriate tool to a given application
2) be automatic and self-tuning, i.e., only application expert intervention
3) work at scale, i.e. be computationally efficient and show low latency in streaming applications
4) provide unbiased performance estimates
5) visualize performance and decisions for non-expert quality control.
We will develop representations of ML algorithms and formulate use cases/problems as queries, see e.g., Google’s Prediction API. Using methods from visual analytics, we will develop interfaces for visual search in ML-algorithm space. Our developments will be based on international open source tools for working at scale (e.g., Apache, Torch, Python tools, DTU, VISMA, DIKU tools and more (www.apache.org, torch.ch, scikit-learn.org/stable) consolidated and optimized. Academic and business partners will work to create an actionable tool set of databases, algorithms, user interfaces, and will develop a large set of proof of concepts, and publish best practice recommendations based on the MLaaS.
The main objective is to produce a coherent, actionable set of open source ML tools with build-in access to Danish resources - for value creation and growth in the private and public sectors. Foucs will be on cleaning, decision support and interactivity. Meta-learning methods will be pursued such as Bayesian optimization of hyper-parameters for a broad set of machine learning applications: supervised, semi/unsupervised, novelty/outlier detection, multitask, multi-view, performance evaluation and visualization. We will reduce variance inflation by use of heterogeneous ensembles and research compression techniques to reduce the computational burden of ensembles and large models.
The DABAI project has a strong focus on creating actionable access to a vast array of Danish and international databases. This is urgent for value creation in the currently diverse and disconnected data scape. Actionable access will be facilitated by machine learning based "filters" connecting databases with relevant tools for analytics, visualization and decision making. For the database ML solutions focus will be on novelty and cleaning tools, use of deep architectures to optimize decision performance, and on visualization tools enabling ML-human interaction, including visualization of uncertainty and risk assessment by simulation.
The consortium hosts a wide range of machine learning activities and consortium researchers have made fundamental contributions to machine learning. To realize the potential of machine learning at societal scale the tools must be accessible, e.g. in the form of machine learning as a service (MLaaS), implementing a broad set of methods and an efficient interface to match methods with use cases.