The data growth is enormous and we need to develop infrastructure and tools for processing and extracting the information. This is the main areas of what we do:
- Big Data and Analytics are these days almost synonyms. We focus on many of these aspects.
- We offer the Metacenter cluster with large number of computers and storage.
- Data center optimization and modelling of power usage in virtualized datacenters.
- Autoscaling / Hadoop scaling / cloud, web. prog.
- Monitoring and visualization of infrastructure in OpenStack / cloud,
- Combine the Apache Spark and REST API for large AI systems such as Question Answering engine
Big Data hints
We have a great introduction to the Big Data. The materials come from the CVUT course Big Data Technologies introduces (BDT) course. All lectures and the accompanying materials are available on line.
- BDT introduces the basics for creating account, locating data etc. in the Metacentrum – Large data center, we use in our research.
- BDT introduces to the Hadoop and Mapreduce, includes practical examples of the simplest algorithms, such as dictionary creation, histogram of words, inverted index for full text search, a simple HBase usage.