Managing Big Data with R and Hadoop
Learn how to manage and analyse big data using the R programming language and Hadoop programming framework in this online programming course from PRACE.
Who is the course for?
This course is designed for people interested in data science, computational statistics and machine learning and have basic experiences with them. It will be also useful for advanced undergraduate students and first year PhD students in data analysis, statistics or bioinformatics, who wish to understand how to manage big data with Hadoop using R programming language.
We expect that the learners will also have basic experiences with linux and bash and working experiences with R and matrix operations. They should be also capable to download and run virtual machine.
What topics will you cover?
Welcome to BIG DATA
Working with Hadoop
First steps in R and RHadoop
Statistical learning with RHadoop: clustering
Statistical learning with RHadoop: regression and classification