Getting Started with R Software for Data Analysis
AI and Data Science
AI and Data Science
R is a flexible environment for statistical computing and graphics, which is becoming increasingly popular as a tool to get insight in often complex data. While in some ways similar to other programming languages (such as C, Java and Perl), R is particularly suited for data analysis because ready-made functions are available for a wide variety of statistical (classical statistical tests, linear and nonlinear modeling, timeseries analysis, classification, clustering, ...) and graphical techniques.
The base R program can be extended with user-submitted packages, which means new techniques are often implemented in R before being available in other software. This is one of the reasons why R is becoming the de facto standard in certain fields such as bioinformatics (Bioconductor) and financial services.
This course is part of a larger course series in Data Analysis consisting of 19 individual modules. Find more information and enroll for this module via www.ipvw-ices.ugent.be
This course introduces the use of the R environment for the implementation of data management, data exploration, basic statistical analysis and automation of procedures.
It starts with a description of the R GUI, the use of the command line and an overview of basic data structures. The application of standard procedures to import data or to export results to external files will be illustrated.
Creation of new variables, subsetting, merging and stacking of data sets will be covered in the data management section. Exploration of the data by histograms, box plots, scatter plots, summary numbers, correlation coefficients and cross-tabulations will be performed.
Simple statistical procedures that will be covered are:
Finally, installing new packages and automation of analysis procedures will also be discussed.
Practical sessions and specific exercises will be provided to allow participants to practice their R skills in interaction with the teacher.