
Domains
Agile Management
Master Agile methodologies for efficient and timely project delivery.
View All Agile Management Coursesicon-refresh-cwCertifications
Scrum Alliance
16 Hours
Best Seller
Certified ScrumMaster (CSM) CertificationScrum Alliance
16 Hours
Best Seller
Certified Scrum Product Owner (CSPO) CertificationScaled Agile
16 Hours
Trending
Leading SAFe 6.0 CertificationScrum.org
16 Hours
Professional Scrum Master (PSM) CertificationScaled Agile
16 Hours
SAFe 6.0 Scrum Master (SSM) CertificationAdvanced Certifications
Scaled Agile, Inc.
32 Hours
Recommended
Implementing SAFe 6.0 (SPC) CertificationScaled Agile, Inc.
24 Hours
SAFe 6.0 Release Train Engineer (RTE) CertificationScaled Agile, Inc.
16 Hours
Trending
SAFe® 6.0 Product Owner/Product Manager (POPM)IC Agile
24 Hours
ICP Agile Certified Coaching (ICP-ACC)Scrum.org
16 Hours
Professional Scrum Product Owner I (PSPO I) TrainingMasters
32 Hours
Trending
Agile Management Master's Program32 Hours
Agile Excellence Master's ProgramOn-Demand Courses
Agile and ScrumRoles
Scrum MasterTech Courses and Bootcamps
Full Stack Developer BootcampAccreditation Bodies
Scrum AllianceTop Resources
Scrum TutorialProject Management
Gain expert skills to lead projects to success and timely completion.
View All Project Management Coursesicon-standCertifications
PMI
36 Hours
Best Seller
Project Management Professional (PMP) CertificationAxelos
32 Hours
PRINCE2 Foundation & Practitioner CertificationAxelos
16 Hours
PRINCE2 Foundation CertificationAxelos
16 Hours
PRINCE2 Practitioner CertificationSkills
Change ManagementMasters
Job Oriented
45 Hours
Trending
Project Management Master's ProgramUniversity Programs
45 Hours
Trending
Project Management Master's ProgramOn-Demand Courses
PRINCE2 Practitioner CourseRoles
Project ManagerAccreditation Bodies
PMITop Resources
Theories of MotivationCloud Computing
Learn to harness the cloud to deliver computing resources efficiently.
View All Cloud Computing Coursesicon-cloud-snowingCertifications
AWS
32 Hours
Best Seller
AWS Certified Solutions Architect - AssociateAWS
32 Hours
AWS Cloud Practitioner CertificationAWS
24 Hours
AWS DevOps CertificationMicrosoft
16 Hours
Azure Fundamentals CertificationMicrosoft
24 Hours
Best Seller
Azure Administrator CertificationMicrosoft
45 Hours
Recommended
Azure Data Engineer CertificationMicrosoft
32 Hours
Azure Solution Architect CertificationMicrosoft
40 Hours
Azure DevOps CertificationAWS
24 Hours
Systems Operations on AWS Certification TrainingAWS
24 Hours
Developing on AWSMasters
Job Oriented
48 Hours
New
AWS Cloud Architect Masters ProgramBootcamps
Career Kickstarter
100 Hours
Trending
Cloud Engineer BootcampRoles
Cloud EngineerOn-Demand Courses
AWS Certified Developer Associate - Complete GuideAuthorized Partners of
AWSTop Resources
Scrum TutorialIT Service Management
Understand how to plan, design, and optimize IT services efficiently.
View All DevOps Coursesicon-git-commitCertifications
Axelos
16 Hours
Best Seller
ITIL 4 Foundation CertificationAxelos
16 Hours
ITIL Practitioner CertificationPeopleCert
16 Hours
ISO 14001 Foundation CertificationPeopleCert
16 Hours
ISO 20000 CertificationPeopleCert
24 Hours
ISO 27000 Foundation CertificationAxelos
24 Hours
ITIL 4 Specialist: Create, Deliver and Support TrainingAxelos
24 Hours
ITIL 4 Specialist: Drive Stakeholder Value TrainingAxelos
16 Hours
ITIL 4 Strategist Direct, Plan and Improve TrainingOn-Demand Courses
ITIL 4 Specialist: Create, Deliver and Support ExamTop Resources
ITIL Practice TestData Science
Unlock valuable insights from data with advanced analytics.
View All Data Science Coursesicon-dataBootcamps
Job Oriented
6 Months
Trending
Data Science BootcampJob Oriented
289 Hours
Data Engineer BootcampJob Oriented
6 Months
Data Analyst BootcampJob Oriented
288 Hours
New
AI Engineer BootcampSkills
Data Science with PythonRoles
Data ScientistOn-Demand Courses
Data Analysis Using ExcelTop Resources
Machine Learning TutorialDevOps
Automate and streamline the delivery of products and services.
View All DevOps Coursesicon-terminal-squareCertifications
DevOps Institute
16 Hours
Best Seller
DevOps Foundation CertificationCNCF
32 Hours
New
Certified Kubernetes AdministratorDevops Institute
16 Hours
Devops LeaderSkills
KubernetesRoles
DevOps EngineerOn-Demand Courses
CI/CD with Jenkins XGlobal Accreditations
DevOps InstituteTop Resources
Top DevOps ProjectsBI And Visualization
Understand how to transform data into actionable, measurable insights.
View All BI And Visualization Coursesicon-microscopeBI and Visualization Tools
Certification
24 Hours
Recommended
Tableau CertificationCertification
24 Hours
Data Visualization with Tableau CertificationMicrosoft
24 Hours
Best Seller
Microsoft Power BI CertificationTIBCO
36 Hours
TIBCO Spotfire TrainingCertification
30 Hours
Data Visualization with QlikView CertificationCertification
16 Hours
Sisense BI CertificationOn-Demand Courses
Data Visualization Using Tableau TrainingTop Resources
Python Data Viz LibsCyber Security
Understand how to protect data and systems from threats or disasters.
View All Cyber Security Coursesicon-refresh-cwCertifications
CompTIA
40 Hours
Best Seller
CompTIA Security+EC-Council
40 Hours
Certified Ethical Hacker (CEH v12) CertificationISACA
22 Hours
Certified Information Systems Auditor (CISA) CertificationISACA
40 Hours
Certified Information Security Manager (CISM) Certification(ISC)²
40 Hours
Certified Information Systems Security Professional (CISSP)(ISC)²
40 Hours
Certified Cloud Security Professional (CCSP) Certification16 Hours
Certified Information Privacy Professional - Europe (CIPP-E) CertificationISACA
16 Hours
COBIT5 Foundation16 Hours
Payment Card Industry Security Standards (PCI-DSS) CertificationOn-Demand Courses
CISSPTop Resources
Laptops for IT SecurityWeb Development
Learn to create user-friendly, fast, and dynamic web applications.
View All Web Development Coursesicon-codeBootcamps
Career Kickstarter
6 Months
Best Seller
Full-Stack Developer BootcampJob Oriented
3 Months
Best Seller
UI/UX Design BootcampEnterprise Recommended
6 Months
Java Full Stack Developer BootcampCareer Kickstarter
490+ Hours
Front-End Development BootcampCareer Accelerator
4 Months
Backend Development Bootcamp (Node JS)Skills
ReactOn-Demand Courses
Angular TrainingTop Resources
Top HTML ProjectsBlockchain
Understand how transactions and databases work in blockchain technology.
View All Blockchain Coursesicon-stop-squareBlockchain Certifications
40 Hours
Blockchain Professional Certification32 Hours
Blockchain Solutions Architect Certification32 Hours
Blockchain Security Engineer Certification24 Hours
Blockchain Quality Engineer Certification5+ Hours
Blockchain 101 CertificationOn-Demand Courses
NFT Essentials 101: A Beginner's GuideTop Resources
Blockchain Interview QsProgramming
Learn to code efficiently and design software that solves problems.
View All Programming Coursesicon-codeSkills
Python CertificationInterview Prep
Career Accelerator
3 Months
Software Engineer Interview PrepOn-Demand Courses
Data Structures and Algorithms with JavaScriptTop Resources
Python TutorialIn this tutorial, we will explore various options for loading and saving data into R from plain-text files, R files, and excel spreadsheets. It will also demonstrate the R packages that you can use to load data from databases and other common programs, like SAS and MATLAB.
R comes with many data sets pre-loaded in the datasets package, which comes with base R. One can list the pre-loaded datasets as well as the short descriptions of each by running:
> help(package=”dataset name”)
> setwd(“/location or directory path/”) #This is to set the working directory at some specified location.
> getwd()
you can place the data files straight into the folder that is the working directory, or you can move your working directory to where your data files are. You can move your working directory to any folder on your computer with the function setwd. Just give setwd the file path to your new working directory.
If the file path does not begin with your root directory, R will assume that it begins at your current working directory.
You can also change your working directory by clicking on Session > Set Working Directory > Choose Directory in the RStudio menu bar. The Windows and Mac GUIs have similar options. If you start R from a UNIX command line (as on Linux machines), the working directory will be whichever directory you were in when you called R.
You can see what files are in your working directory with list.files(). If you see the file that you would like to open in your working directory, then you are ready to proceed. How you open files in your working directory will depend on which type of file you would like to open.
Plain-text files are one of the most common ways to save data. They are very simple and can be read by many different computer programs—even the most basic text editors. For this reason, public data often comes as plain-text files.
You can use load.table() functionalities to read/load a plain-test file.
The following table shares a quick overview of the R’s read functions in which you can always overwrite any of the default arguments as necessary for your data type:
Functions | Defaults | Use |
|---|---|---|
read.table | sep = " “, header = FALSE | General read function |
read.csv | sep = “,”, header = TRUE | To read the csv files |
read.delim | sep = “\t”, header = TRUE | To read the tab-delimited file |
read.csv2 | sep = “;”, header = TRUE,dec = “,” | To read CSV file with European decimal format |
read.delim2 | sep = “\t”, header = TRUE,dec = “,” | To read the tab-delimited files with European decimal format. |
There is one specific type of plain-text file that defies the pattern by using its layout to separate data cells. Each row is placed in its own line (as with other plain-text files), and then each column begins at a specific number of characters from the left-hand side of the document. To achieve this, an arbitrary number of character spaces is added to the end of each entry to correctly position the next entry. These documents are known as fixed-width files and usually, end with the extension “.fwf”. R has its own function to read such file – read.fwf.
There are many data files which are available on the Internet at their own web address. If you are connected to the Internet, you can easily open these files straight into R with read.table, read.csv, etc. You need to pass a web address into the file name argument for any of R’s data-reading functions. As a result, you could read in the iris data set from a web address like http://.../poker.csv with:
poker <- read.csv("http://.../iris.csv")
Once your data is in R, you can save it to any file format that R supports. If you’d like to save it as a plain-text file, you can use the ‘write’ family of functions.
R saves data sets to plain-text files with the write family of functions:
File Format | Function and Syntax |
|---|---|
.csv | write.csv(r_object, file = filepath, row.names = FALSE) |
.csv (with european decimal notation) | write.csv2(r_object, file = filepath, row.names = FALSE) |
tab delimited | write.table(r_object, file = filepath, sep = "\t", row.names=FALSE) |
The first argument of each function is the R object that contains your data set. The file argument is the file name (including extension) that you wish to give the saved data. By default, each function will save your data into your working directory.
Sometimes it becomes quite important to save a big output file and share with others. Compression is an effective way to deal in such a scenario.
To compress a plain-text file, surround the file name or file path with the function bzfile, gzfile, or xzfile. For example:
> write.csv(poker, file = bzfile("data/poker.csv.bz2"), row.names = FALSE)
There are 2 more functions that will also compress the file with two different types of compression format.
Function | Compression Type |
|---|---|
bzfile | bzip2 |
gzfile | gnu zip(gzip) |
xzfile | xz compression |
R provides two file formats of its own for storing data, .RDS and .RData. RDS files can store a single R object, and RData files can store multiple R objects.
You can open an RDS file with readRDS. For example, if the iris data was saved as iris.RDS, you could open it with:
poker <- readRDS("iris.RDS")
Opening RData files is even easier. Simply run the function load with the file:
load("file.RData")
There’s no need to assign the output to an object. The R objects in your RData file will be loaded into your R session with their original names. RData files can contain multiple R objects, so loading one may read in multiple objects. load doesn’t tell you how many objects it is reading in, nor what their names are, so it pays to know a little about the RData file before you load it.
To save data as an RData object, one can use the save function. To save data as an RDS object, use the saveRDS function. In each case, the first argument should be the name of the R object you wish to save. You should then include a file argument that has the file name or file path you want to save the data set to.
For example, if you have three R objects, a, b, and c, you could save them all in the same RData file and then reload them in another R session:
a<-5
b<-90
c<-100
save(a, b, c, file = "test.RData") load("test.RData")
There are quite a few numbers of packages to help you read Excel files directly into R. Unfortunately, many of these packages do not work on all operating systems. Others have been made out of date by the .xlsx file format. One package that does work on all file systems (and gets good reviews) is the XLConnect package. To use it, you’ll need to install and load the package:
install.packages("XLConnect")
library(XLConnect)
XLConnect relies on Java to be platform-independent. So when you first open XLConnect, RStudio may ask to download a Java Runtime Environment if you do not already have one.
A good number of functions are available in R, which attempt to read the file types of other data- analysis programs:
File Format | Function | Library |
|---|---|---|
MATLAB | readMat | R.matlab |
minitab | read.mtp | foreign |
SAS | read.ssd | foreign |
SAS (XPORT format) | read.xport | foreign |
SPSS | read.spss | Foreign |
Stata | read.dta | Foreign |
Systat | read.systat | foreign |
ERSI ArcGIS | read.shapefile | shapefiles |
One great aspect of R is that you can also use R to connect to a database and read in data. How you do this will depend on the database management system that you use. Working with a database will usually require experience that goes beyond the skill set of a typical R user. However, if you are interested in doing this, the best place to start is by downloading these R packages and reading their documentation.
Use the RODBC package to connect to databases through an ODBC connection.
Use the DBI package to connect to databases through individual drivers. The DBI package provides a common syntax for working with different databases. You will have to download a database-specific package to use in conjunction with DBI. These packages provide the API for the native drivers of different database programs. For MySQL use RMySQL, for SQLite use RSQLite, for Oracle use ROracle, for PostgreSQL use RPostgreSQL, and for databases that use drivers based on the Java Database Connectivity (JDBC) API use RJDBC. Once you have loaded the appropriate driver package, you can use the commands provided by DBI to access your database.
Hope you enjoyed this tutorial and in our next tutorial, we will discuss how to generate visualization using R.