Hey! Today I want to show you how to read a csv file in R, we will also see what arguments we can use to read the different options file. We will work with a csv that you can download from here link
To start creating a data set with the data we will start calling to read.csv function:
df <- read.csv( "Countries.csv", header = TRUE , col.names = c("maptools","maps","gapminder") , sep = "," ,encoding = "utf-8" , stringsAsFactors = FALSE )
## Warning in read.table(file = file, header = header, sep = sep, quote = quote, : ## header and 'col.names' are of different lengths
## maptools maps gapminder ## 1 Afghanistan Afghanistan Afghanistan ## 2 Ã…land Islands <NA> <NA> ## 3 Albania Albania Albania ## 4 Algeria Algeria Algeria ## 5 American Samoa American Samoa Samoa ## 6 <NA> Andaman Islands <NA>
In this function we have used quite a few parameters as for example, header, colnames,sep…. Don´t worry about that, because I’m going to explain what it means.
It’s A boolean parameter, if we write True, we will receive the data with header, if not no.
We can change the column names with this parameter, in this case we have wrote the same name that have the csv file.
If we have problems reading the data, we can select the kind of separator that we want to use for the file, in this case we have use “,” a separator because the file come with this separator.
If you are working with categorical data, this parameters is very usefully, because by default read.csv() convert characters string into factors, so if we want avoid it we must use this parameters as FALSE.
It replaces values (eg., characters, numbers) in you csv file with NA. If you try read.csv(“Countries.csv”, na.strings = “A”) you’ll see that all A’s in csv were replaced with NA’s.
With this parameter you can choose the classes of the columns, as the example:
colClasses = c(“character”,“complex”, “factor”, “integer”, “numeric”,“Date”, “logical”)))
If you want only change one variable you can do like this:
Select the number of lines of the data file to skip before beginning to read data.
It is used to mark character strings as known to be in Latin-1 or UTF-8, for exaple if you find problems loading data from Sapin is possible that you must put the parameter fileEncoding = “utf-8”
Here the most important parameters for this function, Do you have another that you use usually? Let’s us these parameters in the comments!