//How to read a csv file in R

How to read a csv file in R

Hey! Today I want to show you how to read a csv file in R, we will also see what arguments we can use to read the different options file. We will work with a csv that you can download from here link

To start creating a data set with the data we will start calling to read.csv function:

df <- read.csv(
    "Countries.csv",
    header = TRUE
    , col.names = c("maptools","maps","gapminder")
    , sep = ","
    ,encoding = "utf-8"
    , stringsAsFactors = FALSE
        )
## Warning in read.table(file = file, header = header, sep = sep, quote = quote, :
## header and 'col.names' are of different lengths
head(df)
##         maptools            maps   gapminder
## 1    Afghanistan     Afghanistan Afghanistan
## 2 Ã…land Islands            <NA>        <NA>
## 3        Albania         Albania     Albania
## 4        Algeria         Algeria     Algeria
## 5 American Samoa  American Samoa       Samoa
## 6           <NA> Andaman Islands        <NA>

In this function we have used quite a few parameters as for example, header, colnames,sep…. Don´t worry about that, because I’m going to explain what it means.

col.names :

We can change the column names with this parameter, in this case we have wrote the same name that have the csv file.

sep:

If we have problems reading the data, we can select the kind of separator that we want to use for the file, in this case we have use “,” a separator because the file come with this separator.

stringsAsFactors :

If you are working with categorical data, this parameters is very usefully, because by default read.csv() convert characters string into factors, so if we want avoid it we must use this parameters as FALSE.

na.strings :

It replaces values (eg., characters, numbers) in you csv file with NA. If you try read.csv(“Countries.csv”, na.strings = “A”) you’ll see that all A’s in csv were replaced with NA’s.

colClasses:

With this parameter you can choose the classes of the columns, as the example:

colClasses = c(“character”,“complex”, “factor”, “integer”, “numeric”,“Date”, “logical”)))

If you want only change one variable you can do like this:

colClasses=c(“variableName”=“character”))

Encoding

It is used to mark character strings as known to be in Latin-1 or UTF-8, for exaple if you find problems loading data from Sapin is possible that you must put the parameter fileEncoding = “utf-8”

Here the most important parameters for this function, Do you have another that you use usually? Let’s us these parameters in the comments!