Examples

Importing Shapefiles into R

Yield data is often stored within shapefiles. To import this data into an R data frame we must use the rgdal package. Each file must be imported individually before they can be merged into a single data frame. To do this you must provide the path to the folder containing the shapefile as well as the shapefile’s layer. The layer is most often the name of the file without the .shp extension. The example below shows how to load a data contained within three shapefiles (field_combine1, field_combine2, and field_combine3).

# Load the rgdal package
library(rgdal)

# Import three shapefiles into R
dataset1 <- readOGR(dsn=path.expand("Users/username/Desktop/Field"),
                    layer = "field_combine1")
dataset2 <- readOGR(dsn=path.expand("Users/username/Desktop/Field"),
                    layer = "field_combine2")
dataset3 <- readOGR(dsn=path.expand("Users/username/Desktop/Field"),
                    layer = "field_combine3")

# Merge shapefiles
interm_field <- rbind(data.frame(dataset1), data.frame(dataset2))
field <- rbind(data.frame(interm_field), data.frame(dataset3))

Cleaning yield data with defaults

# Load cydr
library(cydr)

# Clean data
clean_field <- narrow_passes(field)
clean_field <- pass_end_turns(clean_field)
clean_field <- speed(clean_field)
clean_field <- residual_outliers(clean_field)

Summarizing errors

Often, it is valuable to summarize the cydr error columns into a single column. This makes it easy to identify whether an observation is erroneous or not. This can be useful for tasks such as filtering out erroneous observations, visualizing where errors occur, and comparing errors and non-errors.

field <- field %>%
  mutate(cydr_Error = cydr_NarrowPassError |
                      cydr_PassEndError |
                      cydr_SpeedError |
                      cydr_ResidualError)
 

Removing Errors

How you want to analyze the yield data will help determine whether or not you remove errors from the dataset or attempt to correct them. This example shows how you could remove the errors from the dataset using dplyr’s filter() function.

# Remove errors from the dataset
field <- field %>%
  filter(!cydr_Error)

Plotting a Yield Map

ggplot(field, aes(coords.x1,
                  coords.x2,
                  colour=Yld_Vol_Dr)) +
  geom_point(alpha=0.5) +
  coord_quickmap() +
  theme_minimal() +
  scale_colour_distiller(type="div", palette="PRGn", direction=1)

Exporting to CSV

Once cydr has been used to clean the yield data, data analysis can begin. Users can continue working with the data in R, or they can export it for use with other software. The example below shows how a user can export a dataframe to a CSV. Upon export, a new CSV file called cleaned_field.csv will be created within the current directory.

# Export to CSV
write.csv(clean_field, "cleaned_field.csv")