Here’s an example of creating a data frame in R. Let’s consider a simple dataset for demonstration purposes.
# Create a data frame
data <- data.frame(
Name = c("John", "Anna", "Peter", "Linda"),
Age = c(28, 24, 35, 32),
Country = c("USA", "UK", "Australia", "Germany")
)
# Print the data frame
print(data)
When you run this code, it creates a data frame named data
with three columns (Name
, Age
, and Country
) and four rows, each representing a person. The output will look like this:
Name Age Country
1 John 28 USA
2 Anna 24 UK
3 Peter 35 Australia
4 Linda 32 Germany
Adding More Data
You can add more data to your data frame by using the rbind
function for rows or the cbind
function for columns. Here’s how you can do it:
# Create an initial data frame
data <- data.frame(
Name = c("John", "Anna", "Peter", "Linda"),
Age = c(28, 24, 35, 32),
Country = c("USA", "UK", "Australia", "Germany")
)
# Create a new row to add
new_row <- data.frame(
Name = "Emily",
Age = 27,
Country = "Canada"
)
# Add the new row to the data frame
data <- rbind(data, new_row)
# Print the updated data frame
print(data)
The output will now include Emily:
Name Age Country
1 John 28 USA
2 Anna 24 UK
3 Peter 35 Australia
4 Linda 32 Germany
5 Emily 27 Canada
Adding a New Column
To add a new column, you can use the $
operator to assign values directly to a new column name in your data frame.
# Create an initial data frame
data <- data.frame(
Name = c("John", "Anna", "Peter", "Linda"),
Age = c(28, 24, 35, 32),
Country = c("USA", "UK", "Australia", "Germany")
)
# Add a new column for Occupation
data$Occupation <- c("Engineer", "Doctor", "Teacher", "Lawyer")
# Print the updated data frame
print(data)
The output will now include an Occupation
column:
Name Age Country Occupation
1 John 28 USA Engineer
2 Anna 24 UK Doctor
3 Peter 35 Australia Teacher
4 Linda 32 Germany Lawyer
Basic Data Frame Operations
You can perform various operations on data frames, such as filtering, sorting, and grouping, using functions like subset
, order
, and aggregate
, or by using packages like dplyr
which provides a grammar of data manipulation.
# Filtering
filtered_data <- subset(data, Age > 30)
# Sorting
sorted_data <- data[order(data$Age), ]
# Using dplyr for more complex operations
library(dplyr)
grouped_data <- data %>%
group_by(Country) %>%
summarise(AvgAge = mean(Age))
These are just basic examples of working with data frames in R. Data frames are powerful and flexible, allowing you to store and manipulate tabular data efficiently.