lead or lag function to get several values, not just the nth
I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.
The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.
df <- tibble(words = c("it", "was", "the", "best", "of", "times",
"it", "was", "the", "worst", "of", "times"))
df <- df %>% mutate(chunks = ifelse(words=="times",
paste(lag(words, 3),
words,
lead(words, 3), sep = " "),
NA))
The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.
Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.
Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.
r dplyr lag lead
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.
The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.
df <- tibble(words = c("it", "was", "the", "best", "of", "times",
"it", "was", "the", "worst", "of", "times"))
df <- df %>% mutate(chunks = ifelse(words=="times",
paste(lag(words, 3),
words,
lead(words, 3), sep = " "),
NA))
The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.
Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.
Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.
r dplyr lag lead
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.
The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.
df <- tibble(words = c("it", "was", "the", "best", "of", "times",
"it", "was", "the", "worst", "of", "times"))
df <- df %>% mutate(chunks = ifelse(words=="times",
paste(lag(words, 3),
words,
lead(words, 3), sep = " "),
NA))
The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.
Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.
Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.
r dplyr lag lead
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.
The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.
df <- tibble(words = c("it", "was", "the", "best", "of", "times",
"it", "was", "the", "worst", "of", "times"))
df <- df %>% mutate(chunks = ifelse(words=="times",
paste(lag(words, 3),
words,
lead(words, 3), sep = " "),
NA))
The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.
Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.
Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.
r dplyr lag lead
r dplyr lag lead
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 5 hours ago
wscampbell
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 5 hours ago
wscampbellwscampbell
363
363
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
One option would be sapply:
library(dplyr)
df %>%
mutate(
chunks = ifelse(words == "times",
sapply(1:nrow(.),
function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
NA)
)
Output:
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it NA
2 was NA
3 the NA
4 best NA
5 of NA
6 times the best of times it was the
7 it NA
8 was NA
9 the NA
10 worst NA
11 of NA
12 times the worst of times
Although not an explicit lead or lag function, it can often serve the purpose as well.
1
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
add a comment |
Similar to @arg0naut but without dplyr:
r = 1:nrow(df)
w = which(df$words == "times")
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
df$chunks <- NA_character_
df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it <NA>
2 was <NA>
3 the <NA>
4 best <NA>
5 of <NA>
6 times the best of times it was the
7 it <NA>
8 was <NA>
9 the <NA>
10 worst <NA>
11 of <NA>
12 times the worst of times
The data.table translation:
library(data.table)
DT = data.table(df)
w = DT["times", on="words", which=TRUE]
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]
add a comment |
data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).
With data table:
library(data.table)
setDT(df)
df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]
# words chunks
# 1: it <NA>
# 2: was <NA>
# 3: the <NA>
# 4: best <NA>
# 5: of <NA>
# 6: times the best of times it was the
# 7: it <NA>
# 8: was <NA>
# 9: the <NA>
# 10: worst <NA>
# 11: of <NA>
# 12: times the worst of times
With dplyr and only using data.table for the shift function:
library(dplyr)
df %>%
mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
chunks = trimws(ifelse(words != "times", NA, chunks)))
# # A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
# 10 worst NA
# 11 of NA
# 12 times the worst of times
add a comment |
Here is a another tidyverse solution using lag and lead
laglead_f <- function(what, range)
setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))
df %>%
mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
unite(chunks, -words, sep = " ") %>%
mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
## A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
#10 worst NA
#11 of NA
#12 times the worst of times
The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
wscampbell is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55010810%2flead-or-lag-function-to-get-several-values-not-just-the-nth%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
One option would be sapply:
library(dplyr)
df %>%
mutate(
chunks = ifelse(words == "times",
sapply(1:nrow(.),
function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
NA)
)
Output:
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it NA
2 was NA
3 the NA
4 best NA
5 of NA
6 times the best of times it was the
7 it NA
8 was NA
9 the NA
10 worst NA
11 of NA
12 times the worst of times
Although not an explicit lead or lag function, it can often serve the purpose as well.
1
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
add a comment |
One option would be sapply:
library(dplyr)
df %>%
mutate(
chunks = ifelse(words == "times",
sapply(1:nrow(.),
function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
NA)
)
Output:
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it NA
2 was NA
3 the NA
4 best NA
5 of NA
6 times the best of times it was the
7 it NA
8 was NA
9 the NA
10 worst NA
11 of NA
12 times the worst of times
Although not an explicit lead or lag function, it can often serve the purpose as well.
1
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
add a comment |
One option would be sapply:
library(dplyr)
df %>%
mutate(
chunks = ifelse(words == "times",
sapply(1:nrow(.),
function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
NA)
)
Output:
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it NA
2 was NA
3 the NA
4 best NA
5 of NA
6 times the best of times it was the
7 it NA
8 was NA
9 the NA
10 worst NA
11 of NA
12 times the worst of times
Although not an explicit lead or lag function, it can often serve the purpose as well.
One option would be sapply:
library(dplyr)
df %>%
mutate(
chunks = ifelse(words == "times",
sapply(1:nrow(.),
function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
NA)
)
Output:
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it NA
2 was NA
3 the NA
4 best NA
5 of NA
6 times the best of times it was the
7 it NA
8 was NA
9 the NA
10 worst NA
11 of NA
12 times the worst of times
Although not an explicit lead or lag function, it can often serve the purpose as well.
answered 4 hours ago
arg0nautarg0naut
5,3191319
5,3191319
1
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
add a comment |
1
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
1
1
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
Works pefectly, arg0naut. Thanks a bunch! Really, really helpful
– wscampbell
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
You're welcome! Consider accepting the answer if it helped.
– arg0naut
4 hours ago
add a comment |
Similar to @arg0naut but without dplyr:
r = 1:nrow(df)
w = which(df$words == "times")
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
df$chunks <- NA_character_
df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it <NA>
2 was <NA>
3 the <NA>
4 best <NA>
5 of <NA>
6 times the best of times it was the
7 it <NA>
8 was <NA>
9 the <NA>
10 worst <NA>
11 of <NA>
12 times the worst of times
The data.table translation:
library(data.table)
DT = data.table(df)
w = DT["times", on="words", which=TRUE]
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]
add a comment |
Similar to @arg0naut but without dplyr:
r = 1:nrow(df)
w = which(df$words == "times")
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
df$chunks <- NA_character_
df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it <NA>
2 was <NA>
3 the <NA>
4 best <NA>
5 of <NA>
6 times the best of times it was the
7 it <NA>
8 was <NA>
9 the <NA>
10 worst <NA>
11 of <NA>
12 times the worst of times
The data.table translation:
library(data.table)
DT = data.table(df)
w = DT["times", on="words", which=TRUE]
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]
add a comment |
Similar to @arg0naut but without dplyr:
r = 1:nrow(df)
w = which(df$words == "times")
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
df$chunks <- NA_character_
df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it <NA>
2 was <NA>
3 the <NA>
4 best <NA>
5 of <NA>
6 times the best of times it was the
7 it <NA>
8 was <NA>
9 the <NA>
10 worst <NA>
11 of <NA>
12 times the worst of times
The data.table translation:
library(data.table)
DT = data.table(df)
w = DT["times", on="words", which=TRUE]
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]
Similar to @arg0naut but without dplyr:
r = 1:nrow(df)
w = which(df$words == "times")
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
df$chunks <- NA_character_
df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")
# A tibble: 12 x 2
words chunks
<chr> <chr>
1 it <NA>
2 was <NA>
3 the <NA>
4 best <NA>
5 of <NA>
6 times the best of times it was the
7 it <NA>
8 was <NA>
9 the <NA>
10 worst <NA>
11 of <NA>
12 times the worst of times
The data.table translation:
library(data.table)
DT = data.table(df)
w = DT["times", on="words", which=TRUE]
wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))
DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]
answered 4 hours ago
FrankFrank
55k659133
55k659133
add a comment |
add a comment |
data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).
With data table:
library(data.table)
setDT(df)
df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]
# words chunks
# 1: it <NA>
# 2: was <NA>
# 3: the <NA>
# 4: best <NA>
# 5: of <NA>
# 6: times the best of times it was the
# 7: it <NA>
# 8: was <NA>
# 9: the <NA>
# 10: worst <NA>
# 11: of <NA>
# 12: times the worst of times
With dplyr and only using data.table for the shift function:
library(dplyr)
df %>%
mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
chunks = trimws(ifelse(words != "times", NA, chunks)))
# # A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
# 10 worst NA
# 11 of NA
# 12 times the worst of times
add a comment |
data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).
With data table:
library(data.table)
setDT(df)
df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]
# words chunks
# 1: it <NA>
# 2: was <NA>
# 3: the <NA>
# 4: best <NA>
# 5: of <NA>
# 6: times the best of times it was the
# 7: it <NA>
# 8: was <NA>
# 9: the <NA>
# 10: worst <NA>
# 11: of <NA>
# 12: times the worst of times
With dplyr and only using data.table for the shift function:
library(dplyr)
df %>%
mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
chunks = trimws(ifelse(words != "times", NA, chunks)))
# # A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
# 10 worst NA
# 11 of NA
# 12 times the worst of times
add a comment |
data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).
With data table:
library(data.table)
setDT(df)
df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]
# words chunks
# 1: it <NA>
# 2: was <NA>
# 3: the <NA>
# 4: best <NA>
# 5: of <NA>
# 6: times the best of times it was the
# 7: it <NA>
# 8: was <NA>
# 9: the <NA>
# 10: worst <NA>
# 11: of <NA>
# 12: times the worst of times
With dplyr and only using data.table for the shift function:
library(dplyr)
df %>%
mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
chunks = trimws(ifelse(words != "times", NA, chunks)))
# # A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
# 10 worst NA
# 11 of NA
# 12 times the worst of times
data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).
With data table:
library(data.table)
setDT(df)
df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]
# words chunks
# 1: it <NA>
# 2: was <NA>
# 3: the <NA>
# 4: best <NA>
# 5: of <NA>
# 6: times the best of times it was the
# 7: it <NA>
# 8: was <NA>
# 9: the <NA>
# 10: worst <NA>
# 11: of <NA>
# 12: times the worst of times
With dplyr and only using data.table for the shift function:
library(dplyr)
df %>%
mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
chunks = trimws(ifelse(words != "times", NA, chunks)))
# # A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
# 10 worst NA
# 11 of NA
# 12 times the worst of times
edited 3 hours ago
answered 4 hours ago
IceCreamToucanIceCreamToucan
9,9921818
9,9921818
add a comment |
add a comment |
Here is a another tidyverse solution using lag and lead
laglead_f <- function(what, range)
setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))
df %>%
mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
unite(chunks, -words, sep = " ") %>%
mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
## A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
#10 worst NA
#11 of NA
#12 times the worst of times
The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
add a comment |
Here is a another tidyverse solution using lag and lead
laglead_f <- function(what, range)
setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))
df %>%
mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
unite(chunks, -words, sep = " ") %>%
mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
## A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
#10 worst NA
#11 of NA
#12 times the worst of times
The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
add a comment |
Here is a another tidyverse solution using lag and lead
laglead_f <- function(what, range)
setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))
df %>%
mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
unite(chunks, -words, sep = " ") %>%
mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
## A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
#10 worst NA
#11 of NA
#12 times the worst of times
The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".
Here is a another tidyverse solution using lag and lead
laglead_f <- function(what, range)
setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))
df %>%
mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
unite(chunks, -words, sep = " ") %>%
mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
## A tibble: 12 x 2
# words chunks
# <chr> <chr>
# 1 it NA
# 2 was NA
# 3 the NA
# 4 best NA
# 5 of NA
# 6 times the best of times it was the
# 7 it NA
# 8 was NA
# 9 the NA
#10 worst NA
#11 of NA
#12 times the worst of times
The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".
edited 3 hours ago
answered 3 hours ago
Maurits EversMaurits Evers
29k41535
29k41535
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
add a comment |
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
This is SUPER clever. Thanks a bunch, Maurits. This is really great and keeps things inside the tidyverse. Really grateful.
– wscampbell
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
You're very welcome @wscampbell and thanks for an interesting question!
– Maurits Evers
1 hour ago
add a comment |
wscampbell is a new contributor. Be nice, and check out our Code of Conduct.
wscampbell is a new contributor. Be nice, and check out our Code of Conduct.
wscampbell is a new contributor. Be nice, and check out our Code of Conduct.
wscampbell is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55010810%2flead-or-lag-function-to-get-several-values-not-just-the-nth%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown