Rank groups within a grouped sequence of TRUE/FALSE and NA





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







10















I have a little nut to crack.



I have a data.frame like this:



   group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
-18L))



And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.



So the result should look like:



    group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA



I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.



Thanks for the help!










share|improve this question




















  • 1





    you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

    – user20650
    19 hours ago













  • that is a really funny solution. Very good job!

    – Humpelstielzchen
    19 hours ago











  • In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

    – smci
    19 hours ago











  • No, when group A stops so stops the sequence for group A.

    – Humpelstielzchen
    19 hours ago













  • But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

    – smci
    19 hours ago




















10















I have a little nut to crack.



I have a data.frame like this:



   group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
-18L))



And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.



So the result should look like:



    group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA



I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.



Thanks for the help!










share|improve this question




















  • 1





    you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

    – user20650
    19 hours ago













  • that is a really funny solution. Very good job!

    – Humpelstielzchen
    19 hours ago











  • In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

    – smci
    19 hours ago











  • No, when group A stops so stops the sequence for group A.

    – Humpelstielzchen
    19 hours ago













  • But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

    – smci
    19 hours ago
















10












10








10








I have a little nut to crack.



I have a data.frame like this:



   group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
-18L))



And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.



So the result should look like:



    group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA



I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.



Thanks for the help!










share|improve this question
















I have a little nut to crack.



I have a data.frame like this:



   group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
-18L))



And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.



So the result should look like:



    group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA



I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.



Thanks for the help!







r dplyr data.table rank






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 19 hours ago







Humpelstielzchen

















asked 21 hours ago









HumpelstielzchenHumpelstielzchen

1,3901318




1,3901318








  • 1





    you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

    – user20650
    19 hours ago













  • that is a really funny solution. Very good job!

    – Humpelstielzchen
    19 hours ago











  • In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

    – smci
    19 hours ago











  • No, when group A stops so stops the sequence for group A.

    – Humpelstielzchen
    19 hours ago













  • But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

    – smci
    19 hours ago
















  • 1





    you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

    – user20650
    19 hours ago













  • that is a really funny solution. Very good job!

    – Humpelstielzchen
    19 hours ago











  • In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

    – smci
    19 hours ago











  • No, when group A stops so stops the sequence for group A.

    – Humpelstielzchen
    19 hours ago













  • But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

    – smci
    19 hours ago










1




1





you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
19 hours ago







you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
19 hours ago















that is a really funny solution. Very good job!

– Humpelstielzchen
19 hours ago





that is a really funny solution. Very good job!

– Humpelstielzchen
19 hours ago













In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
19 hours ago





In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
19 hours ago













No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
19 hours ago







No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
19 hours ago















But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
19 hours ago







But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
19 hours ago














4 Answers
4






active

oldest

votes


















7














Another data.table approach:



library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
(criterium), goal := rleid(cr), by=.(group)]





share|improve this answer



















  • 1





    Tried with rleid but didn't get it to work. (+1)

    – markus
    19 hours ago











  • works for me. And seems to be the most elegant answer.

    – Humpelstielzchen
    19 hours ago



















6














Maybe I have over-complicated this but one way with dplyr is



library(dplyr)

df %>%
mutate(temp = replace(criterium, is.na(criterium), FALSE),
temp1 = cumsum(!temp)) %>%
group_by(temp1) %>%
mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
group_by(group) %>%
mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA


We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.






share|improve this answer































    4














    A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.



    f1 <- function(x) {
    x[is.na(x)] <- FALSE
    rle1 <- rle(x)
    y <- rle1$values
    rle1$values[!y] <- 0
    rle1$values[y] <- cumsum(rle1$values[y])
    return(inverse.rle(rle1))
    }


    do.call(rbind,
    lapply(split(df, df$group), function(i){i$goal <- f1(i$criterium);
    i$goal <- replace(i$goal, is.na(i$criterium)|!i$criterium, NA);
    i}))


    Of course, If you want you can apply it via dplyr, i.e.



    library(dplyr)

    df %>%
    group_by(group) %>%
    mutate(goal = f1(criterium),
    goal = replace(goal, is.na(criterium)|!criterium, NA))


    which gives,




    # A tibble: 18 x 3
    # Groups: group [2]
    group criterium goal
    <fct> <lgl> <dbl>
    1 A NA NA
    2 A TRUE 1
    3 A TRUE 1
    4 A TRUE 1
    5 A FALSE NA
    6 A FALSE NA
    7 A TRUE 2
    8 A TRUE 2
    9 A FALSE NA
    10 A TRUE 3
    11 A TRUE 3
    12 A TRUE 3
    13 B NA NA
    14 B FALSE NA
    15 B TRUE 1
    16 B TRUE 1
    17 B TRUE 1
    18 B FALSE NA






    share|improve this answer

































      4














      A data.table option using rle



      library(data.table)
      DT <- as.data.table(dat)
      DT[, goal := {
      r <- rle(replace(criterium, is.na(criterium), FALSE))
      r$values <- with(r, cumsum(values) * values)
      out <- inverse.rle(r)
      replace(out, out == 0, NA)
      }, by = group]
      DT
      # group criterium goal
      # 1: A NA NA
      # 2: A TRUE 1
      # 3: A TRUE 1
      # 4: A TRUE 1
      # 5: A FALSE NA
      # 6: A FALSE NA
      # 7: A TRUE 2
      # 8: A TRUE 2
      # 9: A FALSE NA
      #10: A TRUE 3
      #11: A TRUE 3
      #12: A TRUE 3
      #13: B NA NA
      #14: B FALSE NA
      #15: B TRUE 1
      #16: B TRUE 1
      #17: B TRUE 1
      #18: B FALSE NA


      step by step



      When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle



      r
      #Run Length Encoding
      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
      # values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...


      We manipulate the values compenent in the following way



      r$values <- with(r, cumsum(values) * values)
      r
      #Run Length Encoding
      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
      # values : int [1:9] 0 1 0 2 0 3 0 4 0


      That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times



      out <- inverse.rle(r)
      out
      # [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0


      This is almost what OP wants but we need to replace the 0s with NA



      replace(out, out == 0, NA)


      This is done for each group.



      data



      dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
      1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
      "B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
      FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
      TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
      -18L))





      share|improve this answer


























      • Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

        – Humpelstielzchen
        20 hours ago






      • 1





        @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

        – markus
        20 hours ago













      • Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

        – Humpelstielzchen
        19 hours ago












      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55606323%2frank-groups-within-a-grouped-sequence-of-true-false-and-na%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      7














      Another data.table approach:



      library(data.table)
      setDT(dt)
      dt[, cr := rleid(criterium)][
      (criterium), goal := rleid(cr), by=.(group)]





      share|improve this answer



















      • 1





        Tried with rleid but didn't get it to work. (+1)

        – markus
        19 hours ago











      • works for me. And seems to be the most elegant answer.

        – Humpelstielzchen
        19 hours ago
















      7














      Another data.table approach:



      library(data.table)
      setDT(dt)
      dt[, cr := rleid(criterium)][
      (criterium), goal := rleid(cr), by=.(group)]





      share|improve this answer



















      • 1





        Tried with rleid but didn't get it to work. (+1)

        – markus
        19 hours ago











      • works for me. And seems to be the most elegant answer.

        – Humpelstielzchen
        19 hours ago














      7












      7








      7







      Another data.table approach:



      library(data.table)
      setDT(dt)
      dt[, cr := rleid(criterium)][
      (criterium), goal := rleid(cr), by=.(group)]





      share|improve this answer













      Another data.table approach:



      library(data.table)
      setDT(dt)
      dt[, cr := rleid(criterium)][
      (criterium), goal := rleid(cr), by=.(group)]






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered 20 hours ago









      chinsoon12chinsoon12

      9,93611420




      9,93611420








      • 1





        Tried with rleid but didn't get it to work. (+1)

        – markus
        19 hours ago











      • works for me. And seems to be the most elegant answer.

        – Humpelstielzchen
        19 hours ago














      • 1





        Tried with rleid but didn't get it to work. (+1)

        – markus
        19 hours ago











      • works for me. And seems to be the most elegant answer.

        – Humpelstielzchen
        19 hours ago








      1




      1





      Tried with rleid but didn't get it to work. (+1)

      – markus
      19 hours ago





      Tried with rleid but didn't get it to work. (+1)

      – markus
      19 hours ago













      works for me. And seems to be the most elegant answer.

      – Humpelstielzchen
      19 hours ago





      works for me. And seems to be the most elegant answer.

      – Humpelstielzchen
      19 hours ago













      6














      Maybe I have over-complicated this but one way with dplyr is



      library(dplyr)

      df %>%
      mutate(temp = replace(criterium, is.na(criterium), FALSE),
      temp1 = cumsum(!temp)) %>%
      group_by(temp1) %>%
      mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
      group_by(group) %>%
      mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
      select(-temp, -temp1)

      # group criterium goal
      # <fct> <lgl> <int>
      # 1 A NA NA
      # 2 A TRUE 1
      # 3 A TRUE 1
      # 4 A TRUE 1
      # 5 A FALSE NA
      # 6 A FALSE NA
      # 7 A TRUE 2
      # 8 A TRUE 2
      # 9 A FALSE NA
      #10 A TRUE 3
      #11 A TRUE 3
      #12 A TRUE 3
      #13 B NA NA
      #14 B FALSE NA
      #15 B TRUE 1
      #16 B TRUE 1
      #17 B TRUE 1
      #18 B FALSE NA


      We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.






      share|improve this answer




























        6














        Maybe I have over-complicated this but one way with dplyr is



        library(dplyr)

        df %>%
        mutate(temp = replace(criterium, is.na(criterium), FALSE),
        temp1 = cumsum(!temp)) %>%
        group_by(temp1) %>%
        mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
        group_by(group) %>%
        mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
        select(-temp, -temp1)

        # group criterium goal
        # <fct> <lgl> <int>
        # 1 A NA NA
        # 2 A TRUE 1
        # 3 A TRUE 1
        # 4 A TRUE 1
        # 5 A FALSE NA
        # 6 A FALSE NA
        # 7 A TRUE 2
        # 8 A TRUE 2
        # 9 A FALSE NA
        #10 A TRUE 3
        #11 A TRUE 3
        #12 A TRUE 3
        #13 B NA NA
        #14 B FALSE NA
        #15 B TRUE 1
        #16 B TRUE 1
        #17 B TRUE 1
        #18 B FALSE NA


        We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.






        share|improve this answer


























          6












          6








          6







          Maybe I have over-complicated this but one way with dplyr is



          library(dplyr)

          df %>%
          mutate(temp = replace(criterium, is.na(criterium), FALSE),
          temp1 = cumsum(!temp)) %>%
          group_by(temp1) %>%
          mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
          group_by(group) %>%
          mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
          select(-temp, -temp1)

          # group criterium goal
          # <fct> <lgl> <int>
          # 1 A NA NA
          # 2 A TRUE 1
          # 3 A TRUE 1
          # 4 A TRUE 1
          # 5 A FALSE NA
          # 6 A FALSE NA
          # 7 A TRUE 2
          # 8 A TRUE 2
          # 9 A FALSE NA
          #10 A TRUE 3
          #11 A TRUE 3
          #12 A TRUE 3
          #13 B NA NA
          #14 B FALSE NA
          #15 B TRUE 1
          #16 B TRUE 1
          #17 B TRUE 1
          #18 B FALSE NA


          We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.






          share|improve this answer













          Maybe I have over-complicated this but one way with dplyr is



          library(dplyr)

          df %>%
          mutate(temp = replace(criterium, is.na(criterium), FALSE),
          temp1 = cumsum(!temp)) %>%
          group_by(temp1) %>%
          mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
          group_by(group) %>%
          mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
          select(-temp, -temp1)

          # group criterium goal
          # <fct> <lgl> <int>
          # 1 A NA NA
          # 2 A TRUE 1
          # 3 A TRUE 1
          # 4 A TRUE 1
          # 5 A FALSE NA
          # 6 A FALSE NA
          # 7 A TRUE 2
          # 8 A TRUE 2
          # 9 A FALSE NA
          #10 A TRUE 3
          #11 A TRUE 3
          #12 A TRUE 3
          #13 B NA NA
          #14 B FALSE NA
          #15 B TRUE 1
          #16 B TRUE 1
          #17 B TRUE 1
          #18 B FALSE NA


          We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 21 hours ago









          Ronak ShahRonak Shah

          46.1k104268




          46.1k104268























              4














              A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.



              f1 <- function(x) {
              x[is.na(x)] <- FALSE
              rle1 <- rle(x)
              y <- rle1$values
              rle1$values[!y] <- 0
              rle1$values[y] <- cumsum(rle1$values[y])
              return(inverse.rle(rle1))
              }


              do.call(rbind,
              lapply(split(df, df$group), function(i){i$goal <- f1(i$criterium);
              i$goal <- replace(i$goal, is.na(i$criterium)|!i$criterium, NA);
              i}))


              Of course, If you want you can apply it via dplyr, i.e.



              library(dplyr)

              df %>%
              group_by(group) %>%
              mutate(goal = f1(criterium),
              goal = replace(goal, is.na(criterium)|!criterium, NA))


              which gives,




              # A tibble: 18 x 3
              # Groups: group [2]
              group criterium goal
              <fct> <lgl> <dbl>
              1 A NA NA
              2 A TRUE 1
              3 A TRUE 1
              4 A TRUE 1
              5 A FALSE NA
              6 A FALSE NA
              7 A TRUE 2
              8 A TRUE 2
              9 A FALSE NA
              10 A TRUE 3
              11 A TRUE 3
              12 A TRUE 3
              13 B NA NA
              14 B FALSE NA
              15 B TRUE 1
              16 B TRUE 1
              17 B TRUE 1
              18 B FALSE NA






              share|improve this answer






























                4














                A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.



                f1 <- function(x) {
                x[is.na(x)] <- FALSE
                rle1 <- rle(x)
                y <- rle1$values
                rle1$values[!y] <- 0
                rle1$values[y] <- cumsum(rle1$values[y])
                return(inverse.rle(rle1))
                }


                do.call(rbind,
                lapply(split(df, df$group), function(i){i$goal <- f1(i$criterium);
                i$goal <- replace(i$goal, is.na(i$criterium)|!i$criterium, NA);
                i}))


                Of course, If you want you can apply it via dplyr, i.e.



                library(dplyr)

                df %>%
                group_by(group) %>%
                mutate(goal = f1(criterium),
                goal = replace(goal, is.na(criterium)|!criterium, NA))


                which gives,




                # A tibble: 18 x 3
                # Groups: group [2]
                group criterium goal
                <fct> <lgl> <dbl>
                1 A NA NA
                2 A TRUE 1
                3 A TRUE 1
                4 A TRUE 1
                5 A FALSE NA
                6 A FALSE NA
                7 A TRUE 2
                8 A TRUE 2
                9 A FALSE NA
                10 A TRUE 3
                11 A TRUE 3
                12 A TRUE 3
                13 B NA NA
                14 B FALSE NA
                15 B TRUE 1
                16 B TRUE 1
                17 B TRUE 1
                18 B FALSE NA






                share|improve this answer




























                  4












                  4








                  4







                  A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.



                  f1 <- function(x) {
                  x[is.na(x)] <- FALSE
                  rle1 <- rle(x)
                  y <- rle1$values
                  rle1$values[!y] <- 0
                  rle1$values[y] <- cumsum(rle1$values[y])
                  return(inverse.rle(rle1))
                  }


                  do.call(rbind,
                  lapply(split(df, df$group), function(i){i$goal <- f1(i$criterium);
                  i$goal <- replace(i$goal, is.na(i$criterium)|!i$criterium, NA);
                  i}))


                  Of course, If you want you can apply it via dplyr, i.e.



                  library(dplyr)

                  df %>%
                  group_by(group) %>%
                  mutate(goal = f1(criterium),
                  goal = replace(goal, is.na(criterium)|!criterium, NA))


                  which gives,




                  # A tibble: 18 x 3
                  # Groups: group [2]
                  group criterium goal
                  <fct> <lgl> <dbl>
                  1 A NA NA
                  2 A TRUE 1
                  3 A TRUE 1
                  4 A TRUE 1
                  5 A FALSE NA
                  6 A FALSE NA
                  7 A TRUE 2
                  8 A TRUE 2
                  9 A FALSE NA
                  10 A TRUE 3
                  11 A TRUE 3
                  12 A TRUE 3
                  13 B NA NA
                  14 B FALSE NA
                  15 B TRUE 1
                  16 B TRUE 1
                  17 B TRUE 1
                  18 B FALSE NA






                  share|improve this answer















                  A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.



                  f1 <- function(x) {
                  x[is.na(x)] <- FALSE
                  rle1 <- rle(x)
                  y <- rle1$values
                  rle1$values[!y] <- 0
                  rle1$values[y] <- cumsum(rle1$values[y])
                  return(inverse.rle(rle1))
                  }


                  do.call(rbind,
                  lapply(split(df, df$group), function(i){i$goal <- f1(i$criterium);
                  i$goal <- replace(i$goal, is.na(i$criterium)|!i$criterium, NA);
                  i}))


                  Of course, If you want you can apply it via dplyr, i.e.



                  library(dplyr)

                  df %>%
                  group_by(group) %>%
                  mutate(goal = f1(criterium),
                  goal = replace(goal, is.na(criterium)|!criterium, NA))


                  which gives,




                  # A tibble: 18 x 3
                  # Groups: group [2]
                  group criterium goal
                  <fct> <lgl> <dbl>
                  1 A NA NA
                  2 A TRUE 1
                  3 A TRUE 1
                  4 A TRUE 1
                  5 A FALSE NA
                  6 A FALSE NA
                  7 A TRUE 2
                  8 A TRUE 2
                  9 A FALSE NA
                  10 A TRUE 3
                  11 A TRUE 3
                  12 A TRUE 3
                  13 B NA NA
                  14 B FALSE NA
                  15 B TRUE 1
                  16 B TRUE 1
                  17 B TRUE 1
                  18 B FALSE NA







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 18 hours ago

























                  answered 20 hours ago









                  SotosSotos

                  31.4k51741




                  31.4k51741























                      4














                      A data.table option using rle



                      library(data.table)
                      DT <- as.data.table(dat)
                      DT[, goal := {
                      r <- rle(replace(criterium, is.na(criterium), FALSE))
                      r$values <- with(r, cumsum(values) * values)
                      out <- inverse.rle(r)
                      replace(out, out == 0, NA)
                      }, by = group]
                      DT
                      # group criterium goal
                      # 1: A NA NA
                      # 2: A TRUE 1
                      # 3: A TRUE 1
                      # 4: A TRUE 1
                      # 5: A FALSE NA
                      # 6: A FALSE NA
                      # 7: A TRUE 2
                      # 8: A TRUE 2
                      # 9: A FALSE NA
                      #10: A TRUE 3
                      #11: A TRUE 3
                      #12: A TRUE 3
                      #13: B NA NA
                      #14: B FALSE NA
                      #15: B TRUE 1
                      #16: B TRUE 1
                      #17: B TRUE 1
                      #18: B FALSE NA


                      step by step



                      When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle



                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...


                      We manipulate the values compenent in the following way



                      r$values <- with(r, cumsum(values) * values)
                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : int [1:9] 0 1 0 2 0 3 0 4 0


                      That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times



                      out <- inverse.rle(r)
                      out
                      # [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0


                      This is almost what OP wants but we need to replace the 0s with NA



                      replace(out, out == 0, NA)


                      This is done for each group.



                      data



                      dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                      1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
                      "B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
                      FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
                      TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
                      -18L))





                      share|improve this answer


























                      • Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

                        – Humpelstielzchen
                        20 hours ago






                      • 1





                        @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

                        – markus
                        20 hours ago













                      • Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

                        – Humpelstielzchen
                        19 hours ago
















                      4














                      A data.table option using rle



                      library(data.table)
                      DT <- as.data.table(dat)
                      DT[, goal := {
                      r <- rle(replace(criterium, is.na(criterium), FALSE))
                      r$values <- with(r, cumsum(values) * values)
                      out <- inverse.rle(r)
                      replace(out, out == 0, NA)
                      }, by = group]
                      DT
                      # group criterium goal
                      # 1: A NA NA
                      # 2: A TRUE 1
                      # 3: A TRUE 1
                      # 4: A TRUE 1
                      # 5: A FALSE NA
                      # 6: A FALSE NA
                      # 7: A TRUE 2
                      # 8: A TRUE 2
                      # 9: A FALSE NA
                      #10: A TRUE 3
                      #11: A TRUE 3
                      #12: A TRUE 3
                      #13: B NA NA
                      #14: B FALSE NA
                      #15: B TRUE 1
                      #16: B TRUE 1
                      #17: B TRUE 1
                      #18: B FALSE NA


                      step by step



                      When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle



                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...


                      We manipulate the values compenent in the following way



                      r$values <- with(r, cumsum(values) * values)
                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : int [1:9] 0 1 0 2 0 3 0 4 0


                      That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times



                      out <- inverse.rle(r)
                      out
                      # [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0


                      This is almost what OP wants but we need to replace the 0s with NA



                      replace(out, out == 0, NA)


                      This is done for each group.



                      data



                      dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                      1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
                      "B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
                      FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
                      TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
                      -18L))





                      share|improve this answer


























                      • Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

                        – Humpelstielzchen
                        20 hours ago






                      • 1





                        @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

                        – markus
                        20 hours ago













                      • Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

                        – Humpelstielzchen
                        19 hours ago














                      4












                      4








                      4







                      A data.table option using rle



                      library(data.table)
                      DT <- as.data.table(dat)
                      DT[, goal := {
                      r <- rle(replace(criterium, is.na(criterium), FALSE))
                      r$values <- with(r, cumsum(values) * values)
                      out <- inverse.rle(r)
                      replace(out, out == 0, NA)
                      }, by = group]
                      DT
                      # group criterium goal
                      # 1: A NA NA
                      # 2: A TRUE 1
                      # 3: A TRUE 1
                      # 4: A TRUE 1
                      # 5: A FALSE NA
                      # 6: A FALSE NA
                      # 7: A TRUE 2
                      # 8: A TRUE 2
                      # 9: A FALSE NA
                      #10: A TRUE 3
                      #11: A TRUE 3
                      #12: A TRUE 3
                      #13: B NA NA
                      #14: B FALSE NA
                      #15: B TRUE 1
                      #16: B TRUE 1
                      #17: B TRUE 1
                      #18: B FALSE NA


                      step by step



                      When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle



                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...


                      We manipulate the values compenent in the following way



                      r$values <- with(r, cumsum(values) * values)
                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : int [1:9] 0 1 0 2 0 3 0 4 0


                      That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times



                      out <- inverse.rle(r)
                      out
                      # [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0


                      This is almost what OP wants but we need to replace the 0s with NA



                      replace(out, out == 0, NA)


                      This is done for each group.



                      data



                      dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                      1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
                      "B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
                      FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
                      TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
                      -18L))





                      share|improve this answer















                      A data.table option using rle



                      library(data.table)
                      DT <- as.data.table(dat)
                      DT[, goal := {
                      r <- rle(replace(criterium, is.na(criterium), FALSE))
                      r$values <- with(r, cumsum(values) * values)
                      out <- inverse.rle(r)
                      replace(out, out == 0, NA)
                      }, by = group]
                      DT
                      # group criterium goal
                      # 1: A NA NA
                      # 2: A TRUE 1
                      # 3: A TRUE 1
                      # 4: A TRUE 1
                      # 5: A FALSE NA
                      # 6: A FALSE NA
                      # 7: A TRUE 2
                      # 8: A TRUE 2
                      # 9: A FALSE NA
                      #10: A TRUE 3
                      #11: A TRUE 3
                      #12: A TRUE 3
                      #13: B NA NA
                      #14: B FALSE NA
                      #15: B TRUE 1
                      #16: B TRUE 1
                      #17: B TRUE 1
                      #18: B FALSE NA


                      step by step



                      When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle



                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...


                      We manipulate the values compenent in the following way



                      r$values <- with(r, cumsum(values) * values)
                      r
                      #Run Length Encoding
                      # lengths: int [1:9] 1 3 2 2 1 3 2 3 1
                      # values : int [1:9] 0 1 0 2 0 3 0 4 0


                      That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times



                      out <- inverse.rle(r)
                      out
                      # [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0


                      This is almost what OP wants but we need to replace the 0s with NA



                      replace(out, out == 0, NA)


                      This is done for each group.



                      data



                      dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                      1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
                      "B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
                      FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
                      TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
                      -18L))






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited 16 hours ago

























                      answered 20 hours ago









                      markusmarkus

                      15.4k11336




                      15.4k11336













                      • Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

                        – Humpelstielzchen
                        20 hours ago






                      • 1





                        @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

                        – markus
                        20 hours ago













                      • Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

                        – Humpelstielzchen
                        19 hours ago



















                      • Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

                        – Humpelstielzchen
                        20 hours ago






                      • 1





                        @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

                        – markus
                        20 hours ago













                      • Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

                        – Humpelstielzchen
                        19 hours ago

















                      Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

                      – Humpelstielzchen
                      20 hours ago





                      Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

                      – Humpelstielzchen
                      20 hours ago




                      1




                      1





                      @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

                      – markus
                      20 hours ago







                      @Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

                      – markus
                      20 hours ago















                      Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

                      – Humpelstielzchen
                      19 hours ago





                      Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

                      – Humpelstielzchen
                      19 hours ago


















                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55606323%2frank-groups-within-a-grouped-sequence-of-true-false-and-na%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Knooppunt Holsloot

                      Altaar (religie)

                      Gregoriusmis