python - Can I create a new column based on when the value changes in another column? -
let s have df
print(df) date_time b 0 10/08/2016 12:04:56 1 5 1 10/08/2016 12:04:58 1 6 2 10/08/2016 12:04:59 2 3 3 10/08/2016 12:05:00 2 2 4 10/08/2016 12:05:01 3 4 5 10/08/2016 12:05:02 3 6 6 10/08/2016 12:05:03 1 3 7 10/08/2016 12:05:04 1 2 8 10/08/2016 12:05:05 2 4 9 10/08/2016 12:05:06 2 6 10 10/08/2016 12:05:07 3 4 11 10/08/2016 12:05:08 3 2
the values in column ['a']
repeat on time, need column though, have new id each time change, have following df
print(df) date_time b c 0 10/08/2016 12:04:56 1 5 1 1 10/08/2016 12:04:58 1 6 1 2 10/08/2016 12:04:59 2 3 2 3 10/08/2016 12:05:00 2 2 2 4 10/08/2016 12:05:01 3 4 3 5 10/08/2016 12:05:02 3 6 3 6 10/08/2016 12:05:03 1 3 4 7 10/08/2016 12:05:04 1 2 4 8 10/08/2016 12:05:05 2 4 5 9 10/08/2016 12:05:06 2 6 5 10 10/08/2016 12:05:07 3 4 6 11 10/08/2016 12:05:08 3 2 6
is there way python? still new , hoped find me in pandas, have not found yet. in original dataframe values in column ['a']
change on irregular intervals approximately every ten minutes , not every 2 rows in example. has idea how approach task? thank you
you can use shift-cumsum pattern.
df['c'] = (df.a != df.a.shift()).cumsum() >>> df date_time b c 0 10/08/2016 12:04:56 1 5 1 1 10/08/2016 12:04:58 1 6 1 2 10/08/2016 12:04:59 2 3 2 3 10/08/2016 12:05:00 2 2 2 4 10/08/2016 12:05:01 3 4 3 5 10/08/2016 12:05:02 3 6 3 6 10/08/2016 12:05:03 1 3 4 7 10/08/2016 12:05:04 1 2 4 8 10/08/2016 12:05:05 2 4 5 9 10/08/2016 12:05:06 2 6 5 10 10/08/2016 12:05:07 3 4 6 11 10/08/2016 12:05:08 3 2 6
as side note, popular pattern grouping. example, average b
value of each such group:
df.groupby((df.a != df.a.shift()).cumsum()).b.mean()
Comments
Post a Comment