python - removing rows with any column containing NaN, NaTs, and nans -
currently have data below:
df_all.head() out[2]: unnamed: 0 symbol date close weight 0 4061 2016-01-13 36.515889 (0.000002) 1 4062 aa 2016-01-14 36.351784 0.000112 2 4063 aac 2016-01-15 36.351784 (0.000004) 3 4064 aal 2016-01-19 36.590483 0.000006 4 4065 aamc 2016-01-20 35.934062 0.000002 df_all.tail() out[3]: unnamed: 0 symbol date close weight 1252498 26950320 nan nat 9.84 nan 1252499 26950321 nan nat 10.26 nan 1252500 26950322 nan nat 9.99 nan 1252501 26950323 nan nat 9.11 nan 1252502 26950324 nan nat 9.18 nan df_all.dtypes out[4]: unnamed: 0 int64 symbol object date datetime64[ns] close float64 weight object dtype: object
as can seen, getting values in symbol of nan, nat date , nan weight.
my goal: want remove row has column containing nan, nat or nan , have new df_clean result
i don't seem able apply appropriate filter? not sure if have convert datatypes first (although tried well)
since, symbol 'nan'
not caught dropna()
or isnull()
. need cast symbol'nan'
np.nan
try this:
df["symbol"] = np.where(df["symbol"]=='nan',np.nan, df["symbol"] ) df.dropna()
Comments
Post a Comment