PySpark - Custom aggregation count in groupBy

less than 1 minute read

PySpark custom aggregation count example in groupBy.

  cnt_cond = lambda cond: F.sum(F.when(cond, 1).otherwise(0))
  df.groupBy(df.date).agg(F.avg(df.price).alias('avg'),
                          cnt_cond(df.include == 'true').alias('count_cnd')) \
                          .show()

Github code

Share on

Twitter Facebook LinkedIn

C++ - Variadic template expression

less than 1 minute read

C++ Variadic Templates and Parameter Packs - Vladimir Vishnevskii - C++ on Sea 2025

MLP, CNN and RNN

less than 1 minute read

Model Connection Style Best For Weakness MLP Fully connected, dense layers Tabular data, g...

LLM - key terms

2 minute read

Key terms from Reasoning Engine by Gulli

C++ - new language features for C++20/23

less than 1 minute read

Youtube videos for new features in C++20/23

Nam Seob Seo

PySpark - Custom aggregation count in groupBy

Share on

You may also enjoy

C++ - Variadic template expression

MLP, CNN and RNN

LLM - key terms

C++ - new language features for C++20/23