Browse our archives by topic…
Data

Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?
One-hot encoding and label encoding are two methods used to encode categorical data. Understand the specific advantages and disadvantages of these techniques.

There's something wrong with the Pandas API on Spark
Fix the following issues: Errors converting large datasets to pandas, pandas for Spark is very slow, and pandas for Spark column reduction doesn't reduce data.

Per-Property Rows from JSON in Spark on Microsoft Fabric
Spark doesn't always interpret JSON how we'd like. For example, if each key/value pair in a JSON object is conceptually one item, Spark won't give you a row per item by default. This article shows how to nudge Spark in the right direction.