Browse our archives by topic…
Data
data:image/s3,"s3://crabby-images/7d2d7/7d2d712ce540f9fd8c67dd7ab58fb09fe8d2cdb2" alt="Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?"
Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?
One-hot encoding and label encoding are two methods used to encode categorical data. Understand the specific advantages and disadvantages of these techniques.
data:image/s3,"s3://crabby-images/b8f6f/b8f6fe641e37c1f3bff27363b69195add14d6282" alt="There's something wrong with the Pandas API on Spark"
There's something wrong with the Pandas API on Spark
Fix the following issues: Errors converting large datasets to pandas, pandas for Spark is very slow, and pandas for Spark column reduction doesn't reduce data.
data:image/s3,"s3://crabby-images/3f628/3f628698062d189e0985a189157de455eb350cac" alt="Per-Property Rows from JSON in Spark on Microsoft Fabric"
Per-Property Rows from JSON in Spark on Microsoft Fabric
Spark doesn't always interpret JSON how we'd like. For example, if each key/value pair in a JSON object is conceptually one item, Spark won't give you a row per item by default. This article shows how to nudge Spark in the right direction.