Activity
Mon
Wed
Fri
Sun
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
What is this?
Less
More

Memberships

Learn Microsoft Fabric

14.3k members • Free

Fabric Dojo 织物

364 members • $30/month

42 contributions to Learn Microsoft Fabric
Data Skew in Spark notebook
I’ve been encountering data skew issues in my Spark notebook. I’ve tried implementing salting and repartitioning with different columns, but regardless of the combinations or the number of partitions (up to 200), the result is still skewed data. I’m running out of ideas on how to resolve this. Has anyone else faced similar issues or have any suggestions?
1 like • Aug '24
Hi @Stéphane Michel i found this article on data skewness, I hope it helps. https://statusneo.com/solving-data-skewness-in-apache-spark-techniques-and-best-practices/
Create your CV in Power BI
Last week I landed a new job in a company in the forefront in Fabric (and Tabular Editor) in the Nordics. I think creating a CV in Power BI helped. Maybe you can find some inspiration from the one I created. Link to CV. Recently also How to Power BI released some ideas on this topic: https://www.youtube.com/watch?v=GkDOIRYiGFg Background: Coming from the Power BI/ user side, I have previously found it hard to apply for jobs, getting the following reply "we had a candidate with more database competence". I was about to send a CV in pdf, when I went to the grocery store and listened to Explicit Measures Podcast (recommended), where they advised to create it in Power BI to show your skills. Two days later, the CV was sent in Power BI and a few hours later the first interview was booked. Much more fun then creating a PDF. And no recruiters I talked to had seen this before and one said it was the most impressive he seen in 20 years (writing CVs is not my strength). They were all enthusiastic about it. Also used some statistics from this site to show my interest in Fabric (a comment I read from @Will Needham ). For improving it further, maybe I could add a hobby-project as well, such as getting data from Strava (to combine work and spare time interests). Good luck if you are looking for new opportunities.
1 like • Aug '24
Congrats @Eivind Haugen and well done.
Paralelizm in Fabric
Hi, I have a question about paralelizm (launching parallel tasks to make load faster). Does pyspark in Fabric works the same as in DataBricks? Which means it is smart and does paralelizm automatically if I am not mistaken. Or should I make him do it in python script? Thank you. J python: https://docs.python.org/3/library/concurrent.futures.html
1 like • Aug '24
.and yes parallelism works in Fabric. Fabric will create a queue once it cannot kick off the next process.
Databricks vs Fabric
Does anyone have comments on experiences using Databricks vs fabric. Is this an ‘or’ decision or can it be an ‘and’ possibility?
3 likes • Aug '24
Hi @Patricia Anderson though there are similarities between the two Databricks is a more mature and stable product in all areas regarding to Notebooks and metadata driven solutions. That saying, Fabric offers more than notebooks and the present dsy gain is for people coming from Power BI. New features are being relessed and Microsoft continues to improve Fabric, no doubt it will be a top product in a few years. Remember what Power BI was like in 2018 and compare it to present day. Good luck
Data Pipeline Failure Notification
Does anyone have a good idea how you can get the notification when the data pipeline is failed? As far as I know, there are notification activities - Team or Outlook but its for activity level and not for the pipeline or workspace level. You need to add the notification activity per each pipeline activities if you want to monitor the entire pipeline properly. Someone suggests that you can wrap the pipeline within another pipeline and run Outlook or Team activity if the wrapped pipeline has failure. Any good idea?
0 likes • Aug '24
@Jerry Lee yes, that's true. It'll give you more detail on the point of failure. If you want to capture logs at job-run level, can you have Script on start of the job and have another dynamic script to log any failures. OnFailure of every task points to it with any parameters you want passed to the notification. Not sure how tidy or feasible this approach is though.
0 likes • Aug '24
@Jerry Lee i've re-read your question. You can call a pipeline from another pipeline and the run-status of the child will propagate to the parent. In the parent pipeline, add the onFailure action to the child pipeline component, to send the notification.
1-10 of 42
Adeola Adeyemo
4
85points to level up
@adeola-adeyemo-8729
I'm a Data Consultant, covering data architecture, data engineering, data governance and AI/ML. I'm looking forward to learning MS Fabric. Thank you

Active 143d ago
Joined May 6, 2024
London
Powered by