Activity
Mon
Wed
Fri
Sun
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Memberships

Learn Microsoft Fabric

Public • 5.4k • Free

Fabric Dojo 织物

Private • 202 • $39/m

42 contributions to Learn Microsoft Fabric
Data Skew in Spark notebook
I’ve been encountering data skew issues in my Spark notebook. I’ve tried implementing salting and repartitioning with different columns, but regardless of the combinations or the number of partitions (up to 200), the result is still skewed data. I’m running out of ideas on how to resolve this. Has anyone else faced similar issues or have any suggestions?
0
2
New comment Aug 24
1 like • Aug 24
Hi @Stéphane Michel i found this article on data skewness, I hope it helps. https://statusneo.com/solving-data-skewness-in-apache-spark-techniques-and-best-practices/
Create your CV in Power BI
Last week I landed a new job in a company in the forefront in Fabric (and Tabular Editor) in the Nordics. I think creating a CV in Power BI helped. Maybe you can find some inspiration from the one I created. Link to CV. Recently also How to Power BI released some ideas on this topic: https://www.youtube.com/watch?v=GkDOIRYiGFg Background: Coming from the Power BI/ user side, I have previously found it hard to apply for jobs, getting the following reply "we had a candidate with more database competence". I was about to send a CV in pdf, when I went to the grocery store and listened to Explicit Measures Podcast (recommended), where they advised to create it in Power BI to show your skills. Two days later, the CV was sent in Power BI and a few hours later the first interview was booked. Much more fun then creating a PDF. And no recruiters I talked to had seen this before and one said it was the most impressive he seen in 20 years (writing CVs is not my strength). They were all enthusiastic about it. Also used some statistics from this site to show my interest in Fabric (a comment I read from @Will Needham ). For improving it further, maybe I could add a hobby-project as well, such as getting data from Strava (to combine work and spare time interests). Good luck if you are looking for new opportunities.
36
26
New comment Sep 25
1 like • Aug 24
Congrats @Eivind Haugen and well done.
Paralelizm in Fabric
Hi, I have a question about paralelizm (launching parallel tasks to make load faster). Does pyspark in Fabric works the same as in DataBricks? Which means it is smart and does paralelizm automatically if I am not mistaken. Or should I make him do it in python script? Thank you. J python: https://docs.python.org/3/library/concurrent.futures.html
2
3
New comment Aug 13
1 like • Aug 12
.and yes parallelism works in Fabric. Fabric will create a queue once it cannot kick off the next process.
Databricks vs Fabric
Does anyone have comments on experiences using Databricks vs fabric. Is this an ‘or’ decision or can it be an ‘and’ possibility?
5
5
New comment Aug 10
3 likes • Aug 7
Hi @Patricia Anderson though there are similarities between the two Databricks is a more mature and stable product in all areas regarding to Notebooks and metadata driven solutions. That saying, Fabric offers more than notebooks and the present dsy gain is for people coming from Power BI. New features are being relessed and Microsoft continues to improve Fabric, no doubt it will be a top product in a few years. Remember what Power BI was like in 2018 and compare it to present day. Good luck
Data Pipeline Failure Notification
Does anyone have a good idea how you can get the notification when the data pipeline is failed? As far as I know, there are notification activities - Team or Outlook but its for activity level and not for the pipeline or workspace level. You need to add the notification activity per each pipeline activities if you want to monitor the entire pipeline properly. Someone suggests that you can wrap the pipeline within another pipeline and run Outlook or Team activity if the wrapped pipeline has failure. Any good idea?
3
11
New comment Aug 7
0 likes • Aug 7
@Jerry Lee yes, that's true. It'll give you more detail on the point of failure. If you want to capture logs at job-run level, can you have Script on start of the job and have another dynamic script to log any failures. OnFailure of every task points to it with any parameters you want passed to the notification. Not sure how tidy or feasible this approach is though.
0 likes • Aug 7
@Jerry Lee i've re-read your question. You can call a pipeline from another pipeline and the run-status of the child will propagate to the parent. In the parent pipeline, add the onFailure action to the child pipeline component, to send the notification.
1-10 of 42
Adeola Adeyemo
4
86points to level up
@adeola-adeyemo-8729
I'm a Data Consultant, covering data architecture, data engineering, data governance and AI/ML. I'm looking forward to learning MS Fabric. Thank you

Active 17d ago
Joined May 6, 2024
London
powered by