Activity
Mon
Wed
Fri
Sun
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
What is this?
Less
More

Memberships

Learn Microsoft Fabric

Public • 5.9k • Free

Learn Power Apps

Private • 2k • $3/m

13 contributions to Learn Microsoft Fabric
Incremental Refresh for Dataflow Gen2 (Public Preview)
Hey everyone, the conference updates are now starting 👀 Starting with this incremental refresh feature added to the Dataflow Gen2. You can read more here: https://blog.fabric.microsoft.com/en-us/blog/announcing-public-preview-incremental-refresh-in-dataflows-gen2/
26
14
New comment Oct 21
Incremental Refresh for Dataflow Gen2 (Public Preview)
1 like • Oct 3
Finally this feature is here. Better late than never. I wish Incremental Refresh feature was available in Dataflow Gen2 months back. Our pains in the ongoing implementation would have been reduced.
Seeking Best Practices for CSV to JSON Transformation in Fabric
Hi everyone! My name is Max, and I work as an analyst in a small company. And i'm thrilled to build Fabric solutions for my tasks instead of huge piles of different tools that are hard to log and monitor. I'm new to Microsoft Fabric, and as an exercise to learn it, I decided to migrate one of my existing projects to the Fabric architecture. I'm facing some challenges with the architecture. Here is a brief description of the old process: 1. The client sends a list of agencies in a .csv file. 2. Validation and mapping are done within a Logic App. 3. The agency table is transformed into a set of JSON objects (one object per row) using Logic App and Azure Functions. The resulting JSON object has a more complex structure than the original table, so it cannot be converted 1:1 from the table to JSON. 4. All JSONs are saved in Blob Storage (in the new architecture, I was thinking of switching to Lakehouse to better orchestrate and log all these JSON files). While the first two steps are straightforward to implement in Fabric (thanks to Will's videos), I'm struggling with the third step. I can't figure out what is the best way to separate csv rows into set of json files. I have tried PySpark, but either I didn't understand something, or it's not the best fit for this task - saving JSON files to Lake Storage is behaving oddly. Also, I couldn't find a suitable solution in DataFlow and Pipeline (calling Azure Function seems like a possible option, but I can't figure out the best way to organize it). SQL doesn't look helpful, but It doesn't accept FOR JSON for some reason. Or should I just use something outside Fabric for this specific operation? I would appreciate any ideas on how to make this transformation as efficient and effective as possible, thanks! 💚
0
6
New comment Jul 10
0 likes • Jul 9
From the explanation you provided, if you need to leverage Microsoft Fabric, then there is no need of converting to JSON in this scenario, unless there is a specific requirement. Fabric is capable of Ingesting the CSV directly into Lakehouse tables, using 3 approaches: 1. Dataflow Gen2 2. Copy activity in Data factory pipelines 3. Pyspark notebooks
Cross-workspace access.
Hello everyone, Is it possible to access one dataflow from different workspaces.
1
6
New comment Jul 10
3 likes • Jul 9
"Accessing a dataflow from different workspace" has multiple interpretations. Please check if any of these answer your questions. All these points have an underlying assumption that all these multiple workspaces are accessible by the user. 1. If your question is that when we open up workspace A and land up on its homepage, does the dataflow from workspace B get listed here or can be searched here? Answer is No. 2. A dataflow defined in workspace A, can choose tables from lakehouse belonging to workspace B as sources and can choose destination tables as those from lakehouse belonging to workspace C. 3. A dataflow is defined in workspace A but you want to create a pipeline in workspace B, where one of the activities is expected to be thus dataflow from workspace A. Answer is Yes, this is possible.
What are your biggest pain points currently with Fabric?
Hey everyone, happy Monday! I'm currently planning out future content for the YouTube channel, and want to always produce the content that is most relevant/helpful to you! So, some questions: - What are your biggest pain points currently with Fabric? - Anything you're struggling to understand? - Things you think are important, but don't quite grasp yet? It's this kind of engagement that led to the Power BI -> Fabric series, and then the DP-600 series, so I thought I'd open it up to you again! Look forward to hearing from you - thank you! Here's some potential things: - Delta file format - integrating AI/ ML/ Azure AI / Open AI - copilot - Git integration / deployment pipelines / CI/CD - data modelling / implementing SCDs - medallion implementation - more advanced pyspark stuff - data pipelines - metadata driven workflows - dataflows (and optimising dataflows) - lakehouse achitectures - real-time - data science projects - semantic link - migrating semantic models - using python to manage semantic models - administration/ automation - fabric api - other...?
Complete action
11
61
New comment Aug 7
What are your biggest pain points currently with Fabric?
1 like • May 21
@Will Needham this is my list itself. I faced each of these during my ongoing implementation at a client project. It was a thrilling experience to overcome each of these challenges by exploring multiple alternate approaches. 🙂
2 likes • May 29
@Will Needham and @Julio Ochoa , I really appreciate the efforts you took in reading my long comment so patiently and answering those. Hats off to your efforts. My "knowledge repository" about Fabric features has definitely enriched on reading these answers. Thank you very much. Some of the limitations which I mentioned are due to certain constraints in the customer environment and for most of the questions, I got really good inputs from both of you. I will surely get back on those however, in mid-June. The go-live of this Fabric implementation is scheduled next week. There is an excitement as well as fear. With the blessings of you and the community members, I am sure the ride will be smoother. Can't wait to share my experiences and takeaways from this Fabric implementation once the go-live and post-go-live processes are accomplished.
Explicit Path Question
Good morning, I have been following along with the Spark Tutorial in Microsoft Fabric and ran into an interesting issue that I have yet to understand. After leaving my office Friday I paused the lesson after running the line of code: df = spark.read.csv('Files/property-sales-extended.csv', header=True, inferSchema=True) df.show() The code ran fine on Friday and did again this morning. After loading another file the code failed: df = spark.read.csv('Files/property-sales-missing.csv', header=True, inferSchema=True) df.show() The error message was that the path was not valid. Same LakeHouse, same folder, same everything as far as I could tell. I solved the issue by changing the call to explicitly include the name of the LakeHouse folder: df = spark.read.csv('Files/csv/property-sales-missing.csv', header=True, inferSchema=True) df.show() The ABFS path also works. Can someone help me understand the reason for this? Thanks, Steve Malcolm
1
4
New comment May 29
0 likes • May 29
@Steve Malcolm , plz try with this code: df = spark.read.csv('/lakehouse/default/Files/property-sales-extended.csv', header=True, inferSchema=True)
1-10 of 13
Chinmay Phadke
3
39points to level up
@chinmay-phadke-9568
I am a Data Engineer.

Active 23d ago
Joined Apr 2, 2024
powered by