John Doe

Learn Microsoft Fabric

Activity

Mon

Wed

Fri

Sun

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

What is this?

Less

Memberships

Learn Microsoft Fabric

Public • 6.5k • Free

3 contributions to Learn Microsoft Fabric

John Doe

18d ago in

Technical

Index Structured JSON documents in Microsoft Fabric

Hello Community, Today, I run some elasticsearch clusters to index a large number of structured JSON documents spread across 100s of indices. This is wrapped by a microservice service X, to which a user makes calls (with elastic DSL), and X queries elasticsearch behind the scenes. Now, If I want to achieve this using Microsoft Fabric (where X queries Fabric instead of Elastic) - What are some options I can explore? Fabric SQL can be one I am thinking to explore - but what are some other things I can explore to achieve this functionality?

New comment 3d ago

John Doe

0 likes • 11d

Thanks for your suggestion Prateek. Can you tell me more about the complexities involved with Fabric SQL? Can't I have a column with JSON data type or something like that?

John Doe

Apr 1 in

Technical

Syncing lakehouse Shortcut in Files to Table

Say I have a data source (ADLS Gen2) with the following contents. CompanyData/Employee/emp1.json CompanyData/Employee/emp2.json CompanyData/Company/comp1.json The emp* and comp* have different JSON fields/structures (can also have nested fields). Ultimately, I want these JSONs' data converted to lakehouse Tables with columns inferred from the json data structure, at-least for the top level fields? Tables -- CompanyData ----Employee (Structure inferred from JSON structure?) -------Columns: FullName, Age, etc. Rows from emp1.json and emp2.json ----Company -------Columns: CompanyName, Address, etc. Rows from comp1.json So, the questions are :- 1. I believe the shortcut has to be created under "Files", as the json in the source, is not in delta parquet format for onelake to automatically recognize it as table. Is this right? 2. Can I achieve what I want automatically out of the box? or do I have to write a spark job or something to transform the data from files into Tables? 3. The source can keep getting modified (addition of files, deletions, etc). How can I keep lakehouse "tables" in sync with the shortcut in "Files" - the source can potentially have 10s and 100s of millions of json files. Periodically running spark job to transform all those millions of files may not be a good idea. Thanks!!

New comment Apr 2

John Doe

0 likes • Apr 2

Got it. Is there a database engine that allows me to store KV pairs in parquet, which I can use in my CRUD APIs, that i can directly link to Fabric through some means?

John Doe

Mar 10 in

Technical

Periodic Data Import from Source to Lakehouse

I have a data source, which undergoes frequent CRUD operations. I see from most of the tutorials that the data ingestion are one-time activities like load from file, source from blob, etc. How can I make it periodic so that only the delta new records that were created at source, are ingested into the lakehouse?

New comment Aug 13

John Doe

0 likes • Mar 11

Great! Thanks for this info!

1-3 of 3

Level 1

3points to level up

John Doe

@john-doe-3502

I'm a Software Engineer, working on building data platforms for various industries!

Active 10d ago

Joined Dec 2, 2024

Sri Lanka

Contributions

Followers

Following