Jimmy Jones

Data Alchemy

Activity

Mon

Wed

Fri

Sun

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

What is this?

Less

Memberships

Data Alchemy

Public • 21k • Free

65 contributions to Data Alchemy

Jimmy Jones

16h ago in

💬 General

Another Interesting Article on Semantic Chunking

https://towardsdatascience.com/a-visual-exploration-of-semantic-text-chunking-6bb46f728e30

Jimmy Jones

18d ago in

💬 General

POC - Youtube Video Transcript Knowledge Base Chatbot

The goal of this project is to develop a system that processes YouTube video transcripts, creates embeddings for smaller document chunks, and facilitates user interaction through a chat interface. The system will enable users to ask questions and receive relevant content along with context from the original video, including the URL to the video and the relevant timestamps. This post will be updated as I work through the concepts and start completing tasks. Update #1: To add a little more clarity, I am architecting this in a way where there are components that will be added to a pipeline, and then the pipeline will be run periodically to check for new videos and get the transcript for them. I also need to add a component that checks for the last video id that was processed and use that id to check if there are latest videos. # Components: ## get_youtube_transcript #### Inputs: ``youtube URL`` #### Outputs: ``transcript`` ``message`` ## clean_transcript #### Inputs: ``transcript`` #### Outputs: ``cleaned transcript`` ``message`` ## save_cleaned_transcript #### Input: ``cleaned transcript`` #### Output: ``message`` ## chunk_transcript #### Inputs: ``cleaned transcript`` #### Outputs: ``chunks`` ``message`` ## create_embeddings #### Inputs: ``chunks`` #### Outputs: ``embeddings`` ``message`` ## save_embeddings #### Inputs: ``embeddings`` #### Outputs: ``message``

New comment 16h ago

Jimmy Jones

0 likes • 2d

This is a wall of text but this is the result of a custom modfile in ollama. Modfile: FROM llama3.2:3b-instruct-q4_0 PARAMETER temperature 0.1 PARAMETER num_ctx 8192 SYSTEM You are a book editor whose only job is to take in raw text and add correct punctuation. Do not add or remove any words. If there is punctuation that is incorrect, you may remove it or move it to the appropriate position within the text. My prompt: (base) jjones@Jamess-MacBook-Pro ollama % ollama run llama3.2:3b-instruct-q4_0-8k >>> 0:00 have you ever set a goal like a big goal 0:02 for the year only to lose momentum in a 0:04 few weeks well in today's video I'm 0:06 going to show you exactly how to use the 0:07 full focused planner to set powerful 0:10 quarterly goals that stick we're going 0:12 to talk about habit goals versus 0:14 achievement goals and why your key 0:15 motivation matters more than you think 0:18 hey there if you're new here I'm Riley 0:19 I'm an Air Force officer and a giant 0:21 productivity nerd if you're ready to 0:22 start setting intentional achievable 0:24 goals that you actually stick to this is 0:27 the video for you so why quarterly I get 0:31 it it sounds kind of corporate kind of 0:33 boring but there's a reason that 0:34 successful businesses actually break 0:36 down the results by quarter LLM Output: Here's the edited text with correct punctuation: 0:00 - Have you ever set a goal, like a big goal? 0:02 For the year, only to lose momentum in a few weeks. 0:04 Well, in today's video, I'm going to show you exactly how to use the full-focused planner to set powerful quarterly goals that stick. 0:06 We're going to talk about habit goals versus achievement goals and why your key motivation matters more than you think. 0:12 Hey there! If you're new here, I'm Riley. 0:19 I'm an Air Force officer and a giant productivity nerd. 0:21 If you're ready to start setting intentional, achievable goals that you actually stick to, this is the video for you. 0:27 So, why quarterly? It sounds kind of

Jimmy Jones

0 likes • 16h

I won't post another wall of text but I did get the basic implementation of a semantic chunker working, although it is slow on my macbook pro. Need to try it out on my desktop pc tomorrow to see if I can gain a little speed on the gtx1080 and 32gb or ram. "messages": [ "Processed 59 sentences into 13 chunks for video ID io_LVnhtRVc." ]

Jimmy Jones

2d ago in

💬 General

Good Introduction to Context Length

https://agi-sphere.com/context-length/

Jimmy Jones

3d ago in

💬 General

The Fastest Way To Achieve Literally Anything (You'll Never Feel Lost Again)

I have been feeling a little bit scattered the last few years, unable to focus on 1 specific thing because of the necessity to stop whatever I was doing whenever a new legal document in my case was filed or a deadline was approaching. I now find myself in a condition which the following video gave words to; physic entropy. So I watched this video today and while I was thinking how to apply the information in my own life, a thought kind of hit me. What if agentic AI systems were setup in a way that followed the content of this video? https://www.youtube.com/watch?v=aoB2CiFNGdc

New comment 2d ago

The Fastest Way To Achieve Literally Anything (You'll Never Feel Lost Again)

Jimmy Jones

2 likes • 2d

@Paraskevi Kivroglou I am thinking of breaking agents into subagents that each do a specific part of the system in the video. I'm not super clear on the vision yet.

Carl Scutt

6d ago in

💬 General

Scratching my head about this for a long time

The Introduction to Langchain (LLM Applications) was very interesting and it is touching on some things that I have been struggling with. My question about this video goes like this. My great battle is using my preferred LLM in this case Ollama and although most frameworks say they are compatible with local Ollama, I have never been able to get a straight answer from anyone about how to make that happen. I found it relatively easy creating working models on my local machine with an AGU so I run Ollama locally with something like Anythingllm in a Docker container, but when I attempt to host live for the world, things start going wrong. Most cloud hosts that allow Ollama to run are pretty expensive, and if one decides to use a paid-for product, there are other restrictions like cost and number of requests. I my mind it would be perfect to run lightweight agents just like in this video and run a private LLM somewhere, or use and endpoint at huggingface, but that is still a mystery to me. So, oh yes, my question. If I want to replicate what David is doing in this video, how would I reference either a local or remote hosted Ollama installation?

New comment 2d ago

Jimmy Jones

0 likes • 6d

I have not gotten to the point where I want to try and host Ollama on an external server. In my limited understanding, it's going to be pretty expensive to try and rent out the hardware you would need to allow ollama to run at any type of acceptable speed for real use cases. I'm interested in seeing what others have to say about this. To reference a locally running ollama instance, you would use the OpenAI python package if writing your own code, or the OpenAI Chat package in langchain, where you would just change the host to your local ollama instance. This is a concise article on using the ollama api with python. https://dev.to/jayantaadhikary/using-the-ollama-api-to-run-llms-and-generate-responses-locally-18b7

Jimmy Jones

0 likes • 3d

@Carl Scutt What computer are you using? I have a 2019 macbook pro with 16gb RAM and it is enough to run ollama for embeddings and the llama3.23b model fairly fast. It's enough for learning.

1-10 of 65

Level 5 - Data Alchemist

60points to level up

Jimmy Jones

@jimmy-jones-8626

24 year veteran in the web / software industries.

Active 3h ago

Joined Aug 5, 2023

INTP

Pagosa Springs, CO

Contributions

Followers

Following