Hi everyone! I've been developing a Python script that simplifies our SEO competitor analysis. I believe it could be really useful for many of us here.
What It Does: The script uses the TextRazor API to extract entities from competitor content stored in text files. It then organizes these entities, along with their occurrence counts, in a Google Sheet. Moreover, it calculates and displays intersections of entities between different competitors, helping identify common or unique themes across multiple sources.
KEY FEATURES
ENTITY EXTRACTION
- Powered by TextRazor's API: The script extracts entities from each competitor's content, providing insights into key topics or entities they are focusing on.
GOOGLE SHEETS INTEGRATION
- Organized Data: All extracted data is neatly organized and stored in Google Sheets, allowing for easy access and manipulation.
DYNAMIC INTERSECTIONS
- Competitor Analysis: The script can dynamically calculate and list the intersections of entities among competitors. This feature shows how many and which specific entities are shared, enhancing competitor analysis.
- Custom Headings in Sheets: Intersections for each combination of competitors are displayed under headings like "C1 ∩ C2 ∩ C3", "C1 ∩ C2", "C1 ∩ C3", and "C2 ∩ C3". C1,C2… means competitor 1,2….etc.
CUSTOMIZABLE AND SCALABLE
- Flexibility: Whether analyzing 2 competitors or 10, the script adjusts the analysis based on the number of data files processed, ensuring scalability and customization to fit different needs.
How It Helps:
- Strategic Insights: Quickly see what competitors are focusing on in their content strategies.
- Efficiency: Automates tedious data collection and processing tasks, allowing you to focus on strategy and execution.
- Flexibility: Works with any number of competitors and adapts the analysis accordingly.
Before You Start:
- Prepare Your Data: You need to manually extract content from your competitors’ websites and save them into comp*.txt files. While TextRazor offers an option to extract data directly from URLs, it’s not recommended here. Why? Because extracting directly from URLs often pulls in unnecessary data from the HTML, cluttering your analysis with irrelevant information.
- Set Up Your API and Sheets: Make sure to have your TextRazor API key and Google Sheet configured and ready to go.
Feel free to experiment with it and let me know if there are any improvements or features you'd like to see!" The script isn't very visually appealing yet, but I plan to enhance the data visualization, including Venn diagrams, in the future. Here: https://github.com/kalhara2018/entity-analyzer