youtube-transcript.ai

Web scraping: Claude Code for Economists with Paul Goldsmith-Pinkham | Markus Academy | Ep. 162-3

Watch with subtitles, summary & AI chat
Add the free Subkun extension — works directly on YouTube.
  • Watch
  • Subtitles
  • Summary
  • Ask AI
Try free →

Economists and researchers interested in using AI tools like Claude Code for efficient web scraping and data analysis.

TL;DR

This video demonstrates how to use Claude Code for web scraping, specifically extracting tariff-related risk factors from SEC filings. It guides economists through building a database and performing basic analysis directly from the command line, showcasing the power of AI in data collection.

Key Takeaways

In This Video

  1. 00:00Introduction to Claude Code for Economists

    Welcome to the third video in a series on using Claude Code for applied economists. This episode focuses on web scraping.

  2. 01:16Web Scraping vs. Structured Data

    Unlike structured data from APIs, web scraping involves collecting messy, unstructured data that needs significant processing.

  3. 01:49Targeting SEC Edgar for 10-K Filings

    We will scrape 10-K annual filings from the SEC's Edgar database to analyze risk factors.

  4. 02:18Analyzing Tariff-Related Risk Disclosures

    The goal is to track how 'tariff' mentions in risk factors (Item 1A) change over time.

  5. 03:15Project Steps: Scraping, Extracting, Database, Analysis

    The process involves scraping 10-K filings, extracting Item 1A, building a database, and querying trends.

  6. 04:21Providing Explicit Context to Claude

    Giving Claude precise information about the Edgar API, rate limits, and user agents improves its performance.

  7. 06:08Initiating the Project on the Command Line

    Starting a new Claude session, creating a folder, and providing the detailed prompt to begin the data collection.

Questions & Answers

How to scrape data from the SEC Edgar website using Claude Code?
You can scrape 10-K filings from the SEC Edgar website using Claude Code by providing a structured prompt detailing the CIK, the Edgar filing API, rate limits, and the user agent header.
What is Edgar and what are 10-K filings?
Edgar is the SEC's database for regulatory filings. 10-Ks are annual filings made by companies, which include a section called Item 1A detailing material risks.
How does Claude Code handle unstructured or messy data?
Claude Code can process messy data by performing significant work to structure it, which is useful when scraping data from the web or when data is not well-organized.
What is the purpose of a user agent header in web scraping?
A user agent header tells the Python script to pretend it has a user agent associated with it, often including a name and email, which helps websites identify the script and avoid blocking.
What is the rate limit for downloading files from Edgar?
The rate limit for downloading files from Edgar is typically around 10 files per second to prevent spamming the server and getting blocked.
How can Claude Code be used for economic research on policy uncertainty?
Claude Code can be used to scrape and analyze data from regulatory filings like 10-Ks to study trends in risk disclosures, such as those related to tariffs and policy uncertainty.

Key Terms

Download or copy the punctuated YouTube transcript (Markdown)

Full Transcript

Loading transcript…

Source

YouTube video. Original: https://www.youtube.com/watch?v=wqLZrKdevHs
Transcript captured and processed by youtube-transcript.ai on 2026-06-25.