beautifulsoup4

Parsing

Web scraping

Try beautifulsoup4 in PyRun

What is beautifulsoup4?

BeautifulSoup4 (bs4) is a Python library for parsing HTML and XML documents. It creates a parse tree from page source code that makes it easy to navigate, search, and extract data. Combined with the requests library, it forms the basis of the most common Python web-scraping stack.

In PyRun, BeautifulSoup4 is loaded via micropip and works immediately. You can parse raw HTML strings, extract tags, attributes, and text content, and explore document structure — all without a local Python environment. It's a great tool for learning web scraping concepts safely.

Code Example

Navigate and extract data from an HTML document.

HTML Parsing with BeautifulSoup
Try in Editor
from bs4 import BeautifulSoup

html_doc = """
<html>
  <head><title>Sample Page</title></head>
  <body>
    <h1>Article List</h1>
    <ul>
      <li><a href="/post/1" class="post">Intro to Python</a></li>
      <li><a href="/post/2" class="post">NumPy Basics</a></li>
      <li><a href="/post/3" class="post">Pandas Guide</a></li>
    </ul>
    <p class="footer">© 2025 PyRun</p>
  </body>
</html>
"""

soup = BeautifulSoup(html_doc, 'html.parser')
print("Title:", soup.title.string)
print("H1:", soup.h1.string)
print("\nLinks:")
for a in soup.find_all('a', class_='post'):
    print(f"  {a.string}  →  {a['href']}")

Why run beautifulsoup4 in PyRun?

  • Zero setup — no pip install, no virtual environment, no Python download
  • Instant results — powered by WebAssembly, runs locally in your browser
  • Share your code — generate a link and anyone can run it instantly
  • Works offline — after first load, PyRun runs without internet
Open editor with beautifulsoup4 example

Related Packages