beautifulsoup4
ParsingWeb scraping
What is beautifulsoup4?
BeautifulSoup4 (bs4) is a Python library for parsing HTML and XML documents. It creates a parse tree from page source code that makes it easy to navigate, search, and extract data. Combined with the requests library, it forms the basis of the most common Python web-scraping stack.
In PyRun, BeautifulSoup4 is loaded via micropip and works immediately. You can parse raw HTML strings, extract tags, attributes, and text content, and explore document structure — all without a local Python environment. It's a great tool for learning web scraping concepts safely.
Code Example
Navigate and extract data from an HTML document.
HTML Parsing with BeautifulSoup
Try in Editorfrom bs4 import BeautifulSoup
html_doc = """
<html>
<head><title>Sample Page</title></head>
<body>
<h1>Article List</h1>
<ul>
<li><a href="/post/1" class="post">Intro to Python</a></li>
<li><a href="/post/2" class="post">NumPy Basics</a></li>
<li><a href="/post/3" class="post">Pandas Guide</a></li>
</ul>
<p class="footer">© 2025 PyRun</p>
</body>
</html>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
print("Title:", soup.title.string)
print("H1:", soup.h1.string)
print("\nLinks:")
for a in soup.find_all('a', class_='post'):
print(f" {a.string} → {a['href']}")Why run beautifulsoup4 in PyRun?
- ✦ Zero setup — no pip install, no virtual environment, no Python download
- ✦ Instant results — powered by WebAssembly, runs locally in your browser
- ✦ Share your code — generate a link and anyone can run it instantly
- ✦ Works offline — after first load, PyRun runs without internet