Smolagents: Build AI Agents in Minutes with Python!

I recently discovered the smolagents library by Hugging Face, and I’m excited about how it simplifies creating AI agents.

These agents use powerful language models to write and run Python code, making them more flexible than other tools I’ve tried.

In this guide, I’ll share what I’ve learned about smolagents, their features, and how you can use them to automate your tasks tasks.

Whether you’re a beginner or a seasoned coder, I hope my experience inspires you to try smolagents.

What Are Smol Agents?

Smol agents are the heart of the smolagents library. They let me build smart systems that think and act by generating code on the fly. Unlike rigid, rule-based systems, these agents use large language models to make decisions dynamically. Here’s what I love about them:

  • Simple Design: The library uses just 1,000 lines of code, so it’s easy to understand and fix issues.
  • Code-Based Actions: Agents write Python code to perform tasks, which feels more natural than using JSON or text.
  • Safe Execution: Code runs in secure environments like E2B or Docker, protecting my system.
  • Flexible Models: I can use any language model, from Hugging Face to OpenAI.
  • Multiple Inputs: They handle text, images, videos, and audio, opening up many possibilities.
  • Community Sharing: I can share my tools and agents on the Hugging Face Hub.

I find smolagents perfect for tasks like searching the web, analyzing data, or automating web actions, exactly what I need for my social media posting project.

Key Features and Benefits

Here’s a quick look at why smolagents stands out, based on my experience:

Feature Description
Simple Design Uses ~1,000 lines of code, making it easy to debug and understand.
Code-Based Actions Writes Python code for tasks, allowing complex logic and smooth data handling.
Safe Execution Runs code in secure E2B or Docker environments to protect your system.
Model Flexibility Works with any language model, like Qwen or OpenAI, via LiteLLM.
Multiple Inputs Supports text, images, videos, and audio for diverse applications.
Community Sharing Shares tools and agents easily through the Hugging Face Hub.

These features make smolagents more user-friendly than other frameworks I’ve used, which often feel overly complex or restrictive.

Getting Started with Smolagents

Setting up smolagents was straightforward on my Ubuntu 24.04 system with an NVIDIA RTX 4080. Here’s how you can do it:

  • Check Requirements:
    • You need Python 3.8 or higher. Check with `python --version`.
    • Ensure pip is installed. Run `pip --version` to confirm.
    • Sign up for a Hugging Face account and get a token from Hugging Face Tokens.
  • Create a Virtual Environment:
    • I like keeping projects organized, so create a virtual environment:
python -m venv smolenv
  • Activate it:
    • On Windows: `smolenv\Scripts\activate`
    • On Linux/macOS: `source smolenv/bin/activate`
  • Install Smolagents:
    • Install the library with:
pip install smolagents
  • Set Your Hugging Face Token:
    • To use Hugging Face models, set your token:
      • On Linux/macOS: `export HF_TOKEN=your_token_here`
      • On Windows: `set HF_TOKEN=your_token_here`
    • Replace `your_token_here` with your token.
  • Run Your First Agent:
    • Create a file named `first_agent.py` and add:
from smolagents import CodeAgent, WebSearchTool, InferenceClientModel

model = InferenceClientModel()
agent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True)
agent.run("What is the capital of France?")
    • Run it with `python first_agent.py`. The agent searches and displays the answer.

You’re now ready to build your own agents! For more details, check the smolagents documentation.

Examples of Smol Agents

Let me share three examples I tried, showing how versatile smolagents can be for your projects.

1. Building My First Agent

I started with a simple agent that answers questions by searching the web. Here’s the code I used:

from smolagents import CodeAgent, WebSearchTool, InferenceClientModel

model = InferenceClientModel()
agent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True)
agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")

This agent searches for the leopard’s speed and the bridge’s length, then calculates the time. It’s perfect for gathering quick facts.

2. Creating a Multi-Agent System

For bigger tasks, I built a system where multiple agents work together. A manager agent assigns tasks to others, like this:

from smolagents import CodeAgent, WebSearchTool, InferenceClientModel

# Manager Agent
manager_model = InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct")
manager_agent = CodeAgent(tools=[WebSearchTool()], model=manager_model)

# Web Search Agent
web_search_agent = CodeAgent(tools=[WebSearchTool(), VisitWebpageTool()], model=manager_model)

manager_agent.run("Search for recent AI trends and summarize findings.")

This setup splits tasks, making it easier to research content. My system with 12GB VRAM can handle Qwen2.5-7B with 4-bit quantization, but larger models like 32B may need cloud resources.

3. Automating Web Tasks with Helium

To automate web actions, I used Helium for browser tasks. Here’s an example where my agent writes a blog post:

First, install Helium:

pip install helium

Then, use this code:

from smolagents import CodeAgent, InferenceClientModel
import helium

model = InferenceClientModel()
agent = CodeAgent(model=model, stream_outputs=True)
agent.run("""
from helium import *
start_chrome('https://ghost.org')
click('Sign in')
write('username', into='Email')
write('password', into='Password')
click('Sign in')
go_to('https://ghost.org/admin')
click('New post')
write('Detailed article on smolAgents', into='Post title')
""")

This agent logs into Ghost and starts a post. I can tweak it to post on LinkedIn or X by updating the URLs and actions.

My Social Media Posting Project

My goal is to automate posts on LinkedIn and X, and smolagents fits perfectly. Here’s how I plan to use it:

  • LinkedIn Posts: A web agent logs into LinkedIn, navigates to the post page, and shares content created by another agent.
  • X Posts: An agent posts updates on X, using web searches to find trending topics.
  • Teamwork: A manager agent coordinates research and posting agents for a smooth workflow.

Below are clickable expandable sections for each major component. Each  block can be expanded/collapsed to show or hide the corresponding code section.

1. Imports and Environment Setup
import os
import time
from datetime import datetime
from PIL import Image
from io import BytesIO
from dotenv import load_dotenv
from smolagents import CodeAgent, ChatAgent, LiteLLMModel, TransformersModel
import helium

# Load credentials from .env file
load_dotenv()

LINKEDIN_USERNAME = os.getenv("LINKEDIN_USERNAME")
LINKEDIN_PASSWORD = os.getenv("LINKEDIN_PASSWORD")
X_USERNAME = os.getenv("X_USERNAME")
X_PASSWORD = os.getenv("X_PASSWORD")
    
2. Model Configuration
def get_model():
    # Option 1: Use a local Ollama model
    return LiteLLMModel(
        model_id="ollama/llama3:8b",  # Change to your preferred model
        api_key="ollama"
    )
    
    # Option 2: Use a Hugging Face model
    # return TransformersModel(
    #     model_id="meta-llama/Llama-3.2-3B-Instruct",
    #     max_new_tokens=1024,
    #     device_map="auto"
    # )
    
3. Screenshot & Popup Helpers
def save_screenshot(step_log, agent):
    time.sleep(1.5)  # Give page time to load
    driver = helium.get_driver()
    if driver:
        png_bytes = driver.get_screenshot_as_png()
        image = Image.open(BytesIO(png_bytes))
        step_log.observations_images = [image.copy()]
        print("📸 Screenshot captured")

def close_popups(agent, args):
    """Tool to close common social media popups"""
    try:
        driver = helium.get_driver()
        common_buttons = [
            "//button[contains(@aria-label, 'Close')]",
            "//button[contains(@aria-label, 'Dismiss')]",
            "//div[@role='dialog']//button",
            "//button[contains(text(), 'Not Now')]"
        ]
        for selector in common_buttons:
            try:
                elements = driver.find_elements_by_xpath(selector)
                for el in elements:
                    if el.is_displayed():
                        el.click()
                        time.sleep(0.5)
                        return "Popup closed"
            except:
                pass
        return "No popups found"
    except:
        return "Error checking for popups"
    
4. Research Agent
def run_research_agent(topic):
    print("🔍 Starting research agent...")
    research_agent = CodeAgent(
        model=get_model(),
        stream_outputs=True,
        additional_authorized_imports=["helium", "time", "requests", "bs4"],
        step_callbacks=[save_screenshot],
        tools=[close_popups]
    )
    research_prompt = f"""
    Research trending topics related to {topic} for a social media post.
    Follow these steps:
    1. Import helium: from helium import *
    2. Start a browser: start_chrome('https://www.google.com')
    3. Search for: "trending topics {topic} {datetime.now().strftime('%B %Y')}"
    4. Visit 2-3 sources to find current trending topics
    5. Summarize your findings into 3 trending subtopics
    FORMAT YOUR RESPONSE LIKE THIS:
    TOPIC 1: [Name]
    SUMMARY: [2-3 sentence explanation]
    TOPIC 2: [Name]
    SUMMARY: [2-3 sentence explanation]
    TOPIC 3: [Name]
    SUMMARY: [2-3 sentence explanation]
    After research, make sure to kill the browser: kill_browser()
    """
    research_results = research_agent.run(research_prompt)
    print("\n✅ Research complete!\n")
    return research_results
    
5. Content Creation Agent
def create_content(research, topic_type="informative"):
    print("✍️ Creating content based on research...")
    content_agent = ChatAgent(
        model=get_model(),
        stream_outputs=True
    )
    content_prompt = f"""
    Based on this research about {topic_type} topics:
    {research}
    Create:
    1. A professional LinkedIn post (250-300 words) with hashtags and a call to action
    2. A concise X post (under 280 characters) with relevant hashtags
    FORMAT YOUR RESPONSE LIKE THIS:
    === LINKEDIN POST ===
    [LinkedIn post content here]
    === X POST ===
    [X post content here]
    """
    content = content_agent.run(content_prompt)
    print("\n✅ Content creation complete!\n")
    return content
    
6. LinkedIn Posting Agent
def post_to_linkedin(content):
    print("🔗 Starting LinkedIn posting agent...")
    import re
    linkedin_match = re.search(r"=== LINKEDIN POST ===\s*(.*?)(?=\s*=== X POST ===|\Z)", content, re.DOTALL)
    linkedin_post = linkedin_match.group(1).strip() if linkedin_match else content
    linkedin_agent = CodeAgent(
        model=get_model(),
        stream_outputs=True,
        additional_authorized_imports=["helium", "time"],
        step_callbacks=[save_screenshot],
        tools=[close_popups]
    )
    linkedin_prompt = f"""
    Post the following content to LinkedIn:
    {linkedin_post}
    Follow these steps EXACTLY:
    1. Import helium: from helium import *
    2. Start LinkedIn: start_chrome('https://www.linkedin.com/login')
    3. Login:
       - Write '{{LINKEDIN_USERNAME}}' into 'Email or phone'
       - Write '{{LINKEDIN_PASSWORD}}' into 'Password'
       - Click 'Sign in'
    4. Wait for login: wait_until(Text('Feed').exists, timeout_secs=30)
    5. Look for post button: click('Start a post')
    6. Write the post content in the editor
    7. Click 'Post'
    8. Wait for confirmation: wait_until(Text('Post successful').exists, timeout_secs=30)
    9. Kill browser when done: kill_browser()
    If you encounter popups, use the close_popups() tool.
    """
    result = linkedin_agent.run(linkedin_prompt)
    print("\n✅ LinkedIn posting complete!\n")
    return result
    
7. X (Twitter) Posting Agent
def post_to_x(content):
    print("🐦 Starting X posting agent...")
    import re
    x_match = re.search(r"=== X POST ===\s*(.*?)(?=\s*===|\Z)", content, re.DOTALL)
    x_post = x_match.group(1).strip() if x_match else content[:280]
    x_agent = CodeAgent(
        model=get_model(),
        stream_outputs=True,
        additional_authorized_imports=["helium", "time"],
        step_callbacks=[save_screenshot],
        tools=[close_popups]
    )
    x_prompt = f"""
    Post the following content to X (Twitter):
    {x_post}
    Follow these steps EXACTLY:
    1. Import helium: from helium import *
    2. Start X: start_chrome('https://twitter.com/i/flow/login')
    3. Wait for login page: wait_until(Text('Sign in').exists)
    4. Login:
       - Write '{{X_USERNAME}}' into username field
       - Click 'Next'
       - Write '{{X_PASSWORD}}' into password field
       - Click 'Log in'
    5. Wait for home screen: wait_until(S('#react-root').exists)
    6. Click on 'Post' or 'What is happening?'
    7. Type the content
    8. Click 'Post'
    9. Wait for confirmation and kill browser: kill_browser()
    If you encounter popups, use the close_popups() tool.
    """
    result = x_agent.run(x_prompt)
    print("\n✅ X posting complete!\n")
    return result
    
8. Manager Agent
def run_manager(research_results, content, linkedin_result, x_result):
    print("🧠 Manager agent analyzing workflow...")
    manager_agent = ChatAgent(
        model=get_model(),
        stream_outputs=True
    )
    manager_prompt = f"""
    You're the Manager Agent overseeing a social media posting workflow.
    Review these outputs:
    RESEARCH RESULTS:
    {research_results}
    CONTENT CREATED:
    {content}
    LINKEDIN POSTING:
    {linkedin_result if linkedin_result else "Skipped"}
    X POSTING:
    {x_result if x_result else "Skipped"}
    Please provide:
    1. A brief summary of the workflow execution
    2. What went well in this process
    3. Suggestions for improving future posts
    4. Recommendations for optimizing the workflow
    """
    analysis = manager_agent.run(manager_prompt)
    print("\n✅ Manager analysis complete!\n")
    return analysis
    
9. Main Workflow
def run_social_media_workflow(topic="AI and machine learning", post_type="informative"):
    print(f"\n🚀 Starting social media workflow for {topic} ({post_type} posts)\n")
    if not LINKEDIN_USERNAME or not LINKEDIN_PASSWORD:
        print("⚠️ LinkedIn credentials not found in .env file")
    if not X_USERNAME or not X_PASSWORD:
        print("⚠️ X credentials not found in .env file")
    research_results = run_research_agent(topic)
    content = create_content(research_results, post_type)
    linkedin_result = None
    x_result = None
    if LINKEDIN_USERNAME and LINKEDIN_PASSWORD:
        linkedin_result = post_to_linkedin(content)
    else:
        print("⚠️ Skipping LinkedIn posting (no credentials)")
    if X_USERNAME and X_PASSWORD:
        x_result = post_to_x(content)
    else:
        print("⚠️ Skipping X posting (no credentials)")
    manager_analysis = run_manager(research_results, content, linkedin_result, x_result)
    print("\n🏁 Social media workflow completed successfully!")
    return {
        "research": research_results,
        "content": content,
        "linkedin_result": linkedin_result,
        "x_result": x_result,
        "manager_analysis": manager_analysis
    }

if __name__ == "__main__":
    # Create .env file with your credentials before running
    run_social_media_workflow(topic="artificial intelligence", post_type="thought leadership")
    

Unlike CrewAI, where one agent’s failure stopped everything, smolagents’ modular design keeps things running smoothly.

Comparing Smolagents to Other Tools

I’ve tried Claude Desktop, CrewAI, and n8n. Here’s how smolagents stacks up:

Tool Strengths Weaknesses
smolagents Simple, code-based, secure, Hub sharing Newer, fewer community examples
CrewAI Customizable, task-focused One agent’s failure disrupts all
n8n Easy workflows, no-code friendly Limited AI agent features
Claude Strong model, user-friendly Limited free tier, less flexible

Smolagents balances ease and power, ideal for my coding and web automation needs.

Conclusion

Smolagents has been a fantastic find for me. Its simplicity, secure code execution, and flexibility make it great for automating tasks.

Whether you want to build a single agent or a complex system, smolagents has you covered. I encourage you to try it and see how it can simplify your projects.

FAQ

Which models work with smolagents?

You can use any model, like Qwen from Hugging Face or OpenAI’s via LiteLLM.

How does smolagents keep code safe?

It runs code in secure E2B or Docker environments.

Can smolagents do more than code?

Yes, it handles web searches, data analysis, and web automation.

Is there support for smolagents?

Check the Hugging Face documentation and GitHub for help.

How do I start with smolagents?

Install it with pip install smolagents, set your token, and try the examples above.