How to Set Up OSWorld Environment: A Comprehensive Guide

Nov 2, 2021 | Data Science

If you’re looking to set up the OSWorld environment for benchmarking multimodal agents, you’ve landed in the right place. This guide will walk you through everything you need to install the environment on your desktop or server as well as virtualized platforms like AWS and Azure. Grab a cup of coffee, and let’s dive in!

Updates to the OSWorld Environment

  • 2024-06-15: We refactored the code for VMware Integration and started supporting other platforms like VirtualBox, AWS, and Azure. Hold tight!
  • 2024-04-11: Released our paper, environment and benchmark, and project page. Check it out!

Installation Instructions

For Non-Virtualized Platforms

Starting with a system that’s not virtualized? Here’s what to do:

  1. Clone the OSWorld repository and change into its directory:
  2. git clone https://github.com/xlang-ai/OSWorld
    cd OSWorld
  3. Install dependencies from requirements.txt. It’s preferable to use Conda for managing the environment:
  4. # Optional: Create a Conda environment
    conda create -n osworld python=3.9
    conda activate osworld
    # Install required dependencies
    pip install -r requirements.txt
  5. Install [VMware Workstation Pro](https://www.vmware.com/products/workstation-pro/workstation-pro-evaluation.html) and set up the vmrun command. For Apple Chips, install [VMware Fusion](https://support.broadcom.com/group/ecx/productdownloads?subfamily=VMware+Fusion) instead. Verify the installation by running:
  6. vmrun -T ws list
  7. If everything is set correctly, you’ll see a message about the running virtual machines! Note: You can also use [VirtualBox](https://www.virtualbox.org) if VMware has issues.

For AWS or Azure Users

As you venture into the cloud, follow these guidelines:

On AWS

Check the AWS Guidelines for proper instance selection and region configuration.

On Azure

Support for Azure is in progress but not yet fully validated. Your patience is appreciated!

Quick Start

Run this example to interact with the OSWorld environment:

from desktop_env.desktop_env import DesktopEnv

example = {
    'id': '94d95f96-9699-4208-98ba-3c3119edf9c2',
    'instruction': 'I want to install Spotify on my current system. Could you please help me?',
    'config': {
        'type': 'execute',
        'parameters': {
            'command': [
                'python',
                '-c',
                'import pyautogui; import time; pyautogui.click(960, 540); time.sleep(0.5);'
            ]
        }
    },
    'evaluator': {
        'func': 'check_include_exclude',
        'result': {
            'type': 'vm_command_line',
            'command': 'which spotify',
            'expected': {
                'type': 'rule',
                'rules': {
                    'include': ['spotify'],
                    'exclude': ['not found']
                }
            }
        }
    }
}

env = DesktopEnv(action_space=pyautogui)
obs = env.reset(task_config=example)
obs, reward, done, info = env.step(pyautogui.rightClick())

This example will give you insight into setting up an interaction with the environment, seeing successful actions and logs along the way.

Understanding the Code: An Analogy

Think of the OSWorld environment like setting up a stage for a play. Each component of the code comes together to create a backdrop where your characters (programs) can perform. You start with:

  • Constructing the stage with your example, which defines the scenario and the characters involved.
  • The DesktopEnv acts like the director, overseeing the actions and ensuring everything runs smoothly.
  • Your actions, such as pyautogui.click, are like the actors following their script cues, making the right movements at the right times.

In the end, when the actors perform flawlessly and everything unfolds as planned, it symbolizes a successful environment setup!

Troubleshooting Tips

Running into issues during the setup? Here are some troubleshooting ideas:

  • Ensure you are running Python version 3.9 as specified. You can check your version by running
    python --version
    .
  • If you encounter issues installing VMware, make sure your system meets all hardware requirements.
  • For any environment-related hiccups, consult the environment interface guidelines.
  • In case of unresponsive virtual machines, restarting the services or the host machine can often resolve connectivity issues.
  • For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

Setting up the OSWorld environment may seem daunting at first, but break it down step by step and you’ll find it manageable. Remember, our team is always here to explore advancements in AI technology, pushing the envelope for more effective solutions.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox