The new AI assistant can browse, search, and use web apps like a human

Still image from an explanatory video showing ACT-1 doing a search on in a browser.
Zoom / Still image from an explanatory video showing ACT-1 doing a search on in a browser when asked to “find a home”.


Yesterday, California-based AI firm Adept announced Movement adapter (ACT-1), an artificial intelligence model that can perform actions in a program such as a human assistant when given high-level written or verbal commands. It is said that it can run web applications and perform intelligent searches on websites while clicking, scrolling, and typing in the correct fields as if it were a person using a computer.

In a demo video chirp By Adept, the company shows someone writing, “Find me a home in Houston that works for a family of 4. My budget is 600k” in a text entry box. When the assignment is submitted, ACT-1 automatically browses in a web browser, clicks on the appropriate areas of the website, type a search entry, and changes search parameters until a matching home appears on the screen.

Another demonstration video on clever site It shows ACT-1 running Salesforce with prompts such as “Add Max Nye in Adept as a new lead” and “Call recording with James Veel saying he’s considering buying 100 widgets.” ACT-1 then clicks the right buttons, scrolls, and fills out the appropriate forms to finish these tasks. Other demo videos show ACT-1 navigating Google Sheets, Craigslist, and Wikipedia through a browser.

A witty promotional video showing ACT-1 while running Google Sheets, a web-based spreadsheet application.

How is this possible? Witty describes the ACT-1 as a “widespread converter.” In AI, a adapter A model is a type of neural network that learns to do something by training on model data, and builds knowledge of the context and relationships between the elements in the data set. Transformers have been behind many recent AI innovations, including language models such as GPT-3 It can be written at almost a human level.

In the case of ACT-1, it appears that the training data came from the humans running the program first, and the AI ​​model learned from that. The person who introduce themselves As a developer for ACT-1 on Hacker News WroteWe used a combination of human demonstrations and feedback data! You need dedicated software to record demos and to represent the state of the gadget in a typical consumable way.

After training, the ACT-1 model interacts with a web browser through a Chrome extension that can “observe what’s happening in the browser and take certain actions, such as clicking, typing, and scrolling,” according to Handy. The company describes ACT-1’s observation ability as being able to generalize across websites, so the rules learned in one location can apply to others.

While automated browsing scripts already exist (and are often used Power robots with bad intentions), it seems that the powerful, generalized nature of ACT-1 included in the demos takes machine automation to a new level. Already, people on Twitter are serious and half joking sound the alarms about the potential for abuse that this technology can bring. Should we allow an intelligent system to have this much control over our computer interfaces?

While these concerns are purely hypothetical at the moment – especially since ACT-1 does not operate independently – they are something we must bear in mind as we speed toward human-level general AI that can interact with the outside world online. so witty the reviewer On their website, Target wrote, “We believe the clearest framing of general intelligence is a system that can do anything a human can do in front of a computer.”

Leave a Reply

Your email address will not be published. Required fields are marked *