Cornell researchers have taught a robot to take Airbnb photos

Aesthetics is what happens when our brain interacts with content and thinks, “oh pretty, please give me more.” Whether it’s a starry night or The starry Night, the sound of a scenic seaside or Megan Thee Stallion’s latest single, understanding how the sensory experiences that sparkle most deeply within us has given rise to a whole branch of philosophy studying art, in all its forms, as well as its design. , produced and consumed. While what constitutes “good” art varies as much from person to person as what constitutes porn, appreciating the finer things in life is an inherently human endeavor (sorry, Suda) – or at least until we teach computers how to do it too.

The study of computational aesthetics seeks to quantify beauty as expressed in human creative endeavours, essentially using mathematical formulas and machine learning algorithms to evaluate a specific piece based on existing criteria, reaching (hopefully) an opinion equivalent to that of a human performing the same inspection. . This field was founded in the early 1930s when American mathematician George David Birkhoff devised his theory of aesthetics, M=O/C, where M is the aesthetic measure (think, a numerical score), O is the order and C is the complexity. Under this metric, simple, orderly rooms would rank higher – i.e., more aesthetically pleasing – than complex, chaotic scenes.

German philosopher Max Bense and French engineer Abraham Moles both independently formalized Birkoff’s early work into a reliable scientific method for evaluating aesthetics in the 1950s. In the 1990s, the International Society for Mathematical and Computational Aesthetics was founded, and over the past 30 years the field has evolved further, expanding into AI and computer graphics, with the ultimate goal of developing computer systems capable of judging art with the same objectivity and sensitivity as humans, if not superior sensitivities. As such, these computer vision systems have found use in augmenting the judgments of human evaluators and automating rote image analysis similar to what we see in medical diagnostics, as well as classifying videos and photographs to help amateur photographers improve their craft.

Recently, a team of researchers from Cornell University took a state-of-the-art computational aesthetic system one step further, allowing AI to not only determine the most pleasing image in a given data set, but also capture new, original images – and above all, good ones – shots all by themselves. They dubbed it, AutoPhoto, his study was presented last fall at the International Conference on Robots and Intelligent Systems. This robot-photographer consists of three parts: the image evaluation algorithm, which evaluates a presented image and delivers an aesthetic score; a Clearpath Jackal wheeled robot on which the camera is attached; and the AutoPhoto algorithm itself, which serves as a kind of firmware, translating the results of the image grading process into drive commands for the physical robot and effectively automating the optimized image capture process.

For its image evaluation algorithm, the Cornell team led by sophomore Hadi AlZayer, a master’s student, leveraged an existing learned aesthetic estimation model, which had been trained on a dataset of more than a million photographs classified by man. AutoPhoto itself was trained virtually on dozens of 3D images of interior room scenes to spot the optimally composed angle before the team attached it to the Jackal.

When dropped into a building on campus, as you can see in the video above, the robot starts with a slew of bad takes, but as the AutoPhoto algorithm gets its bearings, its selection of takes view steadily improves until the pictures rival those of the locals. Zillow Lists. On average, it took about a dozen iterations to optimize each shot, and the whole process only takes a few minutes.

“You can basically make incremental improvements to current commands,” AlZayer told Engadget. “You can do it one step at a time, which means you can frame it as a reinforcement learning problem.” This way the algorithm doesn’t have to conform to traditional heuristics like the rule of thirds because it already knows what people will like because it’s learned how to match the look and feel of shots. view he takes with the highest ranked images of his workout data, AlZayer explained.

“The hardest part was the fact that there was no existing benchmark number that we were trying to improve on,” AlZayer noted at Cornell Press. “We had to define the whole process and the problem.”

Looking ahead, AlZayer hopes to adapt the AutoPhoto system for outdoor use, potentially replacing the ground-based Jackal with a UAV. “Simulating high-quality, realistic exterior scenes is very difficult,” AlZayer said, “simply because it is more difficult to perform the reconstruction of a controlled scene.” To work around this problem, he and his team are currently investigating whether the AutoPhoto model can be trained on video or still images rather than 3D scenes.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you purchase something through one of these links, we may earn an affiliate commission.

Comments are closed.