Cade Metz, "Genius Makers"

책 읽는 즐거움 2026. 5. 4. 02:41

Cade Metz, "Genius Makers: The Maverics Who Brought AI

to Go던ogle, Facebook, and the World" (2021)

AI의 가능성에 대해서 별로 기대하지 않다가 최근의 성과에 대해 보고 들으면서는 그것을 가져온 것이 무엇인지 궁금했었다. 그런데 이 책을 읽으면서, 아, 그렇겠구나, 했다. Neural networks! 전에, 80년대 중반에, 교과서에서 읽은 backpropagation neural network 또한 computing/network/data 전반에 상응해서 그간, 예상을 뛰어 넘는, 놀라운 성능 향상이 있었을 것을 막연히 잊고 있었던 거다.

우리 뇌가 디자인으로 만들어진 것이 아니고 우리의 논리적 사고력이나 intelligence가 수백만 년에 걸친 뇌의 진화에 의해 얻어진 것이라면, trial and error 방식의 'training'으로 그런 과정을 모방하는, neural network이 AI의 바른, 그리고 아마도 유일하게 효과적인, 접근 방향일 수밖에 없겠다.

본문에서

When [Geoff] Hinton gave a lecture at the annual NIPS[Neural Information Processing Systems] conference [in 2007], ... on his sixtieth birthday, the phrase "deep learning" appeared in the title for the first time. It was a cunning piece of rebranding. Referring to mulptiple layers of neural networks, there was nothing new about "deep learning." But it was an evocative term designed to galvanize research in an area that had once again fallen from favor. (p. 64)

At Microsoft [around 2010?], using those $10,000 cards [Nvidia's GPUs(Graphics Processing Units)] to train a neural network on spoken words collected by the company's Being voice search service, [George] Dahl pushed Hinton's speech prototype beyond the performance of anything else under development at the company. What he and [Abdel-rahman] Mohamed and Hinton showed was that a neural network could shit through a sea of very noisy speech and somehow find the stuff that mattered, the patterns that no human engineer could ever pinpoint on their own, the telltale signs that distinguished one subtle sound from another, one word from another. It was an inflection point in the long history of artificial intelligence. (p. 74)

The power of Google's GPU cluster was that it allowed the company to experiment with myriad technoloies on a massive scale. Building a neural network was a task of trial and error, and with tens of thousands of GPU chips at their disposal, researchers could explore more possibilities in less time.... Spurred by the $130 million in graphics chips it sold to Google, Nvidia recognized itself around the deep learning idea, and soon it was not merely selling chips for AI research, it was doing its own research, exploring the boundaries image recognition and self-driving cars .... (p. 139)

After Hinton, [Yann] LeCun, and [Yoshua] Bengio published a paper on the rise of deep learning in Nature, he[Jürgen Schmidhuber] wrote a critique arguing that "the Canadians" weren't as influential as they seem to be -- that they'd built their work on the ideas of others working in Europe and Japan. And around the same time, when Ian Goodfellow presented his paper on GANs[generative adversarial networks] -- a technology whose influence would soon reverberate across the industry -- Schmidhuber stood up in the audience and chastised him for not citing similar work in Switzerland from the 1990s. (p. 141)

After reading about the match, Jordi Ensign, a forty-five-year-old computer programmer from Florida, went out and got two tatoos. AlphaGo's move 37 was tatooed on the inside of her right arm -- and Lee Sedol's Move 78 was on the left. (p. 178)

When Ilya Sutskever published the paper that remade machine translation -- known as the Sequence to Sequence paper -- he said it was not really about translation. When Jeff Dean and Greg Corrado read it they agreed. They decided it was an ideal way of analyzing healthcare records. (p. 193)

As they did, they cast GANs and related technologies in a new light. These technogies, it seemed, were a way of generating fake news. (p. 209)

A neural network learned from such a wide array of examples that small and unexpected flaws could creep into its training without anyone ever knowing. (p. 212)

BERT was what researchers call a "universal language model." ... Universal language models are giant neural networks that learn the vagaries of language by analyzing millions of sentences written by humans.... BERT analyzed ... vast library of books as well as every article on Wikipedia, spending days poring over all this text with help from hundreds of GPU chips. (p. 273)

If they could build a large enough simulation of what humans encounter in their daily lives, these labs [OpemAI, DeepMind, ...] believed they could build AGI[artificial general intelligence]. (p. 297)

The hope was that researchers could change the equation with new kinds of computer chips -- chips that could drive this research to levels beyond both Nvidia's GPUs and Google's TPUs[tensor processing units]. Dozens of companies, including Google, Nvidia, and Intel, as well as a long line of start-ups, were building new chips just for training neural networks, so that systems built by labs like DeepMind and OpenAI could learn far more in far less time. (p. 298)

More than ever DeepMind was focused on the future. And though it operated with considerable independence, it could still draw on Google's vast resources. Since acquiring DeepMind, Google had invested $1.2 billion in its research. By 2020, in addition to the hundreds of computer scientists at the London lab, [Demis] Hassabis had hired a team of more than fhfty neuroscientists to investigate the inner workings of the brain. (p. 301)

During his Turing lecture [in 2019] ... Hinton explained the rise of machine learning and explored where it might be going.... Hinton did not believe in reinforcement learning, the method Demis Hassabis and DeepMind saw as the path to AGI. It required too much data and too much processing power to succeed with practical tasks in the real world.... That same year ... when Hinton saw what reinforcement learning could do for [Pieter] Abeel's robots, he changed his mind about the future of AI research. As Covariant's system [and Abeel's robots] moved into the warehouse in Berlin, he called it "the AlphaGo moment" for robotics. "I have always been skeptical of reinforcement learning because it required an extraordinary amount of computation. But we've now got that," he said. Still, he didn't believe in building AGI. (p. 310)

DeepSeek

Meaghan Tobin and Cade Metz,

DeepSeek’s Sequel Set to Extend China’s Reach in Open-Source A.I.

(April 24, 2026, New York Times):

When the Chinese start-up DeepSeek published details about one of its artificial intelligence models last year, it sent shock waves through the tech industry.

The company said it had built its system by spending far less on computer chips than American rivals like OpenAI and Anthropic. It marked the start of what became known as China’s “DeepSeek moment,” shorthand for the belief that Chinese A.I. companies were ready to showcase their technical capabilities to the world.

The DeepSeek moment reflected a shift in the global A.I. landscape. The change was about not only lower costs but also openness in how the technology is shared.

DeepSeek released its models as open source, which means others can freely use and modify them. By contrast, OpenAI and Anthropic kept their leading models proprietary. The episode demonstrated that an open-source system could perform almost as well as closed versions. In the months that followed, Chinese firms released dozens of other open-source models. By the end of 2025, these models made up a significant share of global A.I. use.

DeepSeek-R1

Slate: The OpenAI Case

저작자표시 비영리 변경금지 (새창열림)

'책 읽는 즐거움' 카테고리의 다른 글

Thomas Hardy, "Jude the Obscure (0)	2026.05.21
Eudora Welty, "The Optimist's Daughter" (0)	2026.04.26
Colum McCann, 소설 "A Peirogon" (0)	2026.04.20
책을 읽으며 (4) (0)	2026.04.18
책을 읽으며 (3) (0)	2026.04.18

ABOUT ME

삶의 재미 삶의 재미

'책 읽는 즐거움' 카테고리의 다른 글

티스토리툴바

ABOUT ME

'책 읽는 즐거움' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바