Recently I had the opportunity to ask Larry Yaeger a few questions, this is part one of a three-part series where we discuss handwriting technology, the future of Inkwell and the original Newton project. In case you're in the dark, here is a brief bio for Larryall a part of keeping you In The Loop!
Larry Yaeger has used computers to solve a wide variety of problems throughout his career. Having studied aerospace engineering with a focus on computers, he carried out pioneering computational fluid dynamic flow studies over the space shuttle and submarines.
As Director of Software Development at Digital Productions, he used a Cray X-MP supercomputer to generate the special effects for Hollywood films The Last Starfighter, 2010, and Labyrinth, as well a number of Clio Award-winning television commercials. While with Alan Kay's Vivarium Program at Apple Computer, he designed and programmed a computer "voice" for Koko the gorilla, helped introduce Macintoshes into routine production on Star Trek: The Next Generation, and created a widely respected Artificial Life computational ecology ("Polyworld") that evolves neural architectures resulting from the mutation and recombination of genetic codes, via behavior-based, sexual reproduction of artificial organisms.
He also coauthored possibly the first book+CD-ROM title, the award-winning Visualization of Natural Phenomena. Also at Apple, in the Advanced Technology Group, he was Technical Lead in the development of the neural network-based handprint recognition system, the world's first genuinely usable handwriting recognition system, showcased in second generation Newton PDAs and Mac OS X's "Inkwell." He currently resides in scenic Beanblossom, Indiana, and teaches and performs research in Artificial Life at Indiana University.
Larry Yaeger: Perhaps a little. But my perspective on Newton is different than most people associated with it. I'd met Steve Sakoman1 and talked with him about his vision for the Newtonwhich included multiple instantiations from a very small form factor (today's "PDA") to laptop-sized (a "slate") to desktop-sized to wall-sized, all sharing information seamlessly and wirelessly. It was a neat vision. However, the project lingered and lingered and never produced anything viable, for quite a few years.
It's been so long, now, that I can't quite piece together the exact chronology, but I think it was before I even began working on handwriting recognition, or perhaps only slightly after, that I got invited to a meeting at which the ParaGraph2 guys were demoing their technology (that ultimately ended up in the first-generation Newton). The Newton project was now under Larry Tesler3, and he led this meeting. I got a chance to use their handwriting recognition. It got 1 out of 10 things I wrote correct. It was terrible. The experience was the same for pretty much everyone. Yet Tesler beamed. I never understood it. Later I learned he'd simply become fascinated by the technology, and its potential, and I guess that blinded him to the realities of the technology at the time.
(To be fair, the debacle of the first-generation Newton wasn't entirely ParaGraph's fault. There again, Tesler and some of the highest -evel decisions put them into a situation from which it was impossible for anything good to come. But that's another story.)
I was in ATG4, not the Newton group, at the time. I began working on handwriting recognition. Much time passed. I had some conversations with Newton engineers that were intriguing, and suggested possibilities of us working together. But handwriting recognition is a hard problem, and we were just getting going. It took us years to produce a viable technology. Meanwhile Newton was ramping up and even shipped its first-generation project. I don't think we ever seriously contemplated being part of that first generation, as we knew we weren't ready. Of course, neither was ParaGraph, as history all too starkly made clear.
So while I thought the original, sweeping Newton concept was exciting, and even the reduced vision of what first shipped was intriguing, I had my doubts about whether it could succeed. I also, at various points and with others, lobbied for a smaller form factor, but never seemed to have any impact. (Very, very late in the game I got the then-CEO, Shane Robison5, to agree to investigate what we now call a palm-sized device, but it was too late, as Newton was soon to be reeled back into Apple and killed off.)
Still, yes, there were so many things done right about the Newton, both hardware and software, that it seemed wonderfully, radically different, in all the right ways. (Except handwriting.) The concept of the data soup, the UI, nicely architected system software, the portability... It really was a brilliant design, in many ways. In fact, if they had left handwriting recognition out entirely, forcing people to use the built-in, on-screen keyboard, I retroactively predict the Newton would have been a success from day one. As it was, the handwriting recognition was so bad, and so touted as the one true input method for the device, that it was very nearly doomed to failure from day one.
Our internal handwriting recognition work didn't get into the Newton until much later, with the release of Newton 2.0 and the MP130 (though I believe there was a mid-generation release of MP120s with Newton 2.0 and our software, but I'm not 100% certain of that). We kind of thought we might have "saved the Newton", and I still think the company and product could have been rescued, but that's yet another story.
LY: Oh yes! I used to never miss an episode of the Simpsons, and was heartily bemused, if also a bit saddened, when I saw that. And even earlier, I think, I was a huge Doonesbury/Gary Trudeau fan, and laughed and cringed at his "egg freckles" jab at the Newton. In fact, once we started seriously working on a Newton version of our recognizer, I kept a photocopy of the Doonesbury panels lampooning the Newton on the wall by my desk, as a reminder that we had to do better.
Tune in tomorrow for Part II of the three-part series!
1. Steve Sakoman was the original visionary behind the Newton project but left Apple before his vision reached a tangible stage. After Apple Sakoman was the Chief Engineer at palmOne and the CTO at Be.
2. ParaGraph was responsible for the first generation handwriting recognition in the Newton OS They later developed a cursive recognition system for the Newton 2.x a system, the latter licensed to Microsoft who now uses it in their Windows XP Tablet PC Edition. The software is still available under the name Callligrapher. ParaGraph is now PhatWare.
3. Larry Tesler was the original head of the Lisa group at Apple Computer. He later took charge of the Newton Group after Steve Sakoman left Apple. Larry later took over the Advanced Technology Group and then later killed the group as Chief Scientist at Apple. Since Apple he has worked at Amazon.com as the VP of Shopping Experience and Design, and at Yahoo! Research Labs.
4. ATG or the Advanced Technology Group briefly became Apple Research Laboratories before being disbanded by Tesler in 1987. ATG was at least partially responsible for such innovations as "Toby's Frame Buffer", QuickTime, QuickTime VR, QuickDraw3D, speech recognition and synthesis software, handwriting recognition software (later became InkWell), "v-twin" that became Apple's 'core document search technology', an Ensoniq chip for the Mac, the Midi Manager and a DSP chip for the Mac among other things.
5. Shane Robinson is currently the Executive Vice President and Chief Strategy and Technology Officer at Hewlett Packard. Robinson was briefly the manager of the Newton Group before it was killed by Apple.
Here is part two of our three-part interview with Larry Yaegar. In part one we discussed the Newton project, and porting the Newton's handwriting recognition to OS X.
Larry Yaeger: Yes it was. Especially since we started a port to OS 9 first. Brought it to an official alpha release (and a very good, unusually stable alpha, with all known bugs of any significance fixed), and within a couple of days, certainly less than a week, got the official word from Marketing that there would be "no new features on OS 9." So it was back to the drawing board, as very little of the approach or the code for our OS 9 approach could work in OS X.
OS X, then, involved a complete redesign and a lot of new code, but I'm happy with the event-based approach we used to let both system and apps intercept the data handling at any level... points, strokes, words, phrases, with phrase termination determined in a variety of ways. It turned out very flexible, but with a very compact API.
LY: Well, ultimately, those of us involved in Inkwell have always hoped there would someday be a pen-based Mac. That would be the best way to take advantage of the technology. However, in the interim, there are a couple of key ways we think the technology is of value to our customers.
Our original thinking was that it would benefit graphic designers, artists, multimedia workers, etc. Those customers represent a core market for Apple. And many of them routinely work with graphics tablets already. Because going back and forth between the pen and the keyboard can be a bit annoying and causes a break in the work flow, we thought Inkwell would be handy for these kinds of professions. With handwriting recognition enabled, you can enter a caption, a filename, or whatever short bit of text without putting down the pen and moving to the keyboard.
Turns out there was another use for pretty much the same core marketgestural input. We were approached by the team working on Motion, the really cool, interactive animation application from Apple's Pro Apps division, to see if our ink-handling APIs could help them manage pen-gestures as a means to control their user interface and whether our core recognition technology could help them do accurate gesture recognition. The answer was yes on both fronts, and I really enjoyed working with them to craft a nice set of gestures, gather a bunch of data, clean the data, and train a neural network to perform the recognition. Then I gave them a couple of extra software hooks to fit their specific needs, and modified our main event-flow code path to support "ink on demand" (so the pen acts just like a mouse, except when you do something special, like hold down a barrel button on the pen, or a modifier key on the keyboard), and even gave them the ability to "write in the air" above the tablet. The Motion team did a great job then implementing the gestural interface in their app, and customers seem to really like it. Apparently that sort of gestural interface was previously only available on packages costing hundreds of thousands of dollars, and ours is more accurate too boot!
You may have noticed that the "ink on demand" and "ink in air" support was moved up to the entire system, as of Tiger, so now it's available to everyone, though the gesture recognition is still unique to Motion at this time.
LY: Not really. I'm afraid I'm spoiled. I refused to use Graffiti entirely. I want to write in natural English, thank you very much. And the new "Graffiti 2" (CIC's Jot, I believe, but am not certain) is still one letter at a time, which is just too annoying. I want to be able to write words and phrases, in a normal, natural fashion.
I've tried the handwriting recognition software in various hand-held devices over the years, and they pretty much all sucked. I recently lived for a little while with an HP h6315 iPaq integrated Pocket PC and phone. It had every feature in the world, but I had to give it up, as the software was horribly buggy (I had to keep rebooting my phone, really), the integration with the phone functionality was laughable (but only to keep from crying), launching their built-in web browser immediately terminated their built-in VPN (thus making the device unusable in many environments, including the Indiana University campus, where I work these days), when the batteries drained it forgot everything, including the owner info, contact list, etc., and synchronizing with my Mac was possible but more than a little problematic. And, sadly, the handwriting recognition was completely unusable. They know it by now, of course, and make it easy to bring up an on-screen keyboard or even attach a tiny hardware QWERTY keyboard.
Which is too bad, and completely unnecessary, because the handwriting recognition on Tablet PCs these days is quite decent. It's a combination of the much-improved ParaGraph recognizer and some homebrew recognition software from within Microsoft. It has its flaws, but in the limited experimenting I've done, it seemed genuinely useful to me. Basically, I think their recognition is about on a par with ours, plus they do cursive, and we don't.
(I was working on cursive when ATG was killed off quite a few years back. Was probably about 3/4 of the way there. Still have the modified codebase, but have never had time to go back and work on it again.)
Tune in tomorrow for the final installment of this interview where Larry talks the future of hand writing recognition.
The conclusion of our three-part interview with Larry Yaeger. In part two we discussed current implementations of hand writing technology and uses for the Inkwell technology.
Larry Yaeger: Hmm, I guess I just about answered this towards the end of the previous question's answer. I haven't seen any handwriting recognition software that is significantly better than the (later generation) Newton's, but the Tablet PC software has just about caught up. Sadly, it's only on Windows boxes. ;)
LY: Mostly, they don't work well enough, though there are other constraints: you don't really want to talk to your computer in a public place, and if you're halfway competent at typing, typing is faster than handwriting. But there are also situations and places in which voice and handwriting are clear winners. A friend of mine suffered so severely from carpal tunnel syndrome that he had to use voice input for programming. It was horribly tedious and error prone at first, but he was highly motivated, and he finally made it work surprisingly wellwell enough that he was able to continue programming for a living, which would not otherwise have been the case. And in classrooms and meetings of pretty much any kind, handwriting is a much better option than keyboard, for the most partyou can sketch little drawings, as needed, show connections between things with a quick arc, and be less obtrusive and quieter than someone clacking away on a keyboard.
I can understand why Apple hasn't pursued a pen-based Mac to date... an integrated tablet adds weight, thickness, and cost to a portable computerexactly the opposite of what you're trying to optimize in the design of such machines. But technology continues to improve and costs continue to reduce, so I live in hope. Personally, I really want a pen-based Mac that, like some of the Tablet PCs, lets you easily keep your keyboard around too, if you want to. There are situations where I want the keyboard and situations where I want the pen, so I really would like to have both. Maybe someday.
LY: Hard to say. And if I actually knew anything definite, of course I couldn't say. Even though I've transitioned to Indiana University's School of Informatics, in order to pursue my research interests in artificial life, I have stayed on at Apple one day a week, in order to help sustain and nurture the technology. There's a lot of excitement about the gestural interface support in Motion, so perhaps we'll see more of that. And, of course, I'd like to see a pen-based Mac someday, though I'm not holding my breath.
LY: Well, predicting the future is a sucker's game. Normally I just avoid it. But I think both hand and voice input will be a part of future systems. Better accuracy, both from core recognition technology and from more and more "contextual understanding," that functions under a wider range of conditions (especially on the part of voice), will make the technologies ubiquitous someday, I suspect. But it's going to be a while.
LY: I have two or three main machines. I suppose I live most on my PowerBook (G4/1.5GHz 15"). (Had a first-generation 17", but found it a bit too bulky and heavy, so I went back to the 15", even though I loved the extra screen real estate of the 17".) I do most of my development on the PowerBook, besides e-mail, web, etc.
Then there's my beautiful liquid-cooled dual 2.5 GHz PowerMac G5 at the office, with the 30" Apple display. Yum! Very fast, and that screen is an absolute joy to work on. That's where I expect to be running most of my artificial life simulations.
And at home I've got a (formerly state of the art) dual 2.0 GHz PowerMac G5 with the 23" Apple Cinema Grand display (and a CRT on either side). One of the CRTs and the keyboard go through a KVM switch so I can control three other Power Macs, of various vintages. And then there's the OS X Server PowerMac G4 I'm preparing to be my main web/SMTP server, in order to replace the aging Beige PowerMac G3 that has been my trusty OS 9 web/SMTP server for lo these many years.
Thanks to Larry for the great interview, footnote information and his bio.