To The Stars


From Punch Cards To Pronunciation

From Punch Cards To Pronunciation

The Evolution of Human Interfaces & Amazon Lex

Ever since there have been computers, there have been interfaces to allow people to interact with those systems. Just like the computers behind them, these interfaces have grown in scope and sophistication on the path to the holy grail: an interface which is intuitive, efficient and accessible to all. Amazon Lex takes us one step further down this path by enabling anyone to wrap their application, business logic or data in a conversational, natural language interface; I think of this type of 'chatbot' as the third generation of interfaces, each one defined and enabled by the changing ratio of computers to people.

5MB of data stored on 62,500 punch cards. Source.

1st Gen: Machine-oriented 

When computers were rare and expensive and humans were relatively cheap, it's no surprise that all of the effort fell to the human to interact with the system. With the very first interfaces on the very first huge, room-sized cabinet machines, humans were forced to speak the language of the machine - literally - flipping bits in memory by hand using punch cards. No effort was made on behalf of the machine to be easy to use or intuitive, and operating them was a rarified skillset. 

2nd Gen: Application-oriented

As computers became cheaper and more capable, so the ratio of computers to people tended towards 1:1 (and beyond, as we now have computers in our phones, watches, fridges and cars), so the onus on communicating too became more evenly balanced. The computer would come half way and try to make life easier by presenting a pre-defined set of buttons, clicks, controls, taps and gestures, while the user would have to meet it in the middle and translate their wishes and intent, projecting them into the application through the graphical interface controls. However, just like speaking any foreign language - it can be hard to learn, takes a long time to get fluent (especially if you don't use the language frequently), and even then, some things can get lost in translation. Graphical user interfaces are powerful, but they are lossy when it comes to capturing and conveying the intent of a user.

3rd Gen: Intent-oriented

 The Amazon Echo. 

The Amazon Echo. 

In an era where computing power is available as a utility, at the end of API, the availability of that power ceases to be a constraint in interacting with computer systems. The ratio of computers to humans weights heavily towards the computers, and so the onus of responsibility falls entirely on the computer to do the work in understanding and satisfying the users intent. These 'intent-oriented' interfaces understand natural language, through both text and speech, and can use that to infer and then fulfill the intent of the user.  

These interfaces are more intuitive and efficient, can be placed ubiquitously through the world, and are more accessible.

It's these third generation interfaces, which are focused on capturing, understanding and then fulfilling the intent of the user, which were popularized by the Amazon Echo and Alexa when it launched two years ago, and which Lex - the engine that powers the natural language and voice recognition AI systems used inside Alexa - brings to all developers, on all devices and platforms.

The era of the third generation is upon us. You can be part of it by signing up for the Lex preview.

The quotes in this article were read by Brian, one of 24 different voices available in Amazon Polly.

The main post image features Matthew Goode as Hugh Alexander in The Imitation Game© 2014 - The Weinstein Company.

Say Hello to Amazon AI

Say Hello to Amazon AI