Notes from AIFrontiers conference

8 minute read

My soothsayer friend BG told me last year that “deep learning is the next big thing”. I didn’t know what that meant. A few days ago, I attended the AIFrontiers conference in Santa Clara, California. Now I have a glimpse of what he meant :-)

What is Intelligence?

In this context, by “intelligence”, I interpret it as “smart”. Yes, we have smart phones, smart TVs, and smart speakers. But imagine way more smarter software and devices… like self-driving cars!

Note that artificial Intelligence is about understanding intelligence. Machine Learning is a “brute force” data-driven approach to simulating intelligence., they are related but not the same thing. There are many areas that will lead to Artificial General Intelligence (AGI) which means “a software that can do any task”, as opposed to Machine Learning which creates software that can do specific tasks. This conference was about Machine Learning, and specifically Deep Learning.

To summarize the scope of the areas, Artificial Intelligence > Machine Learning > Deep Learning.

From Analog to Digital to Intelligence

The mantra at this conference was that we will move from a software stack to an intelligence stack to solve future engineering challenges.

This was best explained by the legendary Jeff Dean in his keynote speech, talking about how many products at Google use deep learning:

Deep Learning at Google

What is Machine Learning?

Machine learning is one technique to achieve intelligence.

What is machine learning? My understanding is: it is about making computer programs whose behavior is learned from data instead of solely based on lines of code written by humans. Think spam filters - whenever we click on “Spam” or “Not Spam” buttons, the spam filtering system learns from this and the behavior changes over time to reflect that, without somebody explicitly writing code for every single email. On top of this idea, design the system to learn by itself, and it can learn and improve orders of magnitude faster.

What makes Machine Learning special? Because the system is now learning behaviors that is more accurate for the task and can handle more situations than the algorithms we humans could have imagined! Think converting sentences from one human language to another, self-driving cars, etc. Think of all the situations that such systems need to handle. We could have not written code to handle every situation.

Why now? Because machine learning requires:

  1. Lots of data - which we have now thanks to (a) so many people buying mobile phones, (b) mobile phones sensors and apps generating so much data.
  2. Lots of computers - which we have now thanks to cloud computing.
  3. Lots of parallel processing power (think matrix multiplications) - which we have now thanks to Graphics Processing Units (GPUs).

What is Deep Learning?

What is deep learning? It is a machine learning technique that is based on “layers of neurons”, i.e. think of millions of neurons in your human brain that work together to understand, perceive, store knowledge… deep learning tries to simulate your brain. At least, that’s the way I understood it.

Jeff Dean explains deep learning

What do you want in a Machine Learning System?

Jeff Dean talked about their first internal machine learning system, the problems they faced, and what they ideally wanted:

What do you want in a Machine Learning System?
Computation Time and Research Productivity

And eventually they designed TensorFlow to achieve those desirable features.

He went on to mention the algorithms they use for different products, which I found interesting, not because I understood what they meant, but because they are pointers in case you want to learn more. After all, the whole point of attending conferences and meetups is to know what is happening out there.

Speech Recognition
Google Photos Search
Google Search
Language Translation

Some of these models can be found at

Jeff Dean also mentioned the kind of impact they have had on products, esp. converting April Fool’s Day jokes into reality:

Google Inbox Smart Reply
Algorithms behind Google Inbox Smart Reply

Jeff Dean expects more reuse of machine learning-developed models across different tasks, described as zero-shot learning:

Zero-shot learning

And more compute-based model generation:

More compute

Jeff Dean also gave a glimpse of what kind of queries they hope to achieve in the future:

Google Search queries of the future

Autonomous Driving

There was a lot of info throughout the day, so I’ll only post what I found were interesting topics / slides in the discussions:

Speakers were from Waymo (Google), Tesla Motors (not in official capacity), Baidu Autonomous Driving Unit.
Google / Waymo designing a car specifically for autonomous driving

Baidu also played videos of their self-driving cars in China, so this is not just a USA-only phenomenon. China, indeed, may have an edge in AI.

Big Data and Machine Learning in the car

This is a reason why I feel C++, the beast, is making a comeback - because performance and efficient hardware usage is important again, because we now have to run a lot of processing on the Internet of Things, especially self-driving cars. And because it’s C++, correctness becomes a new risk. This might give a clue as to why Tesla Motors attracted Chris Lattner, the creator of the LLVM compiler, speculation is that Tesla Motors wants to build an integrated autopilot system from chip to compiler.

Computer Chips specifically for autonomous driving

With Google creating custom chips called Tensor Processing Units (“TPU”) for machine learning model generation in the cloud to NVidia making chips for self-driving cars to Intel releasing it’s Go platform containing 5G modems and chips for self-driving cars, efficient and performant chips for machine learning has become important. This explains why NVidia’s shares have gone up 225% in 2016.

The car is one node of the Internet of Things. It will connect and interact with the cloud.

This is very familiar to me because that is what we do at Automatic.

Speech-Enabled Assistants

Speakers were from Microsoft, Baidu, Amazon Alexa.


Speech is not the same as text processing, there are more nuances.
Types of chatbots


Why deep learning
Handle issues such as background noise and multiple people speaking
Handle issues such as person speaking from other end of room
They converted existing voice recordings to far-field and used that to train models
How much compute power, you ask?
GPUs to the rescue
Deep Speech works for Mandarin
Deep Speech works for multiple languages
Why focus on speech? More inclusive and faster than typing.
Speech recognition can be more accurate than typing for non-technical people
Try the TalkType app for Android
Baidu’s Goal is AI for 100 million people

Amazon Alexa:

Speech recognition process
‘LSTM’ technique

See Wikipedia entry on Long short-term memory.

More techniques

Natural Language Processing

Speaker was from Google Brain

He talked about how deep learning has dramatically changed the field of NLP. Focused on “end-to-end” deep learning methods.

Computer Vision (Perception)

Speakers were from OpenCV, Bosch and Google

An example of using computer vision is from Jeff Dean’s keynote speech - - enter your address, it will tell you how much roof area you have and how much money you can save by switching to solar energy!

OpenCV is a popular open source computer vision library:

OpenCV 3
Deep Learning comes to OpenCV


Street View to Vision processing to Local Business discovery, cars, cameras, vision, and maps - all in one sentence
New machine learning techniques, better data and compute, you get the idea.
Future of Perception

Impact of AI on jobs

Speaker was from McKinsey
McKinsey study focus
Based on current AI/ML capabilities: Few jobs will be fully automatable. Most jobs will only be partially automatable. That’s a relief!

Internet of Things

Speakers were from Bosch, Nervana (Intel) and Vion

Vion Vision was the most interesting. They are deploying machine learning models to devices like cameras. They demonstrated their bus-counting cameras that helps bus operators to get real-time traffic so that they can deploy more buses in high-traffic routes, etc. They even had a demo of public-area cameras that auto-detect a crowd beating up a person and sending an alert to the local police station.

Vion Vision cameras
Camera counting
Custom chip for deep learning

Deep Learning Frameworks

Speakers were from Google, Facebook and Amazon

This was an amazing session where creators or prominent members of each Deep Learning Framework came up and talked about their thoughts on the framework status and future.

Rethinking slow float-based computation
Math Challenges
  • Scalability - How do I train on multiple GPUs and CPUs? OpenMPI, NCCL, ZeroMQ, etc.
  • Portability - Cloud, Mobile, IoT, cars, drones, coffee makers. Constraints - limited computation, battery life, models maybe luxurious, ecosystem less developed
  • Augmented Computation Patterns - more than float dense math - quantized computation, sparse math libs, model compression, rethinking existing ops (ResNEXT)
  • Augmented Math Challenges
  • Modularity - reusability
No silver bullet

Amazon mxnet:

Why another framework?
Core philosophy of mxnet
Current state of industry
Future direction
Torch next generation
Another vote for sharing components

Thank You AIFrontiers Organizers

It was an excellent conference, with well-chosen topics and the best speakers imaginable - the platform creators themselves. People who were expecting deep-dives or technical details were disappointed, but it was a great “state of the industry” conference for people like me who know nothing about the topic.

Thank you to the conference organizers, the Silicon Valley AI and Big Data Association and all the sponsors.

Ending Note

Geoffrey Moore (author of “Crossing The Chasm”) says:

In the coming decade all global enterprises, both private and public, will target the trapped value in their ineffective and inefficient outward-facing relationships with their targeted constituencies, be they consumers, clients, customers, patients, students, or citizens. Authentic sustainable engagement will become the new scarce ingredient. The as-a-service model will expand from commodity transactions to incorporate more significant life interests as well—education, health, personal development, family relationships, wealth management, safety and security, and the like. Machine learning and artificial intelligence will be the new keys to the kingdom, enabling institutions to operate at global scale with unprecedented speed, relevance, and accuracy. Operating models will prioritize customer relationship effectiveness over the supply chain efficiency, causing CRM to displace ERP as the most prominent information system, and the hot expertise will lie in user experience design, data analytics, machine learning, and artificial intelligence.