What does Meeker’s Internet Trends report tell us about voice search?

What does Meeker’s Internet Trends report tell us about voice search?

June 3, 2016 7:05 pm

This week noticed the publication of Mary Meeker’s annual Internet Trends report, packed full of knowledge and insights into the event of the web and digital know-how throughout the globe.

Particularly of curiosity to us right here at Search Engine Watch is a 21-web page part on the evolution of voice and pure language as a computing interface, titled ‘Re-Imagining Voice = A New Paradigm in Human-Computer Interaction’.

It appears at tendencies in recognition accuracy, voice assistants, voice search and gross sales of units just like the Amazon Echo to construct up an correct image of how voice interface has progressed over the previous few years, and is more likely to progress sooner or later.

So what can we study from the report and Meeker’s knowledge concerning the position of voice in web tendencies for 2016?

Voice search is rising exponentially

We know that voice is a quick-rising development in search, because the proliferation of digital assistants and the advances in deciphering pure language queries make voice looking simpler and extra correct.

But the figures from Meeker’s report present precisely to what extent voice search has grown over the previous eight years, because the launch of the iPhone and Google Voice Search in 2008. Google voice queries have risen greater than 35-fold from 2008 to at present, in response to Google Trends, with “name mother” and “navigate residence” being two of probably the most generally-used voice instructions.

A slide from Meeker's trends report showing the rise in Google Voice Search queries since 2008. The heading reads, "Google Voice Search Queries = Up >35x since 2008 and >7x since 2010, per Google Trends". The graph below it tracks the rise of three terms: "Navigate Home", "Call Mom" and "Call Dad", represented by a red line, a blue line and an aqua line respectively". All three terms have fairly low growth from 2008 to 2013, followed by a rapid rise with several sharp peaks upwards from 2013 to 2016. The 'Call Dad' trend grows the least, with the 'Call Mom' trend rising the fastest, briefly overtaken by 'Navigate Home' in 2015.

Tracking the rise of voice-particular queries resembling “name mother”, “name dad” and “navigate residence” are an sudden however surprisingly correct strategy to map the expansion of voice search and voice instructions. As an apart, anybody can monitor this knowledge for themselves by getting into the identical phrases into Google Trends. It’s fascinating to assume what the signature voice instructions is perhaps for monitoring using sensible residence hubs like Amazon Echo in a number of years’ time.

Google is, in fact, under no circumstances the one search engine experiencing this development, and the report goes on as an example the rise in speech recognition and textual content to speech utilization for the Chinese search engine Baidu. Meeker notes that “typing Chinese on a small cellphone keyboard [is] much more troublesome than typing English”, resulting in “quickly rising” utilization of voice enter throughout Baidu’s merchandise.

A slide from Meeker's report showing growth in Baidu voice input. The header reads, "Baidu Voice = input growth >4x... Output >26x, since Q2: 14". Below it are two graphs showing upward trends in usage between Q2 of 2014 and Q1 of 2016. The left-hand graph, Baidu Speech Recognition daily usage, has a steady upward climb with a slight plateau between Q2 and Q3 of 2015, followed by a much sharper increase. The right-hand graph, Baidu Text to Speech daily usage, shows a very gradual rise from Q2 in 2014 to Q2 in 2015, followed by a steep rise up to the present day.

Meeker additionally plots a timeline of key milestones within the progress of voice search since 2014, noting that 10% of Baidu search queries have been made by voice in September 2014, that Amazon Echo was the quickest-promoting speaker in 2015, and that Andrew Ng, Chief Scientist at Baidu, has predicted that by 2020 50% of all searches might be made with both pictures or speech.

While developments in picture search haven’t been making as a lot of a splash as developments with voice, it shouldn’t be ignored, because the know-how that may allow us to ‘search’ objects within the bodily world is approaching in leaps and bounds. In April, Bing carried out an replace to its iOS app permitting customers to look the online with pictures from their telephone digital camera, though the function is restricted to customers in america, as they’re the one ones who can obtain the app.

The visible search app CamFind, which has been round since 2013, additionally has an uncanny potential to determine objects within the bodily world and name up product listings, which has an enormous quantity of potential for each search and marketing.

A timeline showing milestones in the progression of voice search usage. In September 2014, Baidu reached 1 in 10 voice queries coming through speech. In June 2015 Siri handled more than 1 billion requests per week through speech. In 2015 the Amazon Echo was the fastest-selling speaker in 2015, representing 25% of the USA speaker market according to 1010data. In May 2016 Bing reported 25% of searches performed on Windows 10 taskbar are voice searches, while Google reported that 1 in 5 Android mobile app searches in the USA are voice searches and the share is growing. Finally, the timeline notes that in 2020 Andrew Ng predicts that at least 50% of all searches will be either through images or speech.

Why do individuals use voice?

The improve in voice search and voice instructions is just not solely as a result of improved know-how; probably the most superior know-how on the earth nonetheless wouldn’t see widespread adoption if it wasn’t helpful. So what are voice enter adopters (a minimum of in america) utilizing it to do?

The commonest setting for utilizing voice enter is the house, which explains the recognition of voice-managed sensible residence hubs like Amazon Echo. In second place is the automotive, which tallies up with the preferred motivation for utilizing voice enter: “Useful when palms/imaginative and prescient occupied”.

Two bar charts on voice usage side by side. The left-hand chart is titled Primary Reasons for Using Voice, USA, 2016. It shows that 61% of users find voice input useful when their hands or vision are occupied, while 30% find it gives faster results, 24% have difficulty typing on certain devices, 22% think it is fun or cool, 12% use it to avoid confusing menus, and 1% have another motivation. The right-hand chart shows the primary setting for voice usage, with home on 43%, car on 36%, on the go with 19% and work at 3%.

30% of respondents discovered voice enter quicker than utilizing textual content, which additionally is sensible – Meeker observes elsewhere within the report that people can converse virtually A occasions as shortly as they will sort, at a mean of one hundred fifty phrases per minute (spoken) versus forty phrases per minute (typed). While this has all the time been the case, the power of know-how to precisely parse these phrases and shortly ship a response is what is absolutely starting to make voice enter quicker and extra handy than textual content.

As Andrew Ng stated, in a quote that's reproduced on web page 117 of the report, “No one needs to attend 10 seconds for a response. Accuracy, adopted by latency, are the 2 key metrics for a manufacturing speech system…”

The third-hottest cause for utilizing voice enter, “Difficulty typing on sure units”, is a reminder of the essential position that voice has all the time performed, and continues to play, in making know-how extra accessible. The least common setting for utilizing voice enter is at work, which could possibly be because of the problem in choosing out a person consumer’s voice in a piece surroundings, or on account of a social reluctance to speak to a tool in entrance of colleagues.

A pie chart showing a breakdown of different kinds of query for the Hound voice assistant app in the USA. The header reads, "Hound Voice Search and Assistant App = 6-8 queries across 4 categories per user per day". The pie chart is divided into 4 segments. The largest is General Information at 30%, followed by Personal Assistant on 27%, then Local Information at 22%, and finally Fun and Entertainment at 21%.

Meeker’s report additionally seems to be into the utilization of 1 digital assistant particularly: Hound, an assistant app developed by the audio recognition firm SoundHound, and which was additionally lately used so as to add voice search capabilities to SoundHound’s music search engine of the identical identify.

What’s fascinating concerning the utilization breakdown for Hound, at the very least among the many 4 pretty broad classes that the report divides it into, is that nobody use sort dominates overwhelmingly. The hottest use for Hound is ‘basic info’, at 30%, above even ‘private assistant’ (which is what Hound was designed to do) at 27%.

Put along with the share of queries for ‘native info’, greater than half of voice queries to Hound are info queries, suggesting that many customers nonetheless see voice primarily as a gateway into search. It can be fascinating to see comparable graphs for utilization of Siri, Cortana and Google’s assistants to find out whether or not this development is borne out throughout the board.

A tipping level for voice?

Towards the top of the part, Meeker appears on the evolution and possession of the Amazon Echo, which as a tool which was particularly designed for use with voice (versus smartphones which had voice capabilities built-in into them) is probably probably the most helpful product case research for the adoption of voice instructions.

Meeker notes on one slide that computing business inflection factors are “sometimes solely apparent with hindsight”. On the subsequent, she juxtaposes the height of iPhone gross sales in 2015 and the start of their estimated decline in 2016 with the take-off of Amazon Echo gross sales in the identical interval, seeming to recommend that one prompted the opposite, or that one system is giving strategy to the opposite for dominance of the sensible system market.

A slide from Meeker's report with the heading "iPhone sales may have peaked in 2015... while Amazon Echo device sales beginning to take off?" On the left is a column graph showing iOS smartphone unit shipments globally from 2007. The columns rise steadily up to 2015, where they peak, with a reduced number (a little over 200 million shipments) estimated for 2016. On the right is a column graph showing estimated Amazon Echo unit shipments in the USA, from Q2 in 2015 to Q1 in 2016. The columns rise in large increments in 2015, while the column for 2016 is only slightly higher, at an estimated 1 million unit shipments.

I’m unsure if I would agree that the Amazon Echo is taking up from the iPhone (or from smartphones), since they’re basically totally different units: one is designed to be house-sure, the opposite moveable; one is visible and the opposite is just not; and as I identified above, the Amazon Echo is designed to work solely with voice, whereas the iPhone merely has voice capabilities.

But it's fascinating to view the development as a part of a shift within the computing market in the direction of a unique sort of know-how: an ‘all the time-on’, Internet of Things-related system particularly designed to work with voice, and maybe that’s the purpose that Meeker is making right here.

Meeker factors to the quick motion of third-celebration builders to construct platforms which combine the Alexa voice assistant into totally different units as proof of the enlargement of “voice as computing interface”. While I assume we'll all the time depend upon a visible interface for a lot of issues, this could possibly be the start of a tipping level the place voice instructions take over from buttons and textual content as the first enter technique for many units and machines.

Hopefully Meeker will revisit this matter in subsequent tendencies stories in order that we will see how issues play out over the subsequent few years.

You may also like...