Pull and Push - How Machines Deliver Text Data To Human

Pull and Push - How Machines Deliver Text Data To Human

💡

Learn how pull and push strategies define how relevant information is delivered to the end user

In this blog post we'll take a look at how information is delivered to human beings by machines.

There are in fact different strategies that identify not only the context of information retrieval, but also user intent and means of delivery. We'll look into what information retrieval is, how user intent defines the objective and how this objective is achieved by specific information delivery systems.

What Is Information Retrieval? ℹī¸

Information Retrieval (IR) is the process of gaining knowledge from a source of data from the environment. This environment can be explored in several ways to obtain such information, depending on the its state and the state of the user.

The main goal of IR is to minimize the reduction of noise delivery, and maximize the delivery of signal delivery.

Think about Google - it is safe to say that it is the main source of information retrieval of the world. Now think about Amazon - we all know how powerful its recommendation systems are. But how do Google and Amazon work and what strategies do they use to deliver relevant content to the end user? We won't look at how search engines and recommender systems work, but we'll see together the strategies that they employ to favor IR.

Strategies to deliver text data đŸĻ˜

We have mentioned Google and Amazon. The first because it is a search engine, and the second because of its recommender systems. They are the de-facto standard of the industry because they work so well. This is proven by how satisfied the users are by using their product.

But they are radically different in terms of how they deliver information to the user in some of their specific functionalities. While they both use search and recommendation systems (for instance, Google suggests related keywords, which can be interpreted as a recommendation, while Amazon delivers product info through the search bar), we can dissect how Google delivers information through search and how Amazon delivers information through recommendation.

The user queries the system: the Pull strategy ⛓ī¸

Literally means that the user pulls the information from the system. An example is when users queries a database or a search engine. In this context, the user takes the initiative and searches the environment for information.

In tangible terms, whenever we search Google for something, we are pulling information from its database to do something with that information.

Pulling involves two aspects:

  1. Querying
  2. We query the search engine through a keyword and the engine returns relevant documents. The ability of the engine to deliver relevant content dictates whether the search engine is doing a good work or not. Querying works very well when users know what they are looking for.

    image
  3. Browsing
  4. The user navigates the structure of the documents to find the information he's looking for. As you can intuitively understand, this strategy works well when the user doesn't know what to look for or can't conveniently query the system.

    image

How Do Google and Amazon Use Pull Strategies?

In search and in visualizing their results.

Whenever Google returns a SERP (Search Engine Result Page), or whenever Amazon displays a list of products, they are moving the user from the querying space to the browsing space. This does not happen if the end user lands directly on the result they looked for (Google's Are you feeling lucky feature for instance).

Users query the system → "where to buy sneakers in Milan, Italy"

Users browse the results → documents (items) match the intent of the user

The system guesses what info is relevant: the Push strategy 👐

This strategy is used when systems take the initiative to deliver presumably relevant information to the end user. This strategy is employed by recommendation systems. The better these systems are at pushing information, the better is their performance and usage.

Amazon is a perfect example of how these systems work on the professional level. Netflix's system is another one worth mentioning. We can all acknowledge how powerful these systems are in that they directly increase (or the decrease) the user value for the business.

But why are these systems so difficult to tune? Why is Amazon so good at suggesting items and some other engines fail when it comes to create more complex associations?

It's because these systems require stable, clean information coming in from the end user. In other words, it must access user behavior data. Of course, the more traffic you have on your website, the more data you can store and feed into the system.

The Problem of User Intent 🧩

Natural Language Processing is a very difficult domain. Have machines decode what humans imply during conversation is turning out to be quite the challenge.

John saw a kid with a telescope.

This sentence alone is sufficient to break any NLP algorithm of the past 20 years. The portion with a telescope can either refer to John (as if John saw the kind by looking through a telescope) or to the kid (as if the kid was holding a telescope when John saw him). Discerning ambiguity is one of the greatest challenges in NLP today. Google and Facebook have done great work in the field, together with many other big shots of the industry.

It goes without saying that understanding user intent during search is one tough task. Google, being the first search engine in the world whose job is literally to predict what users intention is, is still trying to figure out how to achieve this. Many updates are pushed on a weekly basis to its core algorithm and often these updates also tune the engine's ability to better understand user queries.

Data scientists and ML engineers implementing search and recommendation systems face the problem of understanding user intent more than anything. That being said, IR systems are still allowing businesses to make millions in spite of the difficulties in understanding accurately what the user really seeks. And we are just scratching the surface.

Bonus: Google's generative AI, LamDA

As a bonus for those who stayed with me until the end (thank you 🙏), here's a video of the incredible conversational capabilities that Google's new AI, LamDA. Not related to IR... but who knows. Maybe in the future search with be conducted via conversation? We already have Siri and the Google Assistant...