Skip to main content

Subsection 2.5.1 What’s a Boolean Query?

We’ve been thinking of logical expressions as statements that can be true or false. And we have been writing truth tables to describe the situations in which they take on one or the other of those values.

A slightly different way to think of them is as commands: Find me situations that match my expression (in other words, situations about which my expression is true).

We call commands like this queries and you’ve probably already used several of them just today.

Video cover image

For example, suppose I enter this query into my Google search bar:

(chocolate pie) OR nuts

I expect Google to make my statement true by returning pages that match it. (Of course, there’s more to Google’s search algorithm than this; it also uses popularity and other factors. But a first approximation to what it should do is to find pages whose content matches what I’ve asked for.)

Lots of apps and web sites support Boolean queries. Most support three Boolean operators:

  • And, which is generally the default operator. We don’t write it explicitly at all. So, for example, suppose I type this query into Google:

chocolate pie

The search engine will return pages that contain both words, “chocolate” and “pie”.

  • Or. Google uses the keyword OR (it must be upper case). Ebay uses commas, which must separate terms enclosed in parentheses. For example, we could ask for rings with gold or silver by writing:

ring (gold, silver)

  • Not. Not is useful as a way to pare down the number of records that are returned. The most common way to indicate it is with a minus sign. In one experiment, the following ebay query returned 72,046 items:

Winnie the Pooh

But let’s say I’m a purist. This ebay query returned only 33,477 items:

Winnie the Pooh -Disney

Most search engines don’t support complex (arbitrarily nested) Boolean expressions. You can experiment with individual ones to see what they allow.

If you do want to experiment, we’ve found that it’s much easier to see what’s going on in ebay than it is in Google. That’s because Google uses so many extra factors in deciding what pages to return. Ebay, on the other hand, will take you literally and only return items that actually match your query (or some simple variants like singular/plural forms).

For the next set of problems, use ebay because it’s much easier there, than in Google or some other search engine, to see what effect changing a query has.

Exercises Exercises

1.

Suppose that you want to find memorabilia from Carnegie-Mellon University (previously known as Carnegie Tech and sometimes called just Carnegie).

(Part 1) Go to ebay and type in the query, Carnegie. Jot down how many hits you got.

(Part 2) Now start using the minus operator to get rid of things that have nothing to do with the school. Are there a few words that, if ruled out, pare down the search a lot?

(Part 3) Suppose that you wrote a query such as:

Carnegie –purple –hamburger

(Of course, that’s not one you’d write, but we want to talk about something of that form, so this will do.)

Which of the following Boolean expressions describes all the results that that query will return:

I. Carnegie \(\wedge \) \(\neg\) purple \(\wedge \neg \) hamburger

II. Carnegie ∧ ¬(purple ∧ hamburger)

III. Carnegie ∧ ¬(purple ∨ hamburger)

IV. ¬(purple ∨ hamburger) ∧ Carnegie

  1. None of them.

  2. Just one of them.

  3. Just two of them.

  4. Just three of them.

  5. All four of them.

Answer.
Correct answer is D.
Solution.
Explanation: I is the most obviously correct choice. The keyword Carnegie must be present. And purple must be absent and hamburger must be absent. III is also correct. Carnegie must be present and neither purple nor hamburger can be present. IV is equivalent to III. II, on the other hand is different. It says that Carnegie must be present and it’s not true that BOTH purple and hamburger are present. That’s not strong enough. By the way, in the next chapter, we’ll describe general rules for manipulating logical expressions that will make it clear why I, III and IV are equivalent.