Learn how data collected can be used to predict future. This is an important lesson, and it is not found in any other books or websites.

*click on the content to continue..*

Consider the number of glasses of water students drink during normal day and sunny day. Today is a hot sunny-day. Can one predict how many students would require `4` glasses of water?

One student says : As per the data in the table, we can expect more or less `9` students will drink `4` glasses of water.

Another student says : The data represents the number of students who had `4` glasses of water yesterday. There is no way we can predict how many will take today.

Is it possible to predict the future with data?

- Yes. We can expect more or less to the value in the data.
- Yes. We can expect more or less to the value in the data.
- No. It will never be possible to predict

The answer is "Yes. We can expect more or less to the value in the data".

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day another *class of `40` students* is considered. Can one predict the data for another class of `40` students?

- Yes. The data would be more or less same.
- Yes. The data would be more or less same.
- No. The data would be completely different.

The answer is "Yes. The data would be more or less same". The data may not be exactly equal, but one can expect that the data would be more or less same.

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day another *class of `80` students* is considered. Can one predict the data for the class of `80` students?

- Yes. The data would be more-or-less double of the data for `40` students.
- Yes. The data would be more-or-less double of the data for `40` students.
- No. The data would be completely different.

The answer is "Yes. The data would be more-or-less double of the data for `40` students". The data may not be exactly double, but one can expect that the data would be more or less double.

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day another *class of `20` students* is considered. Can one predict the data for the class of `20` students?

- Yes. The data would be more-or-less half of the data for `40` students.
- Yes. The data would be more-or-less half of the data for `40` students.
- No. The data would be completely different.

The answer is "Yes. The data would be more-or-less half of the data for `40` students". The data may not be exactly half, but one can expect that the data would be more or less half.

Consider the number of glasses of water students drink during normal day. The data is collected for `40` students at the end of the day. Students are lined-up and one by one students provide information how many glasses of water they drank. • First student says, `2` glasses

• Second student says, `3` glasses

• Third student says, `1` glass

And so on, Which of the following describe the data of each student?

- `11` students out of `40` would say `1` glass
- `18` students out of `40` would say `2` glasses
- `6`, `2`, and `3` out of `40` would say `3`, `4`, and `5` glasses respectively.
- all the above
- all the above

The answer is "All the above". Note that the table is interpreted for one person at a time.

Consider the number of glasses of water students drink during normal day. The data is collected for `40` students. One day, *only `1` student* is considered. Can one predict how many glasses the student would drink?

- Yes. One can predict the data for one student.
- No. Any prediction is not possible at all for one student.
- Yes and No, the prediction for one student can be given only in the context of large data.
- Yes and No, the prediction for one student can be given only in the context of large data.

The answer is "Yes and No, the prediction for one student can be given, but only in the context of large data.".

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day, *only `1` student* is considered. It is clear that, with certainty, it is not possible to predict the number of glasses the student will report. The possible prediction based on the recorded-data is given below.

If the data-collection is repeated `40` times,

• the data-value `1` glass would appear `11` times out of the `40` times

• the data-value `2` glasses would appear `18` times out of the `40` times

• the data-value `3` glasses would appear `6` times out of the `40` times

• the data-value `4` glasses would appear `2` times out of the `40` times

• the data-value `5` glasses would appear `3` times out of the `40` times.

This is referred as : "**probability**" of the data value `1` is `11//40`.

Let us consider another form of data. A person is tossing a coin, and recording the data. The data is shown in in the tally and tabular form for `40` tosses. If the coin is tossed `10` times, can one predict the result?

- Yes, the coin will have more or less `5` times heads and `5` times tail
- Yes, the coin will have more or less `5` times heads and `5` times tail
- No, it is not possible to predict.

The answer is "Yes, the coin will have more or less `5` times heads and `5` times tails".

Considering data from tossing a coin. If the coin is tossed once, can one predict the result?

- Yes, the best one can say is the result will be come as heads for `20` times in `40` repetitions
- Yes, the best one can say is the result will be come as heads for `20` times in `40` repetitions
- No, it is not possible to predict.

The answer is "Yes, the best one can say is the result will be come as heads for `20` times in `40` repetitions".

**Predicting Based on Representative Data**: Data can be used to predict the outcome of events.

Data is collected over a large number of iterations/repetitions.

It is known the result of one iteration can be one of many possibilities.

The result of one iteration is predicted in the context of the large-number-of-repetitions.

*switch to slide-show version*