News

Science, Tomatoes, and How to Read a Table

Numerous knowledge systems exist, including intuition, invention, personal experience, statistical data collection, storytelling, religious teachings, philosophical schools, and science.
Updated:
December 6, 2022

They all have value in certain situations. For example, we tell our kids to listen to their intuition when they go places and leave if they feel something is wrong in their gut. And whenever we go to Penn State football games, we leave our house super early to avoid the frustration of sitting in traffic, based on our personal experience.

As Extension educators at Penn State, we primarily focus on knowledge generated through scientific studies. The strength of science-based information is that it does not rely on opinion. Instead, information is built on unbiased observations and systematic experimentation. Scientists follow the scientific method, which is a rigorous process of answering questions. You can find out more about the scientific method, including the steps involved, here: Scientific Method - Definition, Steps & Example (byjus.com).

Science is the best system we have for unlocking how the universe works. Because of science, we know how plants respond to different fertilizers, training systems, light levels, weed pressure, and so much more. However, one of the problems with science is that it involves terminology that can be difficult to explain, which can lead to misunderstandings.

Take the results from our recent tomato cultivar evaluation as an example. We have been evaluating cultivars of key vegetables since 2008 because one of the most frequent questions farmers ask is, "What cultivars should I grow?". In 2022-23, we are focusing on early-maturing red slicer tomatoes. Below is a table of some of our results from the 2022 growing season.

Plants were grown using a plasticulture system with 18-inch in-row spacing and 10 feet between rows. Water was supplied at a rate of 1 acre-inch per week, and pests were managed following recommendations in the Mid-Atlantic Vegetable Production Recommendations guide.

Table 1. Yield per plant of tomato cultivars grown at the Penn State University Russell E. Larson Agricultural Research Center in Pennsylvania Furnace, Pennsylvania, in 2022.

Table showing yield per plant of 11 tomato cultivars grown at Penn State\'s Larson Agricultural Research Center in 2022

*Values are means of six plants per replication and four replications; 'Red Deuce' (bolded) is the standard to which all other cultivars were compared; Values followed by different letters or within a column are statistically different at P ≤ 0.05. A randomized complete block design was used for the experiment, with each cultivar replicated four times. Data were analyzed using the mixed procedure, and means were separated using pdiff.

The first column, highlighted in green, lists the cultivars we evaluated. Cultivar evaluations conducted to develop recommendations generally have a "standard" cultivar. This is a widely grown cultivar that is included so that you can compare the other cultivars to it. In this case, the standard is 'Red Deuce,' which is bolded.

The next column is the mean marketable fruit yield in pounds, followed by the mean marketable fruit yield by number. These columns are highlighted in blue. The values in these columns are yields of tomatoes that are saleable.

The next two columns, highlighted in orange, are the mean unmarketable fruit yields by pounds and number. The values are of yields of tomatoes that are culls.

The last two columns, highlighted in yellow, are the total fruit yields by pounds and number. The values in these columns combine the marketable and unmarketable yield. If you add the values for a cultivar's marketable and unmarketable pounds of fruit, they will roughly add up to the total fruit yield in pounds. Sometimes they don't add up exactly because of rounding, but the values are very close.

The first value in the table is 7.8, in purple font, which is the mean marketable fruit yield in pounds for 'Red Deuce.' The value of 7.8 means that, on average, 'Red Deuce' produced 7.8 pounds of fruit per plant.

The letter "a" follows 7.8. In fact, every value in this column is followed by an "a." This is how we signal whether the data analysis showed differences between the cultivars. When the letters are the same or overlap, the values are not different, according to the analysis. Since an "a" follows all the values in the column, they are not statistically different from each other. A shorthand, when all of the values in a column are not different from each other, leaves off the letters. We've used this method in the mean total fruit yield columns (highlighted in yellow) as an example.

The first value in the next column is 13.1, highlighted in red font. This is the mean number of marketable fruit that 'Red Deuce' produced per plant. The value 13.1 is followed by "ab." The next value is 16.8 for 'Patsy.' It's highlighted in brown font and followed by an "a." Since 'Red Deuce' and 'Patsy' values for the mean number of marketable fruit share an "a," they are not statistically different from each other.

The rest of the column reveals that none of the other cultivars are different from 'Red Deuce.' Now compare 'Patsy' with 'Thunderbird,' highlighted in blue font. The value for 'Patsy' is followed by an "a" and 'Thunderbird' by a "b." This indicates that these yields are statistically different. To take it further, 'Patsy' produced more tomatoes than 'Thunderbird.'

Let’s go back to the first value in the table, 7.8. The asterisk behind 7.8 signals a note at the bottom of the table in italics. The note provides more information about how to read the table, how the experiment was designed, and how the data were analyzed. The bolded italics text states, "Values followed by different letters within a column are statistically different at P ≤ 0.05". "P" stands for probability. You can think of probability as the likelihood that an event will happen. The value of ≤ 0.05 can be thought of as a percent or how often in 100 times the event will happen. Using our study as an example, if we conducted our tomato evaluation 100 times, in 5 or fewer of those times, the cultivars would have different mean marketable fruit yield in pounds. Put another way, if we conducted our tomato evaluation 100 times, in 95 of those times, the yields would not be different.

This can be confusing because, numerically, the values are different. For example, the 7.8 for 'Red Deuce' is numerically different from the 7.3 for 'Patsy,' highlighted in green font. These values are means which are the average values of the experimental data. In our experiment, we harvested fruit from 6 plants of each cultivar planted per row or block, and had four rows. The mean, then, is based on the yield of 24 plants. You can get the same average from different sets of numbers. For example, the average of 4 and 6 is 5, and the average of 0 and 10 is also 5. However, 4 and 6 represent a much smaller range than 0 and 10. We call this variability. It's the same for the means in the table, sometimes, the range that resulted in the mean is small, and sometimes it is large.

Standard deviation is used in experiments to measure the variability around the mean. In the table, we have used the letters after the numbers to signal when the means and standard deviation show differences between the values. Even though the 7.8 for 'Red Deuce' is numerically different from the 7.3 for 'Patsy,' when you consider the variability or the range between the numbers that resulted in 7.8 and 7.3, they are not different according to the data analysis.

Another way to think of this is by analogy:

  • Pretend you go to a restaurant, and the server tells you that the wait for a table will be 30 minutes plus or minus 10 minutes; they are giving you the variability. You may have to wait 20 minutes on the low end and 40 minutes on the high end. Your friend goes to another restaurant, and the server tells them that the wait is 20 minutes plus or minus 15 minutes. They would have to wait between 5 minutes on the low end and 35 minutes on the high end.
  • Now, let's add the concept of the P-value using P ≤ 0.05. Since the two wait times overlap, 10 to 40 minutes and 5 to 35 minutes, we would say they were not different, even though the average wait times of 30 or 20 minutes are different numerically. In other words, 95 times in 100, the wait time for you and your friend will fall into that overlapping range and, therefore, not be different. Experiments and data analysis are more involved than this, but this analogy offers a simplified explanation.

With this knowledge, anyone reading the table can interpret it. For example, none of the cultivars evaluated were different from 'Red Deuce' for mean marketable fruit by weight or number, mean unmarketable fruit by weight or number, and mean total fruit by weight and number. This means that growing any of these cultivars using the same growing methods and similar environments will produce yields of tomatoes not different than 'Red Deuce' most of the time, and may be a good option for growing.