Donald Campbell together with a co-author, he published a book on planning experiments in the field of psychology: Experimental and Quasi-Experimental Designs for Researchy, where he used the phrase "perfect experiment"

“In an ideal experiment, only the independent variable (and, of course, the dependent variable, which takes on different values ​​under different conditions), is allowed to change. Everything else remains unchanged, and therefore the dependent variable is affected only by the independent variable.”

Robert Gottsdanker, Fundamentals of psychological experiment, M., Moscow University Publishing House, 1982, p. 51.

“In our three well-designed experiments, this was certainly not the case. The weavers wore headphones and worked without them in different times- on even or odd weeks. The pieces that Jack learned using the whole and partial methods were also different. And Yoko never drank both types of tomato juice on the same day.

In each case, something else changed in addition to the independent variable. […]

As you will soon see, a perfect experiment is impossible. However, the idea itself is useful, and it is what guides us when improving real experiments.

In an ideal (impossible) experiment, the weaver would work with and without headphones at the same time! Jack Mozart would simultaneously learn the same piece using whole and partial methods!

In both of these cases, the difference in the values ​​of the dependent variable would be due only to the independent variable, the difference in its conditions. In other words, all incidental circumstances, all other potential variables would remain at the same unchanged level.”

Robert Gottsdanker, Fundamentals of psychological experiment, M., Moscow University Publishing House, 1982, p. 51-52.

The ideal experiment is scientific model, a mental ideal, a standard against which real experiments can be evaluated.

If you want to test experimentally whether music radio programs help you learn words French, you can easily do this by repeating one of the experiments described in the previous chapter. You'll likely design your experiment after Jack Mozart. You will determine both conditions of the independent variable in advance, study at the same time of day, and document each step of the experiment. Instead of four piano pieces, you could learn four lists of words as follows: listening to the radio, without the radio, without the radio, with the radio. In other words, you will be able to use the same experimental design as Jack.


It is quite possible that you will understand some of the reasons for your own actions. But something will certainly remain unclear, and above all - the sequence of conditions of the independent variable, that is, the experimental design itself. This is not your fault, because you have not yet gone through the experimental schemes. In this chapter this shortcoming will be eliminated. Of course, you can conduct an experiment by simply imitating a model, but it is much better to understand what you are doing. No two experiments are identical, and blindly copying an experimental design often leads to difficulties. For example, Yoko could use regular alternation between two conditions (types of juice) in her experiment, as was done in the experiment with the weavers (using or not using headphones). But then she would probably know the name of the juice being tested, which is exactly what she was trying to avoid” by using a random sequence. Moreover, unless you know the rationale behind the various plans and designs, it will be difficult for you to evaluate the quality of the experiments you read about. And, as you remember, teaching you this is one of the main goals of our book.


In this chapter, we will compare the designs used to conduct the experiments in Chapter 1 with less successful plans for conducting the same experiments. The model for their comparison will be a “flawless” experiment (which is practically impossible). Analysis of this kind will allow us to consider the basic ideas that guide us in creating and evaluating experiments. During this analysis, we will introduce several new terms to our vocabulary. In the end, we will determine what is perfect and what is not in the three experimental designs that were used in Chapter 1. And these designs represent three ways of ordering, or three types of sequences of presentation of different conditions of the independent variable, used in an experiment with one subject.


After studying the material in this chapter, you will be able to competently plan your own experiment without imitating someone else’s experiment. At the end of the chapter we will be asked questions on the following topics:


1. The degree of approximation of a real experiment to an ideal one.


2. Factors that violate the internal validity of the experiment.


3. Systematic and non-systematic sources of violation of internal validity.


4. Methods for increasing internal validity, methods of primary control and experimental designs. 5. Some new terms from the experimenter’s vocabulary.

JUST PLANS AND MORE SUCCESSFUL PLANS

Undoubtedly, the first condition for conducting an experiment is its organization, the presence of a plan. But not every plan can be considered successful. Let us assume that the experiments described in Chapter 1 were carried out differently, according to next plans.


1. In the first experiment, let the weaver first wear headphones for 13 weeks, and then work without them for 13 weeks.


2. Suppose Yoko decided to use only two cans of each type of juice in her experiment, and the entire experiment took four days instead of 36.


3. Jack decided to apply the partial method of memorization to the first two plays, and the whole method to the next two.


4. Or, keeping the same sequence of methods, Jack chose short waltzes for the experiment, rather than the longer pieces that he usually learned.


We feel quite clearly that in comparison with the experiments previously described, all these plans are unsuccessful. And if we had a sample for comparison, then we could definitely say exactly why the original plans were better. A flawless experiment serves as such a model. In the next section, we discuss it in detail and then see how it is used to evaluate our experiments.

A FLAWLESS EXPERIMENT

We now have examples of successfully and unsuccessfully designed experiments. Is it possible to further improve a well-designed experiment? And is it possible to make an experiment absolutely flawless? The answer is: any experiment can be improved indefinitely, or - which is the same thing - perfect experiment cannot be carried out. Real experiments improve as they get closer to perfection.

The perfect experiment

Impeccability is best defined in terms of the concept of an ideal experiment (Keppel, 1973, p. 23). In an ideal experiment, only the independent variable (and, of course, the dependent variable, which takes on different values ​​under different conditions), is allowed to change. Everything else remains the same, so the dependent variable is only affected by the independent variable. This is certainly not the case in our three well-designed experiments. The weavers wore headphones and worked without them at different times - on even or odd weeks. The pieces that Jack learned using the whole and partial methods were also different. Yoko never drank both types of tomato juice on the same day. In each case, something else changed in addition to the independent variable. In subsequent chapters, we'll cover the Other Type of Experiment, which uses different subjects for each condition of the independent variable, eliminating time variations (like even and odd weeks) and task differences (learning pieces). But they also do not meet all the requirements of an ideal experiment, because the subjects will also be different. As you will soon see, a perfect experiment is impossible. However, the idea itself is useful, and it is what guides us when improving real experiments.


In an ideal (impossible) experiment, the weaver would work with and without headphones at the same time! Jack Mozart would simultaneously learn the same piece using whole and partial methods. In both of these cases, the difference in the values ​​of the dependent variable would be due only to the independent variable, the difference in its conditions. In other words, all incidental circumstances, all other potential variables would remain at the same unchanged level.

Endless experiment

Poor Yoko! In her case, even a perfect experiment will not be flawless. No wonder she fears that tomato juice of the same variety varies in quality in different cans. Even if she had conducted a perfect experiment, managing to drink two different types of juice from the same glass at the same time, her estimates would still only apply to particular examples of each type. Yet Yoko could eliminate the effects of juice quality variability across jars, achieving a different kind of impossible feat. “All” she needs is not to stop her experiment after 36 days and continue it indefinitely. Then she could average not only the variability of each type of juice, but also possible fluctuations in own assessments its taste. This is an endless experiment. It is not difficult to see that it is not only impossible, but also meaningless. After all, the general meaning of the experiment is to draw conclusions that have a wider application based on a limited amount of data. However, an endless experiment, like an ideal one, also serves as our guiding idea.


In fact, Jack Mozart and the authors of the weaving workshop study could also be asked to conduct an endless experiment instead of an ideal one. After all, even if in an ideal experiment Jack discovers that the partial method is more effective for this particular piece, the question remains whether the advantages of this method will continue to be learned when learning other pieces. The first experiment raises the same doubts: what if the weaver worked better with headphones only during the experiment? However, they (and you) need to be warned that endless experimentation also has its downsides. The very fact that subjects are presented with one of the experimental conditions may affect (during the study period) their performance under another condition. It is possible that the partial method was more effective during the experiment only due to the contrast with the whole method. And after the experiment, only one method will be used, and the contrast factor will disappear. All this proves that neither ideal nor endless experiments are completely flawless. Fortunately, they not only have different disadvantages, but also different advantages and can serve to evaluate real experiments that are very far from perfect.

Full Compliance Experiment

Neither ideal nor infinite experiments can eliminate the shortcomings of Jack Mozart's unsuccessful version of the study - memorizing waltzes instead of sonatas. At best, Jack could conduct a brilliant experiment on waltzes - which, however, will not make them sonatas!


To completely eliminate shortcomings of this kind, a full compliance experiment is needed. This experiment is also pointless, although it is practically feasible. In his study, Jack would have to learn the same pieces that he would learn after this. There is no benefit from such an experiment, just like from an endless one. But no one can point out to Jack the inconsistency of the plays that he learned in his experiment.


All three types of (almost) perfect experiment are unrealistic. An ideal experiment is impossible, an experiment of complete compliance is meaningless, and an infinite experiment is both. They are useful as “thought” experiments. They tell us what to do to create an effective experiment. Ideal and infinite experiments show how to avoid extraneous influences and thereby achieve greater confidence that the experimental results truly reflect the relationship. independent and dependent variables. The full compliance experiment reminds us of the need to control for other important experimental variables, which we hold constant.

GENERALIZATION, REPRESENTATION AND VALIDITY

As we established in Chapter 1, the goal of any experimental study is to ensure that conclusions based on a limited amount of data remain valid beyond the experiment. This is called generalization. Our analysis of a flawless experiment shows that the reliability of experimental conclusions is determined by at least two requirements. The validity of possible generalizations also depends on them. The first requirement is that the relationship between the independent and dependent variables found in the experiment be free from the influence of other variables. The second requirement is that the constant level of the additional variable involved in the experiment corresponds to its level in the wider field of practice.

Representativeness

We already know that a perfect experiment is impossible, but it gives us guidelines for properly designing real-world experiments. We can now ask the question of the application of these principles. The answer is simple - you need to determine how successfully the actual experiment represents (represents) an impeccable experiment. First of all, let's see to what extent the possibility of extraneous influences on the dependent variable is excluded in our experiments.


In the original study, conducted in a weaving workshop, the subject worked for 13 weeks with headphones and 13 alternating weeks without headphones. In the "failure" revision of the experiment, she wore headphones for the first 13 weeks and worked without them for the next 13. In an ideal experiment, the subject would have to work with and without headphones at the same time. It is clear that the pattern of alternating weeks approaches this ideal in, to a greater extent. An alternation of two conditions, or ABABABABAB, etc., is more representative of their simultaneous presentation than a sequence consisting of only A and B.


In his original experiment, Jack Mozart learned pieces in the following order: whole method - partial - partial - whole. In the “unsuccessful” experiment the sequence was different: integral - integral - partial - partial. In the first case, the average positions of the integral and partial methods were the same. The holistic method occupied an average of 2.5 in positions 1 and 4 in the sequence. The positions of the partial method were 2 and 3, the average was 2.5. On the contrary, in the “unsuccessful” experiment, the holistic method occupied positions 1 and 2. The average was 1.5, and the partial method occupied positions 3 and 4, the average was 3.5. The original experiment was again more representative of the simultaneous presentation of two conditions.


In her original experiment, Yoko drank both Rittenhouse and Buddin' Beadle juices in random order for 36 days. In the “unsuccessfully” modified version, it ended in 4 days. It is clear that 36 is closer to infinity, not 4. The original design represents an infinite experiment better than the modified design.


The full compliance experiment is better represented in Jack's original study than in his modified version with waltzes. Although Jack did not learn all the pieces he intended to learn later, he took exactly the same type of pieces, that is, he chose the appropriate level of the additional variable. And the option with waltzes turns out to be “inadequate”, since these pieces differ in their level from those that Jack would learn in a full compliance experiment.


In summary, more reliable information about the relationship between the independent and dependent variables is provided by those experiments that better represent the ideal and infinite experiments. And the closer the level of a significant additional variable in the experiment is to its level in the full compliance experiment, the better the real situation being studied is represented in it.

Validity

Depending on how flawless real experiments are, they are called more or less valid. A flawless experiment would allow one to accurately separate the correct hypothesis from the incorrect one. If Jack Mozart could conduct a perfect experiment, he would know with absolute certainty which of his hypotheses is correct: the partial method is better or the whole method is better. Thus, when you talk about the validity of an experiment, you are assessing the quality of the work you propose to do to determine the validity of one of the competing hypotheses.

Internal validity

All three of the “failed” experiments we described lacked internal validity. This means that they do not allow us to consider the resulting picture of the relationship between the independent and dependent variables as reliable. And, as we have seen, all sorts of outside influences are to blame for this. An experiment that lacks internal validity cannot be used to determine which hypothesis about the relationship between the independent and dependent variables is true and which is false. For example, if we are not clear whether the weaver worked better because she wore headphones or because the weather was good, we cannot consider the results of the experiment sufficient to determine the true and false hypotheses about the effect of headphones on labor productivity.

The term “internal” emphasizes the essence of this type of validity. We can say that an experiment lacking internal validity is unsuccessful, so to speak, from the inside, by its very essence. Indeed, if it does not allow one to verify the reliability of the found relationship between the independent and dependent variables, it is simply useless.

External validity

An “inadequate” experiment that Jack could have conducted, learning waltzes instead of sonatas, would not have failed in principle. This would be a completely normal experiment in learning waltzes. It cannot be considered useless. Jack could have used his results if he had decided in hindsight that what he was actually looking for was the most effective method learning waltzes. However, this experiment lacks external validity. It does not provide sufficient basis for determining true and false hypotheses about best method learning sonatas.

The term “external” refers to the definition of the topic of the experiment being conducted - what exactly it is dedicated to. In this case, the experiment was not externally valid because “sonatas” are just as necessary component the hypothesis being tested, as independent and dependent variables.

General definitions

The concepts of external and internal validity are central to our entire book. Their application in subsequent chapters is determined in basic terms by what we have just said. We will now give more formal definitions of these concepts. True, you will understand their full significance only when you become acquainted with experimental problems of a higher order. But you will already have a basis for a general understanding and further clarification of what validity is and its two types.


Let's start with a schematic representation of the experimental hypothesis:


Independent variable... Attitude... Dependent variable... Levels of other variables. So, the hypothesis includes the relation itself and the designations of both its sides. The definition of the validity of an experiment, both internal and external, is as follows. This is the degree of validity of the conclusion about the experimental hypothesis that the results of a given experiment provide in comparison with the results of an experiment that is flawless in all three aspects.


The concept of internal validity of an experiment concerns only the relationship itself and does not affect what exactly is correlated. Hence, internal validity is the degree of validity of a conclusion about an experimental hypothesis based on the results of a given experiment, compared with the conclusion based on the results of ideal and infinite experiments, where changes in the independent and dependent variables occur under the same conditions, and all other side factors remain unchanged.


Any experiment also faces the problem of matching the situation being studied with the real one. The question of whether the level corresponds to an additional variable, such as music, has already arisen. We will discuss similar issues for independent and dependent variables a little later. It is clear that questions about correspondence concern the content of what stands on either side of the relationship being studied. These are the issues of external validity. It can be defined as the degree of legality this conclusion about the experimental hypothesis in comparison with the conclusion that is based on the results of the experiment with full compliance independent, dependent and levels of all additional variables.


In this chapter we will discuss mainly the issue of internal validity. In any experiment you will face this problem from the very beginning; If internal validity is not achieved, there is no point in considering external validity. Recall that Chapter 1 presented experiments of a type for which issues of external validity are largely undiscussed. And in the next chapter we will look at experiments in which these issues come to the fore.

No guarantees

We can say that an experiment is valid without actually knowing whether the conclusions are correct. We can prove that it is invalid without knowing that the conclusions are erroneous. The reason is that we cannot know in advance which of the two competing hypotheses is correct. After all, if we knew about this, we would not have to conduct the experiment. If Jack had known in advance which of his two hypotheses was true: (1) the partial method is better or (2) the whole method is better, he might not have carried out his research.


When determining the validity of real experiments, we must compare the procedures for conducting them with the procedures for “conducting” a flawless experiment. A valid experiment represents a flawless experiment better than an invalid one. therefore, in a valid experiment we are more likely to obtain the same results that we could achieve in a flawless experiment. It is important to remember that limited - and always imperfect - experimental data comes with risks. Even the most highly valid experiment can provide inaccurate information about the correctness of the experimental hypothesis, and information obtained in an invalid experiment may turn out to be accurate. We will discuss the reasons for this risk and its impact on the interpretation of experimental results in the following chapters, primarily in Chapter 6 (“Significant Results”).

FACTORS THREATENING INTERNAL VALIDITY

We can now apply the concept of a flawless experiment (ideal and infinite) to describe what prevents the achievement of internal validity in real experiments. As we will see, some of these interferences cannot be eliminated; they are necessarily related to the procedures for conducting our not-quite-flawless experiments. For example, if Jack needs to learn two pieces, he will inevitably learn one of them first. There are, however, difficulties that can be overcome if you take care of them in advance. So, Jack already knew that you shouldn’t use partial and holistic methods at different times of the day.

Changes over time

Known side effects. In an ideal experiment, different states of the independent variable are presented to the subject simultaneously. Jack couldn't do that, but he could at least study at the same time of day. Time of day is a previously known covariate (i.e., different from the independent) variable that can affect the effectiveness of the lesson, and it must be kept constant. If Jack had not been careful, then different days he could carry out the experiment either with closed or with open windows. And street noise can greatly affect the effectiveness of training. Therefore, it is better to keep it unchanged by keeping the windows closed. In the headphone experiment, which lasted more than six months, the researchers were aware of possible changes in temperature and humidity in the weaving workshop. Unfortunately, the experimental conditions did not allow them to exclude these changes. But the experimenters recorded and tried to take into account the influence of these factors. And most importantly, alternating two conditions of the independent variable reduced the influence of these factors. The experimenter should try to determine in advance all possible factors that may change over time. And most importantly, try to keep them at a constant level with each new test.


Instability over time. But even if he tries his best, the experimenter will not be able to make one sample exactly (except for the difference in the levels of the independent variable) similar to the others. There will always be some instability over time. In an experiment, it manifests itself in the variability of secondary factors, as well as in some variations of the independent variable itself. Finally, completely unclear sources of strong fluctuations in the responses of subjects always remain, leading to an increase in the scatter of experimental data. Let's look at specific examples of each of these three forms of instability over time.


Variability of side factors. It often happens like this. that the experimenter knows about the existence of extraneous factors influencing the dependent variable, but cannot control them directly. Some day at work as a weaver could turn out to be “not the most successful” due to the fact that she went to bed late the night before. Of course, the experimenter could try to convince her not to do this until the experiment is completed. But the experiment lasted six months! Having dined at a restaurant the night before, Jack felt unwell while practicing one of the plays - he should be more careful next time.


From sample to sample, environmental conditions never remain the same. Describing an experiment in a weaving workshop, the researchers state:


“It is well known that the productivity of weaving labor can be influenced by atmospheric conditions. Thus, with increasing temperature and relative humidity, the number of thread breaks decreases. On the other hand, a further increase in both, continuing to have a beneficial effect on physical properties yarn, has an adverse effect on the physiological state of people, whose performance may be reduced so as to nullify any positive effects” (Weston and Adams, 1932, p. 56).


Consequently, even by measuring temperature and humidity, it is impossible to accurately determine their effect on labor productivity. The list of secondary variables could be continued indefinitely, including subjective factors, such as, for example, the good or bad state of health of the subject during the experiment. A conscientious experimenter may detect some of these changes, but cannot avoid them. Now you understand why the experimenter strives to escape from the real world into beautiful soundproof laboratories and deal with such subjects (white rats), whose behavior he can control 24 hours a day. But even there, heaters sometimes get cold, water bottles get clogged, and rats get a runny nose.


The very presence of an experimental situation can cause lasting changes in the behavior of the subject. This was the main conclusion from the famous Hawthorne experiments, a conclusion important for all experimental psychologists. A study was conducted at the Western Electric Plant in Hawthorne, Illinois, on the impact of shop floor lighting on productivity. assembly work. Preliminary attempts to establish any pattern ended in failure. A systematic study of workers' working conditions was then undertaken (Roethlisberger and Dixon, 1946). A major part of this study involved experimenting with a switch assembly task. It was “an assembly of telephone relays; this is an operation that is usually performed by women: you need to connect approximately 35 small parts into the “assembled reinforcement” and secure it with four screws” (p. 20).


A special room was equipped for the experiment so that the researchers could control the working conditions and adequately evaluate the activities of the operators. Five young women who had fully mastered this type work. Two independent variables were examined: the distribution of rest periods, as well as the length of the working day and working week. Labor was paid according to the total number of switches assembled by a team of five people.


It was found that, regardless of the distribution of rest periods and the length of the working day and week, labor productivity continued to grow for two years! The researchers report, first, a “gradual change social relations in a group of operators in the direction group cohesion and solidarity and, secondly, about changing the relationship between operators and their controllers. The organizers of the experiment sought to create an atmosphere of mutual support and cooperation among the girls, to relieve them of unnecessary worries and anxieties. These efforts to create necessary conditions the experiment indirectly led to changes in relationships between people” (pp. 58-59).


Using our terminology, this situation can be described as follows. Before the experiment, the social working conditions of the subjects were at the same level. In the experimental situation, this “side variable” moved to another level. This led to a long-term change in the dependent variable - labor productivity, despite the fact that objectively the social conditions in the experiment remained unchanged.


The independent variable is i. We cannot count on complete identity of each of the conditions of the independent variable throughout the experiment. On some days or even weeks, the headphones might not be as comfortable to wear as on others. Despite Jack's best efforts, he may have different attitudes towards, for example, the partial method when learning different pieces. And Yoko was aware of the variations in each of the conditions of her independent variable. The juice of the same variety in any two jars is never the same, and the difference is sometimes very large. Some changes will occur even in those experiments in which, it would seem, complete uniformity of conditions has been achieved. The brightness of the electric light (as a stimulus) will change due to voltage drops in the network, and they happen quite often. During the experiment, natural changes may also occur, for example, as the life of the light bulb increases, its light may become less and less bright.


Dependent variable. When exposed to the same independent variable, the subject will not always give the same answer. This will be the case even if the experimenter is unusually skillful and punctual in eliminating instability of confounding factors and the independent variable.


The instability of the dependent variable is illustrated very effectively in the graphs depicting the results of the two experiments. In Fig. Figure 2.1 shows the weekly output of subject D. in the experiment with headphones. As you can see, she missed the fewest strokes from the tenth to the twelfth week and from the eighteenth to the twenty-second. And its worst indicators are greatest number missed beats - occur in the fourteenth week and the end of the experiment. And what is especially interesting is that for both operating conditions the curves rise and fall together. Changes in productivity over time are undoubtedly more significant than the differences between headphone use and non-use.


In Fig. Figure 2.2 shows changes in the subject's responses in the experiment on the choice reaction time. Trials were given every six seconds; the subject had to move the handle towards or away from himself and thereby combine two light points. Of course, the dots were presented in a random order. Over 70 planned consecutive trials, both short-term fluctuations and more regular deviations were observed in the subject's reaction time. The shortest reaction times were shown approximately between the thirtieth and fortieth trials, and the longest between the sixtieth and seventieth trials. And this increase cannot be considered the result of fatigue, because just before the fortieth test the subject was resting. As a result, the largest indicators slightly exceeded 400 ms, and the smallest - 200 ms, i.e., the reaction time changed in a ratio of two to one.














Rice. 2.1. Weekly labor productivity of subject D. The x-axis is the sequence of weeks of the experiment. The y-axis is the number of missed beats (on average per hour). Dotted line - work without headphones, solid line - with headphones



Rice. 2.2. Choice reaction time for 70 consecutive trials. The abscissa axis is the sample numbers (the dotted line marks the rest period). The y-axis is the reaction time (in ms). The dotted line is a shift of the handle towards you, the solid line is away from you; Answers with errors are marked with triangles



Thus, in the study of reaction time, minute-by-minute and even second-by-second changes were found. They are not associated with fatigue, but rather can be explained by fluctuations in attention. The graph of weaver D's results shows significant fluctuations in her labor productivity. Moreover, the ups and downs of the curves appear to be independent of temperature and humidity. True, the increase in the number of missed hits by the end of the experiment can be explained by the use of artificial (gas) lighting; it was necessary since the experiment ended in the fall.


Even when the subjects' responses themselves are constant, changes can be introduced by the procedure for measuring them. The counter records each movement of the shuttle making a new strike. However, the devices are not always in good working order. And if measurements are associated with subjective judgments, then they will certainly turn out to be less stable. Jack considered the piece to be completely memorized after two flawless performances by heart. However, there were quite a lot of small errors, almost mistakes, in the execution of the plays. Sometimes Jack might consider them mistakes, and sometimes not. And this was explained by completely natural fluctuations in his subjective state. Changes in the evaluation of performances of plays could also be regular. For example, as the experiment progressed, Jack could become increasingly strict about his mistakes.

Differences in experimental tasks

The same piece cannot be learned (as ideally) by two different methods at the same time. But even if the methods follow one another, they still cannot be applied to the same piece. If a piece is memorized, it is memorized. There are experiments in which it is necessary not only to present different experimental conditions at different times, but also to change the difficulty of the tasks. This is a very significant difference from an ideal experiment. How can Jack be sure that the pieces he chooses are equally difficult? But in any experiment to study learning with the participation of the same subjects, the tasks for different conditions of the independent variable will necessarily be different.

Sequence Effects

In an unsuccessful version of his experiment, Jack first learned two pieces using the partial method, and then two others using the whole method. We already know that the quality of his play can be affected by any (including those just described) factors that change over time. However, there are other influences associated with the position of each independent variable condition in the sequence of their presentation. The influences of one condition on those that follow it are called sequence effects, order effects, or carryover effects. They can be positive and negative, general and specific. The use of the partial method could have a positive effect on Jack's future training in the holistic method by increasing practice or getting used to the experimental mode. It could also have a negative effect: the habit of memorizing plays in short passages could interfere with memorization large parts or Jack could simply be tired from studying.

Experimenter's biases

At the time of the advent of the automobile, there was such an anecdote in the form of a riddle. Question: What is the most important cog in a car? Answer: The one who holds the steering wheel. We can ask in the same spirit. Question: Which of the factors threatening the validity of an experiment is the most dangerous? Answer: Experimenter.


If the researcher has any expectations about the results of the experiment, especially related to preference for one of the conditions of the independent variable, then these expectations will somehow manifest themselves during the experiments, Yoko knew well that the main thing was to create a random sequence of both types of juice. She wanted to eliminate any hint of which variety she was evaluating each morning. But Jack did not show due care. First, he selected pairs of pieces that seemed to him the same in difficulty (in order to learn each of the pieces using different methods), and then he himself arranged them in a certain sequence. But if at the same time he counted on the greater efficiency of the partial method, he could unwittingly select more difficult pieces from each pair for the holistic method.


In addition, subjective assessments of the quality of performance of plays could not fluctuate randomly (as shown above). Jack may have unwittingly favored one of the methods. Therefore, when assessing the performance of both pieces of each pair, Jack should not put too much faith in the partial method, but when using the holistic method, he should also try to achieve the highest results.


In the experiment with headphones, the researchers naturally expected to increase productivity with their help and may well have conveyed their confidence to the participants in the experiment. Therefore, perhaps the weavers (on average) tried to work better with headphones.


One of the most insidious consequences of experimenter bias is the reluctance to take into account some experimental data as allegedly obtained under atypical conditions, for example, during strong street noise. Unfortunately, the experimenter's opinion about the atypicality of the conditions is often very subjective. Hence, the same noise level will be considered atypical in one state of the independent variable, but quite normal in another.


Even the accuracy of data recording may depend on the experimenter's bias. It has been shown, for example, that in the protocols of experiments on the study of extrasensory perception there are errors in favor of the presence of the corresponding phenomena if the protocolist believes in their existence. Those who do not believe in extrasensory perception do not allow such distortions (Kennedy, 1939). A thorough analysis of this problem as a whole is presented in the book Experimenter Influences in Psychological Research (Rosenthal, 1976).

We now have examples of successfully and unsuccessfully designed experiments. Is it possible to further improve a well-designed experiment? And is it possible to make an experiment absolutely flawless? The answer is: any experiment can be improved indefinitely, or - which is the same thing - a perfect experiment cannot be carried out. Real experiments improve as they get closer to perfection.

The perfect experiment

Impeccability is best defined in terms of the concept of an ideal experiment (Keppel, 1973, p. 23). In an ideal experiment, only the independent variable (and, of course, the dependent variable, which takes on different values ​​under different conditions), is allowed to change. Everything else remains the same, so the dependent variable is only affected by the independent variable. This is certainly not the case in our three well-designed experiments. The weavers wore headphones and worked without them at different times - even or odd weeks. The pieces that Jack learned using the whole and partial methods were also different. Yoko never drank both types of tomato juice on the same day. In each case, something else changed in addition to the independent variable. In subsequent chapters, we'll cover a different type of experiment in which different subjects are used for each condition of the independent variable, allowing for time variations (like even and odd weeks) and task differences (like memorized pieces) to be eliminated. But they also do not meet all the requirements of an ideal experiment, because the subjects will also be different. As you will soon see, a perfect experiment is impossible. However, the idea itself is useful, and it is what guides us when improving real experiments.

In an ideal (impossible) experiment, the weaver would work with and without headphones at the same time! Jack Mozart would simultaneously learn the same piece using whole and partial methods. In both of these cases, the difference in the values ​​of the dependent variable would be due only to the independent variable, the difference in its conditions. In other words, all incidental circumstances, all other potential variables would remain at the same unchanged level.

  • Bodhi: "Common mistakes when performing social experiments."
  • Bodhi: “The Purpose of Conducting Social Experiments (SE).”
  • Chapter 2. BASICS OF EXPERIMENTAL PLANNING

    If you want to experimentally test whether radio music programs help you learn French words, you can easily do so by repeating one of the experiments described in the previous chapter. You'll likely design your experiment after Jack Mozart. You will determine both conditions of the independent variable in advance, study at the same time of day, and document each step of the experiment. Instead of four piano pieces, you could learn four lists of words as follows: listening to the radio, without the radio, without the radio, with the radio. In other words, you can apply the same experimental design, as does Jack.

    It is quite possible that you will understand some of the reasons for your own actions. But something will certainly remain unclear, and above all - the sequence of conditions of the independent variable, that is, the experimental design itself. This is not your fault, because you have not yet gone through the experimental schemes. In this chapter this shortcoming will be eliminated. Of course, you can conduct an experiment by simply imitating a model, but it is much better to understand what you are doing. No two experiments are identical, and blindly copying an experimental design often leads to difficulties. For example, Yoko could use regular alternation between two conditions (types of juice) in her experiment, as was done in the experiment with the weavers (using or not using headphones). But then she would probably know the name of the juice being tested, and “that’s exactly what she was trying to avoid by using a random sequence. Moreover, if you don’t know the basis of the various plans and schemes, it will be difficult for you to evaluate the quality of the experiments you will read about And, as you remember, teaching you this is one of the main goals of our book.


    In this chapter we will compare those plans for which

    The experiments in Chapter 1 were built, with less successful plans for conducting the same experiments. The model for their comparison will be a “flawless” experiment (which is practically impossible). Analyzing this here will allow us to consider the basic ideas that guide us when creating and evaluating experiments. In the process of this analysis, we will introduce several new terms into our vocabulary. As a result, we will determine what is perfect and what is not in the three experimental designs that were used in Chapter 1. And these schemes represent three ways of ordering, or three types of sequences of presentation of different conditions of the independent variable used in a single-subject experiment.



    After studying the material in this chapter, you will be able to competently plan your own experiment without imitating someone else’s experiment. At the end of the chapter we will be asked questions on the following topics:

    1. The degree of approximation of a real experiment to an ideal one.

    2. Factors that violate the internal validity of the experiment.

    3.Systematic and non-systematic sources of violation of internal validity.

    4. Methods for increasing internal validity, methods of primary control and experimental designs.

    5. Some new terms from the experimenter’s vocabulary.

    JUST PLANS AND MORE SUCCESSFUL PLANS

    Undoubtedly, the first condition for conducting an experiment is its organization, the presence of a plan. But not every plan can be considered successful. Let us assume that the experiments described in chapter 1, were carried out differently, according to the following plans.


    1. In the first experiment, let the weaver initially wear headphones for 13 weeks, and then work without them for 13 weeks.

    2. Suppose Yoko decided to use only two cans of each type of juice in her experiment, and the entire experiment took four days instead of 36.

    3. Jack decided to apply the partial method of memorization to the first two plays, and the whole method to the next two.

    4. Or, keeping the same sequence of methods, Jack chose short waltzes for the experiment, rather than the longer pieces that he usually learned.

    We feel quite clearly that in comparison with the experiments previously described, all these plans are unsuccessful. What if we had sample for comparison, then we could say with absolute certainty why the original plans were better. The "flawless" experiment serves as such a model. In the next section, we will discuss it in detail and then see how it is used to evaluate our experiments.

    A FLAWLESS EXPERIMENT

    We now have examples of successfully and unsuccessfully designed experiments. Is it possible to further improve a well-designed experiment? And is it possible to make an experiment absolutely flawless? The answer is: any experiment can be improved indefinitely, or - which is the same thing - a perfect experiment cannot be carried out. Real experiments improve yourself as we get closer to perfection.

    If you want to experimentally test a hypothesis, you can do the experiment by simply imitating a sample, but it is much better to understand what you are doing. No two experiments are identical, and blindly copying an experimental design often leads to difficulties.

    Undoubtedly, the first condition for conducting an experiment is its organization, the presence of a plan. But not every plan can be considered successful. It is quite clear that in comparison there are plans that are more successful and there are plans that are less successful or completely unsuccessful. When deciding to conduct an experiment, we encounter the concept experimental design . The experimental designs of the first samples of the study represent three methods of ordering, or three types of sequences of presentation of different conditions of the independent variable, used in an experiment with one subject. The model for their comparison will be a “flawless” experiment like reference(practically unfeasible).

    3.1. The concept of a “flawless” experiment

    Any experiment can be improved indefinitely, but a perfect experiment cannot be conducted. Real experiments are being improved as we approach an impeccable experiment, which can be presented in three forms: as an ideal experiment, an endless experiment, and an experiment of complete compliance.

    The perfect experiment

    In an ideal experiment, only the independent variable (and, of course, the dependent variable, which takes on different values ​​under different conditions), is allowed to change. Everything else remains the same, so the dependent variable is only affected by the independent variable. A perfect experiment is impossible. However, the idea itself is useful; it is what guides the improvement of real experiments.

    For example, in an ideal (impossible) experiment, the weaver would work with and without headphones at the same time! In this case, the difference in the values ​​of the dependent variable would be due to only independent variable, the difference in its conditions. In other words, all secondary circumstances, all other potential the variables would remain at the same constant level.

    Endless experiment

    In order to average not only the variability of each of the states of the independent variable, but also possible fluctuations in the states of the subject himself, it is necessary to continue the experiment ad infinitum. This is an endless experiment. It is not only impossible, but also meaningless. After all, the general meaning of the experiment is that, on the basis limited amount of data to draw conclusions that have broader applications. However, this experiment also serves as a guiding idea.

    An endless experiment has disadvantages. The very fact that subjects are presented with one of the experimental conditions may affect (during the study period) their performance under another condition. Therefore, neither ideal nor endless experiments are completely flawless. Fortunately, they not only have different disadvantages, but also different advantages and can serve to evaluate real experiments that are very far from a perfect experiment.

    Full Compliance Experiment

    If, in an unsuccessful version of the study, Jack Mozart had learned waltzes instead of sonatas, an experiment would be needed to eliminate this kind of shortcoming full compliance. This experiment is also pointless because Jack would have to memorize the same ones plays that he will continue to learn after him. But, having learned the pieces once, it is impossible to learn them even after the end of the experiment.

    All three types of (almost) perfect experiment are unrealistic. They are useful as “thought” experiments. They tell you what to do to create an effective experiment. Perfect and endless experiments show how to avoid extraneous influences and thereby achieve greater confidence that the experimental results truly reflect the relationship between the independent and dependent variables. Experiment full compliance reminds us of the need to control other important experimental variables, which we keep constant.