As indicated earlier, the basic analysis consists of factoring the intercorrelations between all the scales, selecting the highest loading GD scales that are equivalent to the SD E-P-A scales, and intercorrelating the resultant composite scores for the 50 concepts as the test for equivalence between the verbal and visual forms of the instrument. The results are presented in Tables 3 and 4.
Cross-cultural Factor Analysis. The 64 GD scales and the 12 SD scales for the five sites combined generate a triangular matrix Or 380 correlations (76 x 5). Each correlation is based on an N of 50 (concepts) after the individual subject data are collapsed over the group of 20-25 subjects (For additional details see Osgood, Suci, and Tannenbaum, 1957; Osgood, 1964; Miron and Osgood, 1966; James, 1966) . Table 3 gives the first three factors of the principal axes solution for the 380 correlation matrix. These three factors account for 54.8% of the total variance (36.3%, 11.3%, and 7.2%, respectively). Two additional smaller factors (4.2%, and 2.7%) were also extracted, but are not presented here. The percentage of total variance accounted for by the first three factors, as well as their respective magnitude, are comparable to the values obtained previously with the verbal scales of the pan-cultural SD (see James, 1966, Table 4).
Table 3 groups the scales by language/culture and lists, for each Or the three factors, the highest loading GD scales above the dotted line; immediately below the dotted line are given the four SD scales previously identified as representatives of the E factor (I), the P factor (II), and the A factor (III). The numbers appearing under the "Scale I.D." column identify the particular scales: 1-64 for the GD (refer to Fig. 1) and 65-76 for the SD (refer to Table 2). The positive and negative values of the loadings reflect the arbitrary (random) left-right orientation of the scales in the test booklets: a positive value goes with the left-hand side of both the SD and GD scales (as given in Fig. 1 and Table 2) while a negative value connects the right hand side of the GD scale with the left hand side of the SD scale (or vice versa) as given in Fig. 1 and Table 2. Factor I ("I Scales") is equally defined by both the GD scales listed and the E scales of the SD in each of the five locations. The loadings range from the low 70's to the low 90's with the Delhi-Hindi scales contributing slightly less to the definition of the factor. The high loadings of all 20 verbal E scales Or the SD and the purity of the GD scales allow for a clear interpretation of this factor as a cross-cultural Evaluation factor for combined GD and SD scales.
Factor II ("II Scales") is much less clearly defined, the scales that load highest on it are impure, and it is defined unequally by the various cultures. For three of the cultures-- Finish, German, Japanese--both GD and the SD P scales have sufficiently high pure loadings to allow for the factor's interpretation as a cross-cultural Potency factor for combined GD and SD scales, For American English, only a few GD scales contribute to the definition of this factor, while the SD P scales contribute almost nothing to it. For Hindi, one of the SD P scales has a substantial loading, but no other scale, either SD or CD, appears to contribute substantially to this factor.
The pattern for Factor III ("III Scales") is similar to that of Factor II. For Finnish, German, and Japanese, the loadings of most of the SD A scales and several of the GD scales are sufficiently high and relatively pure to allow for the interpretation of the factor as a cross-cultural Activity factor for combined GD and 53 scales. For English, two of the SD A scales and several of the GD scales make substantial contributions to this factor's definition. For Hindi, neither the GD scales nor the SD A scales appear to contribute substantially to this third factor,
The pattern of results considered so far suggests that in our attempt to find a graphic equivalent for the verbal 53 we will have greater success with the E factor than with the P and A factors. In connection with the latter, there appear to be cultural differences in the affect assigned to the GD scales that contribute to the definition of factors II and III. In particular, American-English tends to assign part of the variance of the verbal P scales to factor I, and a lesser part to factor III. Delhi-Hindi splits the variance contribution of both its P and A scales between the first and second factors. Finland-Finnish consistently assigns most of the variance of the verbal P scales to factor II, but distributes the variance of its A scales unequally over all three factors. Germany-German and Japan-Japanese exhibit more consistent patterns congruent with a P and A interpretation of factors II and III respectively, but not in every case.
The next step in the analysis consisted of selecting four particular GD scales for each of the three factors. The following criteria influenced these selections: each scale should be as "pure" as possible with high loadings on the factor it represents and minimum loadings on the other two factors; it should maintain this pattern for all ,five cultures; within these two restrictions, it should have higher loadings for the factor it represents than any of the other remaining scales. Table 4 identifies the GD scales thus chosen and gives their correlations with the SD scales for each of the three factors and in each location. As can be seen, the four cross-cultural GD scales selected to represent the E factor correlate quite well with the E scales of the SD in every location. The cross-cultural GD scales selected to represent the P and A factors correlate quite inconsistently with the SD scales; the majority of the values of the coefficients are not significant (with N = 50, rp < .05 = .28)
Table 5 presents the intercorrelations between the composite scores for the 50 concepts computed by taking means over the four scales representing each factor using the SD and GD scales. The left hand side of the table under the column, "(N = 50)" lists the correlation coefficients. The diagonal entries provide the estimate of the equivalence better. the two instruments. The pattern previously noted in connection with the factor analysis results is obvious once again: the E ratings for concepts as indexed by the cross-cultural GD are good estimates of the E ratings given by means of the verbal SD scales; the P and A ratings for the verbal SD are not well estimated by the selected GD scales, and the degree of correspondence varies from culture to culture: the case for Finnish is better than for German and Japanese, and these in turn are more consistent-. than for English and Hindi.
Table 6 presents the composite means for 13 of the 50 concepts on the three factors computed on both instruments. These concepts were selected because their affect is intuitively clear and compelling and thus serve as a face validity check on the two instruments. The scores vary from "3.0 to "-3.0". For example, SUCCESS should be rated as highly positive on the E factor, DEFEAT as highly negative. Similarly, TRUTH , FREEDOM, POLICEMAN should receive high P scores, DEFEAT, SMOKE should be low on P. WIND, LAUGHTER, PROGRESS should be high on A, while DEFEAT, SNAKE, CLOUD should be low on A. In general, the SD ratings corroborate these expectations for all five cultures. Looking now at the GD ratings, one finds a surprising degree of face validity, more so than the size of the correlations in Table 5 would have led one to believe. Thus, SUCCESS, PROGRESS, FREEDOM, TRUTH receive high positive scores on factor I, and high negative scores obtain for ANGER CRIME DEFEAT Similarly, DEFEAT, CLOUD, SMOKE receive low scores on factor II, POLICEMAN gets high scores on factor II, WIND, PROGRESS, LAUGHTER, are relatively high on the third factor while DEFEAT is high negative on this factor.
In general, the polarity of ratings on the GD scales is lower than on the SD scales suggesting that its "affective space" tends to be more "packed" around the origin, and therefore, less differentiated. This is particularly true of the second and third factors. There are, however, certain interesting exceptions, where the GD mean is more extreme. Inspection of Table 6 reveals that for the concepts CLOUD, SMOKE and SNAKE the CD mean on factor II is not only more extreme than the corresponding mean on the SD but is also opposite in sign in all but two cases. In every case, the means for these three concepts have a minus sign on GD factor II. Inspection of the four GD scales involved (see no. l, 15, 17, 46 in Fig. l) reveals that each of the four graphic pairs is characterized by a "roundness-angularity" opposition (with the "-" sign, being aligned with the "roundness" alternative). Apparently, the "roundness" feature of the pictograph influences the ratings of the concepts CLOUD, SMOKE, SNAKE, which of course refer to things in the world that are round. In other words, we find evidence here for what has previously been called "denotative contamination" in connection with the SD (e.g. see Osgood, 1962; Miron and Osgood, 1966). For example, the scale hot-cold is used affectively (metaphorically) with the concept JAZZ, but non-affectively (literally, denotatively) with the concept SUN or with FIRE. Similarly with hard-soft for the concepts LOVE or RESEARCH versus DIAMOND or PEACH.
In order to examine the extent of visual denotative contamination of the GD scales relative to the SD, I divided the 50 concepts into two sets, one "concrete" and the other "abstract." I reasoned that the abstract concepts would be less liable to denotative contamination on the GD scales, hence their means should be closer approximations to the corresponding SD means than what would obtain with the concrete concepts. The right hand side of Table 5 presents this analysis under the columns marked "(N = 23A)" for the abstract concepts and "(N = 23C)" for the concrete. (Four concepts were ambiguous with respect to the abstract-concrete distinction and were not included in the comparison.)
The comparison very clearly supports the expectation: in almost every case the coefficients for the 23 abstract concepts are higher than those for the 23 concrete concepts (median absolute r: .53 vs. .27). Looking at the diagonals, which is a more directly relevant comparison for the hypothesis, the same kind of difference emerges (median absolute r: .76 vs. .48). There are some dramatic differences in the relation between the A factor of the SD and GD factor III: .59 vs. -.18 for AE, .78 -vs. .64 for FF, .59 vs. .17 for GG, and .67 vs. .08 for JP. Similar, if less dramatic, differences obtain for the E and P factors. There are, however, some reversals where better correspondence is exhibited by the concrete concepts: .52 vs. .88 for the E factor with AE and .35 vs. .26 for the P factor with JP. There is also a curious inversion for the P factor with AE where the abstract concepts exhibit a slight negative correlation (-.20) between the two instruments. It appears, then, that although there is a strong suggestion for the presence of denotative contamination with the GD scales, the present data are not sufficiently consistent to be quite conclusive.
There is a final question that needs to be examined before going on to the discussion of the results of this study. We have noted the fact that there seem to be variations between the cultures in terms of the adequacy of the selected GD scales as estimates of the SD. The question poses itself whether the equivalence adequacy between the two instruments can be improved by selecting GD scales separately and independently for each culture. Four such "indigenous" GD scales were selected for each location by inspecting the correlation matrix for the data in each language/culture and identifying the GD scales which correlated most highly and purely with the four SD scales that represented the three factors. This search yielded four SD scales for each factor in the five cultures as presented in Table 7. As can be seen, the best indigenous scales overlap in only a small number of cases with the cross-cultural GD scales (see Table 4). Furthermore, the same scale may tap different factors in different cultures (e.g. scale 1 is A for DH but P for GG and JP).
The intercorrelations of composite means for the 50 concepts rated on these selected indigenous GD scales with their respective SD scales are presented in Table 8. Comparing the diagonals in this table with the corresponding figures in Table 5, it is immediately apparent that the indigenous GD scales are a much better estimate of the SD scales than the cross-cultural GD scales (median: .67 vs. .51) with some truly dramatic differences in evidence (e.g. P for AE 3 vs. .05; A for DH: .59 vs. -.21).
Looking at the differences in intercorrelations between the two instruments for the abstract versus the concrete concepts, the previously noted evidence for visual denotative contamination is again corroborated here, even more strongly than before: in 12 of the 15 diagonal comparisons, the coefficients for the abstract concepts are larger (in several cases, quite substantially).
It should be noted that the procedures used in this study are likely to underestimate the degree of equivalence of the two instruments. It will be recalled that there were five sub groups of subjects, so that each subject rated only 10 concepts. In addition, the ratings were spread over two different sessions. It would be expected that under more suitable conditions (e.g. every subject rating every concept on the selected and reduced number of GD scales), the intercorrelations between the ratings with the two instruments would yield even higher coefficients than those in Tables 5 and 8.