Creating a psychological tool applicable to individuals around the world (STRAEQ-2 item generation and selection phase).

In one of my projects of my PhD, I – Olivier – am working on developing and validating a scale (the Social Thermoregulation, Risk Avoidance, and Eating Questionnaire – 2, or the STRAEQ-2). In the first phase of this project, we involved people from different countries and asked them to generate items. We did so to better represent behaviors from people across the globe and less so just from the EU/US. This item-generation stage involved 152 authors from 115 universities. In this blog post we present this first phase and share the code we used for this first phase.

Is attachment for coping with environmental threats?

Based on the premise that environmental threats shape personality (Buss, 2010; Wei et al., 2017), the goal of this project is to measure individual differences in the way people cope with the environment and to discover if these individual differences are linked to attachment. Attachment theory postulates that people seek proximity with reliable others to meet their needs (Bowlby, 1969), and the psychological literature suggests that distributing threats on others is metabolically more efficient (Beckes & Coan, 2011). In a previous project findings (STRAQ-1) Vergara et al. (2019) showed something consistent with this: individual differences in the way people cope with environmental threats (temperature regulation and risk avoidance) were linked to individual differences in attachment (Vergara et al., 2019). But the reliability of the scale created and validated in this project was somewhat inconsistent across the countries in which they collected data. We therefore decided to extend the STRAQ-1 findings to make it more reliable across countries, now focusing on:

  1. Fluctuation in temperatures: generating a thermoregulation need (one’s need to maintain internal temperature within a comfortable range in order to survive), 
  2. Physical threats: inducing risks avoidance (one’s need to avoid predators or people who want to do harm in order to survive), 
  3. Lack of food: requiring food intake (one’s need to prevent from starvation).

We are also dividing each dimension into 4 subdimensions to make the scale more consistent with the attachment literature: sensitivity to the need, solitary regulation of the need, social regulation of the need, and confidence that others will help to cope with the need.

What are the behaviors that account for coping with environmental threats?

We first took the step to generate the items and we wanted to avoid the mistakes of the past.  Do people in Peru, China, Nigeria, or Sweden deal with temperature the same way? Probably not. So to have scale items that reflect a diverse range of coping behaviors we asked our collaborators; we designed a Qualtrics survey in which they read a description of each construct and we showed them example items.

In total, 737 items were generated by 53 laboratories from 32 countries. To automatize the procedure, and to avoid copy-and-paste errors, we created an R script to import all the generated items from Qualtrics into a Google Document. We created a text document from the Qualtrics file, including the subscale name followed by every text entry (item) for each sub dimension juxtaposing the name of the country that generated the items. Here is the piece of code that we used:

# this code select the ‘temperature sensibility subscale’ (plus their respective countries) in the data frame ‘items’ that is the qualtrics files containing all the items generated, and write the items in a .txt document called ‘Items.txt’:

library(tidyverse)
items %>%
  select(thermo_sens, country) %>%
  write_csv("Items.txt", col_names = TRUE, append = TRUE)

# we repeat this for all the subscales, adding them in the same file (Items.txt). For example here we add the solitary thermoregulation subscale:

items %>%
  select(thermo_soli, country) %>%
  write_csv("Items.txt", col_names = TRUE, append = TRUE)

The lead team imported the list of items (.txt) in an easily shareable document that track changes (google doc) to correct misspelling, reformulate and remove doubles in the list of items (all modifications are available here, and a clean version is here). 

How to select relevant and diverse behaviors (items) from a large list?

Because it would be too fatiguing for our participants to answer all of the 737 previously generated items in one sitting (on top of other questionnaires we offer them), we reduced the amount of items to be included in the main survey. We created a diverse advisory committee (including 9 researchers from Chile, Brasil, Morocco, Nigeria, China, the Netherlands, and France); via an online survey this advisory committee rated to what degree they thought the items were representative of their respective constructs.
Then, we computed the mean and standard deviation for each item. We selected the 10 highest means and lowest standard deviation items per subscales, and we replaced closely related items (~5 per subscales) to get a wider range of behaviors to be included in the scale. To facilitate the procedure, we created dynamic tables per subscale in an Rmarkdown document. These tables allow you to arrange per score (mean or sd), to select per country, or to search for specific words in the items. Here is the code that we used to generate the tables:

# from a data frame call ‘df’ that contains all the items and all the subscales we compute the mean and standard deviation of the expert ratings: 

df <- df %>%
  mutate (mean = rowMeans(cbind(expert_1, expert_9, na.rm=TRUE),
        sd = rowSds(cbind(expert_1, expert_9), na.rm=TRUE),
        mean = round(mean, digits = 2),
        sd =  round(sd, digits = 2)
   )

# then we tidy the data (that is we switch from a short format to a long format) in order to have one items per rows:

df_tidy <- df %>%  
  gather("expert", "rating", -item, -subscale,
          - mean, -sd, -country_list, -country)

# we create a new dataframe including only one subscale (here the sensitivity to temperature) and arrange the data frame having first the 10 highest means and lowest standard deviation:

temp_sens <- df %>%
  filter(grepl('temp_sens', subscale)) %>% #select rows "temp_sens" in the subscale column
  arrange(sd) %>%
  arrange(desc(mean)) %>%
  select(item, mean, sd, country)

# and finally we print an interactive table, that display automatically the 10 first rows (this can be change):

library(DT)
datatable(temp_sens)

We also plotted the distribution of the ratings (general, per experts, and per subscales) of the items to detect if there were some issues with specific subscales. Here is the ggplot code that we used to do that:

# first we create a color palette that will be used for the plot:

palette <- c("#85D4E3", "#F4B5BD", "#9C964A", "#C94A7D", "#CDC08C",
                  "#FAD77B", "#7294D4", "#DC863B", "#972D15")

# and here we plot the rating of the experts per subscales:

expert_plot <- df_tidy %>%
  ggplot(aes(rating, fill = expert, show.legend = FALSE)) +
  geom_bar(show.legend = FALSE) +
  facet_grid(expert ~ subscale, labeller = labeller(subscale = subscale.labs)) +
  scale_fill_manual(values = palette)

expert_plot

We then created a world map of the countries that generated the final list of the 120 STRAEQ-2 items, to observe how diverse our list of items was (we ended up doing some replacements for the final project, to increase the diversity of the scale).

# first import the world.cities database from the ‘package maps' in order to have latitude and longitude of countries around the globe:

library(maps)
data(world.cities)

# then you may want to rename some country in your data frame (here my data frame contains  the 120 selected items amd is called ‘items_120’) to match the name of the cities included in the world.cities data frame, in our case we needed to rename three cities:

items_120$country[items_120$country == "United Kingdom"] <- "UK"
items_120$country[items_120$country == "United States"] <- "USA"
items_120$country[items_120$country == "Serbia"] <- "Serbia and Montenegro"
 
# the next step is to merge the desired columns from the world.cities data frame with your data frame by country in order to get the latitude and longitude next to your cities’ names:

items_120 <- world.cities %>%
    filter(capital == 1) %>%
    select(country = country.etc, lat, lng = long) %>%
    left_join(items_120, ., by = "country")

# we compute the number of items that we have per country, this is needed to vary the size of the dots in the final map:

df_country_count <- items_120 %>%
  group_by(country) %>%
  summarise(n()) %>%
  rename(n = "n()")

# we create the data frame to generate the map:

items_120 <- left_join(items_120, df_country_count)

# we create the map (depending on your data you may want to change the radius argument fonction in order to adapt the differences in dot size, and also the value of the fillOpacity argument):

m <- leaflet(items_120) %>% 
   addTiles() %>%
   addCircles(
       lng = ~lng,
       lat = ~lat, 
       weight = 1,
       radius = ~ log(n + 5) * 100000, 
       popup = ~paste(country, ":", item, "(", subscale, ")"),
       stroke = T, 
       opacity = 1, 
       fill = T, 
       color = "#a500a5", 
       fillOpacity = 0.09
    )

# finally we can print the map:
m

This is where we arrived in the project so far. We are now at the stage to translate the scales for the project into various languages to collect data for the final project. 

Interested in participating in the project?

It is still possible for you to join the project. We will ask you to: 

  1. Submit an IRB application at your site (if necessary). In order to make this step easier for you, we have written an IRB submission pack, which is available on the OSF page of the project
  2. Translate the scales that are included in the main project via a forward-translation and back-translation method.
  3. Administer an online questionnaire to at least 100 participants at your site (more is always possible). 

Please note that to account for authorship we are using a tiers author list combined to the CRediT taxonomy. We also expect to publish a data paper from the project that will help for future reuse of the data. We project to collect data from approximately 11 000 participants across the globe (we currently have 118 sites in 48 countries), to validate the STRAEQ-2 scale across countries, to measure individual differences in the way people cope with environmental threats, and to explore the links these differences maintain with individuals differences in attachment.

This blog post was written by Olivier Dujols and Hans IJzerman.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: